Data transfer

A basic requirement for the transfer of data is the existence of an sshkey pair, whereby the public key must be stored in the UDE SelfCare Portal, see Security -> SSH-keys in the SelfCare-Portal.

Please note, the clusters have different access concepts - magnitUDE direct login to the frontend/login nodes, amplitUDE a ‘jump’ through the jumphost, see also Access with SSH.

In general, the data transfer must initiated from outside the cluster, since due to IT security only incoming ssh connections on the HPC systems are allowed. Thus, all commands/tools must be started on the local/user client system and push/pull the data from this. An exception is the transfer of data from magnitUDE to amplitUDE, see HERE.

The command line tools scp and rsync can be used to transfer files between local computers and the file systems of the HPC systems. The storage location DIRECTORY on the target server TARGET should be replaced accordingly to the remarks in the section related to the cluster systems.


General notes: RSYNC

For a file transfer using rsync with activated zip compression (option ‘z’), recursive processing and preservation of almost all file attributes (option ‘a’) and screen output (‘v’) of the transferred file and a summary, use the following syntax:

rsync -azv <your-files> <username>@TARGET:DIRECTORY

General notes: SCP

Alternatively, the scp tool can be used with the following syntax. If the data is large, it should be reduced in size in advance using compression (e.g. zip.format).

scp  <your-files> <username>@TARGET:DIRECTORY 

magnitUDE

Copy files from/to local client

The (TARGET) should one of the frontend/login nodes of magnitUDE:

The storage location (DIRECTORY) should be of the form

/scratch/USERNAME/<subdirectory>/  # transfer to parallel file storage `SCRATCH`
/homes/USERNAME/<subdirectory>/  # transfer to university wide `HOME` storage

Copy files to amplitUDE

Due to the dedicated transfer server for amplitUDE, the data can be copied directly from magnitUDE to amplitUDE. One of the transfer commands can be executed directly on the frontend node of magnitUDE. In this case, the ‘TARGET’ should be

gateway.amplitude.uni-due.de

and the storage location (DIRECTORY) should be of one of the forms

/lustre/scratch/<your-workspace>/  # transfer to parallel file storage `SCRATCH`
/lustre/hpc_home/USERNAME/<subdirectory>/  # transfer to permanent data storage `HPC_HOME`
/homes/USERNAME/SUBDIRECTORY/  # transfer to university wide `HOME` storage

amplitUDE

Copy files from/to local client

The (TARGET) should the gateway/transfer of magnitUDE:

The storage location (DIRECTORY) should be of the form

/lustre/scratch/<your-workspace>/  # transfer to parallel file workspace storage `SCRATCH`
/lustre/hpc_home/USERNAME/<subdirectory>/  # transfer to permanent data storage `HPC_HOME`
/homes/USERNAME/<subdirectory>/  # transfer to university wide `HOME` storage

The command line tools scp and rsync can be used to transfer files between local computers and the file systems of the HPC systems. Due to the upstream jump host, which ensures load balancing between the front-end nodes of the cluster, a dedicated transfer/gateway server is set up for this purpose.

For example, to transfer a file from your local machine to the /scratch directory on amplitUDE, run the following command from your local terminal:

scp /path/to/local/file <username>@gateway.amplitude.uni-due.de:/lustre/scratch/<your-workspace>/

Replace:

  • /path/to/local/file with the full path to the file you want to transfer.

  • <username> with your actual username.

  • <your-workspace> as the directory under /lustre/scratch/ that corresponds to your workspace.

If you’re transferring large files or directories, consider using rsync instead of scp. rsync is more efficient for large or repeated transfers because it can resume interrupted transfers and only updates the parts of the files that have changed.

For example, to transfer a directory to your workspace on amplitUDE:

rsync -avz /path/to/local/directory <username>@gateway.amplitude.uni-due.de:/lustre/scratch/<your-workspace>/
The -a option ensures the file permissions and timestamps are preserved.
The -v option increases verbosity so you can monitor the transfer.
The -z option compresses data during transfer to speed up the process.

This can be especially useful when working with large datasets or projects that need regular updates.

Copy files from magnitUDE

See Section Copy files from magnitUDE to amplitUDE


From internet sources

Due to IT security guidelines, the HPC systems have no direct connection to external UDE networks (i.e. the Internet). A web proxy proxy.hpc.uni-due.de is available to enable users to access sources on the Internet. The environment variables $HTTP_PROXY and $HTTPS_PROXY are set accordingly on the HPC system by default.

Specific internet sources are released on this server. More can be added on request at hpc-support@uni-due.de. An informal description/content of the source and a justification for the necessity of activation must be provided.