Skip to content

Downloadclient

FileDownloadState

The state a file can be in before/while/after downloading.

BaseExtractionTool(program_name, useability_check_args, extract_args, logger=logging.log)

Initialises a extraction tool object

Parameters:

Name Type Description Default
program_name str

the name of the archive extraction program, e.g., unzip

required
useability_check_args str

the arguments of the extraction program to test if its installed, e.g., --version

required
extract_args str

the arguments that will be passed to the program for extraction

required
logger LoggerFunction

optional decorated logging.log object that can be passed from the calling daemon or client.

log

is_useable()

Checks if the extraction tool is installed and usable

Returns:

Type Description
bool

True if it is usable otherwise False

try_extraction(archive_file_path, file_to_extract, dest_dir_path)

Calls the extraction program to extract a file from an archive

Parameters:

Name Type Description Default
archive_file_path str

path to the archive

required
file_to_extract str

file name to extract from the archive

required
dest_dir_path str

destination directory where the extracted file will be stored

required

Returns:

Type Description
bool

True on success otherwise False

DownloadClient(client=None, logger=None, tracing=True, check_admin=False, check_pcache=False)

Initialises the basic settings for an DownloadClient object

Parameters:

Name Type Description Default
client Optional[Client]

Optional: rucio.client.client.Client object. If None, a new object will be created.

None
external_traces

Optional: reference to a list where traces can be added

required
logger Optional[LoggerFunction]

Optional: logging.Logger object. If None, default logger will be used.

None

download_pfns(items, num_threads=2, trace_custom_fields=None, traces_copy_out=None, deactivate_file_download_exceptions=False)

Download items with a given PFN. This function can only download files, no datasets.

Parameters:

Name Type Description Default
items list[dict[str, Any]]

List of dictionaries. Each dictionary describing a file to download. Keys: pfn - PFN string of this file did - DID string of this file (e.g. 'scope:file.name'). Wildcards are not allowed rse - rse name (e.g. 'CERN-PROD_DATADISK'). RSE Expressions are not allowed base_dir - Optional: Base directory where the downloaded files will be stored. (Default: '.') no_subdir - Optional: If true, files are written directly into base_dir. (Default: False) adler32 - Optional: The adler32 checmsum to compare the downloaded files adler32 checksum with md5 - Optional: The md5 checksum to compare the downloaded files md5 checksum with transfer_timeout - Optional: Timeout time for the download protocols. (Default: None) check_local_with_filesize_only - Optional: If true, already downloaded files will not be validated by checksum.

required
num_threads int

Suggestion of number of threads to use for the download. It will be lowered if it's too high.

2
trace_custom_fields Optional[dict[str, Any]]

Custom key value pairs to send with the traces

None
traces_copy_out Optional[list[dict[str, Any]]]

reference to an external list, where the traces should be uploaded

None
deactivate_file_download_exceptions bool

Boolean, if file download exceptions shouldn't be raised

False

Returns:

Type Description
list[dict[str, Any]]

a list of dictionaries with an entry for each file, containing the input options, the did, and the clientState clientState can be one of the following: ALREADY_DONE, DONE, FILE_NOT_FOUND, FAIL_VALIDATE, FAILED

Raises:

Type Description
InputValidationError

if one of the input items is in the wrong format

NoFilesDownloaded

if no files could be downloaded

NotAllFilesDownloaded

if not all files could be downloaded

RucioException

if something unexpected went wrong during the download

download_dids(items, num_threads=2, trace_custom_fields=None, traces_copy_out=None, deactivate_file_download_exceptions=False, sort=None)

Download items with given DIDs. This function can also download datasets and wildcarded DIDs.

Parameters:

Name Type Description Default
items list[dict[str, Any]]

List of dictionaries. Each dictionary describing an item to download. Keys: did - DID string of this file (e.g. 'scope:file.name') filters - Filter to select DIDs for download. Optional if DID is given rse - Optional: rse name (e.g. 'CERN-PROD_DATADISK') or rse expression from where to download impl - Optional: name of the protocol implementation to be used to download this item. no_resolve_archives - Optional: bool indicating whether archives should not be considered for download (Default: False) resolve_archives - Deprecated: Use no_resolve_archives instead force_scheme - Optional: force a specific scheme to download this item. (Default: None) base_dir - Optional: base directory where the downloaded files will be stored. (Default: '.') no_subdir - Optional: If true, files are written directly into base_dir. (Default: False) nrandom - Optional: if the DID addresses a dataset, nrandom files will be randomly chosen for download from the dataset ignore_checksum - Optional: If true, skips the checksum validation between the downloaded file and the rucio catalouge. (Default: False) transfer_timeout - Optional: Timeout time for the download protocols. (Default: None) transfer_speed_timeout - Optional: Minimum allowed transfer speed (in KBps). Ignored if transfer_timeout set. Otherwise, used to compute default timeout (Default: 500) check_local_with_filesize_only - Optional: If true, already downloaded files will not be validated by checksum.

required
num_threads int

Suggestion of number of threads to use for the download. It will be lowered if it's too high.

2
trace_custom_fields Optional[dict[str, Any]]

Custom key value pairs to send with the traces.

None
traces_copy_out Optional[list[dict[str, Any]]]

reference to an external list, where the traces should be uploaded

None
deactivate_file_download_exceptions bool

Boolean, if file download exceptions shouldn't be raised

False
sort Optional[SORTING_ALGORITHMS_LITERAL]

Select best replica by replica sorting algorithm. Available algorithms: geoip - based on src/dst IP topographical distance closeness - based on src/dst closeness dynamic - Rucio Dynamic Smart Sort (tm)

None

Returns:

Type Description
list[dict[str, Any]]

a list of dictionaries with an entry for each file, containing the input options, the did, and the clientState

Raises:

Type Description
InputValidationError

if one of the input items is in the wrong format

NoFilesDownloaded

if no files could be downloaded

NotAllFilesDownloaded

if not all files could be downloaded

RucioException

if something unexpected went wrong during the download

Download items using a given metalink file.

Parameters:

Name Type Description Default
item dict[str, Any]

dictionary describing an item to download. Keys: base_dir - Optional: base directory where the downloaded files will be stored. (Default: '.') no_subdir - Optional: If true, files are written directly into base_dir. (Default: False) ignore_checksum - Optional: If true, skips the checksum validation between the downloaded file and the rucio catalouge. (Default: False) transfer_timeout - Optional: Timeout time for the download protocols. (Default: None) check_local_with_filesize_only - Optional: If true, already downloaded files will not be validated by checksum.

required
num_threads int

Suggestion of number of threads to use for the download. It will be lowered if it's too high.

2
trace_custom_fields Optional[dict[str, Any]]

Custom key value pairs to send with the traces.

None
traces_copy_out Optional[list[dict[str, Any]]]

reference to an external list, where the traces should be uploaded

None
deactivate_file_download_exceptions bool

Boolean, if file download exceptions shouldn't be raised

False

Returns:

Type Description
list[dict[str, Any]]

a list of dictionaries with an entry for each file, containing the input options, the did, and the clientState

Raises:

Type Description
InputValidationError

if one of the input items is in the wrong format

NoFilesDownloaded

if no files could be downloaded

NotAllFilesDownloaded

if not all files could be downloaded

RucioException

if something unexpected went wrong during the download

download_aria2c(items, trace_custom_fields=None, filters=None, deactivate_file_download_exceptions=False, sort=None)

Uses aria2c to download the items with given DIDs. This function can also download datasets and wildcarded DIDs. It only can download files that are available via https/davs. Aria2c needs to be installed and X509_USER_PROXY needs to be set!

Parameters:

Name Type Description Default
items list[dict[str, Any]]

List of dictionaries. Each dictionary describing an item to download. Keys: did - DID string of this file (e.g. 'scope:file.name'). Wildcards are not allowed rse - Optional: rse name (e.g. 'CERN-PROD_DATADISK') or rse expression from where to download base_dir - Optional: base directory where the downloaded files will be stored. (Default: '.') no_subdir - Optional: If true, files are written directly into base_dir. (Default: False) nrandom - Optional: if the DID addresses a dataset, nrandom files will be randomly chosen for download from the dataset ignore_checksum - Optional: If true, skips the checksum validation between the downloaded file and the rucio catalouge. (Default: False) check_local_with_filesize_only - Optional: If true, already downloaded files will not be validated by checksum.

required
trace_custom_fields Optional[dict[str, Any]]

Custom key value pairs to send with the traces

None
filters Optional[dict[str, Any]]

dictionary containing filter options

None
deactivate_file_download_exceptions bool

Boolean, if file download exceptions shouldn't be raised

False
sort Optional[SORTING_ALGORITHMS_LITERAL]

Select best replica by replica sorting algorithm. Available algorithms: geoip - based on src/dst IP topographical distance closeness - based on src/dst closeness dynamic - Rucio Dynamic Smart Sort (tm)

None

Returns:

Type Description
list[dict[str, Any]]

a list of dictionaries with an entry for each file, containing the input options, the did, and the clientState

Raises:

Type Description
InputValidationError

if one of the input items is in the wrong format

NoFilesDownloaded

if no files could be downloaded

NotAllFilesDownloaded

if not all files could be downloaded

RucioException

if something went wrong during the download (e.g. aria2c could not be started)

preferred_impl(sources)

Finds the optimum protocol impl preferred by the client and supported by the remote RSE.

Parameters:

Name Type Description Default
sources list[dict[str, Any]]

List of sources for a given DID

required

Raises:

Type Description
RucioException(msg)

general exception with msg for more details.