Downloadclient
FileDownloadState
¶
The state a file can be in before/while/after downloading.
BaseExtractionTool(program_name, useability_check_args, extract_args, logger=logging.log)
¶
Initialises a extraction tool object
Parameters:
Name | Type | Description | Default |
---|---|---|---|
program_name
|
str
|
the name of the archive extraction program, e.g., unzip |
required |
useability_check_args
|
str
|
the arguments of the extraction program to test if its installed, e.g., --version |
required |
extract_args
|
str
|
the arguments that will be passed to the program for extraction |
required |
logger
|
LoggerFunction
|
optional decorated logging.log object that can be passed from the calling daemon or client. |
log
|
is_useable()
¶
Checks if the extraction tool is installed and usable
Returns:
Type | Description |
---|---|
bool
|
True if it is usable otherwise False |
try_extraction(archive_file_path, file_to_extract, dest_dir_path)
¶
Calls the extraction program to extract a file from an archive
Parameters:
Name | Type | Description | Default |
---|---|---|---|
archive_file_path
|
str
|
path to the archive |
required |
file_to_extract
|
str
|
file name to extract from the archive |
required |
dest_dir_path
|
str
|
destination directory where the extracted file will be stored |
required |
Returns:
Type | Description |
---|---|
bool
|
True on success otherwise False |
DownloadClient(client=None, logger=None, tracing=True, check_admin=False, check_pcache=False)
¶
Initialises the basic settings for an DownloadClient object
Parameters:
Name | Type | Description | Default |
---|---|---|---|
client
|
Optional[Client]
|
Optional: rucio.client.client.Client object. If None, a new object will be created. |
None
|
external_traces
|
Optional: reference to a list where traces can be added |
required | |
logger
|
Optional[LoggerFunction]
|
Optional: logging.Logger object. If None, default logger will be used. |
None
|
download_pfns(items, num_threads=2, trace_custom_fields=None, traces_copy_out=None, deactivate_file_download_exceptions=False)
¶
Download items with a given PFN. This function can only download files, no datasets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
items
|
list[dict[str, Any]]
|
List of dictionaries. Each dictionary describing a file to download. Keys: pfn - PFN string of this file did - DID string of this file (e.g. 'scope:file.name'). Wildcards are not allowed rse - rse name (e.g. 'CERN-PROD_DATADISK'). RSE Expressions are not allowed base_dir - Optional: Base directory where the downloaded files will be stored. (Default: '.') no_subdir - Optional: If true, files are written directly into base_dir. (Default: False) adler32 - Optional: The adler32 checmsum to compare the downloaded files adler32 checksum with md5 - Optional: The md5 checksum to compare the downloaded files md5 checksum with transfer_timeout - Optional: Timeout time for the download protocols. (Default: None) check_local_with_filesize_only - Optional: If true, already downloaded files will not be validated by checksum. |
required |
num_threads
|
int
|
Suggestion of number of threads to use for the download. It will be lowered if it's too high. |
2
|
trace_custom_fields
|
Optional[dict[str, Any]]
|
Custom key value pairs to send with the traces |
None
|
traces_copy_out
|
Optional[list[dict[str, Any]]]
|
reference to an external list, where the traces should be uploaded |
None
|
deactivate_file_download_exceptions
|
bool
|
Boolean, if file download exceptions shouldn't be raised |
False
|
Returns:
Type | Description |
---|---|
list[dict[str, Any]]
|
a list of dictionaries with an entry for each file, containing the input options, the did, and the clientState clientState can be one of the following: ALREADY_DONE, DONE, FILE_NOT_FOUND, FAIL_VALIDATE, FAILED |
Raises:
Type | Description |
---|---|
InputValidationError
|
if one of the input items is in the wrong format |
NoFilesDownloaded
|
if no files could be downloaded |
NotAllFilesDownloaded
|
if not all files could be downloaded |
RucioException
|
if something unexpected went wrong during the download |
download_dids(items, num_threads=2, trace_custom_fields=None, traces_copy_out=None, deactivate_file_download_exceptions=False, sort=None)
¶
Download items with given DIDs. This function can also download datasets and wildcarded DIDs.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
items
|
list[dict[str, Any]]
|
List of dictionaries. Each dictionary describing an item to download. Keys: did - DID string of this file (e.g. 'scope:file.name') filters - Filter to select DIDs for download. Optional if DID is given rse - Optional: rse name (e.g. 'CERN-PROD_DATADISK') or rse expression from where to download impl - Optional: name of the protocol implementation to be used to download this item. no_resolve_archives - Optional: bool indicating whether archives should not be considered for download (Default: False) resolve_archives - Deprecated: Use no_resolve_archives instead force_scheme - Optional: force a specific scheme to download this item. (Default: None) base_dir - Optional: base directory where the downloaded files will be stored. (Default: '.') no_subdir - Optional: If true, files are written directly into base_dir. (Default: False) nrandom - Optional: if the DID addresses a dataset, nrandom files will be randomly chosen for download from the dataset ignore_checksum - Optional: If true, skips the checksum validation between the downloaded file and the rucio catalouge. (Default: False) transfer_timeout - Optional: Timeout time for the download protocols. (Default: None) transfer_speed_timeout - Optional: Minimum allowed transfer speed (in KBps). Ignored if transfer_timeout set. Otherwise, used to compute default timeout (Default: 500) check_local_with_filesize_only - Optional: If true, already downloaded files will not be validated by checksum. |
required |
num_threads
|
int
|
Suggestion of number of threads to use for the download. It will be lowered if it's too high. |
2
|
trace_custom_fields
|
Optional[dict[str, Any]]
|
Custom key value pairs to send with the traces. |
None
|
traces_copy_out
|
Optional[list[dict[str, Any]]]
|
reference to an external list, where the traces should be uploaded |
None
|
deactivate_file_download_exceptions
|
bool
|
Boolean, if file download exceptions shouldn't be raised |
False
|
sort
|
Optional[SORTING_ALGORITHMS_LITERAL]
|
Select best replica by replica sorting algorithm. Available algorithms: |
None
|
Returns:
Type | Description |
---|---|
list[dict[str, Any]]
|
a list of dictionaries with an entry for each file, containing the input options, the did, and the clientState |
Raises:
Type | Description |
---|---|
InputValidationError
|
if one of the input items is in the wrong format |
NoFilesDownloaded
|
if no files could be downloaded |
NotAllFilesDownloaded
|
if not all files could be downloaded |
RucioException
|
if something unexpected went wrong during the download |
download_from_metalink_file(item, metalink_file_path, num_threads=2, trace_custom_fields=None, traces_copy_out=None, deactivate_file_download_exceptions=False)
¶
Download items using a given metalink file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
item
|
dict[str, Any]
|
dictionary describing an item to download. Keys: base_dir - Optional: base directory where the downloaded files will be stored. (Default: '.') no_subdir - Optional: If true, files are written directly into base_dir. (Default: False) ignore_checksum - Optional: If true, skips the checksum validation between the downloaded file and the rucio catalouge. (Default: False) transfer_timeout - Optional: Timeout time for the download protocols. (Default: None) check_local_with_filesize_only - Optional: If true, already downloaded files will not be validated by checksum. |
required |
num_threads
|
int
|
Suggestion of number of threads to use for the download. It will be lowered if it's too high. |
2
|
trace_custom_fields
|
Optional[dict[str, Any]]
|
Custom key value pairs to send with the traces. |
None
|
traces_copy_out
|
Optional[list[dict[str, Any]]]
|
reference to an external list, where the traces should be uploaded |
None
|
deactivate_file_download_exceptions
|
bool
|
Boolean, if file download exceptions shouldn't be raised |
False
|
Returns:
Type | Description |
---|---|
list[dict[str, Any]]
|
a list of dictionaries with an entry for each file, containing the input options, the did, and the clientState |
Raises:
Type | Description |
---|---|
InputValidationError
|
if one of the input items is in the wrong format |
NoFilesDownloaded
|
if no files could be downloaded |
NotAllFilesDownloaded
|
if not all files could be downloaded |
RucioException
|
if something unexpected went wrong during the download |
download_aria2c(items, trace_custom_fields=None, filters=None, deactivate_file_download_exceptions=False, sort=None)
¶
Uses aria2c to download the items with given DIDs. This function can also download datasets and wildcarded DIDs. It only can download files that are available via https/davs. Aria2c needs to be installed and X509_USER_PROXY needs to be set!
Parameters:
Name | Type | Description | Default |
---|---|---|---|
items
|
list[dict[str, Any]]
|
List of dictionaries. Each dictionary describing an item to download. Keys: did - DID string of this file (e.g. 'scope:file.name'). Wildcards are not allowed rse - Optional: rse name (e.g. 'CERN-PROD_DATADISK') or rse expression from where to download base_dir - Optional: base directory where the downloaded files will be stored. (Default: '.') no_subdir - Optional: If true, files are written directly into base_dir. (Default: False) nrandom - Optional: if the DID addresses a dataset, nrandom files will be randomly chosen for download from the dataset ignore_checksum - Optional: If true, skips the checksum validation between the downloaded file and the rucio catalouge. (Default: False) check_local_with_filesize_only - Optional: If true, already downloaded files will not be validated by checksum. |
required |
trace_custom_fields
|
Optional[dict[str, Any]]
|
Custom key value pairs to send with the traces |
None
|
filters
|
Optional[dict[str, Any]]
|
dictionary containing filter options |
None
|
deactivate_file_download_exceptions
|
bool
|
Boolean, if file download exceptions shouldn't be raised |
False
|
sort
|
Optional[SORTING_ALGORITHMS_LITERAL]
|
Select best replica by replica sorting algorithm. Available algorithms: |
None
|
Returns:
Type | Description |
---|---|
list[dict[str, Any]]
|
a list of dictionaries with an entry for each file, containing the input options, the did, and the clientState |
Raises:
Type | Description |
---|---|
InputValidationError
|
if one of the input items is in the wrong format |
NoFilesDownloaded
|
if no files could be downloaded |
NotAllFilesDownloaded
|
if not all files could be downloaded |
RucioException
|
if something went wrong during the download (e.g. aria2c could not be started) |
preferred_impl(sources)
¶
Finds the optimum protocol impl preferred by the client and supported by the remote RSE.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sources
|
list[dict[str, Any]]
|
List of sources for a given DID |
required |
Raises:
Type | Description |
---|---|
RucioException(msg)
|
general exception with msg for more details. |