Skip to content

Downloadclient

Classes

FileDownloadState

The state a file can be in before/while/after downloading.

BaseExtractionTool

BaseExtractionTool(
    program_name,
    useability_check_args,
    extract_args,
    logger=logging.log,
)

Initialises a extraction tool object

PARAMETER DESCRIPTION
program_name

the name of the archive extraction program, e.g., unzip

TYPE: str

useability_check_args

the arguments of the extraction program to test if its installed, e.g., --version

TYPE: str

extract_args

the arguments that will be passed to the program for extraction

TYPE: str

logger

optional decorated logging.log object that can be passed from the calling daemon or client.

TYPE: LoggerFunction DEFAULT: log

Functions

is_useable
is_useable()

Checks if the extraction tool is installed and usable

RETURNS DESCRIPTION
bool

True if it is usable otherwise False

try_extraction
try_extraction(
    archive_file_path, file_to_extract, dest_dir_path
)

Calls the extraction program to extract a file from an archive

PARAMETER DESCRIPTION
archive_file_path

path to the archive

TYPE: str

file_to_extract

file name to extract from the archive

TYPE: str

dest_dir_path

destination directory where the extracted file will be stored

TYPE: str

RETURNS DESCRIPTION
bool

True on success otherwise False

DownloadClient

DownloadClient(
    client=None,
    logger=None,
    tracing=True,
    check_admin=False,
    check_pcache=False,
)

Initialises the basic settings for an DownloadClient object

PARAMETER DESCRIPTION
client

Optional: rucio.client.client.Client object. If None, a new object will be created.

TYPE: Optional[Client] DEFAULT: None

external_traces

Optional: reference to a list where traces can be added

logger

Optional: logging.Logger object. If None, default logger will be used.

TYPE: Optional[LoggerFunction] DEFAULT: None

Functions

download_pfns
download_pfns(
    items,
    num_threads=2,
    trace_custom_fields=None,
    traces_copy_out=None,
    deactivate_file_download_exceptions=False,
)

Download items with a given PFN. This function can only download files, no datasets.

PARAMETER DESCRIPTION
items

List of dictionaries. Each dictionary describing a file to download. Keys: pfn - PFN string of this file did - DID string of this file (e.g. 'scope:file.name'). Wildcards are not allowed rse - rse name (e.g. 'CERN-PROD_DATADISK'). RSE Expressions are not allowed base_dir - Optional: Base directory where the downloaded files will be stored. (Default: '.') no_subdir - Optional: If true, files are written directly into base_dir. (Default: False) adler32 - Optional: The adler32 checmsum to compare the downloaded files adler32 checksum with md5 - Optional: The md5 checksum to compare the downloaded files md5 checksum with transfer_timeout - Optional: Timeout time for the download protocols. (Default: None) check_local_with_filesize_only - Optional: If true, already downloaded files will not be validated by checksum.

TYPE: list[dict[str, Any]]

num_threads

Suggestion of number of threads to use for the download. It will be lowered if it's too high.

TYPE: int DEFAULT: 2

trace_custom_fields

Custom key value pairs to send with the traces

TYPE: Optional[dict[str, Any]] DEFAULT: None

traces_copy_out

reference to an external list, where the traces should be uploaded

TYPE: Optional[list[dict[str, Any]]] DEFAULT: None

deactivate_file_download_exceptions

Boolean, if file download exceptions shouldn't be raised

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
list[dict[str, Any]]

a list of dictionaries with an entry for each file, containing the input options, the did, and the clientState clientState can be one of the following: ALREADY_DONE, DONE, FILE_NOT_FOUND, FAIL_VALIDATE, FAILED

RAISES DESCRIPTION
InputValidationError

if one of the input items is in the wrong format

NoFilesDownloaded

if no files could be downloaded

NotAllFilesDownloaded

if not all files could be downloaded

RucioException

if something unexpected went wrong during the download

download_dids
download_dids(
    items,
    num_threads=2,
    trace_custom_fields=None,
    traces_copy_out=None,
    deactivate_file_download_exceptions=False,
    sort=None,
)

Download items with given DIDs. This function can also download datasets and wildcarded DIDs.

PARAMETER DESCRIPTION
items

List of dictionaries. Each dictionary describing an item to download. Keys: did - DID string of this file (e.g. 'scope:file.name') filters - Filter to select DIDs for download. Optional if DID is given rse - Optional: rse name (e.g. 'CERN-PROD_DATADISK') or rse expression from where to download impl - Optional: name of the protocol implementation to be used to download this item. no_resolve_archives - Optional: bool indicating whether archives should not be considered for download (Default: False) resolve_archives - Deprecated: Use no_resolve_archives instead force_scheme - Optional: force a specific scheme to download this item. (Default: None) base_dir - Optional: base directory where the downloaded files will be stored. (Default: '.') no_subdir - Optional: If true, files are written directly into base_dir. (Default: False) nrandom - Optional: if the DID addresses a dataset, nrandom files will be randomly chosen for download from the dataset ignore_checksum - Optional: If true, skips the checksum validation between the downloaded file and the rucio catalouge. (Default: False) transfer_timeout - Optional: Timeout time for the download protocols. (Default: None) transfer_speed_timeout - Optional: Minimum allowed transfer speed (in KBps). Ignored if transfer_timeout set. Otherwise, used to compute default timeout (Default: 500) check_local_with_filesize_only - Optional: If true, already downloaded files will not be validated by checksum.

TYPE: list[dict[str, Any]]

num_threads

Suggestion of number of threads to use for the download. It will be lowered if it's too high.

TYPE: int DEFAULT: 2

trace_custom_fields

Custom key value pairs to send with the traces.

TYPE: Optional[dict[str, Any]] DEFAULT: None

traces_copy_out

reference to an external list, where the traces should be uploaded

TYPE: Optional[list[dict[str, Any]]] DEFAULT: None

deactivate_file_download_exceptions

Boolean, if file download exceptions shouldn't be raised

TYPE: bool DEFAULT: False

sort

Select best replica by replica sorting algorithm. Available algorithms: geoip - based on src/dst IP topographical distance

TYPE: Optional[SORTING_ALGORITHMS_LITERAL] DEFAULT: None

RETURNS DESCRIPTION
list[dict[str, Any]]

a list of dictionaries with an entry for each file, containing the input options, the did, and the clientState

RAISES DESCRIPTION
InputValidationError

if one of the input items is in the wrong format

NoFilesDownloaded

if no files could be downloaded

NotAllFilesDownloaded

if not all files could be downloaded

RucioException

if something unexpected went wrong during the download

download_from_metalink_file(
    item,
    metalink_file_path,
    num_threads=2,
    trace_custom_fields=None,
    traces_copy_out=None,
    deactivate_file_download_exceptions=False,
)

Download items using a given metalink file.

PARAMETER DESCRIPTION
item

dictionary describing an item to download. Keys: base_dir - Optional: base directory where the downloaded files will be stored. (Default: '.') no_subdir - Optional: If true, files are written directly into base_dir. (Default: False) ignore_checksum - Optional: If true, skips the checksum validation between the downloaded file and the rucio catalouge. (Default: False) transfer_timeout - Optional: Timeout time for the download protocols. (Default: None) check_local_with_filesize_only - Optional: If true, already downloaded files will not be validated by checksum.

TYPE: dict[str, Any]

num_threads

Suggestion of number of threads to use for the download. It will be lowered if it's too high.

TYPE: int DEFAULT: 2

trace_custom_fields

Custom key value pairs to send with the traces.

TYPE: Optional[dict[str, Any]] DEFAULT: None

traces_copy_out

reference to an external list, where the traces should be uploaded

TYPE: Optional[list[dict[str, Any]]] DEFAULT: None

deactivate_file_download_exceptions

Boolean, if file download exceptions shouldn't be raised

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
list[dict[str, Any]]

a list of dictionaries with an entry for each file, containing the input options, the did, and the clientState

RAISES DESCRIPTION
InputValidationError

if one of the input items is in the wrong format

NoFilesDownloaded

if no files could be downloaded

NotAllFilesDownloaded

if not all files could be downloaded

RucioException

if something unexpected went wrong during the download

download_aria2c
download_aria2c(
    items,
    trace_custom_fields=None,
    filters=None,
    deactivate_file_download_exceptions=False,
    sort=None,
)

Uses aria2c to download the items with given DIDs. This function can also download datasets and wildcarded DIDs. It only can download files that are available via https/davs. Aria2c needs to be installed and X509_USER_PROXY needs to be set!

PARAMETER DESCRIPTION
items

List of dictionaries. Each dictionary describing an item to download. Keys: did - DID string of this file (e.g. 'scope:file.name'). Wildcards are not allowed rse - Optional: rse name (e.g. 'CERN-PROD_DATADISK') or rse expression from where to download base_dir - Optional: base directory where the downloaded files will be stored. (Default: '.') no_subdir - Optional: If true, files are written directly into base_dir. (Default: False) nrandom - Optional: if the DID addresses a dataset, nrandom files will be randomly chosen for download from the dataset ignore_checksum - Optional: If true, skips the checksum validation between the downloaded file and the rucio catalouge. (Default: False) check_local_with_filesize_only - Optional: If true, already downloaded files will not be validated by checksum.

TYPE: list[dict[str, Any]]

trace_custom_fields

Custom key value pairs to send with the traces

TYPE: Optional[dict[str, Any]] DEFAULT: None

filters

dictionary containing filter options

TYPE: Optional[dict[str, Any]] DEFAULT: None

deactivate_file_download_exceptions

Boolean, if file download exceptions shouldn't be raised

TYPE: bool DEFAULT: False

sort

Select best replica by replica sorting algorithm. Available algorithms: geoip - based on src/dst IP topographical distance

TYPE: Optional[SORTING_ALGORITHMS_LITERAL] DEFAULT: None

RETURNS DESCRIPTION
list[dict[str, Any]]

a list of dictionaries with an entry for each file, containing the input options, the did, and the clientState

RAISES DESCRIPTION
InputValidationError

if one of the input items is in the wrong format

NoFilesDownloaded

if no files could be downloaded

NotAllFilesDownloaded

if not all files could be downloaded

RucioException

if something went wrong during the download (e.g. aria2c could not be started)

preferred_impl
preferred_impl(sources)

Finds the optimum protocol impl preferred by the client and supported by the remote RSE.

PARAMETER DESCRIPTION
sources

List of sources for a given DID

TYPE: list[dict[str, Any]]

RAISES DESCRIPTION
RucioException(msg)

general exception with msg for more details.

Functions