Skip to main content

conveyor-preparer (transfer preparer) is the main entry point into the transfer machinery. It leverages topological information to pick the best source replica for the transfer. It also decides if the transfer has to be handled by the throttler or not. For all new rucio installations, it is recommended to run this daemon and activate it by setting the conveyor/use_preparer = True configuration option.

Preparer:

  • finds all source RSEs which have a replica of the desired file
  • filters out the source RSEs which don't respect administrative constraints
  • ensures protocol compatibility between sources and destination
  • performs path computations to find the best sources
  • transitions the transfer request either to a Waiting or to a Queued state

Source replica selection

One of the main jobs done by the preparer is the selection of the replica to be used as a transfer sources. For that, it relies on multiple RSE attributes and on the configured protocols. This section provides a summary of what configuration parameters can influence the preparer at this stage.

We will use the notation section/option to speak about a configuration value to be set in rucio.cfg like this:

[section]
option = value

The preparer will start by retrieving all the possible sources from the database.

In the following step, the preparer will skip all sources which don't respect the administrative constraints. For example, it will ignore source RSEs with availability_read=False (unless the preparer is run with --ignore-availability). It also respects the restricted_read and restricted_write RSE attributes for the source and the destination.

Some request attributes will impact the source selection. For example, preparer will skip source RSEs which don't match the source_replica_expression or allow_tape_source conditions. It will also ignore requests witch require a transfertool that this preparer cannot use. The request attributes are either inherited from the rule, or set by another transfer daemon (for example: preparer)

The next step is to perform the path computation. At this stage, preparer uses the distance between RSEs to perform shortest-path computations. If multi-hopping is enabled via transfers/use_multihop, then the configuration value transfers/hop_penalty + the RSE attributes available_for_multihop and hop_penalty will influence the distances for multi-hop paths. Each hop, even for single-hop transfers, must respect the protocol compatibility between the source of the hop and its destination. The SCHEME_MAP constant defines the compatibility between protocols. Only protocols with non-zero third_party_copy_read will be considered for source RSEs, ordered by priority. Same for the destination: third_party_copy_write is used.

Note: distances between RSEs which are set by the administrator via

rucio-admin rse add-distance --distance 1 RSE1 RSE2
# Note: before rucio 1.30 (as a consequence: also in the current LTS release 1.29),
# the --ranking option was used for the same purpose. The --distance option
# could still be set and was mentioned in documentation alongside --ranking
# but was completely ignored by rucio.
# On 1.29, you'll have to use the following command:
rucio-admin rse add-distance --ranking 1 RSE1 RSE2

Once all valid paths are found, after all the filtering done previously, the paths are ordered using the following simple rules :

  • the source ranking is compared first. Source ranking is an integer which is decreased each time a particular source is found to have an issue to perform this particular transfer. It is thus equal to 0 at first try, and decreased at transfer failure before re-trying the transfer. This ensures that problematic sources are much less likely to be re-used.
  • On equal source ranking, the RSE type is checked. Disk sources are preferred over tape.
  • On equal source RSE type, the distance between the source RSE and the destination RSE is compared.
  • On equal distance, we prefer single-hop paths.