retrieve: best practices

I just wanted to note some best practices for speeding up retrievals for better understanding. Assume i have several files, i want to retrieve and i don't want to download everything recursively. As far as i understood, i should:

retrieve a list of files

option 1

  • use slk_helpers gen_file_query <file1> <file2> ... <fileN> to create a json query
  • use slk_helpers search_limited to create a search based on the query
  • run slk retrieve with the id

this should avoid tapes being rejected and stored if those files are on the same tape? we could implement this if slk_retrieve receives a list of files as input.

option 2

  • use slk_helpers gfbt <file1> <file2> ... <fileN> --gen-search-query to create a json query for each group of files.
  • run slk_helpers search_limited on those queries to obtain search ids
  • run slk_retrieve in parallel for each search id

is option 2 recommendable? I imagine creating a job script for each group of files that are on the same tape to retrieve them (or use threads). Or is there no advantage to option 1? i assume that none of the files are cached...

Edited by Lars Buntemeyer