If some of the data or software your jobs depend on is available via the web, you can have such files transferred by HTCondor using the appropriate HTTP address!
While our Overview of Data Mangement on OSG Connect describes how you can stage data, files, or even software in OSG Connect data locations, any web-accessible file in a non-OSG Connect location can be transferred directly to your jobs IF:
- the file is accessible via an HTTP/HTTPS address
- the file is less than 1GB in size (if larger, you'll need to pre-stage them for stash-based transfer
- the server or website they're on can handle large numbers of your jobs accessing them simultaneously
Importantly, you'll also want to make sure your job executable knows how to handle the file (un-tar, etc.) from within the working directory of the job, just like it would for any other input file.
Transfer Files via HTTP
Use an HTTP URL in
combination with the
transfer_input_files statement in your HTCondor submit file.
# submit file example log = my_job.$(Cluster).$(Process).log error = my_job.$(Cluster).$(Process).err output = my_job.$(Cluster).$(Process).out # transfer software tarball from public via http transfer_input_files = http://www.website.com/path/file.tar.gz ...other submit file details...
Multiple URLs may be specified in any order, using a comma-separated list, and a combination of URLs and
files from other locations (e.g. within
/home) can be provided in the list. For example,
# transfer software tarball from public via http # transfer input data from home via htcondor file transfer transfer_input_files = http://www.website.com/path/file1.tar.gz, http://www.website.com/path/file2.tar.gz, my_data.csv
This page was updated on May 20, 2022 at 18:10 from start/data/file-transfer-via-http.md.