If some of the data or software your jobs depend on is available via the web, you can have such files transferred by HTCondor using the appropriate HTTP address!
While our Overview of Data Mangement on OSG Connect describes how you can stage data, files, or even software in OSG Connect locations, any web-accessible file can be transferred directly to your jobs IF:
- the file is accessible via an HTTP address
- the file is less than 1GB in size (if larger, you'll need to pre-stage them for stash-based transfer
- the server or website they're on can handle large numbers of your jobs accessing them simultaneously
Importantly, you'll also want to make sure your job executable knows how to handle the file (un-tar, etc.) from within the working directory of the job, just like it would for any other input file.
Transfer Files via HTTP
Use an HTTP URL in
combination with the
transfer_input_files statement in your HTCondor submit file.
# submit file example log = my_job.$(Cluster).$(Process).log error = my_job.$(Cluster).$(Process).err output = my_job.$(Cluster).$(Process).out # transfer software tarball from public via http transfer_input_files = http://www.website.com/path/file.tar.gz ...other submit file details...
Multiple URLs can
be specified using a comma-separated list, and a combination of URLs and
/home directory can be provided in a comma separated list. For example,
# transfer software tarball from public via http # transfer input data from home via htcondor file transfer transfer_input_files = http://www.website.com/path/file1.tar.gz, http://www.website.com/path/file2.tar.gz, my_data.csv
This page was updated on Jan 25, 2022 at 14:36 from start/data/file-transfer-via-http.md.