A key feature of @arv-run@ is the ability to introspect the command line to determine which arguments are file inputs, and transform those paths so they are usable inside the Arvados container. In the above example, @HWI-ST1027_129_D0THKACXX.1_2.fastq@ is transformed into @/keep/3229739b505d2b878b62aed09895a55a+142/HWI-ST1027_129_D0THKACXX.1_1.fastq@. In the above example, @arv-run@ works together with @arv-mount@ to identify that the file is already part of an Arvados collection. In this case, it will use the existing collection without any upload step. If you specify a file that is only available on the local filesystem, @arv-run@ will upload a new collection and use that.
-@arv-run@ will parallelize on the files listed on the command line after @--@. You may specify @--batch-size N@ after the @--@ but before listing any files to specify how many files to provide put on the command line for each task. The syntax is designed to mimic standard shell syntax, so it is usually necessary to quote the metacharacters < > and | as either \< \> and \| or '<' '>' and '|'.
+h2. Parallel tasks
+
+@arv-run@ will parallelize over files listed on the command line after @--@.
<notextile>
<pre>
-$ <span class="userinput">cd ~/keep/by_id/3229739b505d2b878b62aed09895a55a+142</span>
-$ <span class="userinput">ls *.fastq</span>
HWI-ST1027_129_D0THKACXX.1_1.fastq HWI-ST1027_129_D0THKACXX.1_2.fastq
-$ <span class="userinput">arv-run grep -H -n ATTGGAGGAAAGATGAGTGAC -- *.fastq \> output.txt</span>
+$ <span class="userinput">arv-run grep -H -n ATTGGAGGAAAGATGAGTGAC -- *.fastq</span>
Running pipeline qr1hi-d1hrv-mg3bju0u7r6w241
</pre>
</notextile>
-You may use stdin @<@ redirection on multiple files. This will create a separate task for each input file:
+You may use also stdin @<@ redirection on multiple files. This will create a separate task for each input file. Because the syntax is designed to mimic standard shell syntax, it is necessary to quote the metacharacters @<@, @>@ and @|@ as either @\<@, @\>@ and @\|@ or @'<'@, @'>'@ and @'|'@.
<notextile>
<pre>
By default, the pipeline will be submitted to your configured Arvado instance. Use @arv-run --local@ to run the command locally using "arv-crunch-job".
+You may specify @--batch-size N@ after the @--@ but before listing any files to specify how many files to provide put on the command line for each task.
+
h1. Examples
Run one @grep@ task per file, with each input files piped from stdin. Redirect the output to output.txt.
component["script_parameters"]["command"] = slots[2:]
pipeline = {
- "name": " ".join(starting_args),
+ "name": " | ".join([s[0] for s in slots[2:]]),
+ "description": "@" + " ".join(starting_args) + "@",
"components": {
"command": component
},