X-Git-Url: https://git.arvados.org/arvados.git/blobdiff_plain/f34a8d68bdd096cf1b019a9806bd1e6eba028d77..5fec43173f9744cbf49e422468a71cd8a9b5d9d3:/doc/user/topics/arv-copy.html.textile.liquid diff --git a/doc/user/topics/arv-copy.html.textile.liquid b/doc/user/topics/arv-copy.html.textile.liquid index 15c9623224..a05620d62d 100644 --- a/doc/user/topics/arv-copy.html.textile.liquid +++ b/doc/user/topics/arv-copy.html.textile.liquid @@ -15,7 +15,7 @@ This tutorial describes how to copy Arvados objects from one cluster to another h2. arv-copy -@arv-copy@ allows users to copy collections, workflow definitions and projects from one cluster to another. +@arv-copy@ allows users to copy collections, workflow definitions and projects from one cluster to another. You can also use @arv-copy@ to import resources from HTTP URLs into Keep. For projects, @arv-copy@ will copy all the collections workflow definitions owned by the project, and recursively copy subprojects. @@ -71,10 +71,14 @@ Additionally, if you need to specify the storage classes where to save the copie h3. How to copy a workflow +Copying workflows requires @arvados-cwl-runner@ to be available in your @$PATH@. + We will use the uuid @jutro-7fd4e-mkmmq53m1ze6apx@ as an example workflow. +Arv-copy will infer the source cluster is @jutro@ from the object uuid, and destination cluster is @pirca@ from @--project-uuid@. + -
~$ arv-copy --src jutro --dst pirca --project-uuid pirca-j7d0g-ecak8knpefz8ere jutro-7fd4e-mkmmq53m1ze6apx
+
~$ arv-copy --project-uuid pirca-j7d0g-ecak8knpefz8ere jutro-7fd4e-mkmmq53m1ze6apx
 ae480c5099b81e17267b7445e35b4bc7+180: 23M / 23M 100.0%
 2463fa9efeb75e099685528b3b9071e0+438: 156M / 156M 100.0%
 jutro-4zz18-vvvqlops0a0kpdl: 94M / 94M 100.0%
@@ -91,8 +95,10 @@ h3. How to copy a project
 
 We will use the uuid @jutro-j7d0g-xj19djofle3aryq@ as an example project.
 
+Arv-copy will infer the source cluster is @jutro@ from the source project uuid, and destination cluster is @pirca@ from @--project-uuid@.
+
 
-
~$ peteramstutz@shell:~$ arv-copy --project-uuid pirca-j7d0g-lr8sq3tx3ovn68k jutro-j7d0g-xj19djofle3aryq
+
~$ arv-copy --project-uuid pirca-j7d0g-lr8sq3tx3ovn68k jutro-j7d0g-xj19djofle3aryq
 2021-09-08 21:29:32 arvados.arv-copy[6377] INFO:
 2021-09-08 21:29:32 arvados.arv-copy[6377] INFO: Success: created copy with uuid pirca-j7d0g-ig9gvu5piznducp
 
@@ -101,3 +107,23 @@ We will use the uuid @jutro-j7d0g-xj19djofle3aryq@ as an example project. The name and description of the original project will be used for the destination copy. If a project already exists with the same name, collections and workflow definitions will be copied into the project with the same name. If you would like to copy the project but not its subproject, you can use the @--no-recursive@ flag. + +h3. Importing HTTP resources to Keep + +You can also use @arv-copy@ to copy the contents of a HTTP URL into Keep. When you do this, Arvados keeps track of the original URL the resource came from. This allows you to refer to the resource by its original URL in Workflow inputs, but actually read from the local copy in Keep. + + +
~$ arv-copy --project-uuid tordo-j7d0g-lr8sq3tx3ovn68k https://example.com/index.html
+tordo-4zz18-dhpb6y9km2byb94
+2023-10-06 10:15:36 arvados.arv-copy[374147] INFO: Success: created copy with uuid tordo-4zz18-dhpb6y9km2byb94
+
+
+ +In addition, when importing from HTTP URLs, you may provide a different cluster than the destination in @--src@. This tells @arv-copy@ to search the other cluster for a collection associated with that URL, and if found, copy the collection from that cluster instead of downloading from the original URL. + +The following @arv-copy@ command line options affect the behavior of HTTP import. + +table(table table-bordered table-condensed). +|_. Option |_. Description | +|==--varying-url-params== VARYING_URL_PARAMS|A comma separated list of URL query parameters that should be ignored when storing HTTP URLs in Keep.| +|==--prefer-cached-downloads==|If a HTTP URL is found in Keep, skip upstream URL freshness check (will not notice if the upstream has changed, but also not error if upstream is unavailable).|