From: Tom Clegg Date: Fri, 14 Mar 2014 14:17:46 +0000 (-0400) Subject: Merge branch 'master' into 2257-inequality-conditions X-Git-Tag: 1.1.0~2709^2~97^2^2~2 X-Git-Url: https://git.arvados.org/arvados.git/commitdiff_plain/35336cd73e444534cb2eda20e3730464cc4e6553?hp=63e8f77e963949d4187555411e7b5c60fc850468 Merge branch 'master' into 2257-inequality-conditions --- diff --git a/COPYING b/COPYING new file mode 100644 index 0000000000..4006e686da --- /dev/null +++ b/COPYING @@ -0,0 +1,11 @@ +Server-side components of Arvados contained in the apps/ and services/ +directories, including the API Server, Workbench, and Crunch, are licenced +under the GNU Affero General Public License version 3 (see agpl-3.0.txt) + +The Arvados client Software Development Kits contained in the sdk/ directory, +example scripts in the crunch_scripts/ directory, and code samples in the +Aravados documentation are licensed under the Apache License, Version 2.0 (see +LICENSE-2.0.txt) + +The Arvados Documentation located in the doc/ directory is licensed under the +Creative Commons Attribution-Share Alike 3.0 United States (see by-sa-3.0.txt) \ No newline at end of file diff --git a/LICENSE-2.0.txt b/LICENSE-2.0.txt new file mode 100644 index 0000000000..d645695673 --- /dev/null +++ b/LICENSE-2.0.txt @@ -0,0 +1,202 @@ + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. diff --git a/README b/README new file mode 100644 index 0000000000..c7a36c355b --- /dev/null +++ b/README @@ -0,0 +1,21 @@ +Welcome to Arvados! + +The main Arvados web site is + https://arvados.org + +The Arvados public wiki is located at + https://arvados.org/projects/arvados/wiki + +The Arvados public bug tracker is located at + https://arvados.org/projects/arvados/issues + +For support see + http://doc.arvados.org/user/getting_started/community.html + +Installation documentation is located at + http://doc.arvados.org/install + +If you wish to build the documentation yourself, follow the instructions in +doc/README to build the documentation, then consult the "Install Guide". + +See COPYING for information about Arvados Free Software licenses. diff --git a/agpl-3.0.txt b/agpl-3.0.txt new file mode 100644 index 0000000000..dba13ed2dd --- /dev/null +++ b/agpl-3.0.txt @@ -0,0 +1,661 @@ + GNU AFFERO GENERAL PUBLIC LICENSE + Version 3, 19 November 2007 + + Copyright (C) 2007 Free Software Foundation, Inc. + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The GNU Affero General Public License is a free, copyleft license for +software and other kinds of works, specifically designed to ensure +cooperation with the community in the case of network server software. + + The licenses for most software and other practical works are designed +to take away your freedom to share and change the works. By contrast, +our General Public Licenses are intended to guarantee your freedom to +share and change all versions of a program--to make sure it remains free +software for all its users. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +them if you wish), that you receive source code or can get it if you +want it, that you can change the software or use pieces of it in new +free programs, and that you know you can do these things. + + Developers that use our General Public Licenses protect your rights +with two steps: (1) assert copyright on the software, and (2) offer +you this License which gives you legal permission to copy, distribute +and/or modify the software. + + A secondary benefit of defending all users' freedom is that +improvements made in alternate versions of the program, if they +receive widespread use, become available for other developers to +incorporate. Many developers of free software are heartened and +encouraged by the resulting cooperation. However, in the case of +software used on network servers, this result may fail to come about. +The GNU General Public License permits making a modified version and +letting the public access it on a server without ever releasing its +source code to the public. + + The GNU Affero General Public License is designed specifically to +ensure that, in such cases, the modified source code becomes available +to the community. It requires the operator of a network server to +provide the source code of the modified version running there to the +users of that server. Therefore, public use of a modified version, on +a publicly accessible server, gives the public access to the source +code of the modified version. + + An older license, called the Affero General Public License and +published by Affero, was designed to accomplish similar goals. This is +a different license, not a version of the Affero GPL, but Affero has +released a new version of the Affero GPL which permits relicensing under +this license. + + The precise terms and conditions for copying, distribution and +modification follow. + + TERMS AND CONDITIONS + + 0. Definitions. + + "This License" refers to version 3 of the GNU Affero General Public License. + + "Copyright" also means copyright-like laws that apply to other kinds of +works, such as semiconductor masks. + + "The Program" refers to any copyrightable work licensed under this +License. Each licensee is addressed as "you". "Licensees" and +"recipients" may be individuals or organizations. + + To "modify" a work means to copy from or adapt all or part of the work +in a fashion requiring copyright permission, other than the making of an +exact copy. The resulting work is called a "modified version" of the +earlier work or a work "based on" the earlier work. + + A "covered work" means either the unmodified Program or a work based +on the Program. + + To "propagate" a work means to do anything with it that, without +permission, would make you directly or secondarily liable for +infringement under applicable copyright law, except executing it on a +computer or modifying a private copy. Propagation includes copying, +distribution (with or without modification), making available to the +public, and in some countries other activities as well. + + To "convey" a work means any kind of propagation that enables other +parties to make or receive copies. Mere interaction with a user through +a computer network, with no transfer of a copy, is not conveying. + + An interactive user interface displays "Appropriate Legal Notices" +to the extent that it includes a convenient and prominently visible +feature that (1) displays an appropriate copyright notice, and (2) +tells the user that there is no warranty for the work (except to the +extent that warranties are provided), that licensees may convey the +work under this License, and how to view a copy of this License. If +the interface presents a list of user commands or options, such as a +menu, a prominent item in the list meets this criterion. + + 1. Source Code. + + The "source code" for a work means the preferred form of the work +for making modifications to it. "Object code" means any non-source +form of a work. + + A "Standard Interface" means an interface that either is an official +standard defined by a recognized standards body, or, in the case of +interfaces specified for a particular programming language, one that +is widely used among developers working in that language. + + The "System Libraries" of an executable work include anything, other +than the work as a whole, that (a) is included in the normal form of +packaging a Major Component, but which is not part of that Major +Component, and (b) serves only to enable use of the work with that +Major Component, or to implement a Standard Interface for which an +implementation is available to the public in source code form. A +"Major Component", in this context, means a major essential component +(kernel, window system, and so on) of the specific operating system +(if any) on which the executable work runs, or a compiler used to +produce the work, or an object code interpreter used to run it. + + The "Corresponding Source" for a work in object code form means all +the source code needed to generate, install, and (for an executable +work) run the object code and to modify the work, including scripts to +control those activities. However, it does not include the work's +System Libraries, or general-purpose tools or generally available free +programs which are used unmodified in performing those activities but +which are not part of the work. For example, Corresponding Source +includes interface definition files associated with source files for +the work, and the source code for shared libraries and dynamically +linked subprograms that the work is specifically designed to require, +such as by intimate data communication or control flow between those +subprograms and other parts of the work. + + The Corresponding Source need not include anything that users +can regenerate automatically from other parts of the Corresponding +Source. + + The Corresponding Source for a work in source code form is that +same work. + + 2. Basic Permissions. + + All rights granted under this License are granted for the term of +copyright on the Program, and are irrevocable provided the stated +conditions are met. This License explicitly affirms your unlimited +permission to run the unmodified Program. The output from running a +covered work is covered by this License only if the output, given its +content, constitutes a covered work. This License acknowledges your +rights of fair use or other equivalent, as provided by copyright law. + + You may make, run and propagate covered works that you do not +convey, without conditions so long as your license otherwise remains +in force. You may convey covered works to others for the sole purpose +of having them make modifications exclusively for you, or provide you +with facilities for running those works, provided that you comply with +the terms of this License in conveying all material for which you do +not control copyright. Those thus making or running the covered works +for you must do so exclusively on your behalf, under your direction +and control, on terms that prohibit them from making any copies of +your copyrighted material outside their relationship with you. + + Conveying under any other circumstances is permitted solely under +the conditions stated below. Sublicensing is not allowed; section 10 +makes it unnecessary. + + 3. Protecting Users' Legal Rights From Anti-Circumvention Law. + + No covered work shall be deemed part of an effective technological +measure under any applicable law fulfilling obligations under article +11 of the WIPO copyright treaty adopted on 20 December 1996, or +similar laws prohibiting or restricting circumvention of such +measures. + + When you convey a covered work, you waive any legal power to forbid +circumvention of technological measures to the extent such circumvention +is effected by exercising rights under this License with respect to +the covered work, and you disclaim any intention to limit operation or +modification of the work as a means of enforcing, against the work's +users, your or third parties' legal rights to forbid circumvention of +technological measures. + + 4. Conveying Verbatim Copies. + + You may convey verbatim copies of the Program's source code as you +receive it, in any medium, provided that you conspicuously and +appropriately publish on each copy an appropriate copyright notice; +keep intact all notices stating that this License and any +non-permissive terms added in accord with section 7 apply to the code; +keep intact all notices of the absence of any warranty; and give all +recipients a copy of this License along with the Program. + + You may charge any price or no price for each copy that you convey, +and you may offer support or warranty protection for a fee. + + 5. Conveying Modified Source Versions. + + You may convey a work based on the Program, or the modifications to +produce it from the Program, in the form of source code under the +terms of section 4, provided that you also meet all of these conditions: + + a) The work must carry prominent notices stating that you modified + it, and giving a relevant date. + + b) The work must carry prominent notices stating that it is + released under this License and any conditions added under section + 7. This requirement modifies the requirement in section 4 to + "keep intact all notices". + + c) You must license the entire work, as a whole, under this + License to anyone who comes into possession of a copy. This + License will therefore apply, along with any applicable section 7 + additional terms, to the whole of the work, and all its parts, + regardless of how they are packaged. This License gives no + permission to license the work in any other way, but it does not + invalidate such permission if you have separately received it. + + d) If the work has interactive user interfaces, each must display + Appropriate Legal Notices; however, if the Program has interactive + interfaces that do not display Appropriate Legal Notices, your + work need not make them do so. + + A compilation of a covered work with other separate and independent +works, which are not by their nature extensions of the covered work, +and which are not combined with it such as to form a larger program, +in or on a volume of a storage or distribution medium, is called an +"aggregate" if the compilation and its resulting copyright are not +used to limit the access or legal rights of the compilation's users +beyond what the individual works permit. Inclusion of a covered work +in an aggregate does not cause this License to apply to the other +parts of the aggregate. + + 6. Conveying Non-Source Forms. + + You may convey a covered work in object code form under the terms +of sections 4 and 5, provided that you also convey the +machine-readable Corresponding Source under the terms of this License, +in one of these ways: + + a) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by the + Corresponding Source fixed on a durable physical medium + customarily used for software interchange. + + b) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by a + written offer, valid for at least three years and valid for as + long as you offer spare parts or customer support for that product + model, to give anyone who possesses the object code either (1) a + copy of the Corresponding Source for all the software in the + product that is covered by this License, on a durable physical + medium customarily used for software interchange, for a price no + more than your reasonable cost of physically performing this + conveying of source, or (2) access to copy the + Corresponding Source from a network server at no charge. + + c) Convey individual copies of the object code with a copy of the + written offer to provide the Corresponding Source. This + alternative is allowed only occasionally and noncommercially, and + only if you received the object code with such an offer, in accord + with subsection 6b. + + d) Convey the object code by offering access from a designated + place (gratis or for a charge), and offer equivalent access to the + Corresponding Source in the same way through the same place at no + further charge. You need not require recipients to copy the + Corresponding Source along with the object code. If the place to + copy the object code is a network server, the Corresponding Source + may be on a different server (operated by you or a third party) + that supports equivalent copying facilities, provided you maintain + clear directions next to the object code saying where to find the + Corresponding Source. Regardless of what server hosts the + Corresponding Source, you remain obligated to ensure that it is + available for as long as needed to satisfy these requirements. + + e) Convey the object code using peer-to-peer transmission, provided + you inform other peers where the object code and Corresponding + Source of the work are being offered to the general public at no + charge under subsection 6d. + + A separable portion of the object code, whose source code is excluded +from the Corresponding Source as a System Library, need not be +included in conveying the object code work. + + A "User Product" is either (1) a "consumer product", which means any +tangible personal property which is normally used for personal, family, +or household purposes, or (2) anything designed or sold for incorporation +into a dwelling. In determining whether a product is a consumer product, +doubtful cases shall be resolved in favor of coverage. For a particular +product received by a particular user, "normally used" refers to a +typical or common use of that class of product, regardless of the status +of the particular user or of the way in which the particular user +actually uses, or expects or is expected to use, the product. A product +is a consumer product regardless of whether the product has substantial +commercial, industrial or non-consumer uses, unless such uses represent +the only significant mode of use of the product. + + "Installation Information" for a User Product means any methods, +procedures, authorization keys, or other information required to install +and execute modified versions of a covered work in that User Product from +a modified version of its Corresponding Source. The information must +suffice to ensure that the continued functioning of the modified object +code is in no case prevented or interfered with solely because +modification has been made. + + If you convey an object code work under this section in, or with, or +specifically for use in, a User Product, and the conveying occurs as +part of a transaction in which the right of possession and use of the +User Product is transferred to the recipient in perpetuity or for a +fixed term (regardless of how the transaction is characterized), the +Corresponding Source conveyed under this section must be accompanied +by the Installation Information. But this requirement does not apply +if neither you nor any third party retains the ability to install +modified object code on the User Product (for example, the work has +been installed in ROM). + + The requirement to provide Installation Information does not include a +requirement to continue to provide support service, warranty, or updates +for a work that has been modified or installed by the recipient, or for +the User Product in which it has been modified or installed. Access to a +network may be denied when the modification itself materially and +adversely affects the operation of the network or violates the rules and +protocols for communication across the network. + + Corresponding Source conveyed, and Installation Information provided, +in accord with this section must be in a format that is publicly +documented (and with an implementation available to the public in +source code form), and must require no special password or key for +unpacking, reading or copying. + + 7. Additional Terms. + + "Additional permissions" are terms that supplement the terms of this +License by making exceptions from one or more of its conditions. +Additional permissions that are applicable to the entire Program shall +be treated as though they were included in this License, to the extent +that they are valid under applicable law. If additional permissions +apply only to part of the Program, that part may be used separately +under those permissions, but the entire Program remains governed by +this License without regard to the additional permissions. + + When you convey a copy of a covered work, you may at your option +remove any additional permissions from that copy, or from any part of +it. (Additional permissions may be written to require their own +removal in certain cases when you modify the work.) You may place +additional permissions on material, added by you to a covered work, +for which you have or can give appropriate copyright permission. + + Notwithstanding any other provision of this License, for material you +add to a covered work, you may (if authorized by the copyright holders of +that material) supplement the terms of this License with terms: + + a) Disclaiming warranty or limiting liability differently from the + terms of sections 15 and 16 of this License; or + + b) Requiring preservation of specified reasonable legal notices or + author attributions in that material or in the Appropriate Legal + Notices displayed by works containing it; or + + c) Prohibiting misrepresentation of the origin of that material, or + requiring that modified versions of such material be marked in + reasonable ways as different from the original version; or + + d) Limiting the use for publicity purposes of names of licensors or + authors of the material; or + + e) Declining to grant rights under trademark law for use of some + trade names, trademarks, or service marks; or + + f) Requiring indemnification of licensors and authors of that + material by anyone who conveys the material (or modified versions of + it) with contractual assumptions of liability to the recipient, for + any liability that these contractual assumptions directly impose on + those licensors and authors. + + All other non-permissive additional terms are considered "further +restrictions" within the meaning of section 10. If the Program as you +received it, or any part of it, contains a notice stating that it is +governed by this License along with a term that is a further +restriction, you may remove that term. If a license document contains +a further restriction but permits relicensing or conveying under this +License, you may add to a covered work material governed by the terms +of that license document, provided that the further restriction does +not survive such relicensing or conveying. + + If you add terms to a covered work in accord with this section, you +must place, in the relevant source files, a statement of the +additional terms that apply to those files, or a notice indicating +where to find the applicable terms. + + Additional terms, permissive or non-permissive, may be stated in the +form of a separately written license, or stated as exceptions; +the above requirements apply either way. + + 8. Termination. + + You may not propagate or modify a covered work except as expressly +provided under this License. Any attempt otherwise to propagate or +modify it is void, and will automatically terminate your rights under +this License (including any patent licenses granted under the third +paragraph of section 11). + + However, if you cease all violation of this License, then your +license from a particular copyright holder is reinstated (a) +provisionally, unless and until the copyright holder explicitly and +finally terminates your license, and (b) permanently, if the copyright +holder fails to notify you of the violation by some reasonable means +prior to 60 days after the cessation. + + Moreover, your license from a particular copyright holder is +reinstated permanently if the copyright holder notifies you of the +violation by some reasonable means, this is the first time you have +received notice of violation of this License (for any work) from that +copyright holder, and you cure the violation prior to 30 days after +your receipt of the notice. + + Termination of your rights under this section does not terminate the +licenses of parties who have received copies or rights from you under +this License. If your rights have been terminated and not permanently +reinstated, you do not qualify to receive new licenses for the same +material under section 10. + + 9. Acceptance Not Required for Having Copies. + + You are not required to accept this License in order to receive or +run a copy of the Program. Ancillary propagation of a covered work +occurring solely as a consequence of using peer-to-peer transmission +to receive a copy likewise does not require acceptance. However, +nothing other than this License grants you permission to propagate or +modify any covered work. These actions infringe copyright if you do +not accept this License. Therefore, by modifying or propagating a +covered work, you indicate your acceptance of this License to do so. + + 10. Automatic Licensing of Downstream Recipients. + + Each time you convey a covered work, the recipient automatically +receives a license from the original licensors, to run, modify and +propagate that work, subject to this License. You are not responsible +for enforcing compliance by third parties with this License. + + An "entity transaction" is a transaction transferring control of an +organization, or substantially all assets of one, or subdividing an +organization, or merging organizations. If propagation of a covered +work results from an entity transaction, each party to that +transaction who receives a copy of the work also receives whatever +licenses to the work the party's predecessor in interest had or could +give under the previous paragraph, plus a right to possession of the +Corresponding Source of the work from the predecessor in interest, if +the predecessor has it or can get it with reasonable efforts. + + You may not impose any further restrictions on the exercise of the +rights granted or affirmed under this License. For example, you may +not impose a license fee, royalty, or other charge for exercise of +rights granted under this License, and you may not initiate litigation +(including a cross-claim or counterclaim in a lawsuit) alleging that +any patent claim is infringed by making, using, selling, offering for +sale, or importing the Program or any portion of it. + + 11. Patents. + + A "contributor" is a copyright holder who authorizes use under this +License of the Program or a work on which the Program is based. The +work thus licensed is called the contributor's "contributor version". + + A contributor's "essential patent claims" are all patent claims +owned or controlled by the contributor, whether already acquired or +hereafter acquired, that would be infringed by some manner, permitted +by this License, of making, using, or selling its contributor version, +but do not include claims that would be infringed only as a +consequence of further modification of the contributor version. For +purposes of this definition, "control" includes the right to grant +patent sublicenses in a manner consistent with the requirements of +this License. + + Each contributor grants you a non-exclusive, worldwide, royalty-free +patent license under the contributor's essential patent claims, to +make, use, sell, offer for sale, import and otherwise run, modify and +propagate the contents of its contributor version. + + In the following three paragraphs, a "patent license" is any express +agreement or commitment, however denominated, not to enforce a patent +(such as an express permission to practice a patent or covenant not to +sue for patent infringement). To "grant" such a patent license to a +party means to make such an agreement or commitment not to enforce a +patent against the party. + + If you convey a covered work, knowingly relying on a patent license, +and the Corresponding Source of the work is not available for anyone +to copy, free of charge and under the terms of this License, through a +publicly available network server or other readily accessible means, +then you must either (1) cause the Corresponding Source to be so +available, or (2) arrange to deprive yourself of the benefit of the +patent license for this particular work, or (3) arrange, in a manner +consistent with the requirements of this License, to extend the patent +license to downstream recipients. "Knowingly relying" means you have +actual knowledge that, but for the patent license, your conveying the +covered work in a country, or your recipient's use of the covered work +in a country, would infringe one or more identifiable patents in that +country that you have reason to believe are valid. + + If, pursuant to or in connection with a single transaction or +arrangement, you convey, or propagate by procuring conveyance of, a +covered work, and grant a patent license to some of the parties +receiving the covered work authorizing them to use, propagate, modify +or convey a specific copy of the covered work, then the patent license +you grant is automatically extended to all recipients of the covered +work and works based on it. + + A patent license is "discriminatory" if it does not include within +the scope of its coverage, prohibits the exercise of, or is +conditioned on the non-exercise of one or more of the rights that are +specifically granted under this License. You may not convey a covered +work if you are a party to an arrangement with a third party that is +in the business of distributing software, under which you make payment +to the third party based on the extent of your activity of conveying +the work, and under which the third party grants, to any of the +parties who would receive the covered work from you, a discriminatory +patent license (a) in connection with copies of the covered work +conveyed by you (or copies made from those copies), or (b) primarily +for and in connection with specific products or compilations that +contain the covered work, unless you entered into that arrangement, +or that patent license was granted, prior to 28 March 2007. + + Nothing in this License shall be construed as excluding or limiting +any implied license or other defenses to infringement that may +otherwise be available to you under applicable patent law. + + 12. No Surrender of Others' Freedom. + + If conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot convey a +covered work so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you may +not convey it at all. For example, if you agree to terms that obligate you +to collect a royalty for further conveying from those to whom you convey +the Program, the only way you could satisfy both those terms and this +License would be to refrain entirely from conveying the Program. + + 13. Remote Network Interaction; Use with the GNU General Public License. + + Notwithstanding any other provision of this License, if you modify the +Program, your modified version must prominently offer all users +interacting with it remotely through a computer network (if your version +supports such interaction) an opportunity to receive the Corresponding +Source of your version by providing access to the Corresponding Source +from a network server at no charge, through some standard or customary +means of facilitating copying of software. This Corresponding Source +shall include the Corresponding Source for any work covered by version 3 +of the GNU General Public License that is incorporated pursuant to the +following paragraph. + + Notwithstanding any other provision of this License, you have +permission to link or combine any covered work with a work licensed +under version 3 of the GNU General Public License into a single +combined work, and to convey the resulting work. The terms of this +License will continue to apply to the part which is the covered work, +but the work with which it is combined will remain governed by version +3 of the GNU General Public License. + + 14. Revised Versions of this License. + + The Free Software Foundation may publish revised and/or new versions of +the GNU Affero General Public License from time to time. Such new versions +will be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + + Each version is given a distinguishing version number. If the +Program specifies that a certain numbered version of the GNU Affero General +Public License "or any later version" applies to it, you have the +option of following the terms and conditions either of that numbered +version or of any later version published by the Free Software +Foundation. If the Program does not specify a version number of the +GNU Affero General Public License, you may choose any version ever published +by the Free Software Foundation. + + If the Program specifies that a proxy can decide which future +versions of the GNU Affero General Public License can be used, that proxy's +public statement of acceptance of a version permanently authorizes you +to choose that version for the Program. + + Later license versions may give you additional or different +permissions. However, no additional obligations are imposed on any +author or copyright holder as a result of your choosing to follow a +later version. + + 15. Disclaimer of Warranty. + + THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY +APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT +HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY +OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, +THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM +IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF +ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + + 16. Limitation of Liability. + + IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS +THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY +GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE +USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF +DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD +PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), +EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF +SUCH DAMAGES. + + 17. Interpretation of Sections 15 and 16. + + If the disclaimer of warranty and limitation of liability provided +above cannot be given local legal effect according to their terms, +reviewing courts shall apply local law that most closely approximates +an absolute waiver of all civil liability in connection with the +Program, unless a warranty or assumption of liability accompanies a +copy of the Program in return for a fee. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +state the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU Affero General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU Affero General Public License for more details. + + You should have received a copy of the GNU Affero General Public License + along with this program. If not, see . + +Also add information on how to contact you by electronic and paper mail. + + If your software can interact with users remotely through a computer +network, you should also make sure that it provides a way for users to +get its source. For example, if your program is a web application, its +interface could display a "Source" link that leads users to an archive +of the code. There are many ways you could offer source, and different +solutions will be better for different programs; see section 13 for the +specific requirements. + + You should also get your employer (if you work as a programmer) or school, +if any, to sign a "copyright disclaimer" for the program, if necessary. +For more information on this, and how to apply and follow the GNU AGPL, see +. diff --git a/apps/workbench/.gitignore b/apps/workbench/.gitignore index 89efb8185f..a656a5bc61 100644 --- a/apps/workbench/.gitignore +++ b/apps/workbench/.gitignore @@ -22,6 +22,7 @@ /config/environments/development.rb /config/environments/test.rb /config/environments/production.rb +/config/application.yml /config/piwik.yml diff --git a/apps/workbench/Gemfile b/apps/workbench/Gemfile index 66734ef3cd..6ae12f75db 100644 --- a/apps/workbench/Gemfile +++ b/apps/workbench/Gemfile @@ -43,7 +43,7 @@ gem 'less-rails' # gem 'capistrano' # To use debugger -# gem 'debugger' +#gem 'byebug' gem 'rvm-capistrano', :group => :test @@ -54,3 +54,4 @@ gem 'RedCloth' gem 'piwik_analytics' gem 'httpclient' gem 'themes_for_rails' +gem "deep_merge", :require => 'deep_merge/rails_compat' \ No newline at end of file diff --git a/apps/workbench/Gemfile.lock b/apps/workbench/Gemfile.lock index 7f4dc8e289..c7ffeb0057 100644 --- a/apps/workbench/Gemfile.lock +++ b/apps/workbench/Gemfile.lock @@ -51,6 +51,7 @@ GEM coffee-script-source (1.6.3) commonjs (0.2.7) daemon_controller (1.1.7) + deep_merge (1.0.1) erubis (2.7.0) execjs (2.0.2) highline (1.6.20) @@ -153,6 +154,7 @@ DEPENDENCIES bootstrap-sass (~> 3.1.0) bootstrap-x-editable-rails coffee-rails (~> 3.2.0) + deep_merge httpclient jquery-rails less diff --git a/apps/workbench/app/assets/javascripts/application.js b/apps/workbench/app/assets/javascripts/application.js index 3b697a6aa9..e7884b9516 100644 --- a/apps/workbench/app/assets/javascripts/application.js +++ b/apps/workbench/app/assets/javascripts/application.js @@ -41,6 +41,7 @@ jQuery(function($){ } targets.fadeToggle(200); }); + $(document). on('ajax:send', function(e, xhr) { $('.loading').fadeTo('fast', 1); @@ -139,4 +140,4 @@ jQuery(function($){ fixer.duplicateTheadTr(); fixer.fixThead(); }); -})(jQuery); +}); diff --git a/apps/workbench/app/assets/javascripts/editable.js b/apps/workbench/app/assets/javascripts/editable.js index 804eeb2d8f..e37b9444c1 100644 --- a/apps/workbench/app/assets/javascripts/editable.js +++ b/apps/workbench/app/assets/javascripts/editable.js @@ -1,5 +1,14 @@ $.fn.editable.defaults.ajaxOptions = {type: 'put', dataType: 'json'}; $.fn.editable.defaults.send = 'always'; + +// Default for editing is popup. I experimented with inline which is a little +// nicer in that it shows up right under the mouse instead of nearby. However, +// the inline box is taller than the regular content, which causes the page +// layout to shift unless we make the table rows tall, which leaves a lot of +// wasted space when not editing. Also inline can get cut off if the page is +// too narrow, when the popup box will just move to do the right thing. +//$.fn.editable.defaults.mode = 'inline'; + $.fn.editable.defaults.params = function (params) { var a = {}; var key = params.pk.key; @@ -7,4 +16,10 @@ $.fn.editable.defaults.params = function (params) { a[key] = {}; a[key][params.name] = params.value; return a; -}; \ No newline at end of file +}; + +$.fn.editable.defaults.validate = function (value) { + if (value == "***invalid***") { + return "Invalid selection"; + } +} diff --git a/apps/workbench/app/assets/javascripts/pipeline_instances.js b/apps/workbench/app/assets/javascripts/pipeline_instances.js new file mode 100644 index 0000000000..ee14e3b781 --- /dev/null +++ b/apps/workbench/app/assets/javascripts/pipeline_instances.js @@ -0,0 +1,46 @@ + +(function() { + var run_pipeline_button_state = function() { + var a = $('a.editable.required.editable-empty'); + if (a.length > 0) { + $("#run-pipeline-button").addClass("disabled"); + } + else { + $("#run-pipeline-button").removeClass("disabled"); + } + } + + $.fn.editable.defaults.success = function (response, newValue) { + var tag = $(this); + if (tag.hasClass("required")) { + if (newValue && newValue.trim() != "") { + tag.removeClass("editable-empty"); + tag.parent().css("background-color", ""); + tag.parent().prev().css("background-color", ""); + } + else { + tag.addClass("editable-empty"); + tag.parent().css("background-color", "#ffdddd"); + tag.parent().prev().css("background-color", "#ffdddd"); + } + } + run_pipeline_button_state(); + } + + $(window).on('load', function() { + var a = $('a.editable.required'); + for (var i = 0; i < a.length; i++) { + var tag = $(a[i]); + if (tag.hasClass("editable-empty")) { + tag.parent().css("background-color", "#ffdddd"); + tag.parent().prev().css("background-color", "#ffdddd"); + } + else { + tag.parent().css("background-color", ""); + tag.parent().prev().css("background-color", ""); + } + } + run_pipeline_button_state(); + } ); + +})(); diff --git a/apps/workbench/app/assets/javascripts/pipeline_instances.js.coffee b/apps/workbench/app/assets/javascripts/pipeline_instances.js.coffee deleted file mode 100644 index 761567942f..0000000000 --- a/apps/workbench/app/assets/javascripts/pipeline_instances.js.coffee +++ /dev/null @@ -1,3 +0,0 @@ -# Place all the behaviors and hooks related to the matching controller here. -# All this logic will automatically be available in application.js. -# You can use CoffeeScript in this file: http://jashkenas.github.com/coffee-script/ diff --git a/apps/workbench/app/assets/javascripts/selection.js b/apps/workbench/app/assets/javascripts/selection.js new file mode 100644 index 0000000000..9213b70a71 --- /dev/null +++ b/apps/workbench/app/assets/javascripts/selection.js @@ -0,0 +1,172 @@ +//= require jquery +//= require jquery_ujs + +/** Javascript for local persistent selection. */ + +get_selection_list = null; +form_selection_sources = {}; + +jQuery(function($){ + var storage = localStorage; // sessionStorage + + get_selection_list = function() { + if (!storage.persistentSelection) { + storage.persistentSelection = JSON.stringify([]); + } + return JSON.parse(storage.persistentSelection); + } + + var put_storage = function(lst) { + storage.persistentSelection = JSON.stringify(lst); + } + + var add_selection = function(uuid, name, href, type) { + var lst = get_selection_list(); + lst.push({"uuid": uuid, "name": name, "href": href, "type": type}); + put_storage(lst); + update_count(); + }; + + var remove_selection = function(uuid) { + var lst = get_selection_list(); + for (var i = 0; i < lst.length; i++) { + if (lst[i].uuid == uuid) { + lst.splice(i, 1); + i--; + } + } + put_storage(lst); + update_count(); + }; + + var remove_selection_click = function(e) { + remove_selection($(this).val()); + }; + + var clear_selections = function() { + put_storage([]); + update_count(); + } + + var update_count = function(e) { + var lst = get_selection_list(); + $("#persistent-selection-count").text(lst.length); + if (lst.length > 0) { + $('#selection-form-content').html( + '
  • Clear selections
  • ' + + '
  • ' + + '
  • '); + + for (var i = 0; i < lst.length; i++) { + $('#selection-form-content > li > table').append("" + + "" + + "" + + "" + + + "" + + "" + + "" + + + "" + + "" + lst[i].type + "" + + "" + + + ""); + } + } else { + $('#selection-form-content').html("
  • No selections.
  • "); + } + + var checkboxes = $('.persistent-selection:checkbox'); + for (i = 0; i < checkboxes.length; i++) { + for (var j = 0; j < lst.length; j++) { + if (lst[j].uuid == $(checkboxes[i]).val()) { + checkboxes[i].checked = true; + break; + } + } + if (j == lst.length) { + checkboxes[i].checked = false; + } + } + + $('.remove-selection').on('click', remove_selection_click); + $('#clear_selections_button').on('click', clear_selections); + }; + + $(document). + on('change', '.persistent-selection:checkbox', function(e) { + //console.log($(this)); + //console.log($(this).val()); + + var inc = 0; + if ($(this).is(":checked")) { + add_selection($(this).val(), $(this).attr('friendly_name'), $(this).attr('href'), $(this).attr('friendly_type')); + } + else { + remove_selection($(this).val()); + } + }); + + + $(window).on('load storage', update_count); + + $('#selection-form-content').on("click", function(e) { + e.stopPropagation(); + }); +}); + +add_form_selection_sources = null; +select_form_sources = null; + +(function() { + var form_selection_sources = {}; + add_form_selection_sources = function (src) { + for (var i = 0; i < src.length; i++) { + var t = form_selection_sources[src[i].type]; + if (!t) { + t = form_selection_sources[src[i].type] = {}; + } + if (!t[src[i].uuid]) { + t[src[i].uuid] = src[i]; + } + } + }; + + select_form_sources = function(type) { + var ret = []; + + if (get_selection_list) { + var lst = get_selection_list(); + if (lst.length > 0) { + var text = "― Selections ―"; + var span = document.createElement('span'); + span.innerHTML = text; + ret.push({text: span.innerHTML, value: "***invalid***"}); + + for (var i = 0; i < lst.length; i++) { + if (lst[i].type == type) { + ret.push({text: lst[i].name, value: lst[i].uuid}) + } + } + } + } + + var text = "― Recent ―"; + var span = document.createElement('span'); + span.innerHTML = text; + ret.push({text: span.innerHTML, value: "***invalid***"}); + + var t = form_selection_sources[type]; + for (var key in t) { + if (t.hasOwnProperty(key)) { + var obj = t[key]; + ret.push({text: obj.name, value: obj.uuid}) + } + } + return ret; + }; +})(); + diff --git a/apps/workbench/app/assets/javascripts/sizing.js b/apps/workbench/app/assets/javascripts/sizing.js index 388f727990..55d2301387 100644 --- a/apps/workbench/app/assets/javascripts/sizing.js +++ b/apps/workbench/app/assets/javascripts/sizing.js @@ -11,14 +11,14 @@ function graph_zoom(divId, svgId, scale) { } function smart_scroll_fixup(s) { - console.log(s); + //console.log(s); if (s != null && s.type == 'shown.bs.tab') { s = [s.target]; } else { s = $(".smart-scroll"); } - console.log(s); + //console.log(s); for (var i = 0; i < s.length; i++) { a = s[i]; var h = window.innerHeight - a.getBoundingClientRect().top - 20; diff --git a/apps/workbench/app/assets/stylesheets/application.css.scss b/apps/workbench/app/assets/stylesheets/application.css.scss index 3b57784c74..455e4c0a9f 100644 --- a/apps/workbench/app/assets/stylesheets/application.css.scss +++ b/apps/workbench/app/assets/stylesheets/application.css.scss @@ -122,7 +122,7 @@ ul.arvados-nav li ul li { } .inline-progress-container { - width: 100px; + width: 100%; display:inline-block; } @@ -176,3 +176,12 @@ table.table-fixed-header-row tbody { position:relative; top:1.5em; } + +/* Setting the height needs to be fixed with javascript. */ +.dropdown-menu { + padding-right: 20px; + max-height: 440px; + width: 400px; + overflow-y: auto; +} + diff --git a/apps/workbench/app/assets/stylesheets/pipeline_templates.css.scss b/apps/workbench/app/assets/stylesheets/pipeline_templates.css.scss index 35d2946bb0..c70377a6ff 100644 --- a/apps/workbench/app/assets/stylesheets/pipeline_templates.css.scss +++ b/apps/workbench/app/assets/stylesheets/pipeline_templates.css.scss @@ -1,3 +1,30 @@ // Place all the styles related to the PipelineTemplates controller here. // They will automatically be included in application.css. // You can use Sass (SCSS) here: http://sass-lang.com/ + +.pipeline_color_legend { + padding-left: 1em; + padding-right: 1em; +} + +table.pipeline-components-table { + width: 100%; + table-layout: fixed; + overflow: hidden; +} + +table.pipeline-components-table thead th { + text-align: bottom; +} +table.pipeline-components-table div.progress { + margin-bottom: 0; +} + +table.pipeline-components-table td { + overflow: hidden; + text-overflow: ellipsis; +} + +td.required { + background: #ffdddd; +} diff --git a/apps/workbench/app/assets/stylesheets/selection.css b/apps/workbench/app/assets/stylesheets/selection.css new file mode 100644 index 0000000000..147d6fe93b --- /dev/null +++ b/apps/workbench/app/assets/stylesheets/selection.css @@ -0,0 +1,29 @@ +#persistent-selection-list { + width: 500px; +} + +#selection-form-content > li > a, #selection-form-content > li > input { + display: block; + padding: 3px 20px; + clear: both; + font-weight: normal; + line-height: 1.42857; + color: rgb(51, 51, 51); + white-space: nowrap; + border: none; + background: transparent; + width: 100%; + text-align: left; +} + +#selection-form-content li table tr { + padding: 3px 20px; + line-height: 1.42857; + border-top: 1px solid rgb(221, 221, 221); +} + +#selection-form-content a:hover, #selection-form-content a:focus, #selection-form-content input:hover, #selection-form-content input:focus, #selection-form-content tr:hover { + text-decoration: none; + color: rgb(38, 38, 38); + background-color: whitesmoke; +} \ No newline at end of file diff --git a/apps/workbench/app/controllers/actions_controller.rb b/apps/workbench/app/controllers/actions_controller.rb new file mode 100644 index 0000000000..74e5831235 --- /dev/null +++ b/apps/workbench/app/controllers/actions_controller.rb @@ -0,0 +1,99 @@ +class ActionsController < ApplicationController + + skip_before_filter :find_object_by_uuid, only: :post + + def combine_selected_files_into_collection + lst = [] + files = [] + params["selection"].each do |s| + m = CollectionsHelper.match(s) + if m and m[1] and m[2] + lst.append(m[1] + m[2]) + files.append(m) + end + end + + collections = Collection.where(uuid: lst) + + chash = {} + collections.each do |c| + c.reload() + chash[c.uuid] = c + end + + combined = "" + files.each do |m| + mt = chash[m[1]+m[2]].manifest_text + if m[4] + IO.popen(['arv-normalize', '--extract', m[4][1..-1]], 'w+b') do |io| + io.write mt + io.close_write + while buf = io.read(2**20) + combined += buf + end + end + else + combined += chash[m[1]+m[2]].manifest_text + end + end + + normalized = '' + IO.popen(['arv-normalize'], 'w+b') do |io| + io.write combined + io.close_write + while buf = io.read(2**20) + normalized += buf + end + end + + require 'digest/md5' + + d = Digest::MD5.new() + d << normalized + newuuid = "#{d.hexdigest}+#{normalized.length}" + + env = Hash[ENV]. + merge({ + 'ARVADOS_API_HOST' => + $arvados_api_client.arvados_v1_base. + sub(/\/arvados\/v1/, ''). + sub(/^https?:\/\//, ''), + 'ARVADOS_API_TOKEN' => Thread.current[:arvados_api_token], + 'ARVADOS_API_HOST_INSECURE' => + Rails.configuration.arvados_insecure_https ? 'true' : 'false' + }) + + IO.popen([env, 'arv-put', '--raw'], 'w+b') do |io| + io.write normalized + io.close_write + while buf = io.read(2**20) + + end + end + + newc = Collection.new({:uuid => newuuid, :manifest_text => normalized}) + newc.save! + + chash.each do |k,v| + l = Link.new({ + tail_kind: "arvados#collection", + tail_uuid: k, + head_kind: "arvados#collection", + head_uuid: newuuid, + link_class: "provenance", + name: "provided" + }) + l.save! + end + + redirect_to controller: 'collections', action: :show, id: newc.uuid + end + + def post + if params["combine_selected_files_into_collection"] + combine_selected_files_into_collection + else + redirect_to :back + end + end +end diff --git a/apps/workbench/app/controllers/application_controller.rb b/apps/workbench/app/controllers/application_controller.rb index e94428e92d..61351d6449 100644 --- a/apps/workbench/app/controllers/application_controller.rb +++ b/apps/workbench/app/controllers/application_controller.rb @@ -24,6 +24,7 @@ class ApplicationController < ActionController::Base def unprocessable(message=nil) @errors ||= [] + @errors << message if message render_error status: 422 end @@ -109,6 +110,7 @@ class ApplicationController < ActionController::Base def create @object ||= model_class.new params[model_class.to_s.underscore.singularize] @object.save! + respond_to do |f| f.json { render json: @object } f.html { @@ -318,14 +320,14 @@ class ApplicationController < ActionController::Base } } - @@notification_tests.push lambda { |controller, current_user| - Job.limit(1).where(created_by: current_user.uuid).each do - return nil - end - return lambda { |view| - view.render partial: 'notifications/jobs_notification' - } - } + #@@notification_tests.push lambda { |controller, current_user| + # Job.limit(1).where(created_by: current_user.uuid).each do + # return nil + # end + # return lambda { |view| + # view.render partial: 'notifications/jobs_notification' + # } + #} @@notification_tests.push lambda { |controller, current_user| Collection.limit(1).where(created_by: current_user.uuid).each do diff --git a/apps/workbench/app/controllers/collections_controller.rb b/apps/workbench/app/controllers/collections_controller.rb index b6997b9759..d46ec0354c 100644 --- a/apps/workbench/app/controllers/collections_controller.rb +++ b/apps/workbench/app/controllers/collections_controller.rb @@ -102,8 +102,16 @@ class CollectionsController < ApplicationController end Collection.where(uuid: @object.uuid).each do |u| - @prov_svg = ProvenanceHelper::create_provenance_graph u.provenance, "provenance_svg", {:direction => :top_down, :combine_jobs => :script_only} rescue nil - @used_by_svg = ProvenanceHelper::create_provenance_graph u.used_by, "used_by_svg", {:direction => :top_down, :combine_jobs => :script_only, :pdata_only => true} rescue nil + puts request + @prov_svg = ProvenanceHelper::create_provenance_graph(u.provenance, "provenance_svg", + {:request => request, + :direction => :bottom_up, + :combine_jobs => :script_only}) rescue nil + @used_by_svg = ProvenanceHelper::create_provenance_graph(u.used_by, "used_by_svg", + {:request => request, + :direction => :top_down, + :combine_jobs => :script_only, + :pdata_only => true}) rescue nil end end diff --git a/apps/workbench/app/controllers/jobs_controller.rb b/apps/workbench/app/controllers/jobs_controller.rb index d302bffad5..4705bb5204 100644 --- a/apps/workbench/app/controllers/jobs_controller.rb +++ b/apps/workbench/app/controllers/jobs_controller.rb @@ -14,7 +14,10 @@ class JobsController < ApplicationController nodes << c end - @svg = ProvenanceHelper::create_provenance_graph nodes, "provenance_svg", {:all_script_parameters => true, :script_version_nodes => true} + @svg = ProvenanceHelper::create_provenance_graph nodes, "provenance_svg", { + :request => request, + :all_script_parameters => true, + :script_version_nodes => true} end def index diff --git a/apps/workbench/app/controllers/keep_disks_controller.rb b/apps/workbench/app/controllers/keep_disks_controller.rb index 482a2d33be..cc89228832 100644 --- a/apps/workbench/app/controllers/keep_disks_controller.rb +++ b/apps/workbench/app/controllers/keep_disks_controller.rb @@ -1,2 +1,7 @@ class KeepDisksController < ApplicationController + def create + defaults = { is_readable: true, is_writable: true } + @object = KeepDisk.new defaults.merge(params[:keep_disk] || {}) + super + end end diff --git a/apps/workbench/app/controllers/pipeline_instances_controller.rb b/apps/workbench/app/controllers/pipeline_instances_controller.rb index 42cb2e9d44..c2a398c9b2 100644 --- a/apps/workbench/app/controllers/pipeline_instances_controller.rb +++ b/apps/workbench/app/controllers/pipeline_instances_controller.rb @@ -45,6 +45,38 @@ class PipelineInstancesController < ApplicationController end def show + if @object.components.empty? and @object.pipeline_template_uuid + template = PipelineTemplate.find(@object.pipeline_template_uuid) + pipeline = {} + template.components.each do |component_name, component_props| + pipeline[component_name] = {} + component_props.each do |k, v| + if k == :script_parameters + pipeline[component_name][:script_parameters] = {} + v.each do |param_name, param_value| + if param_value.is_a? Hash + if param_value[:value] + pipeline[component_name][:script_parameters][param_name] = param_value[:value] + elsif param_value[:default] + pipeline[component_name][:script_parameters][param_name] = param_value[:default] + elsif param_value[:optional] != nil or param_value[:required] != nil or param_value[:dataclass] != nil + pipeline[component_name][:script_parameters][param_name] = "" + else + pipeline[component_name][:script_parameters][param_name] = param_value + end + else + pipeline[component_name][:script_parameters][param_name] = param_value + end + end + else + pipeline[component_name][k] = v + end + end + end + @object.components= pipeline + @object.save + end + @pipelines = [@object] if params[:compare] @@ -56,6 +88,7 @@ class PipelineInstancesController < ApplicationController provenance, pips = graph(@pipelines) @prov_svg = ProvenanceHelper::create_provenance_graph provenance, "provenance_svg", { + :request => request, :all_script_parameters => true, :combine_jobs => :script_and_version, :script_version_nodes => true, @@ -127,6 +160,7 @@ class PipelineInstancesController < ApplicationController @pipelines = @objects @prov_svg = ProvenanceHelper::create_provenance_graph provenance, "provenance_svg", { + :request => request, :all_script_parameters => true, :combine_jobs => :script_and_version, :script_version_nodes => true, @@ -141,6 +175,20 @@ class PipelineInstancesController < ApplicationController %w(Compare Graph) end + def update + updates = params[@object.class.to_s.underscore.singularize.to_sym] + if updates["components"] + require 'deep_merge/rails_compat' + updates["components"] = updates["components"].deeper_merge(@object.components) + end + super + end + + def index + @objects ||= model_class.limit(20).all + super + end + protected def for_comparison v if v.is_a? Hash or v.is_a? Array diff --git a/apps/workbench/app/controllers/pipeline_templates_controller.rb b/apps/workbench/app/controllers/pipeline_templates_controller.rb index fdbebcfaed..98101b5e9b 100644 --- a/apps/workbench/app/controllers/pipeline_templates_controller.rb +++ b/apps/workbench/app/controllers/pipeline_templates_controller.rb @@ -1,2 +1,15 @@ class PipelineTemplatesController < ApplicationController + + def show + @objects = [] + PipelineInstance.where(pipeline_template_uuid: @object.uuid).each do |pipeline| + @objects.push(pipeline) + end + super + end + + def show_pane_list + %w(Components Pipelines Attributes Metadata JSON API) + end + end diff --git a/apps/workbench/app/controllers/users_controller.rb b/apps/workbench/app/controllers/users_controller.rb index 3ccaa525ce..c33de2d034 100644 --- a/apps/workbench/app/controllers/users_controller.rb +++ b/apps/workbench/app/controllers/users_controller.rb @@ -1,6 +1,7 @@ class UsersController < ApplicationController skip_before_filter :find_object_by_uuid, :only => :welcome skip_around_filter :thread_with_mandatory_api_token, :only => :welcome + before_filter :ensure_current_user_is_admin, only: :sudo def welcome if current_user @@ -9,6 +10,23 @@ class UsersController < ApplicationController end end + def show_pane_list + if current_user.andand.is_admin + super | %w(Admin) + else + super + end + end + + def sudo + resp = $arvados_api_client.api(ApiClientAuthorization, '', { + api_client_authorization: { + owner_uuid: @object.uuid + } + }) + redirect_to root_url(api_token: resp[:api_token]) + end + def home @showallalerts = false @my_ssh_keys = AuthorizedKey.where(authorized_user_uuid: current_user.uuid) diff --git a/apps/workbench/app/helpers/application_helper.rb b/apps/workbench/app/helpers/application_helper.rb index cd8e5279dd..e608572f05 100644 --- a/apps/workbench/app/helpers/application_helper.rb +++ b/apps/workbench/app/helpers/application_helper.rb @@ -3,6 +3,10 @@ module ApplicationHelper controller.current_user end + def self.match_uuid(uuid) + /^([0-9a-z]{5})-([0-9a-z]{5})-([0-9a-z]{15})$/.match(uuid.to_s) + end + def current_api_host Rails.configuration.arvados_v1_base.gsub /https?:\/\/|\/arvados\/v1/,'' end @@ -67,7 +71,7 @@ module ApplicationHelper end end style_opts[:class] = (style_opts[:class] || '') + ' nowrap' - link_to link_name, { controller: resource_class.to_s.underscore.pluralize, action: 'show', id: link_uuid }, style_opts + link_to link_name, { controller: resource_class.to_s.tableize, action: 'show', id: link_uuid }, style_opts else attrvalue end @@ -100,4 +104,123 @@ module ApplicationHelper :class => "editable" }.merge(htmloptions) end + + def render_editable_subattribute(object, attr, subattr, template, htmloptions={}) + if object + attrvalue = object.send(attr) + subattr.each do |k| + if attrvalue and attrvalue.is_a? Hash + attrvalue = attrvalue[k] + else + break + end + end + end + + datatype = nil + required = true + if template + #puts "Template is #{template.class} #{template.is_a? Hash} #{template}" + if template.is_a? Hash + if template[:output_of] + return raw("#{template[:output_of]}") + end + if template[:dataclass] + dataclass = template[:dataclass] + end + if template[:optional] != nil + required = (template[:optional] != "true") + end + if template[:required] != nil + required = template[:required] + end + end + end + + rsc = template + if template.is_a? Hash + if template[:value] + rsc = template[:value] + elsif template[:default] + rsc = template[:default] + end + end + + return link_to_if_arvados_object(rsc) if !object + return link_to_if_arvados_object(attrvalue) if !object.attribute_editable? attr + + if dataclass + begin + dataclass = dataclass.constantize + rescue NameError + end + else + dataclass = ArvadosBase.resource_class_for_uuid(rsc) + end + + if dataclass && dataclass.is_a?(Class) + datatype = 'select' + elsif dataclass == 'number' + datatype = 'number' + else + if template.is_a? Array + # ?!? + elsif template.is_a? String + if /^\d+$/.match(template) + datatype = 'number' + else + datatype = 'text' + end + end + end + + id = "#{object.uuid}-#{subattr.join('-')}" + dn = "[#{attr}]" + subattr.each do |a| + dn += "[#{a}]" + end + + if attrvalue.is_a? String + attrvalue = attrvalue.strip + end + + if dataclass and dataclass.is_a? Class + items = [] + if attrvalue and !attrvalue.empty? + items.append({name: attrvalue, uuid: attrvalue, type: dataclass.to_s}) + end + #dataclass.where(uuid: attrvalue).each do |item| + # items.append({name: item.uuid, uuid: item.uuid, type: dataclass.to_s}) + #end + dataclass.limit(10).each do |item| + items.append({name: item.uuid, uuid: item.uuid, type: dataclass.to_s}) + end + end + + lt = link_to attrvalue, '#', { + "data-emptytext" => "none", + "data-placement" => "bottom", + "data-type" => datatype, + "data-url" => url_for(action: "update", id: object.uuid, controller: object.class.to_s.pluralize.underscore), + "data-title" => "Set value for #{subattr[-1].to_s}", + "data-name" => dn, + "data-pk" => "{id: \"#{object.uuid}\", key: \"#{object.class.to_s.underscore}\"}", + "data-showbuttons" => "false", + "data-value" => attrvalue, + :class => "editable #{'required' if required}", + :id => id + }.merge(htmloptions) + + lt += raw("\n") + + lt + end end diff --git a/apps/workbench/app/helpers/collections_helper.rb b/apps/workbench/app/helpers/collections_helper.rb index b2eee48ea6..7b548dfb84 100644 --- a/apps/workbench/app/helpers/collections_helper.rb +++ b/apps/workbench/app/helpers/collections_helper.rb @@ -4,4 +4,8 @@ module CollectionsHelper {source: x.tail_uuid, target: x.head_uuid, type: x.name} end end + + def self.match(uuid) + /^([a-f0-9]{32})(\+[0-9]+)?(\+.*?)?(\/.*)?$/.match(uuid.to_s) + end end diff --git a/apps/workbench/app/helpers/pipeline_instances_helper.rb b/apps/workbench/app/helpers/pipeline_instances_helper.rb index 348004620e..c52d339158 100644 --- a/apps/workbench/app/helpers/pipeline_instances_helper.rb +++ b/apps/workbench/app/helpers/pipeline_instances_helper.rb @@ -1,30 +1,4 @@ module PipelineInstancesHelper - def pipeline_summary object=nil - object ||= @object - ret = {todo:0, running:0, queued:0, done:0, failed:0, total:0} - object.components.values.each do |c| - ret[:total] += 1 - case - when !c[:job] - ret[:todo] += 1 - when c[:job][:success] - ret[:done] += 1 - when c[:job][:failed] - ret[:failed] += 1 - when c[:job][:finished_at] - ret[:running] += 1 # XXX finished but !success and !failed?? - when c[:job][:started_at] - ret[:running] += 1 - else - ret[:queued] += 1 - end - end - ret.merge! Hash[ret.collect do |k,v| - [('percent_' + k.to_s).to_sym, - ret[:total]<1 ? 0 : (100.0*v/ret[:total]).floor] - end] - ret - end def pipeline_jobs object=nil object ||= @object @@ -42,22 +16,37 @@ module PipelineInstancesHelper end def render_pipeline_job pj - if pj[:percent_done] - pj[:progress_bar] = raw("
    ") - elsif pj[:progress] - raw("
    ") - end + pj[:progress_bar] = render partial: 'job_progress', locals: {:j => pj[:job]} pj[:output_link] = link_to_if_arvados_object pj[:output] pj[:job_link] = link_to_if_arvados_object pj[:job][:uuid] pj end + protected def pipeline_jobs_newschool object ret = [] i = -1 - object.components.each do |cname, c| + + comp = [] + + template = PipelineTemplate.find(@object.pipeline_template_uuid) rescue nil + if template + order = PipelineTemplatesHelper::sort_components(template.components) + order.each do |k| + if object.components[k] + comp.push([k, object.components[k]]) + end + end + else + object.components.each do |k, v| + comp.push([k, v]) + end + end + + comp.each do |cname, c| + puts cname, c i += 1 pj = {index: i, name: cname} pj[:job] = c[:job].is_a?(Hash) ? c[:job] : {} diff --git a/apps/workbench/app/helpers/pipeline_templates_helper.rb b/apps/workbench/app/helpers/pipeline_templates_helper.rb index be82878a8e..0540047e9c 100644 --- a/apps/workbench/app/helpers/pipeline_templates_helper.rb +++ b/apps/workbench/app/helpers/pipeline_templates_helper.rb @@ -1,2 +1,24 @@ +require 'tsort' + +class Hash + include TSort + def tsort_each_node(&block) + keys.sort.each(&block) + end + + def tsort_each_child(node) + if self[node] + self[node][:script_parameters].sort.map do |k, v| + if v.is_a? Hash and v[:output_of] + yield v[:output_of].to_sym + end + end + end + end +end + module PipelineTemplatesHelper + def self.sort_components(components) + components.tsort + end end diff --git a/apps/workbench/app/helpers/provenance_helper.rb b/apps/workbench/app/helpers/provenance_helper.rb index 6d6ae5516c..66754d20b2 100644 --- a/apps/workbench/app/helpers/provenance_helper.rb +++ b/apps/workbench/app/helpers/provenance_helper.rb @@ -7,13 +7,15 @@ module ProvenanceHelper @visited = {} @jobs = {} end - + def self.collection_uuid(uuid) - m = /^([a-f0-9]{32}(\+[0-9]+)?)(\+.*)?$/.match(uuid.to_s) + m = CollectionsHelper.match(uuid) if m - #if m[2] - return m[1] - #else + if m[2] + return m[1]+m[2] + else + return m[1] + end # Collection.where(uuid: ['contains', m[1]]).each do |u| # puts "fixup #{uuid} to #{u.uuid}" # return u.uuid @@ -24,17 +26,28 @@ module ProvenanceHelper end end + def url_for u + p = { :host => @opts[:request].host, + :port => @opts[:request].port, + :protocol => @opts[:request].protocol } + p.merge! u + Rails.application.routes.url_helpers.url_for (p) + end + def determine_fillcolor(n) fillcolor = %w(aaaaaa aaffaa aaaaff aaaaaa ffaaaa)[n || 0] || 'aaaaaa' "style=filled,fillcolor=\"##{fillcolor}\"" end def describe_node(uuid) + uuid = uuid.to_sym bgcolor = determine_fillcolor @opts[:pips][uuid] if @opts[:pips] rsc = ArvadosBase::resource_class_for_uuid uuid.to_s if rsc - href = "/#{rsc.to_s.underscore.pluralize rsc}/#{uuid}" + href = url_for ({:controller => rsc.to_s.tableize, + :action => :show, + :id => uuid.to_s }) #"\"#{uuid}\" [label=\"#{rsc}\\n#{uuid}\",href=\"#{href}\"];\n" if rsc == Collection @@ -44,11 +57,12 @@ module ProvenanceHelper #puts "empty!" return "\"#{uuid}\" [label=\"(empty collection)\"];\n" end + puts "#{uuid.class} #{@pdata[uuid]}" if @pdata[uuid] #puts @pdata[uuid] if @pdata[uuid][:name] return "\"#{uuid}\" [label=\"#{@pdata[uuid][:name]}\",href=\"#{href}\",shape=oval,#{bgcolor}];\n" - else + else files = nil if @pdata[uuid].respond_to? :files files = @pdata[uuid].files @@ -67,12 +81,13 @@ module ProvenanceHelper if i < files.length label += "\\n⋮" end + #puts "#{uuid} #{label} #{files}" return "\"#{uuid}\" [label=\"#{label}\",href=\"#{href}\",shape=oval,#{bgcolor}];\n" end end end - return "\"#{uuid}\" [label=\"#{rsc}\",href=\"#{href}\",#{bgcolor}];\n" end + return "\"#{uuid}\" [label=\"#{rsc}\",href=\"#{href}\",#{bgcolor}];\n" end "\"#{uuid}\" [#{bgcolor}];\n" end @@ -99,7 +114,7 @@ module ProvenanceHelper gr = "\"#{head}\" -> \"#{tail}\"" end if extra.length > 0 - gr += "[" + gr += " [" extra.each do |k, v| gr += "#{k}=\"#{v}\"," end @@ -209,6 +224,8 @@ module ProvenanceHelper gr += edge(job_uuid(job), job[:script_version], {:label => "script_version"}) end end + elsif rsc == Link + # do nothing else gr += describe_node(uuid) end @@ -216,8 +233,12 @@ module ProvenanceHelper @pdata.each do |k, link| if link[:head_uuid] == uuid.to_s and link[:link_class] == "provenance" + href = url_for ({:controller => Link.to_s.tableize, + :action => :show, + :id => link[:uuid] }) + gr += describe_node(link[:tail_uuid]) - gr += edge(link[:head_uuid], link[:tail_uuid], {:label => link[:name], :href => "/links/#{link[:uuid]}"}) + gr += edge(link[:head_uuid], link[:tail_uuid], {:label => link[:name], :href => href}) gr += generate_provenance_edges(link[:tail_uuid]) end end @@ -230,7 +251,10 @@ module ProvenanceHelper def describe_jobs gr = "" @jobs.each do |k, v| - gr += "\"#{k}\" [href=\"/jobs?" + href = url_for ({:controller => Job.to_s.tableize, + :action => :index }) + + gr += "\"#{k}\" [href=\"#{href}?" n = 0 v.each do |u| @@ -241,11 +265,11 @@ module ProvenanceHelper gr += "\",label=\"" if @opts[:combine_jobs] == :script_only - gr += uuid = "#{v[0][:script]}" + gr += "#{v[0][:script]}" elsif @opts[:combine_jobs] == :script_and_version - gr += uuid = "#{v[0][:script]}" + gr += "#{v[0][:script]}" # Just show the name but the nodes will be distinct else - gr += uuid = "#{v[0][:script]}\\n#{v[0][:finished_at]}" + gr += "#{v[0][:script]}\\n#{v[0][:finished_at]}" end gr += "\",#{determine_fillcolor n}];\n" end @@ -289,8 +313,8 @@ edge [fontsize=10]; gr += "}" svg = "" - #puts gr - + puts gr + require 'open3' Open3.popen2("dot", "-Tsvg") do |stdin, stdout, wait_thr| diff --git a/apps/workbench/app/models/arvados_base.rb b/apps/workbench/app/models/arvados_base.rb index 72b76a5229..fbf7ee5e79 100644 --- a/apps/workbench/app/models/arvados_base.rb +++ b/apps/workbench/app/models/arvados_base.rb @@ -61,13 +61,16 @@ class ArvadosBase < ActiveRecord::Base attr_reader :kind @columns end + def self.column(name, sql_type = nil, default = nil, null = true) ActiveRecord::ConnectionAdapters::Column.new(name.to_s, default, sql_type.to_s, null) end + def self.attribute_info self.columns @attribute_info end + def self.find(uuid, opts={}) if uuid.class != String or uuid.length < 27 then raise 'argument to find() must be a uuid string. Acceptable formats: warehouse locator or string with format xxxxx-xxxxx-xxxxxxxxxxxxxxx' @@ -84,21 +87,27 @@ class ArvadosBase < ActiveRecord::Base end new.private_reload(hash) end + def self.order(*args) ArvadosResourceList.new(self).order(*args) end + def self.where(*args) ArvadosResourceList.new(self).where(*args) end + def self.limit(*args) ArvadosResourceList.new(self).limit(*args) end + def self.eager(*args) ArvadosResourceList.new(self).eager(*args) end + def self.all(*args) ArvadosResourceList.new(self).all(*args) end + def save obdata = {} self.class.columns.each do |col| @@ -128,8 +137,11 @@ class ArvadosBase < ActiveRecord::Base end end + @new_record = false + self end + def save! self.save or raise Exception.new("Save failed") end @@ -169,6 +181,7 @@ class ArvadosBase < ActiveRecord::Base @links = $arvados_api_client.api Link, '', { _method: 'GET', where: o, eager: true } @links = $arvados_api_client.unpack_api_response(@links) end + def all_links return @all_links if @all_links res = $arvados_api_client.api Link, '', { @@ -181,9 +194,11 @@ class ArvadosBase < ActiveRecord::Base } @all_links = $arvados_api_client.unpack_api_response(res) end + def reload private_reload(self.uuid) end + def private_reload(uuid_or_hash) raise "No such object" if !uuid_or_hash if uuid_or_hash.is_a? Hash @@ -206,8 +221,14 @@ class ArvadosBase < ActiveRecord::Base end end @all_links = nil + @new_record = false self end + + def to_param + uuid + end + def dup super.forget_uuid! end @@ -275,6 +296,10 @@ class ArvadosBase < ActiveRecord::Base (name if self.respond_to? :name) || uuid end + def selection_label + friendly_link_name + end + protected def forget_uuid! diff --git a/apps/workbench/app/models/collection.rb b/apps/workbench/app/models/collection.rb index bda5523d8c..6bc55bde3d 100644 --- a/apps/workbench/app/models/collection.rb +++ b/apps/workbench/app/models/collection.rb @@ -1,4 +1,5 @@ class Collection < ArvadosBase + def total_bytes if files tot = 0 @@ -24,4 +25,5 @@ class Collection < ArvadosBase def used_by $arvados_api_client.api "collections/#{self.uuid}/", "used_by" end + end diff --git a/apps/workbench/app/models/pipeline_instance.rb b/apps/workbench/app/models/pipeline_instance.rb index da6116e916..ccb88351a7 100644 --- a/apps/workbench/app/models/pipeline_instance.rb +++ b/apps/workbench/app/models/pipeline_instance.rb @@ -16,9 +16,9 @@ class PipelineInstance < ArvadosBase end end end - + def attribute_editable?(attr) - attr == 'name' + attr.to_sym == :name || (attr.to_sym == :components and self.active == nil) end def attributes_for_display diff --git a/apps/workbench/app/views/application/_content.html.erb b/apps/workbench/app/views/application/_content.html.erb index 02efdf9999..53444a5c9c 100644 --- a/apps/workbench/app/views/application/_content.html.erb +++ b/apps/workbench/app/views/application/_content.html.erb @@ -25,7 +25,6 @@ <% end %> <% content_for :js do %> - $(window).on('load', function() { - $('ul.nav-tabs > li > a').on('shown.bs.tab', smart_scroll_fixup); - }); + $(window).on('load', smart_scroll_fixup); + $(document).on('shown.bs.tab', 'ul.nav-tabs > li > a', smart_scroll_fixup); <% end %> diff --git a/apps/workbench/app/views/application/_job_progress.html.erb b/apps/workbench/app/views/application/_job_progress.html.erb new file mode 100644 index 0000000000..a25acc3a04 --- /dev/null +++ b/apps/workbench/app/views/application/_job_progress.html.erb @@ -0,0 +1,20 @@ +<% percent_total_tasks = 100 / (j[:tasks_summary][:done] + j[:tasks_summary][:running] + j[:tasks_summary][:failed] + j[:tasks_summary][:todo]) rescue 0 %> + +<% if defined? scaleby %> + <% percent_total_tasks *= scaleby %> +<% end %> + +<% if not defined? scaleby %> +
    +<% end %> + + + + + + + + +<% if not defined? scaleby %> +
    +<% end %> diff --git a/apps/workbench/app/views/application/_job_status_label.html.erb b/apps/workbench/app/views/application/_job_status_label.html.erb new file mode 100644 index 0000000000..87b70fe0e8 --- /dev/null +++ b/apps/workbench/app/views/application/_job_status_label.html.erb @@ -0,0 +1,11 @@ +<% if j[:success] %> + <%= if defined? title then title else 'success' end %> +<% elsif j[:success] == false %> + <%= if defined? title then title else 'failed' end %> +<% elsif j[:finished_at] %> + <%= if defined? title then title else 'finished' end %> +<% elsif j[:started_at] %> + <%= if defined? title then title else 'running' end %> +<% else %> + <%= if defined? title then title else 'not running' end %> +<% end %> diff --git a/apps/workbench/app/views/application/_pipeline_progress.html.erb b/apps/workbench/app/views/application/_pipeline_progress.html.erb new file mode 100644 index 0000000000..d478f65ddc --- /dev/null +++ b/apps/workbench/app/views/application/_pipeline_progress.html.erb @@ -0,0 +1,8 @@ +<% component_frac = 1.0 / p.components.length %> +
    + <% p.components.each do |k,c| %> + <% if c[:job] %> + <%= render partial: "job_progress", locals: {:j => c[:job], :scaleby => component_frac } %> + <% end %> + <% end %> +
    diff --git a/apps/workbench/app/views/application/_pipeline_status_label.html.erb b/apps/workbench/app/views/application/_pipeline_status_label.html.erb new file mode 100644 index 0000000000..020ce81c57 --- /dev/null +++ b/apps/workbench/app/views/application/_pipeline_status_label.html.erb @@ -0,0 +1,13 @@ +<% if p.success %> + finished +<% elsif p.success == false %> + failed +<% elsif p.active %> + running +<% else %> + <% if (p.components.select do |k,v| v[:job] end).length == 0 %> + not started + <% else %> + not running + <% end %> +<% end %> diff --git a/apps/workbench/app/views/application/_selection_checkbox.html.erb b/apps/workbench/app/views/application/_selection_checkbox.html.erb new file mode 100644 index 0000000000..4d47d892c5 --- /dev/null +++ b/apps/workbench/app/views/application/_selection_checkbox.html.erb @@ -0,0 +1,8 @@ +<%if object %> +<%= check_box_tag 'uuids[]', object.uuid, false, { + :class => 'persistent-selection', + :friendly_type => object.class.name, + :friendly_name => object.selection_label, + :href => "#{url_for controller: object.class.name.tableize, action: 'show', id: object.uuid }" +} %> +<% end %> diff --git a/apps/workbench/app/views/application/_show_recent.html.erb b/apps/workbench/app/views/application/_show_recent.html.erb index c58c628ee9..ef4a8d1f04 100644 --- a/apps/workbench/app/views/application/_show_recent.html.erb +++ b/apps/workbench/app/views/application/_show_recent.html.erb @@ -8,9 +8,12 @@ <% attr_blacklist = ' created_at modified_at modified_by_user_uuid modified_by_client_uuid updated_at' %> +<%= form_tag do |f| %> + + <% @objects.first.attributes_for_display.each do |attr, attrvalue| %> <% next if attr_blacklist.index(" "+attr) %> <% @objects.each do |object| %> + + <% object.attributes_for_display.each do |attr, attrvalue| %> <% next if attr_blacklist.index(" "+attr) %>
    @@ -26,6 +29,10 @@
    + <%= render :partial => "selection_checkbox", :locals => {:object => object} %> + @@ -55,3 +62,5 @@
    <% end %> + +<% end %> diff --git a/apps/workbench/app/views/collections/_index_tbody.html.erb b/apps/workbench/app/views/collections/_index_tbody.html.erb index eb9c93fbc3..96b73979eb 100644 --- a/apps/workbench/app/views/collections/_index_tbody.html.erb +++ b/apps/workbench/app/views/collections/_index_tbody.html.erb @@ -1,6 +1,9 @@ <% @collections.each do |c| %> + + <%= render :partial => "selection_checkbox", :locals => {:object => c} %> + <%= link_to_if_arvados_object c.uuid %> diff --git a/apps/workbench/app/views/collections/_show_files.html.erb b/apps/workbench/app/views/collections/_show_files.html.erb index 385af8a272..956958eddb 100644 --- a/apps/workbench/app/views/collections/_show_files.html.erb +++ b/apps/workbench/app/views/collections/_show_files.html.erb @@ -1,5 +1,6 @@ + @@ -7,6 +8,7 @@ + @@ -14,26 +16,38 @@ <% if @object then @object.files.sort_by{|f|[f[0],f[1]]}.each do |file| %> - <% file_path = "#{file[0]}/#{file[1]}" %> - - + <% f0 = file[0] %> + <% f0 = '' if f0 == '.' %> + <% f0 = f0[2..-1] if f0[0..1] == './' %> + <% f0 += '/' if not f0.empty? %> + <% file_path = "#{f0}#{file[1]}" %> + + + - + - + - - + + <% end; end %>
    path file size
    - <%= file[0] %> -
    + <%= check_box_tag 'uuids[]', @object.uuid+'/'+file_path, false, { + :class => 'persistent-selection', + :friendly_type => "File", + :friendly_name => "#{@object.uuid}/#{file_path}", + :href => "#{url_for controller: 'collections', action: 'show', id: @object.uuid }/#{file_path}" + } %> + + <%= file[0] %> + - <%= link_to file[1], {controller: 'collections', action: 'show_file', uuid: @object.uuid, file: file_path, size: file[2], disposition: 'inline'}, {title: 'View in browser'} %> - + <%= link_to file[1], {controller: 'collections', action: 'show_file', uuid: @object.uuid, file: file_path, size: file[2], disposition: 'inline'}, {title: 'View in browser'} %> + - <%= raw(human_readable_bytes_html(file[2])) %> - + <%= raw(human_readable_bytes_html(file[2])) %> + -
    - <%= link_to raw(''), {controller: 'collections', action: 'show_file', uuid: @object.uuid, file: file_path, size: file[2], disposition: 'attachment'}, {class: 'btn btn-info btn-sm', title: 'Download'} %> -
    -
    +
    + <%= link_to raw(''), {controller: 'collections', action: 'show_file', uuid: @object.uuid, file: file_path, size: file[2], disposition: 'attachment'}, {class: 'btn btn-info btn-sm', title: 'Download'} %> +
    +
    diff --git a/apps/workbench/app/views/collections/_show_recent.html.erb b/apps/workbench/app/views/collections/_show_recent.html.erb index 3cedb57e85..a3b93d84e6 100644 --- a/apps/workbench/app/views/collections/_show_recent.html.erb +++ b/apps/workbench/app/views/collections/_show_recent.html.erb @@ -15,8 +15,11 @@
    +<%= form_tag do |f| %> + + @@ -26,6 +29,7 @@ + @@ -38,6 +42,9 @@ <%= render partial: 'index_tbody' %>
    uuid contents owner
    + +<% end %> +
    <% content_for :footer_js do %> diff --git a/apps/workbench/app/views/jobs/_show_recent.html.erb b/apps/workbench/app/views/jobs/_show_recent.html.erb index 85331f3e44..304a3b5c1f 100644 --- a/apps/workbench/app/views/jobs/_show_recent.html.erb +++ b/apps/workbench/app/views/jobs/_show_recent.html.erb @@ -35,24 +35,12 @@ - <% if j.success == false %> - - <% elsif j.success %> - - <% elsif j.running %> - - <% else %> - - <% end %> + <%= render partial: 'job_status_label', locals: {:j => j} %> - <% if j.started_at and not j.finished_at %> - <% percent_total_tasks = 100 / (j.tasks_summary[:running] + j.tasks_summary[:done] + j.tasks_summary[:todo]) rescue 0 %> -
    -
    -
    +
    + <%= render partial: 'job_progress', locals: {:j => j} %>
    - <% end %> <%= link_to_if_arvados_object j.uuid %> diff --git a/apps/workbench/app/views/layouts/application.html.erb b/apps/workbench/app/views/layouts/application.html.erb index 1dc6284c83..abef47136f 100644 --- a/apps/workbench/app/views/layouts/application.html.erb +++ b/apps/workbench/app/views/layouts/application.html.erb @@ -42,14 +42,6 @@ padding-top: 1.25em; } - /* Setting the height needs to be fixed with javascript. */ - .dropdown-menu { - padding-right: 20px; - max-height: 440px; - width: 400px; - overflow-y: auto; - } - @media (min-width: 768px) { .left-nav { position: fixed; @@ -93,7 +85,12 @@
  • -<%= link_to controller.breadcrumb_page_name, request.fullpath %> + <%= link_to controller.breadcrumb_page_name, request.fullpath %> +
  • +
  • + <%= form_tag do |f| %> + <%= render :partial => "selection_checkbox", :locals => {:object => @object} %> + <% end %>
  • <% end %> <% end %> @@ -118,18 +115,18 @@ --> - <% if current_user.is_active %> + {% endif %} + Next: {{ p.title }} + {% assign nx = 0 %} + {% assign n = 1 %} + {% endif %} + {% if p.url == page.url %} + {% assign nx = 1 %} + {% else %} + {% assign prev = p %} + {% endif %} + {% endfor %} + {% endfor %} +{% endfor %} +{% if n == 0 && prev != "" %} +
    + Previous: {{ prev.title }} + {% assign n = 1 %} +{% endif %} \ No newline at end of file diff --git a/doc/_layouts/default.html.liquid b/doc/_layouts/default.html.liquid index 732b2addda..4585b7032a 100644 --- a/doc/_layouts/default.html.liquid +++ b/doc/_layouts/default.html.liquid @@ -11,6 +11,7 @@ + - + + +
    +

    Creative Commons Notice

    + +

    Creative Commons is not a party to this License, and + makes no warranty whatsoever in connection with the Work. + Creative Commons will not be liable to You or any party + on any legal theory for any damages whatsoever, including + without limitation any general, special, incidental or + consequential damages arising in connection to this + license. Notwithstanding the foregoing two (2) sentences, + if Creative Commons has expressly identified itself as + the Licensor hereunder, it shall have all rights and + obligations of Licensor.

    + +

    Except for the limited purpose of indicating to the + public that the Work is licensed under the CCPL, Creative + Commons does not authorize the use by either party of the + trademark "Creative Commons" or any related trademark or + logo of Creative Commons without the prior written + consent of Creative Commons. Any permitted use will be in + compliance with Creative Commons' then-current trademark + usage guidelines, as may be published on its website or + otherwise made available upon request from time to time. + For the avoidance of doubt, this trademark restriction + does not form part of this License.

    + +

    Creative Commons may be contacted at http://creativecommons.org/.

    +
    +
    + + + diff --git a/doc/user/copying/copying.html.textile.liquid b/doc/user/copying/copying.html.textile.liquid new file mode 100644 index 0000000000..2ab868102c --- /dev/null +++ b/doc/user/copying/copying.html.textile.liquid @@ -0,0 +1,11 @@ +--- +layout: default +navsection: userguide +title: "Arvados Free Software Licenses" +... + +Server-side components of Arvados contained in the apps/ and services/ directories, including the API Server, Workbench, and Crunch, are licenced under the "GNU Affero General Public License version 3":agpl-3.0.html. + +The Arvados client Software Development Kits contained in the sdk/ directory, example scripts in the crunch_scripts/ directory, and code samples in the Aravados documentation are licensed under the "Apache License, Version 2.0":LICENSE-2.0.html + +The Arvados Documentation located in the doc/ directory is licensed under the "Creative Commons Attribution-Share Alike 3.0 United States":by-sa-3.0.html diff --git a/doc/user/examples/crunch-examples.html.textile.liquid b/doc/user/examples/crunch-examples.html.textile.liquid index b657a68c9f..13bb1ae086 100644 --- a/doc/user/examples/crunch-examples.html.textile.liquid +++ b/doc/user/examples/crunch-examples.html.textile.liquid @@ -1,13 +1,9 @@ --- layout: default navsection: userguide -navmenu: Examples title: "Crunch examples" - ... -h1. Crunch examples - Several crunch scripts are included with Arvados in the "/crunch_scripts directory":https://arvados.org/projects/arvados/repository/revisions/master/show/crunch_scripts. They are intended to provide examples and starting points for writing your own scripts. h4. bwa-aln diff --git a/doc/user/getting_started/check-environment.html.textile.liquid b/doc/user/getting_started/check-environment.html.textile.liquid index 6cf35f340a..2908e6e742 100644 --- a/doc/user/getting_started/check-environment.html.textile.liquid +++ b/doc/user/getting_started/check-environment.html.textile.liquid @@ -1,13 +1,9 @@ --- layout: default navsection: userguide -navmenu: Getting Started title: "Checking your environment" - ... -h1. Checking your environment - First you should "log into an Arvados VM instance":{{site.baseurl}}/user/getting_started/ssh-access.html#login if you have not already done so. If @arv user current@ is able to access the API server, it will print out information about your account. Check that you are able to access the Arvados API server using the following command: @@ -42,5 +38,3 @@ However, if you receive the following message: bc. ARVADOS_API_HOST and ARVADOS_API_TOKEN need to be defined as environment variables Then follow the instructions for "getting an API token,":{{site.baseurl}}/user/reference/api-tokens.html and try @arv user current@ again. - -Once you are able to access the API server, you are ready proceed to the first tutorial: "Storing and retrieving data using Arvados Keep.":{{site.baseurl}}/user/tutorials/tutorial-keep.html diff --git a/doc/user/getting_started/community.html.textile.liquid b/doc/user/getting_started/community.html.textile.liquid index c910ac1f42..8b6e22d1fd 100644 --- a/doc/user/getting_started/community.html.textile.liquid +++ b/doc/user/getting_started/community.html.textile.liquid @@ -1,12 +1,8 @@ --- layout: default navsection: userguide -navmenu: Getting Started title: Arvados Community and Getting Help - ... -h1. Arvados Community and Getting Help - h2. On the web diff --git a/doc/user/getting_started/ssh-access.html.textile.liquid b/doc/user/getting_started/ssh-access.html.textile.liquid index 3c40315ad7..e4a2b9c8b9 100644 --- a/doc/user/getting_started/ssh-access.html.textile.liquid +++ b/doc/user/getting_started/ssh-access.html.textile.liquid @@ -1,13 +1,9 @@ --- layout: default navsection: userguide -navmenu: Getting Started title: Accessing an Arvados VM over ssh - ... -h1. Accessing an Arvados Virtual Machine over ssh - Arvados requires a public @ssh@ key in order to securely log in to an Arvados VM instance, or to access an Arvados @git@ repository. This document is divided up into three sections. @@ -133,7 +129,7 @@ h1(#workbench). Adding your key to Arvados Workbench h3. From the workbench dashboard -If you have no @ssh@ keys registered, there should be a notification asking you to provide your @ssh@ public key. On the Workbench dashboard (in this guide, this is "https://workbench.{{ site.arvados_api_host }}/":https://workbench.{{ site.arvados_api_host }}/ ), look for the envelope icon 1 in upper right corner (the number indicates there are new notifications). Click on this icon and a dropdown menu should appear with a message asking you to add your public key. Paste your public key into the text area provided and click on the check button to submit the key. You are now ready to "log into an Arvados VM":#login. +If you have no @ssh@ keys registered, there should be a notification asking you to provide your @ssh@ public key. On the Workbench dashboard (in this guide, this is "https://{{ site.arvados_workbench_host }}/":https://{{ site.arvados_workbench_host }}/ ), look for the envelope icon 1 in upper right corner (the number indicates there are new notifications). Click on this icon and a dropdown menu should appear with a message asking you to add your public key. Paste your public key into the text area provided and click on the check button to submit the key. You are now ready to "log into an Arvados VM":#login. h3. Alternate way to add ssh keys @@ -184,6 +180,7 @@ Since the above command line is cumbersome, it can be greatly simplfied by addin
    Host *.arvados
       ProxyCommand ssh -a -x -p2222 turnout@switchyard.{{ site.arvados_api_host }} $SSH_PROXY_FLAGS %h
    +  User you
       ForwardAgent yes
       ForwardX11 no
     
    @@ -191,7 +188,7 @@ Since the above command line is cumbersome, it can be greatly simplfied by addin This will recognize any host ending in ".arvados" and automatically apply the proxy, user and forwarding settings from the configuration file, allowing you to log in with a much simpler command: -notextile.
    $ ssh you@shell.arvados
    +notextile.
    $ ssh shell.arvados
    h2(#windowsvm). Logging in using PuTTY (Windows) diff --git a/doc/user/getting_started/workbench.html.textile.liquid b/doc/user/getting_started/workbench.html.textile.liquid index 71041b3ea2..48a4c470b8 100644 --- a/doc/user/getting_started/workbench.html.textile.liquid +++ b/doc/user/getting_started/workbench.html.textile.liquid @@ -1,18 +1,13 @@ --- layout: default navsection: userguide -navmenu: Getting Started title: Accessing Arvados Workbench - ... -h1. Accessing Arvados Workbench Access the Arvados beta test instance available using this link: -"https://workbench.{{ site.arvados_api_host }}/":https://workbench.{{ site.arvados_api_host }}/ +"https://{{ site.arvados_workbench_host }}/":https://{{ site.arvados_workbench_host }}/ If you are accessing Arvados for the first time, you will be asked to log in using a Google account. Arvados uses only your name and email address from Google services for identification, and will never access any personal information. Once you are logged in, the Workbench page may indicate your account status is *New / inactive*. If this is the case, contact the administrator of the Arvados instance to activate your account. Once your account is active, logging in to the Workbench will present you with a system status dashboard. This gives a summary of data, configuration, and activity in the Arvados instance. - -Next, we will "configure your account for ssh access to an Arvados virtual machine (VM).":ssh-access.html diff --git a/doc/user/index.html.textile.liquid b/doc/user/index.html.textile.liquid index 982b1c3fc3..03a9e60239 100644 --- a/doc/user/index.html.textile.liquid +++ b/doc/user/index.html.textile.liquid @@ -2,12 +2,9 @@ layout: default navsection: userguide title: Welcome to Arvados! - ... -h1. Welcome to Arvados! - -This guide is intended to introduce new users to the Arvados system. It covers initial configuration required to use the system and then presents several tutorials on using Arvados to do data processing. +This guide is intended to introduce new users to the Arvados system. It covers initial configuration required to access the system and then presents several tutorials on using Arvados to do data processing. This user guide introduces how to use the major components of Arvados. These are: @@ -26,9 +23,11 @@ To get the most value out of this guide, you should be comfortable with the foll # Programming in @python@ # Revision control using @git@ -The examples in this guide uses the public Arvados instance located at "https://workbench.{{ site.arvados_api_host }}/":https://workbench.{{ site.arvados_api_host }}/ . You must have an account in order to use this service. If you would like to request an account, please send an email to "arvados@curoverse.com":mailto:arvados@curoverse.com . +We also recommend you read the "Arvados Platform Overview":https://arvados.org/projects/arvados/wiki#Platform-Overview for an introduction and background information about Arvados. + +The examples in this guide uses the Arvados instance located at "https://{{ site.arvados_workbench_host }}/":https://{{ site.arvados_workbench_host }}/ . If you are using a different Arvados instance replace @{{ site.arvados_workbench_host }}@ with your private instance in all of the examples in this guide. -If you are using a different Arvados instance replace @{{ site.arvados_api_host }}@ with your private instance in all of the examples in this guide. +The Arvados public beta instance is located at "https://workbench.qr1hi.arvadosapi.com/":https://workbench.qr1hi.arvadosapi.com/ . You must have an account in order to use this service. If you would like to request an account, please send an email to "arvados@curoverse.com":mailto:arvados@curoverse.com . h2. Typographic conventions @@ -36,11 +35,11 @@ This manual uses the following typographic conventions:
      -
    • Code blocks which are set aside from the text indicate user input to the system. Commands that should be entered into a Unix shell are indicated by the directory where you should enter the command ('~' indicates your home directory) followed by '$', followed by the highlighted command to enter (do not enter the '$'), and possibly followed by example command output in black. For example, the following block indicates that you should type "ls foo" while in your home directory and the expected output will be "foo". - +
    • Code blocks which are set aside from the text indicate user input to the system. Commands that should be entered into a Unix shell are indicated by the directory where you should enter the command ('~' indicates your home directory) followed by '$', followed by the highlighted command to enter (do not enter the '$'), and possibly followed by example command output in black. For example, the following block indicates that you should type "ls foo.*" while in your home directory and the expected output will be "foo.input" and "foo.output".
      ~$ ls foo
       foo
      -
    • + +
    • Code blocks inline with text emphasize specific programs, files, or options that are being discussed.
    • Bold text emphasizes specific items to look when discussing Arvados Workbench pages.
    • @@ -49,4 +48,3 @@ foo
    -Now begin by "accessing the Arvados workbench.":getting_started/workbench.html diff --git a/doc/user/reference/api-tokens.html.textile.liquid b/doc/user/reference/api-tokens.html.textile.liquid index d47c1ccdae..018c71c678 100644 --- a/doc/user/reference/api-tokens.html.textile.liquid +++ b/doc/user/reference/api-tokens.html.textile.liquid @@ -1,16 +1,12 @@ --- layout: default navsection: userguide -navmenu: Reference title: "Getting an API token" - ... -h1. Reference: Getting an API token - The Arvados API token is a secret key that enables the @arv@ command line client to access Arvados with the proper permissions. -Access the Arvados workbench using this link: "https://workbench.{{ site.arvados_api_host }}/":https://workbench.{{ site.arvados_api_host }}/ +Access the Arvados workbench using this link: "https://{{ site.arvados_workbench_host }}/":https://{{ site.arvados_workbench_host }}/ (Replace @{{ site.arvados_api_host }}@ with the hostname of your local Arvados instance if necessary.) diff --git a/doc/user/reference/sdk-cli.html.textile.liquid b/doc/user/reference/sdk-cli.html.textile.liquid index c79563161c..f44fef2bf4 100644 --- a/doc/user/reference/sdk-cli.html.textile.liquid +++ b/doc/user/reference/sdk-cli.html.textile.liquid @@ -1,12 +1,9 @@ --- layout: default navsection: userguide -navmenu: Reference title: "Command line interface" ... -h1. Reference: Command Line Interface - *First, you should be "logged into an Arvados VM instance":{{site.baseurl}}/user/getting_started/ssh-access.html#login, and have a "working environment.":{{site.baseurl}}/user/getting_started/check-environment.html* h3. Usage diff --git a/doc/user/topics/keep.html.textile.liquid b/doc/user/topics/keep.html.textile.liquid new file mode 100644 index 0000000000..dae133abee --- /dev/null +++ b/doc/user/topics/keep.html.textile.liquid @@ -0,0 +1,48 @@ +--- +layout: default +navsection: userguide +title: "How Keep works" +... + +In Keep, information is stored in *data blocks*. Data blocks are normally between 1 byte and 64 megabytes in size. If a file exceeds the maximum size of a single data block, the file will be split across multiple data blocks until the entire file can be stored. These data blocks may be stored and replicated across multiple disks, servers, or clusters. Each data block has its own identifier for the contents of that specific data block. + +In order to reassemble the file, Keep stores a *collection* data block which lists in sequence the data blocks that make up the original file. A collection data block may store the information for multiple files, including a directory structure. + +In this example we will use @c1bad4b39ca5a924e481008009d94e32+210@ which we added to Keep in "the first Keep tutorial":{{ site.baseurl }}/users/tutorial/tutorial-keep.html. First let us examine the contents of this collection using @arv keep get@: + + +
    ~$ arv keep get c1bad4b39ca5a924e481008009d94e32+210
    +. 204e43b8a1185621ca55a94839582e6f+67108864 b9677abbac956bd3e86b1deb28dfac03+67108864 fc15aff2a762b13f521baf042140acec+67108864 323d2a3ce20370c4ca1d3462a344f8fd+25885655 0:227212247:var-GS000016015-ASM.tsv.bz2
    +
    +
    + +The command @arv keep get@ fetches the contents of the locator @c1bad4b39ca5a924e481008009d94e32+210@. This is a locator for a collection data block, so it fetches the contents of the collection. In this example, this collection consists of a single file @var-GS000016015-ASM.tsv.bz2@ which is 227212247 bytes long, and is stored using four sequential data blocks, 204e43b8a1185621ca55a94839582e6f+67108864, b9677abbac956bd3e86b1deb28dfac03+67108864, fc15aff2a762b13f521baf042140acec+67108864, 323d2a3ce20370c4ca1d3462a344f8fd+25885655. + +Let's use @arv keep get@ to download the first datablock: + +notextile.
    ~$ cd /scratch/you
    +/scratch/you$ arv keep get 204e43b8a1185621ca55a94839582e6f+67108864 > block1
    + +{% include 'notebox_begin' %} + +When you run this command, you may get this API warning: + +notextile.
    WARNING:root:API lookup failed for collection 204e43b8a1185621ca55a94839582e6f+67108864 (<class 'apiclient.errors.HttpError'>: <HttpError 404 when requesting https://qr1hi.arvadosapi.com/arvados/v1/collections/204e43b8a1185621ca55a94839582e6f%2B67108864?alt=json returned "Not Found">)
    + +This happens because @arv keep get@ tries to find a collection with this identifier. When that fails, it emits this warning, then looks for a datablock instead, which succeeds. + +{% include 'notebox_end' %} + +Let's look at the size and compute the md5 hash of @block1@: + + +
    /scratch/you$ ls -l block1
    +-rw-r--r-- 1 you group 67108864 Dec  9 20:14 block1
    +/scratch/you$ md5sum block1
    +204e43b8a1185621ca55a94839582e6f  block1
    +
    +
    + +Notice that the block identifer 204e43b8a1185621ca55a94839582e6f+67108864 consists of: +* the md5 hash @204e43b8a1185621ca55a94839582e6f@ which matches the md5 hash of @block1@ +* a size hint @67108864@ which matches the size of @block1@ diff --git a/doc/user/topics/running-pipeline-command-line.html.textile.liquid b/doc/user/topics/running-pipeline-command-line.html.textile.liquid new file mode 100644 index 0000000000..1b8550febc --- /dev/null +++ b/doc/user/topics/running-pipeline-command-line.html.textile.liquid @@ -0,0 +1,119 @@ +--- +layout: default +navsection: userguide +title: "Running a pipeline on the command line" +... + +In "Writing a pipeline":{{ site.baseurl }}/user/tutorials/tutorial-firstscript.html, we learned how to create a pipeline template on the command-line. Let's create one that doesn't require any user input to start: + + +
    ~$ cat >the_pipeline <<EOF
    +{
    +  "name":"Filter md5 hash values",
    +  "components":{
    +    "do_hash":{
    +      "script":"hash.py",
    +      "script_parameters":{
    +        "input": "887cd41e9c613463eab2f0d885c6dd96+83"
    +      },
    +      "script_version":"you:master"
    +    },
    +    "filter":{
    +      "script":"0-filter.py",
    +      "script_parameters":{
    +        "input":{
    +          "output_of":"do_hash"
    +        }
    +      },
    +      "script_version":"you:master"
    +    }
    +  }
    +}
    +EOF
    +~$ arv pipeline_template create --pipeline-template "$(cat the_pipeline)"
    +
    + +You can run this pipeline from the command line using @arv pipeline run@, filling in the UUID that you received from @arv pipeline_template create@: + + +
    ~$ arv pipeline run --template qr1hi-p5p6p-xxxxxxxxxxxxxxx
    +2013-12-16 14:08:40 +0000 -- pipeline_instance qr1hi-d1hrv-vxzkp38nlde9yyr
    +do_hash qr1hi-8i9sb-hoyc2u964ecv1s6 queued 2013-12-16T14:08:40Z
    +filter  -                           -
    +
    +2013-12-16 14:08:51 +0000 -- pipeline_instance qr1hi-d1hrv-vxzkp38nlde9yyr
    +do_hash qr1hi-8i9sb-hoyc2u964ecv1s6 8e1b6acdd3f2f1da722538127c5c6202+56
    +filter  qr1hi-8i9sb-w5k40fztqgg9i2x queued 2013-12-16T14:08:50Z
    +
    +2013-12-16 14:09:01 +0000 -- pipeline_instance qr1hi-d1hrv-vxzkp38nlde9yyr
    +do_hash qr1hi-8i9sb-hoyc2u964ecv1s6 8e1b6acdd3f2f1da722538127c5c6202+56
    +filter  qr1hi-8i9sb-w5k40fztqgg9i2x 735ac35adf430126cf836547731f3af6+56
    +
    +
    + +This instantiates your pipeline and displays a live feed of its status. The new pipeline instance will also show up on the Workbench %(rarr)→% Compute %(rarr)→% Pipeline instances page. + +Arvados adds each pipeline component to the job queue as its dependencies are satisfied (or immediately if it has no dependencies) and finishes when all components are completed or failed and there is no more work left to do. + +The Keep locators of the output of each of @"do_hash"@ and @"filter"@ component are available from the output log shown above. The output is also available on the Workbench by navigating to %(rarr)→% Compute %(rarr)→% Pipeline instances %(rarr)→% pipeline uuid under the *id* column %(rarr)→% components. + + +
    ~$ arv keep get 8e1b6acdd3f2f1da722538127c5c6202+56/md5sum.txt
    +0f1d6bcf55c34bed7f92a805d2d89bbf alice.txt
    +504938460ef369cd275e4ef58994cffe bob.txt
    +8f3b36aff310e06f3c5b9e95678ff77a carol.txt
    +~$ arv keep get 735ac35adf430126cf836547731f3af6+56/0-filter.txt
    +0f1d6bcf55c34bed7f92a805d2d89bbf alice.txt
    +
    +
    + +Indeed, the filter has picked out just the "alice" file as having a hash that starts with 0. + +h3. Running a pipeline with different parameters + +Notice that the pipeline template explicitly specifies the Keep locator for the input: + + +
    ...
    +    "do_hash":{
    +      "script_parameters":{
    +        "input": "887cd41e9c613463eab2f0d885c6dd96+83"
    +      },
    +    }
    +...
    +
    +
    + +You can specify values for pipeline component script_parameters like this: + + +
    ~$ arv pipeline run --template qr1hi-p5p6p-xxxxxxxxxxxxxxx do_hash::input=c1bad4b39ca5a924e481008009d94e32+210
    +2013-12-17 20:31:24 +0000 -- pipeline_instance qr1hi-d1hrv-tlkq20687akys8e
    +do_hash qr1hi-8i9sb-rffhuay4jryl2n2 queued 2013-12-17T20:31:24Z
    +filter  -                           -
    +
    +2013-12-17 20:31:34 +0000 -- pipeline_instance qr1hi-d1hrv-tlkq20687akys8e
    +do_hash qr1hi-8i9sb-rffhuay4jryl2n2 {:done=>1, :running=>1, :failed=>0, :todo=>0}
    +filter  -                           -
    +
    +2013-12-17 20:31:55 +0000 -- pipeline_instance qr1hi-d1hrv-tlkq20687akys8e
    +do_hash qr1hi-8i9sb-rffhuay4jryl2n2 880b55fb4470b148a447ff38cacdd952+54
    +filter  qr1hi-8i9sb-j347g1sqovdh0op queued 2013-12-17T20:31:55Z
    +
    +2013-12-17 20:32:05 +0000 -- pipeline_instance qr1hi-d1hrv-tlkq20687akys8e
    +do_hash qr1hi-8i9sb-rffhuay4jryl2n2 880b55fb4470b148a447ff38cacdd952+54
    +filter  qr1hi-8i9sb-j347g1sqovdh0op 490cd451c8108824b8a17e3723e1f236+19
    +
    +
    + +Now check the output: + + +
    ~$ arv keep get 880b55fb4470b148a447ff38cacdd952+54/md5sum.txt
    +44b8ae3fde7a8a88d2f7ebd237625b4f var-GS000016015-ASM.tsv.bz2
    +~$ arv keep get 490cd451c8108824b8a17e3723e1f236+19/0-filter.txt
    +~$
    +
    +
    + +Since none of the files in the collection have hash code that start with 0, output of the filter component is empty. diff --git a/doc/user/tutorials/tutorial-gatk-variantfiltration.html.textile.liquid b/doc/user/topics/tutorial-gatk-variantfiltration.html.textile.liquid similarity index 98% rename from doc/user/tutorials/tutorial-gatk-variantfiltration.html.textile.liquid rename to doc/user/topics/tutorial-gatk-variantfiltration.html.textile.liquid index 3bf05a5dbd..f83c19934a 100644 --- a/doc/user/tutorials/tutorial-gatk-variantfiltration.html.textile.liquid +++ b/doc/user/topics/tutorial-gatk-variantfiltration.html.textile.liquid @@ -1,20 +1,16 @@ --- layout: default navsection: userguide -navmenu: Tutorials title: "Using GATK with Arvados" - ... -h1. Using GATK with Arvados - This tutorial demonstrates how to use the Genome Analysis Toolkit (GATK) with Arvados. In this example we will install GATK and then create a VariantFiltration job to assign pass/fail scores to variants in a VCF file. *This tutorial assumes that you are "logged into an Arvados VM instance":{{site.baseurl}}/user/getting_started/ssh-access.html#login, and have a "working environment.":{{site.baseurl}}/user/getting_started/check-environment.html* h2. Installing GATK -Download the GATK binary tarball[1] -- e.g., @GenomeAnalysisTK-2.6-4.tar.bz2@ -- and "copy it to your Arvados VM":tutorial-keep.html. +Download the GATK binary tarball[1] -- e.g., @GenomeAnalysisTK-2.6-4.tar.bz2@ -- and "copy it to your Arvados VM":{{site.baseurl}}/user/tutorials/tutorial-keep.html.
    ~$ arv keep put GenomeAnalysisTK-2.6-4.tar.bz2
    diff --git a/doc/user/tutorials/tutorial-job-debug.html.textile.liquid b/doc/user/topics/tutorial-job-debug.html.textile.liquid
    similarity index 95%
    rename from doc/user/tutorials/tutorial-job-debug.html.textile.liquid
    rename to doc/user/topics/tutorial-job-debug.html.textile.liquid
    index 28052089b3..0974e51697 100644
    --- a/doc/user/tutorials/tutorial-job-debug.html.textile.liquid
    +++ b/doc/user/topics/tutorial-job-debug.html.textile.liquid
    @@ -1,18 +1,14 @@
     ---
     layout: default
     navsection: userguide
    -navmenu: Tutorials
     title: "Debugging a Crunch script"
    -
     ...
     
    -h1. Debugging a Crunch script
    -
     To test changes to a script by running a job, the change must be pushed into @git@, the job queued asynchronously, and the actual execution may be run on any compute server.  As a result, debugging a script can be difficult and time consuming.  This tutorial demonstrates using @arv-crunch-job@ to run your job in your local VM.  This avoids the job queue and allows you to execute the script from your uncomitted git tree.
     
     *This tutorial assumes that you are "logged into an Arvados VM instance":{{site.baseurl}}/user/getting_started/ssh-access.html#login, and have a "working environment.":{{site.baseurl}}/user/getting_started/check-environment.html*
     
    -This tutorial uses _you_ to denote your username.  Replace _you_ with your user name in all the following examples.
    +This tutorial uses *@you@* to denote your username.  Replace *@you@* with your user name in all the following examples.
     
     h2. Create a new script
     
    @@ -37,7 +33,7 @@ Instead of a git commit hash, we provide the path to the directory in the "scrip
     
    ~/you/crunch_scripts$ cat >~/the_job <<EOF
     {
      "script":"hello-world.py",
    - "script_version":"/home/you/you",
    + "script_version":"/home/you/you",
      "script_parameters":{}
     }
     EOF
    @@ -46,7 +42,7 @@ EOF
     2013-12-12_21:36:42 qr1hi-8i9sb-okzukfzkpbrnhst 29827  node localhost - 1 slots
     2013-12-12_21:36:42 qr1hi-8i9sb-okzukfzkpbrnhst 29827  start
     2013-12-12_21:36:42 qr1hi-8i9sb-okzukfzkpbrnhst 29827  script hello-world.py
    -2013-12-12_21:36:42 qr1hi-8i9sb-okzukfzkpbrnhst 29827  script_version /home/you/you
    +2013-12-12_21:36:42 qr1hi-8i9sb-okzukfzkpbrnhst 29827  script_version /home/you/you
     2013-12-12_21:36:42 qr1hi-8i9sb-okzukfzkpbrnhst 29827  script_parameters {}
     2013-12-12_21:36:42 qr1hi-8i9sb-okzukfzkpbrnhst 29827  runtime_constraints {"max_tasks_per_node":0}
     2013-12-12_21:36:42 qr1hi-8i9sb-okzukfzkpbrnhst 29827  start level 0
    @@ -75,7 +71,7 @@ bc. 2013-12-12_21:36:42 qr1hi-8i9sb-okzukfzkpbrnhst 29827 0 stderr hello world
     The script's output is captured in the log, which is useful for print statement debugging. However, although this script returned a status code of 0 (success), the job failed.  Why?  For a job to complete successfully scripts must explicitly add their output to Keep, and then tell Arvados about it.  Here is a second try:
     
     
    -
    ~/you/crunch_scripts$ cat >hello-world.py <<EOF
    +
    ~/you/crunch_scripts$ cat >hello-world-fixed.py <<EOF
     #!/usr/bin/env python
     
     import arvados
    @@ -101,7 +97,7 @@ EOF
     ~/you/crunch_scripts$ cat >~/the_job <<EOF
     {
      "script":"hello-world-fixed.py",
    - "script_version":"/home/you/you",
    + "script_version":"/home/you/you",
      "script_parameters":{}
     }
     EOF
    @@ -110,7 +106,7 @@ EOF
     2013-12-12_21:56:59 qr1hi-8i9sb-79260ykfew5trzl 31578  node localhost - 1 slots
     2013-12-12_21:57:00 qr1hi-8i9sb-79260ykfew5trzl 31578  start
     2013-12-12_21:57:00 qr1hi-8i9sb-79260ykfew5trzl 31578  script hello-world.py
    -2013-12-12_21:57:00 qr1hi-8i9sb-79260ykfew5trzl 31578  script_version /home/you/you
    +2013-12-12_21:57:00 qr1hi-8i9sb-79260ykfew5trzl 31578  script_version /home/you/you
     2013-12-12_21:57:00 qr1hi-8i9sb-79260ykfew5trzl 31578  script_parameters {}
     2013-12-12_21:57:00 qr1hi-8i9sb-79260ykfew5trzl 31578  runtime_constraints {"max_tasks_per_node":0}
     2013-12-12_21:57:00 qr1hi-8i9sb-79260ykfew5trzl 31578  start level 0
    @@ -153,4 +149,3 @@ Read and write data to @/tmp/@ instead of Keep. This only works with the Python
     
     notextile. 
    ~$ export KEEP_LOCAL_STORE=/tmp
    -Next, "parallel tasks.":tutorial-parallel.html diff --git a/doc/user/tutorials/tutorial-job1.html.textile.liquid b/doc/user/topics/tutorial-job1.html.textile.liquid similarity index 74% rename from doc/user/tutorials/tutorial-job1.html.textile.liquid rename to doc/user/topics/tutorial-job1.html.textile.liquid index a0dd896033..61eaa639a6 100644 --- a/doc/user/tutorials/tutorial-job1.html.textile.liquid +++ b/doc/user/topics/tutorial-job1.html.textile.liquid @@ -1,32 +1,20 @@ --- layout: default navsection: userguide -navmenu: Tutorials -title: "Running a Crunch job" - +title: "Running a Crunch job on the command line" ... -h1. Running a crunch job - -This tutorial introduces the concepts and use of the Crunch job system using the @arv@ command line tool and Arvados Workbench. +This tutorial introduces how to run individual Crunch jobs using the @arv@ command line tool. *This tutorial assumes that you are "logged into an Arvados VM instance":{{site.baseurl}}/user/getting_started/ssh-access.html#login, and have a "working environment.":{{site.baseurl}}/user/getting_started/check-environment.html* -In "retrieving data using Keep,":tutorial-keep.html we downloaded a file from Keep and did some computation with it (specifically, computing the md5 hash of the complete file). While a straightforward way to accomplish a computational task, there are several obvious drawbacks to this approach: -* Large files require significant time to download. -* Very large files may exceed the scratch space of the local disk. -* We are only able to use the local CPU to process the file. +You will create a job to run the "hash" crunch script. The "hash" script computes the md5 hash of each file in a collection. -The Arvados "Crunch" framework is designed to support processing very large data batches (gigabytes to terabytes) efficiently, and provides the following benefits: -* Increase concurrency by running tasks asynchronously, using many CPUs and network interfaces at once (especially beneficial for CPU-bound and I/O-bound tasks respectively). -* Track inputs, outputs, and settings so you can verify that the inputs, settings, and sequence of programs you used to arrive at an output is really what you think it was. -* Ensure that your programs and workflows are repeatable with different versions of your code, OS updates, etc. -* Interrupt and resume long-running jobs consisting of many short tasks. -* Maintain timing statistics automatically, so they're there when you want them. +h2. Jobs -For your first job, you will run the "hash" crunch script using the Arvados system. The "hash" script computes the md5 hash of each file in a collection. +Crunch pipelines consist of one or more jobs. A "job" is a single run of a specific version of a crunch script with a specific input. You an also run jobs individually. -Crunch jobs are described using JSON objects. For example: +A request to run a crunch job are is described using a JSON object. For example:
    ~$ cat >the_job <<EOF
    @@ -46,7 +34,7 @@ EOF
     * @<the_job@ redirects standard output to a file called @the_job@
     * @"script"@ specifies the name of the script to run.  The script is searched for in the "crunch_scripts/" subdirectory of the @git@ checkout specified by @"script_version"@.
    -* @"script_version"@ specifies the version of the script that you wish to run.  This can be in the form of an explicit @git@ revision hash, or in the form "repository:branch" (in which case it will take the HEAD of the specified branch).  Arvados logs the script version that was used in the run, enabling you to go back and re-run any past job with the guarantee that the exact same code will be used as was used in the previous run.  You can access a list of available @git@ repositories on the Arvados workbench under _Compute %(rarr)→% Code repositories_.
    +* @"script_version"@ specifies the version of the script that you wish to run.  This can be in the form of an explicit @git@ revision hash, or in the form "repository:branch" (in which case it will take the HEAD of the specified branch).  Arvados logs the script version that was used in the run, enabling you to go back and re-run any past job with the guarantee that the exact same code will be used as was used in the previous run.  You can access a list of available @git@ repositories on the Arvados workbench under "Compute %(rarr)→% Code repositories":http://{{site.arvados_workbench_host}}/repositories .
     * @"script_parameters"@ are provided to the script.  In this case, the input is the locator for the collection that we inspected in the previous section.
     
     Use @arv job create@ to actually submit the job.  It should print out a JSON object which describes the newly created job:
    @@ -98,7 +86,7 @@ The job is now queued and will start running as soon as it reaches the front of
     
     h2. Monitor job progress
     
    -Go to the Workbench dashboard.  Your job should be at the top of the "Recent jobs" table.  This table refreshes automatically.  When the job has completed successfully, it will show finished in the *Status* column.
    +Go to the "Workbench dashboard":http://{{site.arvados_workbench_host}}.  Your job should be at the top of the "Recent jobs" table.  This table refreshes automatically.  When the job has completed successfully, it will show finished in the *Status* column.
     
     On the command line, you can access log messages while the job runs using @arv job log_tail_follow@:
     
    @@ -108,7 +96,7 @@ This will print out the last several lines of the log for that job.
     
     h2. Inspect the job output
     
    -On the workbench dashboard, look for the *Output* column of the *Recent jobs* table.  Click on the link under *Output* for your job to go to the files page with the job output.  The files page lists all the files that were output by the job.  Click on the link under the *files* column to view a file, or click on the download icon  to download the output file.
    +On the "Workbench dashboard":http://{{site.arvados_workbench_host}}, look for the *Output* column of the *Recent jobs* table.  Click on the link under *Output* for your job to go to the files page with the job output.  The files page lists all the files that were output by the job.  Click on the link under the *files* column to view a file, or click on the download icon  to download the output file.
     
     On the command line, you can use @arv job get@ to access a JSON object describing the output:
     
    @@ -137,7 +125,7 @@ On the command line, you can use @arv job get@ to access a JSON object describin
      "cancelled_by_user_uuid":null,
      "started_at":"2013-12-16T20:44:36Z",
      "finished_at":"2013-12-16T20:44:53Z",
    - "output":"880b55fb4470b148a447ff38cacdd952+54",
    + "output":"dd755dbc8d49a67f4fe7dc843e4f10a6+54",
      "success":true,
      "running":false,
      "is_locked_by_uuid":"qr1hi-tpzed-9zdpkpni2yddge6",
    @@ -157,12 +145,12 @@ On the command line, you can use @arv job get@ to access a JSON object describin
     
    -* @"output"@ is the unique identifier for this specific job's output. This is a Keep collection. Because the output of Arvados jobs should be deterministic, the known expected output is 880b55fb4470b148a447ff38cacdd952+54. +* @"output"@ is the unique identifier for this specific job's output. This is a Keep collection. Because the output of Arvados jobs should be deterministic, the known expected output is dd755dbc8d49a67f4fe7dc843e4f10a6+54. Now you can list the files in the collection: -
    ~$ arv keep ls 880b55fb4470b148a447ff38cacdd952+54
    +
    ~$ arv keep ls dd755dbc8d49a67f4fe7dc843e4f10a6+54
     md5sum.txt
     
    @@ -170,8 +158,8 @@ md5sum.txt This collection consists of the @md5sum.txt@ file. Use @arv keep get@ to show the contents of the @md5sum.txt@ file: -
    ~$ arv keep get 880b55fb4470b148a447ff38cacdd952+54/md5sum.txt
    -44b8ae3fde7a8a88d2f7ebd237625b4f var-GS000016015-ASM.tsv.bz2
    +
    ~$ arv keep get dd755dbc8d49a67f4fe7dc843e4f10a6+54/md5sum.txt
    +44b8ae3fde7a8a88d2f7ebd237625b4f ./var-GS000016015-ASM.tsv.bz2
     
    @@ -221,15 +209,13 @@ The log collection consists of one log file named with the job id. You can acce 2013-12-16_20:44:39 qr1hi-8i9sb-1pm1t02dezhupss 7575 status: 1 done, 1 running, 0 todo 2013-12-16_20:44:52 qr1hi-8i9sb-1pm1t02dezhupss 7575 1 child 7716 on compute13.1 exit 0 signal 0 success=true 2013-12-16_20:44:52 qr1hi-8i9sb-1pm1t02dezhupss 7575 1 success in 13 seconds -2013-12-16_20:44:52 qr1hi-8i9sb-1pm1t02dezhupss 7575 1 output 880b55fb4470b148a447ff38cacdd952+54 +2013-12-16_20:44:52 qr1hi-8i9sb-1pm1t02dezhupss 7575 1 output dd755dbc8d49a67f4fe7dc843e4f10a6+54 2013-12-16_20:44:52 qr1hi-8i9sb-1pm1t02dezhupss 7575 wait for last 0 children to finish 2013-12-16_20:44:52 qr1hi-8i9sb-1pm1t02dezhupss 7575 status: 2 done, 0 running, 0 todo 2013-12-16_20:44:52 qr1hi-8i9sb-1pm1t02dezhupss 7575 release job allocation 2013-12-16_20:44:52 qr1hi-8i9sb-1pm1t02dezhupss 7575 Freeze not implemented 2013-12-16_20:44:52 qr1hi-8i9sb-1pm1t02dezhupss 7575 collate -2013-12-16_20:44:53 qr1hi-8i9sb-1pm1t02dezhupss 7575 output 880b55fb4470b148a447ff38cacdd952+54 +2013-12-16_20:44:53 qr1hi-8i9sb-1pm1t02dezhupss 7575 output dd755dbc8d49a67f4fe7dc843e4f10a6+54+K@qr1hi 2013-12-16_20:44:53 qr1hi-8i9sb-1pm1t02dezhupss 7575 finish
    - -This concludes the first tutorial. In the next tutorial, we will "write a script to compute the hash.":tutorial-firstscript.html diff --git a/doc/user/tutorials/tutorial-parallel.html.textile.liquid b/doc/user/topics/tutorial-parallel.html.textile.liquid similarity index 71% rename from doc/user/tutorials/tutorial-parallel.html.textile.liquid rename to doc/user/topics/tutorial-parallel.html.textile.liquid index be78506f5d..9d08b6988b 100644 --- a/doc/user/tutorials/tutorial-parallel.html.textile.liquid +++ b/doc/user/topics/tutorial-parallel.html.textile.liquid @@ -1,14 +1,10 @@ --- layout: default navsection: userguide -navmenu: Tutorials title: "Parallel Crunch tasks" - ... -h1. Parallel Crunch tasks - -In the tutorial "writing a crunch script,":tutorial-firstscript.html our script used a "for" loop to compute the md5 hashes for each file in sequence. This approach, while simple, is not able to take advantage of the compute cluster with multiple nodes and cores to speed up computation by running tasks in parallel. This tutorial will demonstrate how to create parallel Crunch tasks. +In the previous tutorials, we used @arvados.job_setup.one_task_per_input_file()@ to automatically parallelize our jobs by creating a separate task per file. For some types of jobs, you may need to split the work up differently, for example creating tasks to process different segments of a single large file. In this this tutorial will demonstrate how to create Crunch tasks directly. Start by entering the @crunch_scripts@ directory of your git repository: @@ -23,7 +19,7 @@ notextile.
    ~/you/crunch_scripts$ nano parall
     
     Add the following code to compute the md5 hash of each file in a 
     
    -
    {% include 'parallel_hash_script_py' %}
    + {% code 'parallel_hash_script_py' as python %} Make the file executable: @@ -38,13 +34,13 @@ Next, add the file to @git@ staging, commit and push:
    -You should now be able to run your new script using Crunch, with "script" referring to our new "parallel-hash.py" script. We will use a different input from our previous examples. We will use @887cd41e9c613463eab2f0d885c6dd96+83@ which consists of three files, "alice.txt", "bob.txt" and "carol.txt" (the example collection used previously in "fetching data from Arvados using Keep":tutorial-keep.html). +You should now be able to run your new script using Crunch, with "script" referring to our new "parallel-hash.py" script. We will use a different input from our previous examples. We will use @887cd41e9c613463eab2f0d885c6dd96+83@ which consists of three files, "alice.txt", "bob.txt" and "carol.txt" (the example collection used previously in "fetching data from Arvados using Keep":{{site.baseurl}}/user/tutorials/tutorial-keep.html#dir).
    ~/you/crunch_scripts$ cat >~/the_job <<EOF
     {
      "script": "parallel-hash.py",
    - "script_version": "you:master",
    + "script_version": "you:master",
      "script_parameters":
      {
       "input": "887cd41e9c613463eab2f0d885c6dd96+83"
    @@ -69,7 +65,7 @@ EOF
     Because the job ran in parallel, each instance of parallel-hash creates a separate @md5sum.txt@ as output.  Arvados automatically collates theses files into a single collection, which is the output of the job:
     
     
    -
    ~/you/crunch_scripts$ arv keep get e2ccd204bca37c77c0ba59fc470cd0f7+162
    +
    ~/you/crunch_scripts$ arv keep ls e2ccd204bca37c77c0ba59fc470cd0f7+162
     md5sum.txt
     md5sum.txt
     md5sum.txt
    @@ -80,9 +76,4 @@ md5sum.txt
     
    -h2. The one job per file pattern - -This example demonstrates how to schedule a new task per file. Because this is a common pattern, the Crunch Python API contains a convenience function to "queue a task for each input file":{{site.baseurl}}/sdk/python/crunch-utility-libraries.html#one_task_per_input which reduces the amount of boilerplate code required to handle parallel jobs. - -Next, "Constructing a Crunch pipeline":tutorial-new-pipeline.html diff --git a/doc/user/tutorials/tutorial-trait-search.html.textile.liquid b/doc/user/topics/tutorial-trait-search.html.textile.liquid similarity index 99% rename from doc/user/tutorials/tutorial-trait-search.html.textile.liquid rename to doc/user/topics/tutorial-trait-search.html.textile.liquid index 6402c7e1d3..001fbbc082 100644 --- a/doc/user/tutorials/tutorial-trait-search.html.textile.liquid +++ b/doc/user/topics/tutorial-trait-search.html.textile.liquid @@ -1,13 +1,9 @@ --- layout: default navsection: userguide -navmenu: Tutorials title: "Querying the Metadata Database" - ... -h1. Querying the Metadata Database - This tutorial introduces the Arvados Metadata Database. The Metadata Database stores information about files in Keep. This example will use the Python SDK to find public WGS (Whole Genome Sequencing) data for people who have reported a certain medical condition. *This tutorial assumes that you are "logged into an Arvados VM instance":{{site.baseurl}}/user/getting_started/ssh-access.html#login, and have a "working environment.":{{site.baseurl}}/user/getting_started/check-environment.html* diff --git a/doc/user/tutorials/intro-crunch.html.textile.liquid b/doc/user/tutorials/intro-crunch.html.textile.liquid new file mode 100644 index 0000000000..46b4d6c754 --- /dev/null +++ b/doc/user/tutorials/intro-crunch.html.textile.liquid @@ -0,0 +1,17 @@ +--- +layout: default +navsection: userguide +title: Introduction to Crunch +... + +In "getting data from Keep,":tutorial-keep.html#arv-get we downloaded a file from Keep and did some computation with it (specifically, computing the md5 hash of the complete file). While a straightforward way to accomplish a computational task, there are several obvious drawbacks to this approach: +* Large files require significant time to download. +* Very large files may exceed the scratch space of the local disk. +* We are only able to use the local CPU to process the file. + +The Arvados "Crunch" framework is designed to support processing very large data batches (gigabytes to terabytes) efficiently, and provides the following benefits: +* Increase concurrency by running tasks asynchronously, using many CPUs and network interfaces at once (especially beneficial for CPU-bound and I/O-bound tasks respectively). +* Track inputs, outputs, and settings so you can verify that the inputs, settings, and sequence of programs you used to arrive at an output is really what you think it was. +* Ensure that your programs and workflows are repeatable with different versions of your code, OS updates, etc. +* Interrupt and resume long-running jobs consisting of many short tasks. +* Maintain timing statistics automatically, so they're there when you want them. diff --git a/doc/user/tutorials/running-external-program.html.textile.liquid b/doc/user/tutorials/running-external-program.html.textile.liquid index 7b31e17818..b555d77fe4 100644 --- a/doc/user/tutorials/running-external-program.html.textile.liquid +++ b/doc/user/tutorials/running-external-program.html.textile.liquid @@ -1,13 +1,9 @@ --- layout: default navsection: userguide -navmenu: Tutorials -title: "Running external programs" - +title: "Using Crunch to run external programs" ... -h1. Running external programs - This tutorial demonstrates how to use Crunch to run an external program by writting a wrapper using the Python SDK. *This tutorial assumes that you are "logged into an Arvados VM instance":{{site.baseurl}}/user/getting_started/ssh-access.html#login, and have a "working environment.":{{site.baseurl}}/user/getting_started/check-environment.html* @@ -27,7 +23,7 @@ notextile.
    ~/you/crunch_scripts$ nano run-md
     
     Add the following code to use the @md5sum@ program to compute the hash of each file in a collection:
     
    -
    {% include 'run_md5sum_py' %}
    + {% code 'run_md5sum_py' as python %} Make the file executable: @@ -45,27 +41,25 @@ Next, add the file to @git@ staging, commit and push: You should now be able to run your new script using Crunch, with "script" referring to our new "run-md5sum.py" script. -
    ~/you/crunch_scripts$ cat >~/the_job <<EOF
    -{
    - "script": "run-md5sum.py",
    - "script_version": "you:master",
    - "script_parameters":
    - {
    -  "input": "c1bad4b39ca5a924e481008009d94e32+210"
    - }
    -}
    -EOF
    -~/you/crunch_scripts$ arv job create --job "$(cat the_job)"
    +
    ~/you/crunch_scripts$ cat >~/the_pipeline <<EOF
     {
    - ...
    - "uuid":"qr1hi-xxxxx-xxxxxxxxxxxxxxx"
    - ...
    -}
    -~/you/crunch_scripts$ arv job get --uuid qr1hi-xxxxx-xxxxxxxxxxxxxxx
    -{
    - ...
    - "output":"4d164b1658c261b9afc6b479130016a3+54",
    - ...
    +  "name":"Run external md5sum program",
    +  "components":{
    +    "do_hash":{
    +      "script":"run-md5sum.py",
    +      "script_parameters":{
    +        "input":{
    +          "required": true,
    +          "dataclass": "Collection"
    +        }
    +      },
    +      "script_version":"you:master"
    +    }
    +  }
     }
    +EOF
    +~/you/crunch_scripts$ arv pipeline_template create --pipeline-template "$(cat ~/the_pipeline)"
     
    + +Your new pipeline template will appear on the "Workbench %(rarr)→% Compute %(rarr)→% Pipeline templates":http://{{ site.arvados_workbench_host }}/pipeline_instances page. You can run the "pipeline using workbench":tutorial-pipeline-workbench.html diff --git a/doc/user/tutorials/tutorial-firstscript.html.textile.liquid b/doc/user/tutorials/tutorial-firstscript.html.textile.liquid index 130f591176..245e89066b 100644 --- a/doc/user/tutorials/tutorial-firstscript.html.textile.liquid +++ b/doc/user/tutorials/tutorial-firstscript.html.textile.liquid @@ -2,12 +2,9 @@ layout: default navsection: userguide navmenu: Tutorials -title: "Writing a Crunch script" - +title: "Writing a pipeline" ... -h1. Writing a Crunch script - In this tutorial, we will write the "hash" script demonstrated in the first tutorial. *This tutorial assumes that you are "logged into an Arvados VM instance":{{site.baseurl}}/user/getting_started/ssh-access.html#login, and have a "working environment.":{{site.baseurl}}/user/getting_started/check-environment.html* @@ -25,7 +22,7 @@ First, you should do some basic configuration for git (you only need to do this ~$ git config --global user.email you@example.com
    -On the Arvados Workbench, navigate to _Compute %(rarr)→% Code repositories._ You should see two repositories, one named "arvados" (under the *name* column) and a second with your user name. Next to *name* is the column *push_url*. Copy the *push_url* cell associated with your repository. This should look like git@git.{{ site.arvados_api_host }}:you.git. +On the Arvados Workbench, navigate to "Compute %(rarr)→% Code repositories":http://{{site.arvados_workbench_host}}/repositories . You should see a repository with your user name listed in the *name* column. Next to *name* is the column *push_url*. Copy the *push_url* value associated with your repository. This should look like git@git.{{ site.arvados_api_host }}:you.git. Next, on the Arvados virtual machine, clone your git repository: @@ -60,14 +57,14 @@ notextile.
    ~/you/crunch_scripts$ nano hash.p
     
     Add the following code to compute the md5 hash of each file in a collection:
     
    -
    {% include 'tutorial_hash_script_py' %}
    + {% code 'tutorial_hash_script_py' as python %} Make the file executable: notextile.
    ~/you/crunch_scripts$ chmod +x hash.py
    {% include 'notebox_begin' %} -The below steps describe how to execute the script after committing changes to git. To test the script locally, please see the "debugging a crunch script":tutorial-job-debug.html page. +The steps below describe how to execute the script after committing changes to git. To run a script locally for testing, please see "debugging a crunch script":{{site.baseurl}}/user/topics/tutorial-job-debug.html . {% include 'notebox_end' %} @@ -96,34 +93,47 @@ To git@git.qr1hi.arvadosapi.com:you.git * [new branch] master -> master
    -You should now be able to run your script using Crunch, similar to how we did it in the "first tutorial.":tutorial-job1.html The field @"script_version"@ should be @you:master@ to tell Crunch to run the script at the head of the "master" git branch, which you just uploaded. +h2. Create a pipeline template + +Next, create a file that contains the pipeline definition: -
    ~/you/crunch_scripts$ cat >~/the_job <<EOF
    -{
    - "script": "hash.py",
    - "script_version": "you:master",
    - "script_parameters":
    - {
    -  "input": "c1bad4b39ca5a924e481008009d94e32+210"
    - }
    -}
    -EOF
    -~/you/crunch_scripts$ arv job create --job "$(cat ~/the_job)"
    -{
    - ...
    - "uuid":"qr1hi-xxxxx-xxxxxxxxxxxxxxx"
    - ...
    -}
    -~/you/crunch_scripts$ arv job get --uuid qr1hi-xxxxx-xxxxxxxxxxxxxxx
    +
    ~/you/crunch_scripts$ cd ~
    +~$ cat >the_pipeline <<EOF
     {
    - ...
    - "output":"880b55fb4470b148a447ff38cacdd952+54",
    - ...
    +  "name":"My first pipeline",
    +  "components":{
    +    "do_hash":{
    +      "script":"hash.py",
    +      "script_parameters":{
    +        "input":{
    +          "required": true,
    +          "dataclass": "Collection"
    +        }
    +      },
    +      "script_version":"you:master"
    +    }
    +  }
     }
    -~/you/crunch_scripts$ arv keep get 880b55fb4470b148a447ff38cacdd952+54/md5sum.txt
    -44b8ae3fde7a8a88d2f7ebd237625b4f var-GS000016015-ASM.tsv.bz2
    +EOF
    +
    + + +* @cat@ is a standard Unix utility that simply copies standard input to standard output +* @<the_pipeline@ redirects standard output to a file called @the_pipeline@ +* @"name"@ is a human-readable name for the pipeline +* @"components"@ is a set of scripts that make up the pipeline +* The component is listed with a human-readable name (@"do_hash"@ in this example) +* @"script"@ specifies the name of the script to run. The script is searched for in the "crunch_scripts/" subdirectory of the @git@ checkout specified by @"script_version"@. +* @"script_version"@ specifies the version of the script that you wish to run. This can be in the form of an explicit @git@ revision hash, or in the form "repository:branch" (in which case it will take the HEAD of the specified branch). Arvados logs the script version that was used in the run, enabling you to go back and re-run any past job with the guarantee that the exact same code will be used as was used in the previous run. You can access a list of available @git@ repositories on the Arvados workbench under "Compute %(rarr)→% Code repositories":http://{{site.arvados_workbench_host}}//repositories . +* @"script_parameters"@ describes the parameters for the script. In this example, there is one parameter called @input@ which is @required@ and is a @Collection@. + +Now, use @arv pipeline_template create@ tell Arvados about your pipeline template: + + +
    ~$ arv pipeline_template create --pipeline-template "$(cat the_pipeline)"
     
    -Next, "debugging a crunch script.":tutorial-job-debug.html +Your new pipeline template will appear on the "Workbench %(rarr)→% Compute %(rarr)→% Pipeline templates":http://{{ site.arvados_workbench_host }}/pipeline_instances page. You can run the "pipeline using workbench":tutorial-pipeline-workbench.html diff --git a/doc/user/tutorials/tutorial-keep.html.textile.liquid b/doc/user/tutorials/tutorial-keep.html.textile.liquid index 8196363864..6a797c001a 100644 --- a/doc/user/tutorials/tutorial-keep.html.textile.liquid +++ b/doc/user/tutorials/tutorial-keep.html.textile.liquid @@ -1,13 +1,9 @@ --- layout: default navsection: userguide -navmenu: Tutorials -title: "Storing and Retrieving data using Arvados Keep" - +title: "Storing and Retrieving data using Keep" ... -h1. Storing and Retrieving data using Arvados Keep - This tutorial introduces you to the Arvados file storage system. @@ -59,7 +55,7 @@ c1bad4b39ca5a924e481008009d94e32+210 The output value @c1bad4b39ca5a924e481008009d94e32+210@ from @arv keep put@ is the Keep locator. This enables you to access the file you just uploaded, and is explained in the next section. -h2. Putting a directory +h2(#dir). Putting a directory You can also use @arv keep put@ to add an entire directory: @@ -74,47 +70,56 @@ You can also use @arv keep put@ to add an entire directory:
    +The locator @887cd41e9c613463eab2f0d885c6dd96+83@ represents a collection with multiple files. + h1. Getting Data from Keep -In Keep, information is stored in *data blocks*. Data blocks are normally between 1 byte and 64 megabytes in size. If a file exceeds the maximum size of a single data block, the file will be split across multiple data blocks until the entire file can be stored. These data blocks may be stored and replicated across multiple disks, servers, or clusters. Each data block has its own identifier for the contents of that specific data block. +h2. Using Workbench -In order to reassemble the file, Keep stores a *collection* data block which lists in sequence the data blocks that make up the original file. A collection data block may store the information for multiple files, including a directory structure. +You may access collections through the "Collections section of Arvados Workbench":https://{{ site.arvados_workbench_host }}/collections located at "https://{{ site.arvados_workbench_host }}/collections":https://{{ site.arvados_workbench_host }}/collections . You can also access individual collections and individual files within a collection. Some examples: -In this example we will use @c1bad4b39ca5a924e481008009d94e32+210@ which we added to keep in the previous section. First let us examine the contents of this collection using @arv keep get@: +* "https://{{ site.arvados_workbench_host }}/collections/c1bad4b39ca5a924e481008009d94e32+210":https://{{ site.arvados_workbench_host }}/collections/c1bad4b39ca5a924e481008009d94e32+210 +* "https://{{ site.arvados_workbench_host }}/collections/887cd41e9c613463eab2f0d885c6dd96+83/alice.txt":https://{{ site.arvados_workbench_host }}/collections/887cd41e9c613463eab2f0d885c6dd96+83/alice.txt + +h2(#arv-get). Using arv-get + +You can view the contents of a collection using @arv keep ls@: -
    /scratch/you$ arv keep get c1bad4b39ca5a924e481008009d94e32+210
    -. 204e43b8a1185621ca55a94839582e6f+67108864 b9677abbac956bd3e86b1deb28dfac03+67108864 fc15aff2a762b13f521baf042140acec+67108864 323d2a3ce20370c4ca1d3462a344f8fd+25885655 0:227212247:var-GS000016015-ASM.tsv.bz2
    +
    /scratch/you$ arv keep ls c1bad4b39ca5a924e481008009d94e32+210
    +var-GS000016015-ASM.tsv.bz2
     
    - -The command @arv keep get@ fetches the contents of the locator @c1bad4b39ca5a924e481008009d94e32+210@. This is a locator for a collection data block, so it fetches the contents of the collection. In this example, this collection consists of a single file @var-GS000016015-ASM.tsv.bz2@ which is 227212247 bytes long, and is stored using four sequential data blocks, 204e43b8a1185621ca55a94839582e6f+67108864, b9677abbac956bd3e86b1deb28dfac03+67108864, fc15aff2a762b13f521baf042140acec+67108864, 323d2a3ce20370c4ca1d3462a344f8fd+25885655. +
    /scratch/you$ arv keep ls 887cd41e9c613463eab2f0d885c6dd96+83
    +alice.txt
    +bob.txt
    +carol.txt
    +
    + -Let's use @arv keep get@ to download the first datablock: +Use @-s@ to print file sizes rounded up to the nearest kilobyte: -notextile.
    /scratch/you$ arv keep get 204e43b8a1185621ca55a94839582e6f+67108864 > block1
    + +
    /scratch/you$ arv keep ls -s c1bad4b39ca5a924e481008009d94e32+210
    +221887 var-GS000016015-ASM.tsv.bz2
    +
    +
    -Let's look at the size and compute the md5 hash of @block1@: +Use @arv keep get@ to download the contents of a collection and place it in the directory specified in the second argument (in this example, @.@ for the current directory): -
    /scratch/you$ ls -l block1
    --rw-r--r-- 1 you group 67108864 Dec  9 20:14 block1
    -/scratch/you$ md5sum block1
    -204e43b8a1185621ca55a94839582e6f  block1
    +
    /scratch/you$ arv keep get c1bad4b39ca5a924e481008009d94e32+210/ .
     
    -Notice that the block identifer 204e43b8a1185621ca55a94839582e6f+67108864 of: -* the md5 hash @204e43b8a1185621ca55a94839582e6f@ which matches the md5 hash of @block1@ -* a size hint @67108864@ which matches the size of @block1@ - -Next, let's use @arv keep get@ to download and reassemble @var-GS000016015-ASM.tsv.bz2@ using the following command: +You can also download indvidual files: -
    /scratch/you$ arv keep get c1bad4b39ca5a924e481008009d94e32+210/var-GS000016015-ASM.tsv.bz2 .
    +
    /scratch/you$ arv keep get 887cd41e9c613463eab2f0d885c6dd96+83/alice.txt .
     
    + -This downloads the file @var-GS000016015-ASM.tsv.bz2@ described by collection @c1bad4b39ca5a924e481008009d94e32+210@ from Keep and places it into the local directory. Now that we have the file, we can compute the md5 hash of the complete file: +With a local copy of the file, we can do some computation, for example computing the md5 hash of the complete file:
    /scratch/you$ md5sum var-GS000016015-ASM.tsv.bz2
    @@ -122,22 +127,40 @@ This downloads the file @var-GS000016015-ASM.tsv.bz2@ described by collection @c
     
    -h2. Accessing Collections +h2. Using arv-mount -There are a couple of other ways to access a collection. You may view the contents of a collection using @arv keep ls@: +Use @arv-mount@ to take advantage of the "File System in User Space / FUSE":http://fuse.sourceforge.net/ feature of the Linux kernel to mount a Keep collection as if it were a regular directory tree. -
    /scratch/you$ arv keep ls c1bad4b39ca5a924e481008009d94e32+210
    +
    /scratch/you$ mkdir mnt
    +/scratch/you$ arv-mount --collection c1bad4b39ca5a924e481008009d94e32+210 mnt &
    +/scratch/you$ cd mnt
    +/scratch/you/mnt$ ls
     var-GS000016015-ASM.tsv.bz2
    -/scratch/you$ arv keep ls -s c1bad4b39ca5a924e481008009d94e32+210
    -221887 var-GS000016015-ASM.tsv.bz2
    +/scratch/you/mnt$ md5sum var-GS000016015-ASM.tsv.bz2
    +44b8ae3fde7a8a88d2f7ebd237625b4f  var-GS000016015-ASM.tsv.bz2
    +/scratch/you/mnt$ cd ..
    +/scratch/you$ fusermount -u mnt
     
    -* @-s@ prints file sizes in kilobytes +You can also mount the entire Keep namespace in "magic directory" mode: -You may also access through the Arvados Workbench using a URI similar to this, where the last part of the path is the Keep locator: + +
    /scratch/you$ mkdir mnt
    +/scratch/you$ arv-mount mnt &
    +/scratch/you$ cd mnt/c1bad4b39ca5a924e481008009d94e32+210
    +/scratch/you/mnt/c1bad4b39ca5a924e481008009d94e32+210$ ls
    +var-GS000016015-ASM.tsv.bz2
    +/scratch/you/mnt/c1bad4b39ca5a924e481008009d94e32+210$ md5sum var-GS000016015-ASM.tsv.bz2
    +44b8ae3fde7a8a88d2f7ebd237625b4f  var-GS000016015-ASM.tsv.bz2
    +/scratch/you/mnt/c1bad4b39ca5a924e481008009d94e32+210$ cd ../..
    +/scratch/you$ fusermount -u mnt
    +
    +
    -"https://workbench.{{ site.arvados_api_host }}/collections/c1bad4b39ca5a924e481008009d94e32+210":https://workbench.{{ site.arvados_api_host }}/collections/c1bad4b39ca5a924e481008009d94e32+210 +Using @arv-mount@ has several significant benefits: -You are now ready to proceed to the next tutorial, "running a crunch job.":tutorial-job1.html +* You can browse, open and read Keep entries as if they are regular files. +* It is easy for existing tools to access files in Keep. +* Data is downloaded on demand, it is not necessary to download an entire file or collection to start processing diff --git a/doc/user/tutorials/tutorial-new-pipeline.html.textile.liquid b/doc/user/tutorials/tutorial-new-pipeline.html.textile.liquid index d128b4b195..b09e624473 100644 --- a/doc/user/tutorials/tutorial-new-pipeline.html.textile.liquid +++ b/doc/user/tutorials/tutorial-new-pipeline.html.textile.liquid @@ -1,29 +1,28 @@ --- layout: default navsection: userguide -navmenu: Tutorials -title: "Constructing a Crunch pipeline" - +title: "Writing a multi-step pipeline" ... -h1. Constructing a Crunch pipeline - A pipeline in Arvados is a collection of crunch scripts, in which the output from one script may be used as the input to another script. *This tutorial assumes that you are "logged into an Arvados VM instance":{{site.baseurl}}/user/getting_started/ssh-access.html#login, and have a "working environment.":{{site.baseurl}}/user/getting_started/check-environment.html* +This tutorial uses *@you@* to denote your username. Replace *@you@* with your user name in all the following examples. + h2. Create a new script -Our second script will filter the output of @parallel_hash.py@ and only include hashes that start with 0. Create a new script in @crunch_scripts/@ called @0-filter.py@: +Our second script will filter the output of @hash.py@ and only include hashes that start with 0. Create a new script in ~/you/crunch_scripts/ called @0-filter.py@: -
    {% include '0_filter_py' %}
    + {% code '0_filter_py' as python %} Now add it to git: -
    $ git add 0-filter.py
    -$ git commit -m"zero filter"
    -$ git push origin master
    +
    ~/you/crunch_scripts$ chmod +x 0-filter.py
    +~/you/crunch_scripts$ git add 0-filter.py
    +~/you/crunch_scripts$ git commit -m"zero filter"
    +~/you/crunch_scripts$ git push origin master
     
    @@ -32,16 +31,19 @@ h2. Create a pipeline template Next, create a file that contains the pipeline definition: -
    $ cat >the_pipeline <<EOF
    +
    ~/you/crunch_scripts$ cat >~/the_pipeline <<EOF
     {
    -  "name":"my_first_pipeline",
    +  "name":"Filter md5 hash values",
       "components":{
         "do_hash":{
    -      "script":"parallel-hash.py",
    +      "script":"hash.py",
           "script_parameters":{
    -        "input": "887cd41e9c613463eab2f0d885c6dd96+83"
    +        "input":{
    +          "required": true,
    +          "dataclass": "Collection"
    +        }
           },
    -      "script_version":"you:master"
    +      "script_version":"you:master"
         },
         "filter":{
           "script":"0-filter.py",
    @@ -50,109 +52,22 @@ Next, create a file that contains the pipeline definition:
               "output_of":"do_hash"
             }
           },
    -      "script_version":"you:master"
    +      "script_version":"you:master"
         }
       }
     }
     EOF
    -
    +
    -* @"name"@ is a human-readable name for the pipeline -* @"components"@ is a set of scripts that make up the pipeline -* Each component is listed with a human-readable name (@"do_hash"@ and @"filter"@ in this example) -* Each item in @"components"@ is a single Arvados job, and uses the same format that we saw previously with @arv job create@ -* @"output_of"@ indicates that the @"input"@ of @"filter"@ is the @"output"@ of the @"do_hash"@ component. This is a _dependency_. Arvados uses the dependencies between jobs to automatically determine the correct order to run the jobs. +* @"output_of"@ indicates that the @input@ of the @do_hash@ component is connected to the @output@ of @filter@. This is a _dependency_. Arvados uses the dependencies between jobs to automatically determine the correct order to run the jobs. Now, use @arv pipeline_template create@ tell Arvados about your pipeline template: -
    $ arv pipeline_template create --pipeline-template "$(cat the_pipeline)"
    -qr1hi-p5p6p-xxxxxxxxxxxxxxx
    +
    ~/you/crunch_scripts$ arv pipeline_template create --pipeline-template "$(cat ~/the_pipeline)"
     
    -Your new pipeline template will appear on the Workbench %(rarr)→% Compute %(rarr)→% Pipeline templates page. - -h3. Running a pipeline - -Run the pipeline using @arv pipeline run@, using the UUID that you received from @arv pipeline create@: - - -
    $ arv pipeline run --template qr1hi-p5p6p-xxxxxxxxxxxxxxx
    -2013-12-16 14:08:40 +0000 -- pipeline_instance qr1hi-d1hrv-vxzkp38nlde9yyr
    -do_hash qr1hi-8i9sb-hoyc2u964ecv1s6 queued 2013-12-16T14:08:40Z
    -filter  -                           -
    -2013-12-16 14:08:51 +0000 -- pipeline_instance qr1hi-d1hrv-vxzkp38nlde9yyr
    -do_hash qr1hi-8i9sb-hoyc2u964ecv1s6 e2ccd204bca37c77c0ba59fc470cd0f7+162
    -filter  qr1hi-8i9sb-w5k40fztqgg9i2x queued 2013-12-16T14:08:50Z
    -2013-12-16 14:09:01 +0000 -- pipeline_instance qr1hi-d1hrv-vxzkp38nlde9yyr
    -do_hash qr1hi-8i9sb-hoyc2u964ecv1s6 e2ccd204bca37c77c0ba59fc470cd0f7+162
    -filter  qr1hi-8i9sb-w5k40fztqgg9i2x 735ac35adf430126cf836547731f3af6+56
    -
    -
    - -This instantiates your pipeline and displays a live feed of its status. The new pipeline instance will also show up on the Workbench %(rarr)→% Compute %(rarr)→% Pipeline instances page. - -Arvados adds each pipeline component to the job queue as its dependencies are satisfied (or immediately if it has no dependencies) and finishes when all components are completed or failed and there is no more work left to do. - -The Keep locators of the output of each of @"do_hash"@ and @"filter"@ component are available from the output log shown above. The output is also available on the Workbench by navigating to %(rarr)→% Compute %(rarr)→% Pipeline instances %(rarr)→% pipeline uuid under the *id* column %(rarr)→% components. - - -
    $ arv keep get e2ccd204bca37c77c0ba59fc470cd0f7+162/md5sum.txt
    -0f1d6bcf55c34bed7f92a805d2d89bbf alice.txt
    -504938460ef369cd275e4ef58994cffe bob.txt
    -8f3b36aff310e06f3c5b9e95678ff77a carol.txt
    -$ arv keep get 735ac35adf430126cf836547731f3af6+56
    -0f1d6bcf55c34bed7f92a805d2d89bbf alice.txt
    -
    -
    - -Indeed, the filter has picked out just the "alice" file as having a hash that starts with 0. - -h3. Running a pipeline with different parameters - -Notice that the pipeline definition explicitly specifies the Keep locator for the input: - - -
    ...
    -    "do_hash":{
    -      "script_parameters":{
    -        "input": "887cd41e9c613463eab2f0d885c6dd96+83"
    -      },
    -    }
    -...
    -
    -
    - -What if we want to run the pipeline on a different input block? One option is to define a new pipeline template, but would potentially result in clutter with many pipeline templates defined for one-off jobs. Instead, you can override values in the input of the component like this: - - -
    $ arv pipeline run --template qr1hi-d1hrv-vxzkp38nlde9yyr do_hash::input=33a9f3842b01ea3fdf27cc582f5ea2af+242
    -2013-12-17 20:31:24 +0000 -- pipeline_instance qr1hi-d1hrv-tlkq20687akys8e
    -do_hash qr1hi-8i9sb-rffhuay4jryl2n2 queued 2013-12-17T20:31:24Z
    -filter  -                           -
    -2013-12-17 20:31:34 +0000 -- pipeline_instance qr1hi-d1hrv-tlkq20687akys8e
    -do_hash qr1hi-8i9sb-rffhuay4jryl2n2 {:done=>1, :running=>1, :failed=>0, :todo=>0}
    -filter  -                           -
    -2013-12-17 20:31:44 +0000 -- pipeline_instance qr1hi-d1hrv-tlkq20687akys8e
    -do_hash qr1hi-8i9sb-rffhuay4jryl2n2 {:done=>1, :running=>1, :failed=>0, :todo=>0}
    -filter  -                           -
    -2013-12-17 20:31:55 +0000 -- pipeline_instance qr1hi-d1hrv-tlkq20687akys8e
    -do_hash qr1hi-8i9sb-rffhuay4jryl2n2 880b55fb4470b148a447ff38cacdd952+54
    -filter  qr1hi-8i9sb-j347g1sqovdh0op queued 2013-12-17T20:31:55Z
    -2013-12-17 20:32:05 +0000 -- pipeline_instance qr1hi-d1hrv-tlkq20687akys8e
    -do_hash qr1hi-8i9sb-rffhuay4jryl2n2 880b55fb4470b148a447ff38cacdd952+54
    -filter  qr1hi-8i9sb-j347g1sqovdh0op fb728f0ffe152058fa64b9aeed344cb5+54
    -
    -
    - -Now check the output: - - -
    $ arv keep ls -s fb728f0ffe152058fa64b9aeed344cb5+54
    -0 0-filter.txt
    -
    -
    +Your new pipeline template will appear on the "Workbench %(rarr)→% Compute %(rarr)→% Pipeline templates":http://{{ site.arvados_workbench_host }}/pipeline_instances page. -Here the filter script output is empty, so none of the files in the collection have hash code that start with 0. diff --git a/doc/user/tutorials/tutorial-pipeline-workbench.html.textile.liquid b/doc/user/tutorials/tutorial-pipeline-workbench.html.textile.liquid new file mode 100644 index 0000000000..52dafb7cce --- /dev/null +++ b/doc/user/tutorials/tutorial-pipeline-workbench.html.textile.liquid @@ -0,0 +1,25 @@ +--- +layout: default +navsection: userguide +title: "Running a pipeline using Workbench" +... + +notextile.
    + +# Go to "Collections":http://{{ site.arvados_workbench_host }}/collections . +# On the collections page, go to the search box and search for "tutorial". +# This should yield a collection with the contents "var-GS000016015-ASM.tsv.bz2" +# Click on the check box to the left of "var-GS000016015-ASM.tsv.bz2". This puts the collection in your persistent selection list. Click on the paperclip in the upper right to get a dropdown menu listing your current selections. +# Go to "Pipeline templates":http://{{ site.arvados_workbench_host }}/pipeline_templates . +# Look for a pipeline named "Tutorial pipeline". +# Click on the play button to the left of "Tutorial pipeline". This will take you to a new page to configure the pipeline. +# Under *parameter* look for "input". Set the value of "input" by clicking on on "none" to get a editing popup. At the top of the selection list in the editing popup will be the collection that you selected in step 4. +# You can now click on "Run pipeline" in the upper right to start the pipeline. +# This will reload the page with the pipeline queued to run. +# The page refreshes automatically every 15 seconds. You should see the pipeline running, and then finish successfully. +# Once it is finished, click on the link under the *output* column. This will take you to the collection page for the output of this pipeline. +# Click on "md5sum.txt" to see the actual file that is the output of this pipeline. +# On the collection page, click on the "Provenance graph" tab to see a graphical representation of the data elements and pipelines that were involved in generating this file. + +notextile.
    + diff --git a/doc/zenweb-liquid.rb b/doc/zenweb-liquid.rb index 545a0d8f5a..0be882a48b 100644 --- a/doc/zenweb-liquid.rb +++ b/doc/zenweb-liquid.rb @@ -1,4 +1,5 @@ require 'zenweb' +require 'liquid' module ZenwebLiquid VERSION = '0.0.1' @@ -15,7 +16,6 @@ module Zenweb ## # Render a page's liquid and return the intermediate result def liquid template, content, page, binding = TOPLEVEL_BINDING - require 'liquid' Liquid::Template.file_system = Liquid::LocalFileSystem.new(File.join(File.dirname(Rake.application().rakefile), "_includes")) unless defined? @liquid_template @liquid_template = Liquid::Template.parse(template) @@ -38,4 +38,35 @@ module Zenweb @liquid_template.render(vars) end end + + class LiquidCode < Liquid::Include + Syntax = /(#{Liquid::QuotedFragment}+)(\s+(?:as)\s+(#{Liquid::QuotedFragment}+))?/o + + def initialize(tag_name, markup, tokens) + Liquid::Tag.instance_method(:initialize).bind(self).call(tag_name, markup, tokens) + + if markup =~ Syntax + @template_name = $1 + @language = $3 + @attributes = {} + else + raise SyntaxError.new("Error in tag 'code' - Valid syntax: include '[code_file]' as '[language]'") + end + end + + def render(context) + require 'coderay' + + partial = load_cached_partial(context) + html = '' + + context.stack do + html = CodeRay.scan(partial.root.nodelist.join, @language).div + end + + html + end + + Liquid::Template.register_tag('code', LiquidCode) + end end diff --git a/docker/README.md b/docker/README.md index ce0bf2a209..f521b8c901 100644 --- a/docker/README.md +++ b/docker/README.md @@ -39,27 +39,17 @@ Prerequisites none /cgroup cgroup defaults 0 0 $ sudo mount /cgroup
    - - 3. Enable IPv4 forwarding: - -
    -     $ grep ipv4.ip_forward /etc/sysctl.conf
    -     net.ipv4.ip_forward=1
    -     $ sudo sysctl net.ipv4.ip_forward=1
    -     
    - 4. [Download and run a docker binary from docker.io.](http://docs.docker.io/en/latest/installation/binaries/) + 3. [Download and run a docker binary from docker.io.](http://docs.docker.io/en/latest/installation/binaries/) -* Ruby (any version) +* Ruby (version 1.9.3 or greater) * sudo privileges to run `debootstrap` Building -------- -1. Copy `config.yml.example` to `config.yml` and edit it with settings - for your installation. -2. Run `make` to build the following Docker images: +Type `./build.sh` to configure and build the following Docker images: * arvados/api - the Arvados API server * arvados/doc - Arvados documentation @@ -67,13 +57,10 @@ Building * arvados/workbench - the Arvados console * arvados/sso - the Arvados single-signon authentication server - You may also build Docker images for individual Arvados services: - - $ make api-image - $ make doc-image - $ make warehouse-image - $ make workbench-image - $ make sso-image +`build.sh` will generate reasonable defaults for all configuration +settings. If you want more control over the way Arvados is +configured, first copy `config.yml.example` to `config.yml` and edit +it with appropriate configuration settings, and then run `./build.sh`. Running ------- diff --git a/sdk/cli/bin/arv-run-pipeline-instance b/sdk/cli/bin/arv-run-pipeline-instance index d2b1109e16..91d7192c07 100755 --- a/sdk/cli/bin/arv-run-pipeline-instance +++ b/sdk/cli/bin/arv-run-pipeline-instance @@ -2,8 +2,8 @@ # == Synopsis # -# wh-run-pipeline-instance --template pipeline-template-uuid [options] [--] [parameters] -# wh-run-pipeline-instance --instance pipeline-instance-uuid [options] +# arv-run-pipeline-instance --template pipeline-template-uuid [options] [--] [parameters] +# arv-run-pipeline-instance --instance pipeline-instance-uuid [options] # # Satisfy a pipeline template by finding or submitting a mapreduce job # for each pipeline component. @@ -21,7 +21,7 @@ # to finish. Just find out whether jobs are finished, # queued, or running for each component # -# [--create-instance-only] Do not try to satisfy any components. Just +# [--submit] Do not try to satisfy any components. Just # create an instance, print its UUID to # stdout, and exit. # @@ -80,13 +80,15 @@ $arvados_api_token = ENV['ARVADOS_API_TOKEN'] or begin require 'rubygems' - require 'google/api_client' require 'json' require 'pp' require 'trollop' -rescue LoadError + require 'google/api_client' +rescue LoadError => l + puts $: abort <<-EOS -#{$0}: fatal: some runtime dependencies are missing. +#{$0}: fatal: #{l.message} +Some runtime dependencies may be missing. Try: gem install pp google-api-client json trollop EOS end @@ -173,10 +175,14 @@ p = Trollop::Parser.new do "UUID of pipeline instance.", :short => :none, :type => :string) - opt(:create_instance_only, + opt(:submit, "Do not try to satisfy any components. Just create a pipeline instance and output its UUID.", :short => :none, :type => :boolean) + opt(:run_here, + "Manage the pipeline in process.", + :short => :none, + :type => :boolean) stop_on [:'--'] end $options = Trollop::with_standard_exception_handling p do @@ -185,13 +191,33 @@ end $debuglevel = $options[:debug_level] || ($options[:debug] && 1) || 0 if $options[:instance] - if $options[:template] or $options[:create_instance_only] - abort "#{$0}: syntax error: --instance cannot be combined with --template or --create-instance-only." + if $options[:template] or $options[:submit] + abort "#{$0}: syntax error: --instance cannot be combined with --template or --submit." end elsif not $options[:template] abort "#{$0}: syntax error: you must supply a --template or --instance." end +if $options[:run_here] == $options[:submit] + abort "#{$0}: syntax error: you must supply either --run-here or --submit." +end + +# Suppress SSL certificate checks if ARVADOS_API_HOST_INSECURE + +module Kernel + def suppress_warnings + original_verbosity = $VERBOSE + $VERBOSE = nil + result = yield + $VERBOSE = original_verbosity + return result + end +end + +if ENV['ARVADOS_API_HOST_INSECURE'] + suppress_warnings { OpenSSL::SSL::VERIFY_PEER = OpenSSL::SSL::VERIFY_NONE } +end + # Set up the API client. $client ||= Google::APIClient. @@ -493,12 +519,13 @@ class WhRunPipelineInstance if p.is_a? Hash and p[:output_of] == cname.to_s debuglog "parameter #{c2name}::#{pname} == #{c[:job][:output]}" c2[:script_parameters][pname] = c[:job][:output] + moretodo = true end end end elsif c[:job][:running] || (!c[:job][:started_at] && !c[:job][:cancelled_at]) - moretodo ||= !@options[:no_wait] + moretodo = true elsif c[:job][:cancelled_at] debuglog "component #{cname} job #{c[:job][:uuid]} cancelled." end @@ -507,6 +534,11 @@ class WhRunPipelineInstance @instance[:components] = @components @instance[:active] = moretodo report_status + + if @options[:no_wait] + moretodo = false + end + if moretodo begin sleep 10 @@ -516,7 +548,26 @@ class WhRunPipelineInstance end end end - @instance[:success] = @components.reject { |cname,c| c[:job] and c[:job][:success] }.empty? + + ended = 0 + succeeded = 0 + failed = 0 + @components.each do |cname, c| + if c[:job] + if c[:job][:finished_at] + ended += 1 + if c[:job][:success] == true + succeeded += 1 + end + end + end + end + + if ended == @components.length + @instance[:active] = false + @instance[:success] = (succeeded == @components.length) + end + @instance.save end @@ -579,7 +630,7 @@ begin end runner.apply_parameters(p.leftovers) runner.setup_instance - if $options[:create_instance_only] + if $options[:submit] runner.instance.save puts runner.instance[:uuid] else diff --git a/sdk/cli/bin/crunch-job b/sdk/cli/bin/crunch-job index 2ba36f2b25..4c8acbb59a 100755 --- a/sdk/cli/bin/crunch-job +++ b/sdk/cli/bin/crunch-job @@ -580,10 +580,6 @@ for (my $todo_ptr = 0; $todo_ptr <= $#jobstep_todo; $todo_ptr ++) $command .= "&& perl -"; } - $ENV{"PYTHONPATH"} =~ s{^}{:} if $ENV{"PYTHONPATH"}; - $ENV{"PYTHONPATH"} =~ s{^}{$ENV{CRUNCH_SRC}/sdk/python}; # xxx hack - $ENV{"PYTHONPATH"} =~ s{^}{$ENV{CRUNCH_SRC}/arvados/sdk/python:}; # xxx hack - $ENV{"PYTHONPATH"} =~ s{$}{:/usr/local/arvados/src/sdk/python}; # xxx hack $command .= "&& exec arv-mount $ENV{TASK_KEEPMOUNT} --exec $ENV{CRUNCH_SRC}/crunch_scripts/" . $Job->{"script"}; my @execargs = ('bash', '-c', $command); diff --git a/sdk/python/arvados/collection.py b/sdk/python/arvados/collection.py index b48b6df009..fb3dea43ac 100644 --- a/sdk/python/arvados/collection.py +++ b/sdk/python/arvados/collection.py @@ -22,6 +22,7 @@ from keep import * from stream import * import config import errors +import util def normalize_stream(s, stream): stream_tokens = [s] @@ -84,12 +85,15 @@ def normalize(collection): class CollectionReader(object): def __init__(self, manifest_locator_or_text): - if re.search(r'^[a-f0-9]{32}\+\d+(\+\S)*$', manifest_locator_or_text): + if re.search(r'^[a-f0-9]{32}(\+\d+)?(\+\S+)*$', manifest_locator_or_text): self._manifest_locator = manifest_locator_or_text self._manifest_text = None - else: + elif re.search(r'^\S+( [a-f0-9]{32,}(\+\S+)*)*( \d+:\d+:\S+)+\n', manifest_locator_or_text): self._manifest_text = manifest_locator_or_text self._manifest_locator = None + else: + raise errors.ArgumentError( + "Argument to CollectionReader must be a manifest or a collection UUID") self._streams = None def __enter__(self): diff --git a/sdk/python/arvados/errors.py b/sdk/python/arvados/errors.py index 5ea54befde..e4c69a3c83 100644 --- a/sdk/python/arvados/errors.py +++ b/sdk/python/arvados/errors.py @@ -1,5 +1,7 @@ # errors.py - Arvados-specific exceptions. +class ArgumentError(Exception): + pass class SyntaxError(Exception): pass class AssertionError(Exception): diff --git a/sdk/python/bin/arv-mount b/sdk/python/bin/arv-mount index ac9cd9bcf6..5e773dfbc6 100755 --- a/sdk/python/bin/arv-mount +++ b/sdk/python/bin/arv-mount @@ -51,10 +51,14 @@ with "--". # wait until the driver is finished initializing operations.initlock.wait() + rc = 255 try: rc = subprocess.call(args.exec_args, shell=False) - except: - rc = 255 + except OSError as e: + sys.stderr.write('arv-mount: %s -- exec %s\n' % (str(e), args.exec_args)) + rc = e.errno + except Exception as e: + sys.stderr.write('arv-mount: %s\n' % str(e)) finally: subprocess.call(["fusermount", "-u", "-z", args.mountpoint]) diff --git a/sdk/python/bin/arv-normalize b/sdk/python/bin/arv-normalize index b1a6ca7b42..755b565072 100755 --- a/sdk/python/bin/arv-normalize +++ b/sdk/python/bin/arv-normalize @@ -13,6 +13,8 @@ logger = logging.getLogger(os.path.basename(sys.argv[0])) parser = argparse.ArgumentParser( description='Read manifest on standard input and put normalized manifest on standard output.') +parser.add_argument('--extract', type=str, help="The file to extract from the input manifest") + args = parser.parse_args() import arvados @@ -21,4 +23,17 @@ r = sys.stdin.read() cr = arvados.CollectionReader(r) -print cr.manifest_text() +if args.extract: + i = args.extract.rfind('/') + if i == -1: + stream = '.' + fn = args.extract + else: + stream = args.extract[:i] + fn = args.extract[(i+1):] + for s in cr.all_streams(): + if s.name() == stream: + if fn in s.files(): + sys.stdout.write(s.files()[fn].as_manifest()) +else: + sys.stdout.write(cr.manifest_text()) diff --git a/sdk/python/setup.py.src b/sdk/python/setup.py.src index d527d5d67d..9b82f4efe5 100644 --- a/sdk/python/setup.py.src +++ b/sdk/python/setup.py.src @@ -17,6 +17,7 @@ setup(name='arvados-python-client', 'bin/arv-put', 'bin/arv-mount', 'bin/arv-ls', + 'bin/arv-normalize', ], install_requires=[ 'python-gflags', diff --git a/sdk/python/test_collections.py b/sdk/python/test_collections.py index 3dfc72f65b..7df620d977 100644 --- a/sdk/python/test_collections.py +++ b/sdk/python/test_collections.py @@ -42,7 +42,7 @@ class LocalCollectionReaderTest(unittest.TestCase): os.environ['KEEP_LOCAL_STORE'] = '/tmp' LocalCollectionWriterTest().runTest() def runTest(self): - cr = arvados.CollectionReader('d6c3b8e571f1b81ebb150a45ed06c884+114') + cr = arvados.CollectionReader('d6c3b8e571f1b81ebb150a45ed06c884+114+Xzizzle') got = [] for s in cr.all_streams(): for f in s.all_files(): diff --git a/services/api/.gitignore b/services/api/.gitignore index 80ba000190..6ddf5231ce 100644 --- a/services/api/.gitignore +++ b/services/api/.gitignore @@ -18,6 +18,7 @@ /config/api.clinicalfuture.com.* /config/database.yml /config/initializers/omniauth.rb +/config/application.yml # asset cache /public/assets/ diff --git a/services/api/Gemfile b/services/api/Gemfile index 59b16cc7ea..ded0358394 100644 --- a/services/api/Gemfile +++ b/services/api/Gemfile @@ -5,7 +5,13 @@ gem 'rails', '~> 3.2.0' # Bundle edge Rails instead: # gem 'rails', :git => 'git://github.com/rails/rails.git' -#gem 'sqlite3' +group :test, :development do + gem 'sqlite3' +end + +# This might not be needed in :test and :development, but we load it +# anyway to make sure it always gets in Gemfile.lock and to help +# reveal install problems sooner rather than later. gem 'pg' # Start using multi_json once we are on Rails 3.2; @@ -53,3 +59,8 @@ gem 'andand' gem 'redis' gem 'test_after_commit', :group => :test + +gem 'google-api-client', '~> 0.6.3' +gem 'trollop' + +gem 'arvados-cli', '>= 0.1.20140311162926' diff --git a/services/api/Gemfile.lock b/services/api/Gemfile.lock index 3929125b37..c734ab560b 100644 --- a/services/api/Gemfile.lock +++ b/services/api/Gemfile.lock @@ -32,8 +32,21 @@ GEM activemodel (>= 3.0.0) activesupport (>= 3.0.0) rack (>= 1.1.0) + addressable (2.3.5) andand (1.3.3) arel (3.0.2) + arvados-cli (0.1.20140311162926) + activesupport (~> 3.2, >= 3.2.13) + andand (~> 1.3, >= 1.3.3) + curb (~> 0.8) + google-api-client (~> 0.6.3) + json (~> 1.7, >= 1.7.7) + oj (~> 2.0, >= 2.0.3) + trollop (~> 2.0) + autoparse (0.3.3) + addressable (>= 2.3.1) + extlib (>= 0.9.15) + multi_json (>= 1.0.0) builder (3.0.4) capistrano (2.15.5) highline @@ -48,11 +61,23 @@ GEM coffee-script-source execjs coffee-script-source (1.6.3) + curb (0.8.5) daemon_controller (1.1.7) erubis (2.7.0) execjs (2.0.2) + extlib (0.9.16) faraday (0.8.8) multipart-post (~> 1.2.0) + google-api-client (0.6.4) + addressable (>= 2.3.2) + autoparse (>= 0.3.3) + extlib (>= 0.9.15) + faraday (~> 0.8.4) + jwt (>= 0.1.5) + launchy (>= 2.1.1) + multi_json (>= 1.0.0) + signet (~> 0.4.5) + uuidtools (>= 2.1.0) hashie (1.2.0) highline (1.6.20) hike (1.2.3) @@ -65,6 +90,8 @@ GEM json (1.8.1) jwt (0.1.8) multi_json (>= 1.5) + launchy (2.4.2) + addressable (~> 2.3) libv8 (3.16.14.3) mail (2.5.4) mime-types (~> 1.16) @@ -132,11 +159,17 @@ GEM railties (~> 3.2.0) sass (>= 3.1.10) tilt (~> 1.3) + signet (0.4.5) + addressable (>= 2.2.3) + faraday (~> 0.8.1) + jwt (>= 0.1.5) + multi_json (>= 1.0.0) sprockets (2.2.2) hike (~> 1.2) multi_json (~> 1.0) rack (~> 1.0) tilt (~> 1.1, != 1.3.0) + sqlite3 (1.3.8) test_after_commit (0.2.2) therubyracer (0.12.0) libv8 (~> 3.16.14.0) @@ -146,10 +179,12 @@ GEM treetop (1.4.15) polyglot polyglot (>= 0.3.1) + trollop (2.0) tzinfo (0.3.38) uglifier (2.3.0) execjs (>= 0.3.0) json (>= 1.8.0) + uuidtools (2.1.4) PLATFORMS ruby @@ -157,7 +192,9 @@ PLATFORMS DEPENDENCIES acts_as_api andand + arvados-cli (>= 0.1.20140311162926) coffee-rails (~> 3.2.0) + google-api-client (~> 0.6.3) jquery-rails multi_json oj @@ -169,6 +206,8 @@ DEPENDENCIES redis rvm-capistrano sass-rails (>= 3.2.0) + sqlite3 test_after_commit therubyracer + trollop uglifier (>= 1.0.3) diff --git a/services/api/app/controllers/application_controller.rb b/services/api/app/controllers/application_controller.rb index 8ed554f8ca..2d37dc18cd 100644 --- a/services/api/app/controllers/application_controller.rb +++ b/services/api/app/controllers/application_controller.rb @@ -39,7 +39,7 @@ class ApplicationController < ActionController::Base if @object.save show else - render_error "Save failed" + raise "Save failed" end end @@ -50,7 +50,7 @@ class ApplicationController < ActionController::Base if @object.update_attributes attrs_to_update show else - render_error "Update failed" + raise "Update failed" end end @@ -88,7 +88,9 @@ class ApplicationController < ActionController::Base def render_error(e) logger.error e.inspect - logger.error e.backtrace.collect { |x| x + "\n" }.join('') if e.backtrace + if e.respond_to? :backtrace and e.backtrace + logger.error e.backtrace.collect { |x| x + "\n" }.join('') + end if @object and @object.errors and @object.errors.full_messages and not @object.errors.full_messages.empty? errors = @object.errors.full_messages else @@ -313,7 +315,7 @@ class ApplicationController < ActionController::Base if supplied_token api_client_auth = ApiClientAuthorization. includes(:api_client, :user). - where('api_token=? and (expires_at is null or expires_at > now())', supplied_token). + where('api_token=? and (expires_at is null or expires_at > CURRENT_TIMESTAMP)', supplied_token). first if api_client_auth.andand.user session[:user_id] = api_client_auth.user.id @@ -444,13 +446,15 @@ class ApplicationController < ActionController::Base end def render *opts - response = opts.first[:json] - if response.is_a?(Hash) && - params[:_profile] && - Thread.current[:request_starttime] - response[:_profile] = { - request_time: Time.now - Thread.current[:request_starttime] - } + if opts.first + response = opts.first[:json] + if response.is_a?(Hash) && + params[:_profile] && + Thread.current[:request_starttime] + response[:_profile] = { + request_time: Time.now - Thread.current[:request_starttime] + } + end end super *opts end diff --git a/services/api/app/controllers/arvados/v1/api_client_authorizations_controller.rb b/services/api/app/controllers/arvados/v1/api_client_authorizations_controller.rb index 10a009807c..8fd915ddfb 100644 --- a/services/api/app/controllers/arvados/v1/api_client_authorizations_controller.rb +++ b/services/api/app/controllers/arvados/v1/api_client_authorizations_controller.rb @@ -28,6 +28,7 @@ class Arvados::V1::ApiClientAuthorizationsController < ApplicationController resource_attrs[:user_id] = User.where(uuid: resource_attrs.delete(:owner_uuid)).first.andand.id end + resource_attrs[:api_client_id] = Thread.current[:api_client].id super end diff --git a/services/api/app/controllers/user_sessions_controller.rb b/services/api/app/controllers/user_sessions_controller.rb index 71c2823dc1..3674c010cb 100644 --- a/services/api/app/controllers/user_sessions_controller.rb +++ b/services/api/app/controllers/user_sessions_controller.rb @@ -121,7 +121,8 @@ class UserSessionsController < ApplicationController api_client_auth = ApiClientAuthorization. new(user: user, api_client: @api_client, - created_by_ip_address: remote_ip) + created_by_ip_address: remote_ip, + scopes: ["all"]) api_client_auth.save! if callback_url.index('?') diff --git a/services/api/app/models/arvados_model.rb b/services/api/app/models/arvados_model.rb index 9475f0dd1d..84bdf95763 100644 --- a/services/api/app/models/arvados_model.rb +++ b/services/api/app/models/arvados_model.rb @@ -140,7 +140,7 @@ class ArvadosModel < ActiveRecord::Base def update_modified_by_fields self.created_at ||= Time.now - self.owner_uuid ||= current_default_owner + self.owner_uuid ||= current_default_owner if self.respond_to? :owner_uuid= self.modified_at = Time.now self.modified_by_user_uuid = current_user ? current_user.uuid : nil self.modified_by_client_uuid = current_api_client ? current_api_client.uuid : nil diff --git a/services/api/app/models/blob.rb b/services/api/app/models/blob.rb new file mode 100644 index 0000000000..11fab9fb59 --- /dev/null +++ b/services/api/app/models/blob.rb @@ -0,0 +1,96 @@ +class Blob + + # In order to get a Blob from Keep, you have to prove either + # [a] you have recently written it to Keep yourself, or + # [b] apiserver has recently decided that you should be able to read it + # + # To ensure that the requestor of a blob is authorized to read it, + # Keep requires clients to timestamp the blob locator with an expiry + # time, and to sign the timestamped locator with their API token. + # + # A signed blob locator has the form: + # locator_hash +A blob_signature @ timestamp + # where the timestamp is a Unix time expressed as a hexadecimal value, + # and the blob_signature is the signed locator_hash + API token + timestamp. + # + class InvalidSignatureError < StandardError + end + + # Blob.sign_locator: return a signed and timestamped blob locator. + # + # The 'opts' argument should include: + # [required] :key - the Arvados server-side blobstore key + # [required] :api_token - user's API token + # [optional] :ttl - number of seconds before this request expires + # + def self.sign_locator blob_locator, opts + # We only use the hash portion for signatures. + blob_hash = blob_locator.split('+').first + + # Generate an expiry timestamp (seconds since epoch, base 16) + timestamp = (Time.now.to_i + (opts[:ttl] || 600)).to_s(16) + # => "53163cb4" + + # Generate a signature. + signature = + generate_signature opts[:key], blob_hash, opts[:api_token], timestamp + + blob_locator + '+A' + signature + '@' + timestamp + end + + # Blob.verify_signature + # Safely verify the signature on a blob locator. + # Return value: true if the locator has a valid signature, false otherwise + # Arguments: signed_blob_locator, opts + # + def self.verify_signature *args + begin + self.verify_signature! *args + true + rescue Blob::InvalidSignatureError + false + end + end + + # Blob.verify_signature! + # Verify the signature on a blob locator. + # Return value: true if the locator has a valid signature + # Arguments: signed_blob_locator, opts + # Exceptions: + # Blob::InvalidSignatureError if the blob locator does not include a + # valid signature + # + def self.verify_signature! signed_blob_locator, opts + blob_hash = signed_blob_locator.split('+').first + given_signature, timestamp = signed_blob_locator. + split('+A').last. + split('+').first. + split('@') + + if !timestamp + raise Blob::InvalidSignatureError.new 'No signature provided.' + end + if !timestamp.match /^[\da-f]+$/ + raise Blob::InvalidSignatureError.new 'Timestamp is not a base16 number.' + end + if timestamp.to_i(16) < Time.now.to_i + raise Blob::InvalidSignatureError.new 'Signature expiry time has passed.' + end + + my_signature = + generate_signature opts[:key], blob_hash, opts[:api_token], timestamp + + if my_signature != given_signature + raise Blob::InvalidSignatureError.new 'Signature is invalid.' + end + + true + end + + def self.generate_signature key, blob_hash, api_token, timestamp + OpenSSL::HMAC.hexdigest('sha1', key, + [blob_hash, + api_token, + timestamp].join('@')) + end +end diff --git a/services/api/app/models/node.rb b/services/api/app/models/node.rb index 459535b52d..805e1ccd41 100644 --- a/services/api/app/models/node.rb +++ b/services/api/app/models/node.rb @@ -8,13 +8,7 @@ class Node < ArvadosModel MAX_SLOTS = 64 - @@confdir = if Rails.configuration.respond_to? :dnsmasq_conf_dir - Rails.configuration.dnsmasq_conf_dir - elsif File.exists? '/etc/dnsmasq.d/.' - '/etc/dnsmasq.d' - else - nil - end + @@confdir = Rails.configuration.dnsmasq_conf_dir @@domain = Rails.configuration.compute_node_domain rescue `hostname --domain`.strip @@nameservers = Rails.configuration.compute_node_nameservers @@ -127,8 +121,8 @@ class Node < ArvadosModel def start!(ping_url_method) ensure_permission_to_update ping_url = ping_url_method.call({ uuid: self.uuid, ping_secret: self.info[:ping_secret] }) - if (Rails.configuration.compute_node_ec2run_args rescue false) and - (Rails.configuration.compute_node_ami rescue false) + if (Rails.configuration.compute_node_ec2run_args and + Rails.configuration.compute_node_ami) ec2_args = ["--user-data '#{ping_url}'", "-t c1.xlarge -n 1", Rails.configuration.compute_node_ec2run_args, diff --git a/services/api/app/models/pipeline_instance.rb b/services/api/app/models/pipeline_instance.rb index 43497da6f4..ad96b771a4 100644 --- a/services/api/app/models/pipeline_instance.rb +++ b/services/api/app/models/pipeline_instance.rb @@ -61,6 +61,10 @@ class PipelineInstance < ArvadosModel t.collect { |r| r[2] }.inject(0.0) { |sum,a| sum += a } / t.size end + def self.queue + self.where('active = true') + end + protected def bootstrap_components if pipeline_template and (!components or components.empty?) diff --git a/services/api/app/models/user.rb b/services/api/app/models/user.rb index a85a63df7d..0896571939 100644 --- a/services/api/app/models/user.rb +++ b/services/api/app/models/user.rb @@ -124,7 +124,7 @@ class User < ArvadosModel end def check_auto_admin - if User.where("uuid not like '%-000000000000000'").where(:is_admin => true).count == 0 and not Rails.configuration.auto_admin_user.nil? + if User.where("uuid not like '%-000000000000000'").where(:is_admin => true).count == 0 and Rails.configuration.auto_admin_user if current_user.email == Rails.configuration.auto_admin_user self.is_admin = true self.is_active = true diff --git a/services/api/config/application.default.yml b/services/api/config/application.default.yml new file mode 100644 index 0000000000..ad95f0426a --- /dev/null +++ b/services/api/config/application.default.yml @@ -0,0 +1,101 @@ +# Do not use this file for site configuration. Create application.yml +# instead (see application.yml.example). + +development: + force_ssl: false + cache_classes: false + whiny_nils: true + consider_all_requests_local: true + action_controller.perform_caching: false + action_mailer.raise_delivery_errors: false + action_mailer.perform_deliveries: false + active_support.deprecation: :log + action_dispatch.best_standards_support: :builtin + active_record.mass_assignment_sanitizer: :strict + active_record.auto_explain_threshold_in_seconds: 0.5 + assets.compress: false + assets.debug: true + +production: + force_ssl: true + cache_classes: true + consider_all_requests_local: false + action_controller.perform_caching: true + serve_static_assets: false + assets.compress: true + assets.compile: false + assets.digest: true + +test: + force_ssl: false + cache_classes: true + serve_static_assets: true + static_cache_control: public, max-age=3600 + whiny_nils: true + consider_all_requests_local: true + action_controller.perform_caching: false + action_dispatch.show_exceptions: false + action_controller.allow_forgery_protection: false + action_mailer.delivery_method: :test + active_support.deprecation: :stderr + active_record.mass_assignment_sanitizer: :strict + uuid_prefix: zzzzz + +common: + secret_token: ~ + uuid_prefix: <%= Digest::MD5.hexdigest(`hostname`).to_i(16).to_s(36)[0..4] %> + + git_repositories_dir: /var/cache/git + + # :none or :slurm_immediate + crunch_job_wrapper: :none + + # username, or false = do not set uid when running jobs. + crunch_job_user: crunch + + # The web service must be able to create/write this file, and + # crunch-job must be able to stat() it. + crunch_refresh_trigger: /tmp/crunch_refresh_trigger + + # Path to /etc/dnsmasq.d, or false = do not update dnsmasq data. + dnsmasq_conf_dir: false + + # Set to AMI id (ami-123456) to auto-start nodes. See app/models/node.rb + compute_node_ami: false + compute_node_ec2run_args: -g arvados-compute + compute_node_spot_bid: 0.11 + + compute_node_domain: <%= `hostname`.split('.')[1..-1].join('.').strip %> + compute_node_nameservers: + - 192.168.1.1 + compute_node_ec2_tag_enable: false + + accept_api_token: {} + + new_users_are_active: false + admin_notifier_email_from: arvados@example.com + email_subject_prefix: "[ARVADOS] " + + # Visitors to the API server will be redirected to the workbench + workbench_address: https://workbench.local:3001/ + + # The e-mail address of the user you would like to become marked as an admin + # user on their first login. + # In the default configuration, authentication happens through the Arvados SSO + # server, which uses openid against Google's servers, so in that case this + # should be an address associated with a Google account. + auto_admin_user: false + + ## Set Time.zone default to the specified zone and make Active + ## Record auto-convert to this zone. Run "rake -D time" for a list + ## of tasks for finding time zone names. Default is UTC. + #time_zone: Central Time (US & Canada) + + ## Default encoding used in templates for Ruby 1.9. + encoding: utf-8 + + # Enable the asset pipeline + assets.enabled: true + + # Version of your assets, change this if you want to expire all your assets + assets.version: "1.0" diff --git a/services/api/config/application.rb b/services/api/config/application.rb index 2f331eeffa..24648a9a58 100644 --- a/services/api/config/application.rb +++ b/services/api/config/application.rb @@ -26,37 +26,11 @@ module Server # Activate observers that should always be running. # config.active_record.observers = :cacher, :garbage_collector, :forum_observer - # Set Time.zone default to the specified zone and make Active Record auto-convert to this zone. - # Run "rake -D time" for a list of tasks for finding time zone names. Default is UTC. - # config.time_zone = 'Central Time (US & Canada)' - # The default locale is :en and all translations from config/locales/*.rb,yml are auto loaded. # config.i18n.load_path += Dir[Rails.root.join('my', 'locales', '*.{rb,yml}').to_s] # config.i18n.default_locale = :de - # Configure the default encoding used in templates for Ruby 1.9. - config.encoding = "utf-8" - # Configure sensitive parameters which will be filtered from the log file. config.filter_parameters += [:password] - - # Enable the asset pipeline - config.assets.enabled = true - - # Version of your assets, change this if you want to expire all your assets - config.assets.version = '1.0' - - config.force_ssl = true - - def config.uuid_prefix(x=nil) - if x and @uuid_prefix - raise "uuid_prefix was already set to #{@uuid_prefix}" - end - @uuid_prefix ||= Digest::MD5.hexdigest(x || `hostname`.strip).to_i(16).to_s(36)[-5..-1] - end - def config.uuid_prefix=(x) - @uuid_prefix = x - end end - end diff --git a/services/api/config/application.yml.example b/services/api/config/application.yml.example new file mode 100644 index 0000000000..a9c33a4bae --- /dev/null +++ b/services/api/config/application.yml.example @@ -0,0 +1,41 @@ +# Copy this file to application.yml and edit to suit. +# +# Consult application.default.yml for the full list of configuration +# settings. +# +# The order of precedence is: +# 1. config/environments/{RAILS_ENV}.rb (deprecated) +# 2. Section in application.yml corresponding to RAILS_ENV (e.g., development) +# 3. Section in application.yml called "common" +# 4. Section in application.default.yml corresponding to RAILS_ENV +# 5. Section in application.default.yml called "common" + +development: + +production: + # At minimum, you need a nice long randomly generated secret_token here. + secret_token: ~ + + uuid_prefix: bogus + + # This is suitable for AWS; see common section below for a static example. + # + #compute_node_nameservers: <%# + require 'net/http' + ['local', 'public'].collect do |iface| + Net::HTTP.get(URI("http://169.254.169.254/latest/meta-data/#{iface}-ipv4")).match(/^[\d\.]+$/)[0] + end << '172.16.0.23' + %> + +test: + uuid_prefix: zzzzz + secret_token: <%= rand(2**512).to_s(36) %> + +common: + + # Git repositories must be readable by api server, or you won't be + # able to submit crunch jobs. To pass the test suites, put a clone + # of the arvados tree in {git_repositories_dir}/arvados.git or + # {git_repositories_dir}/arvados/.git + # + #git_repositories_dir: /var/cache/git diff --git a/services/api/config/database.yml.sample b/services/api/config/database.yml.sample index 62edd8431e..25fcc7ada7 100644 --- a/services/api/config/database.yml.sample +++ b/services/api/config/database.yml.sample @@ -1,13 +1,9 @@ development: - adapter: mysql - encoding: utf8 - database: arvados_development - username: arvados - password: ******** - host: localhost + adapter: sqlite3 + database: db/arvados_development.sqlite3 test: - adapter: mysql + adapter: postgresql encoding: utf8 database: arvados_test username: arvados @@ -15,7 +11,7 @@ test: host: localhost production: - adapter: mysql + adapter: postgresql encoding: utf8 database: arvados_production username: arvados diff --git a/services/api/config/environments/test.rb.example b/services/api/config/environments/test.rb.example index 1782734f83..10608c15e3 100644 --- a/services/api/config/environments/test.rb.example +++ b/services/api/config/environments/test.rb.example @@ -76,4 +76,11 @@ Server::Application.configure do # Visitors to the API server will be redirected to the workbench config.workbench_address = "http://localhost:3000/" + + # The e-mail address of the user you would like to become marked as an admin + # user on their first login. + # In the default configuration, authentication happens through the Arvados SSO + # server, which uses openid against Google's servers, so in that case this + # should be an address associated with a Google account. + config.auto_admin_user = '' end diff --git a/services/api/config/initializers/omniauth.rb.example b/services/api/config/initializers/omniauth.rb.example index cd25374a75..aefcf56625 100644 --- a/services/api/config/initializers/omniauth.rb.example +++ b/services/api/config/initializers/omniauth.rb.example @@ -4,7 +4,7 @@ APP_ID = 'arvados-server' APP_SECRET = rand(2**512).to_s(36) # CHANGE ME! # Update your custom Omniauth provider URL here -CUSTOM_PROVIDER_URL = 'http://auth.clinicalfuture.com' +CUSTOM_PROVIDER_URL = 'http://localhost:3002' Rails.application.config.middleware.use OmniAuth::Builder do provider :josh_id, APP_ID, APP_SECRET, CUSTOM_PROVIDER_URL diff --git a/services/api/config/initializers/zz_load_config.rb b/services/api/config/initializers/zz_load_config.rb new file mode 100644 index 0000000000..3399fd9bf5 --- /dev/null +++ b/services/api/config/initializers/zz_load_config.rb @@ -0,0 +1,46 @@ +$application_config = {} + +%w(application.default application).each do |cfgfile| + path = "#{::Rails.root.to_s}/config/#{cfgfile}.yml" + if File.exists? path + yaml = ERB.new(IO.read path).result(binding) + confs = YAML.load(yaml) + $application_config.merge!(confs['common'] || {}) + $application_config.merge!(confs[::Rails.env.to_s] || {}) + end +end + +Server::Application.configure do + nils = [] + $application_config.each do |k, v| + # "foo.bar: baz" --> { config.foo.bar = baz } + cfg = config + ks = k.split '.' + k = ks.pop + ks.each do |kk| + cfg = cfg.send(kk) + end + if cfg.respond_to?(k.to_sym) and !cfg.send(k).nil? + # Config must have been set already in environments/*.rb. + # + # After config files have been migrated, this mechanism should + # be deprecated, then removed. + elsif v.nil? + # Config variables are not allowed to be nil. Make a "naughty" + # list, and present it below. + nils << k + else + cfg.send "#{k}=", v + end + end + if !nils.empty? + raise < :json, + :api_client_authorization => { + :owner_uuid => users(:spectator).uuid + } + }, {'HTTP_AUTHORIZATION' => "OAuth2 #{api_client_authorizations(:admin_trustedclient).api_token}"} + assert_response :success + + get "/arvados/v1/users/current", { + :format => :json + }, {'HTTP_AUTHORIZATION' => "OAuth2 #{jresponse['api_token']}"} + @jresponse = nil + assert_equal users(:spectator).uuid, jresponse['uuid'] + end + + test "refuse to create token for different user if not trusted client" do + post "/arvados/v1/api_client_authorizations", { + :format => :json, + :api_client_authorization => { + :owner_uuid => users(:spectator).uuid + } + }, {'HTTP_AUTHORIZATION' => "OAuth2 #{api_client_authorizations(:admin).api_token}"} + assert_response 403 + end + + test "refuse to create token for different user if not admin" do + post "/arvados/v1/api_client_authorizations", { + :format => :json, + :api_client_authorization => { + :owner_uuid => users(:spectator).uuid + } + }, {'HTTP_AUTHORIZATION' => "OAuth2 #{api_client_authorizations(:active_trustedclient).api_token}"} + assert_response 403 + end + end diff --git a/services/api/test/unit/blob_test.rb b/services/api/test/unit/blob_test.rb new file mode 100644 index 0000000000..ec6e67a168 --- /dev/null +++ b/services/api/test/unit/blob_test.rb @@ -0,0 +1,94 @@ +require 'test_helper' + +class BlobTest < ActiveSupport::TestCase + @@api_token = rand(2**512).to_s(36)[0..49] + @@key = rand(2**2048).to_s(36) + @@blob_data = 'foo' + @@blob_locator = Digest::MD5.hexdigest(@@blob_data) + + '+' + @@blob_data.size.to_s + + test 'correct' do + signed = Blob.sign_locator @@blob_locator, api_token: @@api_token, key: @@key + assert_equal true, Blob.verify_signature!(signed, api_token: @@api_token, key: @@key) + end + + test 'expired' do + signed = Blob.sign_locator @@blob_locator, api_token: @@api_token, key: @@key, ttl: -1 + assert_raise Blob::InvalidSignatureError do + Blob.verify_signature!(signed, api_token: @@api_token, key: @@key) + end + end + + test 'expired, but no raise' do + signed = Blob.sign_locator @@blob_locator, api_token: @@api_token, key: @@key, ttl: -1 + assert_equal false, Blob.verify_signature(signed, + api_token: @@api_token, + key: @@key) + end + + test 'bogus, wrong block hash' do + signed = Blob.sign_locator @@blob_locator, api_token: @@api_token, key: @@key + assert_raise Blob::InvalidSignatureError do + Blob.verify_signature!(signed.sub('acbd','abcd'), api_token: @@api_token, key: @@key) + end + end + + test 'bogus, expired' do + signed = 'acbd18db4cc2f85cedef654fccc4a4d8+3+Aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa@531641bf' + assert_raises Blob::InvalidSignatureError do + Blob.verify_signature!(signed, api_token: @@api_token, key: @@key) + end + end + + test 'bogus, wrong key' do + signed = Blob.sign_locator(@@blob_locator, + api_token: @@api_token, + key: (@@key+'x')) + assert_raise Blob::InvalidSignatureError do + Blob.verify_signature!(signed, api_token: @@api_token, key: @@key) + end + end + + test 'bogus, wrong api token' do + signed = Blob.sign_locator(@@blob_locator, + api_token: @@api_token.reverse, + key: @@key) + assert_raise Blob::InvalidSignatureError do + Blob.verify_signature!(signed, api_token: @@api_token, key: @@key) + end + end + + test 'bogus, signature format 1' do + signed = 'acbd18db4cc2f85cedef654fccc4a4d8+3+Aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa@' + assert_raise Blob::InvalidSignatureError do + Blob.verify_signature!(signed, api_token: @@api_token, key: @@key) + end + end + + test 'bogus, signature format 2' do + signed = 'acbd18db4cc2f85cedef654fccc4a4d8+3+A@531641bf' + assert_raise Blob::InvalidSignatureError do + Blob.verify_signature!(signed, api_token: @@api_token, key: @@key) + end + end + + test 'bogus, signature format 3' do + signed = 'acbd18db4cc2f85cedef654fccc4a4d8+3+Axyzzy@531641bf' + assert_raise Blob::InvalidSignatureError do + Blob.verify_signature!(signed, api_token: @@api_token, key: @@key) + end + end + + test 'bogus, timestamp format' do + signed = 'acbd18db4cc2f85cedef654fccc4a4d8+3+Aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa@xyzzy' + assert_raise Blob::InvalidSignatureError do + Blob.verify_signature!(signed, api_token: @@api_token, key: @@key) + end + end + + test 'no signature at all' do + assert_raise Blob::InvalidSignatureError do + Blob.verify_signature!(@@blob_locator, api_token: @@api_token, key: @@key) + end + end +end