Skip to content

Commit 1b98e26

Browse files
committed
Some usage info
1 parent b79234b commit 1b98e26

File tree

1 file changed

+229
-0
lines changed

1 file changed

+229
-0
lines changed

README.md

Lines changed: 229 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,12 +17,241 @@ If you would rather install from the checkout of this source code:
1717

1818
pip3 install .
1919

20+
If you would like to use the `cwltool rerun` feature you may also need:
21+
22+
pip3 install cwlref-runner
23+
24+
25+
## Development
26+
27+
To develop cwlprov-py it is recommended to set up a new [virtualenv](https://docs.python.org/3/library/venv.html):
28+
29+
virtualenv -p python3 venv
30+
31+
To activate the environment and install your development version of cwlprov:
32+
33+
. venv3/bin/activate
34+
pip3 install .
35+
36+
2037
## Usage
2138

2239
Use `cwlprov --help` to see all options. For instance `cwlprov validate` will validate the folder is valid according to CWLProv.
2340

41+
$ cwlprov --help
42+
usage: cwlprov [-h] [--version] [--directory DIRECTORY] [--relative]
43+
[--absolute] [--output OUTPUT] [--verbose] [--quiet] [--hints]
44+
[--no-hints]
45+
{validate,info,who,prov,inputs,outputs,run,runs,rerun,derived,runtimes}
46+
...
47+
48+
cwlprov explores Research Objects containing provenance of Common Workflow
49+
Language executions. <https://w3id.org/cwl/prov/>
50+
51+
optional arguments:
52+
-h, --help show this help message and exit
53+
--version show program's version number and exit
54+
--directory DIRECTORY, -d DIRECTORY
55+
Path to CWLProv Research Object (default: .)
56+
--relative Output paths relative to current directory (default if
57+
-d is missing or relative)
58+
--absolute Output absolute paths (default if -d is absolute)
59+
--output OUTPUT, -o OUTPUT
60+
File to write output to (default: stdout)
61+
--verbose, -v Verbose logging (repeat for more verbose)
62+
--quiet, -q No logging or hints
63+
--hints Show hints on cwlprov usage
64+
--no-hints Do not show hints
65+
66+
commands:
67+
{validate,info,who,prov,inputs,outputs,run,runs,rerun,derived,runtimes}
68+
validate validate the CWLProv Research Object
69+
info show research object metadata
70+
who show who ran the workflow
71+
prov export workflow execution provenance in PROV format
72+
inputs list workflow/step input files/values
73+
outputs list workflow/step output files/values
74+
run show workflow execution log
75+
runs List all workflow executions in RO
76+
rerun Rerun a workflow or step
77+
derived List what was derived from a data item, based on
78+
activity usage/generation
79+
runtimes Calculate average step execution runtimes
80+
2481
The [test/](test/) folder contains some examples of workflow runs for different CWLProv profiles.
2582

83+
All commands for `cwlprov` will attempt to detect the CWLProv research object from the current directory, alternatively take the `--directory` option to specify the root folder.
84+
85+
The `--quiet` option may be used in scripts for less verbose outputs. The `--verbose` option has the opposite affect to enable logging. For debug logging, use `-vv` or `--verbose --verbose`.
86+
87+
Note that the general arguments listed above must be provided *before* the _command_, e.g.
88+
89+
cwlprov --quiet --directory /tmp/1 validate
90+
91+
Many of the commands accept additional arguments, which can be accessed by `cwlprov COMMAND --help`, e.g.:
92+
93+
$ cwlprov run --help
94+
usage: cwlprov run [-h] [--step STEP] [--steps] [--no-steps] [--start]
95+
[--no-start] [--end] [--no-end] [--duration]
96+
[--no-duration] [--labels] [--no-labels] [--inputs]
97+
[--outputs]
98+
[id]
99+
100+
positional arguments:
101+
id workflow run UUID
102+
103+
optional arguments:
104+
-h, --help show this help message and exit
105+
--step STEP, -s STEP Show only step with given UUID
106+
--steps List steps of workflow
107+
--no-steps Do not list steps
108+
--start Show start timestamps (default)
109+
--no-start, -S Do not show start timestamps
110+
--end, -e Show end timestamps
111+
--no-end Do not show end timestamps
112+
--duration Show step duration (default)
113+
--no-duration, -D Do not show step duration
114+
--labels Show activity labels
115+
--no-labels, -L Do not show activity labels
116+
--inputs, -i Show inputs
117+
--outputs, -o Show outputs
118+
119+
120+
### Validation
121+
122+
Running `cwlprov` with no commands will return with status 0 if a CWLProv folder structure is detected:
123+
124+
$ cd test/revsort-cwlprov-0.4.0
125+
test/revsort-cwlprov-0.4.0$ cwlprov
126+
Detected CWLProv Research Object: /home/stain/src/cwlprov-py/test/revsort-cwlprov-0.4.0
127+
128+
$ cd /tmp
129+
/tmp$ cwlprov
130+
ERROR:cwlprov.tool:Could not find bagit.txt, try cwlprov -d mybag/
131+
132+
If a cwlprov is not detected or invalid, an error code is raised.
133+
134+
cwlprov && echo Do cwlprov-stuff
135+
ERROR:cwlprov.tool:Could not find bagit.txt, try cwlprov -d mybag/
136+
137+
Combined with the `--quiet` option `cwlprov` can be useful to find the root of a CWLProv folder:
138+
139+
test/revsort-cwlprov-0.4.0/metadata/provenance$ cwlprov -q
140+
/home/stain/src/cwlprov-py/test/revsort-cwlprov-0.4.0
141+
142+
All commands of `cwlprov` will by default perform a _quick validation_, which conforms all files are present in the correct file size. For instance, if we remove a file:
143+
144+
test/revsort-cwlprov-0.4.0$ rm data/32/327fc7aedf4f6b69a42a7c8b808dc5a7aff61376
145+
146+
test/revsort-cwlprov-0.4.0$ cwlprov
147+
ERROR:cwlprov.tool:BagIt validation failed for: /home/stain/src/cwlprov-py/test/revsort-cwlprov-0.4.0: Payload-Oxum validation failed. Expected 3 files and 3333 bytes but found 2 files and 2222 bytes
148+
149+
To perform full validation, use `cwlprov validate`:
150+
151+
test/revsort-cwlprov-0.4.0$ cwlprov validate
152+
WARNING:bdbag.bdbagit:data/32/327fc7aedf4f6b69a42a7c8b808dc5a7aff61376 exists in manifest but was not found on filesystem
153+
ERROR:cwlprov.tool:BagIt validation failed for: /home/stain/src/cwlprov-py/test/revsort-cwlprov-0.4.0: Bag validation failed: data/32/327fc7aedf4f6b69a42a7c8b808dc5a7aff61376 exists in manifest but was not found on filesystem
154+
155+
test/revsort-cwlprov-0.4.0$ git checkout .
156+
157+
test/revsort-cwlprov-0.4.0$ cwlprov validate
158+
Valid CWLProv RO: .
159+
160+
Unlike the quick validation, `cwlprov validate` will confirm checksums on all files, and thus detect byte-level changes. For instance, let's pretend `I` has been replaced with lower case `i` in a data file:
161+
162+
test/revsort-cwlprov-0.4.0$ sed -i 's/I/i/g' data/32/327fc7aedf4f6b69a42a7c8b808dc5a7aff61376
163+
test/revsort-cwlprov-0.4.0$ cwlprov
164+
Detected CWLProv Research Object: /home/stain/src/cwlprov-py/test/revsort-cwlprov-0.4.0
165+
166+
test/revsort-cwlprov-0.4.0$ cwlprov validate
167+
WARNING:bdbag.bdbagit:data/32/327fc7aedf4f6b69a42a7c8b808dc5a7aff61376 sha1 validation failed: expected="327fc7aedf4f6b69a42a7c8b808dc5a7aff61376" found="60c41d3758bc8b03e78db07bc0f17d1804d2662d"
168+
ERROR:cwlprov.tool:BagIt validation failed for: /home/stain/src/cwlprov-py/test/revsort-cwlprov-0.4.0: Bag validation failed: data/32/327fc7aedf4f6b69a42a7c8b808dc5a7aff61376 sha1 validation failed: expected="327fc7aedf4f6b69a42a7c8b808dc5a7aff61376" found="60c41d3758bc8b03e78db07bc0f17d1804d2662d"
169+
170+
### Research Object information
171+
172+
The `cwlprov info` command gives high-level information about the research object and its identifiers.
173+
174+
test/revsort-cwlprov-0.4.0$ cwlprov info
175+
Research Object of CWL workflow run
176+
Research Object ID: arcp://uuid,d47d3d43-4830-44f0-aa32-4cda74849c63/
177+
Profile: https://w3id.org/cwl/prov/0.4.0
178+
Workflow run ID: urn:uuid:d47d3d43-4830-44f0-aa32-4cda74849c63
179+
Packaged: 2018-08-21
180+
181+
The `Profile` indicates the version of the CWLProv the research object implements,
182+
which determine which features of a workflow run is represented.
183+
184+
Note that a warning will be printed if an unknown CWLProv version is detected:
185+
186+
$ cwlprov
187+
WARNING:cwlprov.tool:Unsupported CWLProv version: {'https://w3id.org/cwl/prov/0.8.0'}
188+
Supported profiles:
189+
https://w3id.org/cwl/prov/0.6.0
190+
https://w3id.org/cwl/prov/0.5.0
191+
https://w3id.org/cwl/prov/0.4.0
192+
https://w3id.org/cwl/prov/0.3.0
193+
194+
This typically means that cwlprov-py is outdated, although that is normally harmless. Try `pip install cwlprov` to upgrade.
195+
196+
The `cwlprov who` command will try to determine the user that ran the workflow.
197+
198+
$ cwlprov who
199+
Packaged By: cwltool 1.0.20180925133620 <urn:uuid:d9c16ea5-c3fd-4c56-b125-f3a5207e6c38>
200+
Executed By: Stian Soiland-Reyes <https://orcid.org/0000-0001-9842-9718>
201+
202+
_Note that for privacy concerns, CWL executors like [cwltool](https://github.com/common-workflow-language/cwltool)
203+
would not log such user information unless this has been enabled with options like `--orcid` `--full-name` or `--enable-user-provenance`._
204+
205+
### Workflow run
206+
207+
To list the step executions of a workflow use `cwlprov run`:
208+
209+
test/revsort-cwlprov-0.4.0$ cwlprov run
210+
2018-08-21 17:26:24.467844 Flow d47d3d43-4830-44f0-aa32-4cda74849c63 [ Run of workflow/packed.cwl#main
211+
2018-08-21 17:26:24.530884 Step 6f501717-0c97-492e-b18a-10bc096f1797 Run of workflow/packed.cwl#main/rev (0:00:01.122498)
212+
2018-08-21 17:26:25.656084 Step e7c8b2c0-dee6-4c61-b674-f0807cb47344 Run of workflow/packed.cwl#main/sorted (0:00:01.087999)
213+
2018-08-21 17:26:26.752493 Flow d47d3d43-4830-44f0-aa32-4cda74849c63 ] Run of workflow/packed.cwl#main (0:00:02.284649)
214+
Legend:
215+
[ Workflow start
216+
] Workflow end
217+
218+
The listing can be customized, see `cwlprov run --help` for details. For example:
219+
220+
test/revsort-cwlprov-0.4.0$ cwlprov --no-hints run --no-labels --start --end --no-duration
221+
2018-08-21 17:26:24.467844 Flow d47d3d43-4830-44f0-aa32-4cda74849c63 [
222+
2018-08-21 17:26:24.530884 2018-08-21 17:26:25.653382 Step 6f501717-0c97-492e-b18a-10bc096f1797
223+
2018-08-21 17:26:25.656084 2018-08-21 17:26:26.744083 Step e7c8b2c0-dee6-4c61-b674-f0807cb47344
224+
2018-08-21 17:26:26.752493 Flow d47d3d43-4830-44f0-aa32-4cda74849c63 ]
225+
226+
### Nested workflows
227+
228+
Nested workflows, steps that themselves are workflows, are indicated in `cwlprov run` with a `*`:
229+
230+
(venv3) stain@biggie:~/src/cwlprov-py/test/nested-cwlprov-0.3.0$ cwlprov run
231+
2018-08-08 22:44:06.573330 Flow 39408a40-c1c8-4852-9747-87249425be1e [ Run of workflow/packed.cwl#main
232+
2018-08-08 22:44:06.691722 Step 4f082fb6-3e4d-4a21-82e3-c685ce3deb58 Run of workflow/packed.cwl#main/create-tar (0:00:00.010133)
233+
2018-08-08 22:44:06.702976 Step 0cceeaf6-4109-4f08-940b-f06ac959944a * Run of workflow/packed.cwl#main/compile (unknown duration)
234+
2018-08-08 22:44:12.680097 Flow 39408a40-c1c8-4852-9747-87249425be1e ] Run of workflow/packed.cwl#main (0:00:06.106767)
235+
Legend:
236+
[ Workflow start
237+
* Nested provenance, use UUID to explore: cwlprov run 0cceeaf6-4109-4f08-940b-f06ac959944a
238+
] Workflow end
239+
240+
(venv3) stain@biggie:~/src/cwlprov-py/test/nested-cwlprov-0.3.0$ cwlprov run 0cceeaf6-4109-4f08-940b-f06ac959944a
241+
2018-08-08 22:44:06.607210 Flow 0cceeaf6-4109-4f08-940b-f06ac959944a [ Run of workflow/packed.cwl#main
242+
2018-08-08 22:44:06.707070 Step 83752ab4-8227-4d4a-8baa-78376df34aed Run of workflow/packed.cwl#main/untar (0:00:00.008149)
243+
2018-08-08 22:44:06.718554 Step f56d8478-a190-4251-84d9-7f69fe0f6f8b Run of workflow/packed.cwl#main/argument (0:00:00.532052)
244+
2018-08-08 22:44:07.251588 Flow 0cceeaf6-4109-4f08-940b-f06ac959944a ] Run of workflow/packed.cwl#main (0:00:00.644378)
245+
Legend:
246+
[ Workflow start
247+
] Workflow end
248+
249+
To explore the nested workflow run with other commands, provide the run UUID with `--run` argument, e.g.
250+
251+
test/nested-cwlprov-0.3.0$ cwlprov outputs --format=files --run 0cceeaf6-4109-4f08-940b-f06ac959944a 83752ab4-8227-4d4a-8baa-78376df34aed
252+
Output example_out:
253+
data/93/93035905e94e150874f5a881d39f3c5c6378dd38
254+
26255

27256
## License
28257

0 commit comments

Comments
 (0)