Hello, I am glad to see the work which has been done here!
I am using a lot of notebooks, and I would like to automate them, and get specific cells output.
I think the easier way to do so would be to run a notebook using dotnet-repl, identify a cell, and get its raw output as a file.
I have a crude pipeline, and I am looking for a way to improve it, either with existing nugets, or by suggesting a feature.
The problems
1: a way to give a name or id to a cell
I am talking about a user defined name, not an automatic one like the cell number.
- Using ipynb format, we can have a first line comment such as #id:myFinalOutput, which can be parsed later from the cell source.
- Using trx format, the first line of the cell is used to create a parameter
testName="Cell X: first_line_of_the_cell".
- Maybe a magic command would be better?
2: a way to get a raw output of the cell
- Using ipynb format, the output looks like this:
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\r\n",
"1\r\n",
"2\r\n",
"3\r\n"
]
}
]
- Using trx format, the output looks like this;
<Output>
<StdOut>0
1
2
3</StdOut>
</Output>
It's possible to parse the files. But I think it would be better if dotnet-repl outputs directly the raw output to a a file.
Proposal for one output
The notebook
#!cellid myFinalOutput
Console.WriteLine("Hello World");
The command line
dotnet repl --run /path/to/notebook.ipynb --output-path /path/to/output.txt --cellid myFinalOutput
The output (output.txt)
Hello World
As the output is raw, it could also be some html.
Proposal for several outputs
Managing several outputs for several cells.
The notebook
#!cellid myIntermediaryOutput
2 + 2
#!cellid myFinalOutput
Console.WriteLine("Hello World");
The command line
dotnet repl --run /path/to/notebook.ipynb --output-path /path/to/folder --cellid myIntermediaryOutput,myFinalOutput
- myIntermediaryOutput.txt
4
- myFinalOutput.txt
Hello World
Hello, I am glad to see the work which has been done here!
I am using a lot of notebooks, and I would like to automate them, and get specific cells output.
I think the easier way to do so would be to run a notebook using dotnet-repl, identify a cell, and get its raw output as a file.
I have a crude pipeline, and I am looking for a way to improve it, either with existing nugets, or by suggesting a feature.
The problems
1: a way to give a name or id to a cell
I am talking about a user defined name, not an automatic one like the cell number.
testName="Cell X: first_line_of_the_cell".2: a way to get a raw output of the cell
It's possible to parse the files. But I think it would be better if dotnet-repl outputs directly the raw output to a a file.
Proposal for one output
The notebook
The command line
dotnet repl --run /path/to/notebook.ipynb --output-path /path/to/output.txt --cellid myFinalOutputThe output (output.txt)
Hello WorldAs the output is raw, it could also be some html.
Proposal for several outputs
Managing several outputs for several cells.
The notebook
The command line
dotnet repl --run /path/to/notebook.ipynb --output-path /path/to/folder --cellid myIntermediaryOutput,myFinalOutput4Hello World