Skip to content

Latest commit

 

History

History
1224 lines (1107 loc) · 27.6 KB

File metadata and controls

1224 lines (1107 loc) · 27.6 KB

Tests

Testing sessions:

x = 1
print(x*2)

Timer formatting

Timer is off by default. These tests enable it explicitly:

print(1)
import time
time.sleep(3)

Turn off timer-show to hide it

print(1)

Set :timer-rounded to no to get the full timer. (Also modifying the timer string here so that my expect tests will skip it.)

print(1)

Table formatting

As org tables

By default dataframes are printed as org tables

import pandas as pd
data = {
'Name': ['Joe', 'Eva', 'Charlie', 'David', 'Eva'],
'Age': [44, 32, 33,33, 22],
'City': ['New York', 'San Francisco', 'Boston', 'Paris', 'Tokyo'],
'Score': [92.5, 88.0, 95.2, 78.9, 90.11111]}
df = pd.DataFrame(data)
print(df)
idxNameAgeCityScore
0Joe44New York92.5
1Eva32San Francisco88.0
2Charlie33Boston95.2
3David33Paris78.9
4Eva22Tokyo90.11111

This respects various pandas options:

Float formatting

pd.options.display.float_format = '{:.1f}'.format
print(df.set_index("Name"))
NameAgeCityScore
Joe44New York92.5
Eva32San Francisco88.0
Charlie33Boston95.2
David33Paris78.9
Eva22Tokyo90.1

Max rows

pd.options.display.max_rows = 10
long_df = pd.DataFrame({'A': range(200)})
print(long_df)
idxA
00
11
22
33
44
55
66
77
88
99

Problem – hangs when printing large dataframes.

print_org_df sets max_rows to be 20 by default to avoid this issue.

import pandas as pd
long_df = pd.DataFrame({'A': range(400)})
print(long_df)
idxA
00
11
22
33
44
55
66
77
88
99
1010
1111
1212
1313
1414
1515
1616
1717
1818
1919

If we make the max_rows even modestly large, we run into it, depending on computing resources.

pd.options.display.max_rows = 200
long_df = pd.DataFrame({'A': range(200)})
print(long_df)
idxA
00
11
22
33
44
55
66
77
88
99
1010
1111
1212
1313
1414
1515
1616
1717
1818
1919
2020
2121
2222
2323
2424
2525
2626
2727
2828
2929
3030
3131
3232
3333
3434
3535
3636
3737
3838
3939
4040
4141
4242
4343
4444
4545
4646
4747
4848
4949
5050
5151
5252
5353
5454
5555
5656
5757
5858
5959
6060
6161
6262
6363
6464
6565
6666
6767
6868
6969
7070
7171
7272
7373
7474
7575
7676
7777
7878
7979
8080
8181
8282
8383
8484
8585
8686
8787
8888
8989
9090
9191
9292
9393
9494
9595
9696
9797
9898
9999
100100
101101
102102
103103
104104
105105
106106
107107
108108
109109
110110
111111
112112
113113
114114
115115
116116
117117
118118
119119
120120
121121
122122
123123
124124
125125
126126
127127
128128
129129
130130
131131
132132
133133
134134
135135
136136
137137
138138
139139
140140
141141
142142
143143
144144
145145
146146
147147
148148
149149
150150
151151
152152
153153
154154
155155
156156
157157
158158
159159
160160
161161
162162
163163
164164
165165
166166
167167
168168
169169
170170
171171
172172
173173
174174
175175
176176
177177
178178
179179
180180
181181
182182
183183
184184
185185
186186
187187
188188
189189
190190
191191
192192
193193
194194
195195
196196
197197
198198
199199

Printing multiple dataframes:

print(df)
print("Space between dataframes")
print(df)
idxNameAgeCityScore
0Joe44New York92.5
1Eva32San Francisco88.0
2Charlie33Boston95.2
3David33Paris78.9
4Eva22Tokyo90.1

Space between dataframes

idxNameAgeCityScore
0Joe44New York92.5
1Eva32San Francisco88.0
2Charlie33Boston95.2
3David33Paris78.9
4Eva22Tokyo90.1

In general space between dataframes requires ones below to be aligned. I have an advise function ( adjust-org-babel-results ) that does this, but it can be slow if there are many tables in the org file, so it can be disabled like this.

print(df)
print("Space between dataframes")
print(df)
idxNameAgeCityScore
0Joe44New York92.5
1Eva32San Francisco88.0
2Charlie33Boston95.2
3David33Paris78.9
4Eva22Tokyo90.1

Space between dataframes

idxNameAgeCityScore
0Joe44New York92.5
1Eva32San Francisco88.0
2Charlie33Boston95.2
3David33Paris78.9
4Eva22Tokyo90.1

Bug – tables that contain | are buggy.

Need a way to handle |’s in the string names

import pandas as pd


df = pd.DataFrame({"names": ["John \vert", "Mary", "Bob  Rob", "Alice John", "Tom"]})
print(df)
idxnames
0John

ert \

1Mary
2Bob Rob
3Alice John
4Tom

One work around is to call to_markdown directly, as ob-python-extras converts | that are not in dataframes into \ to prevent org from incorrectly recognizing text as tables.

import pandas as pd

df = pd.DataFrame({"names": ["John", "Mary", "Bob|Rob", "Alice|John", "Tom"]})
print(df.to_markdown())

Displaying styled dataframes as pngs

Dataframes can also be displayed as styled dataframes. This is nice for exporting documents with pretty tables.

Removing because I haven’t been able to get it to work in CI. — #+name: styled_dataframes

styled_df = df.style.background_gradient()
print(styled_df)

Polars

Polars dataframes are always printed as an org table as well.

import polars as pl

df = pl.DataFrame({"x": [1, 1, 3], "y": [2, 3, 1]})
print(df)
idxxy
012
113
231

Cell Timer: 0:00:00

Testing Tabulate

If Tabulate is available we can use it directly to formate the dataframe. This is built into pandas and the safer option.

#+name print_with_tabulate

import pandas as pd
data = {
'Name': ['Joe', 'Eva', 'Charlie', 'David', 'Eva'],
'Age': [44, 32, 33,33, 22],
'City': ['New York', 'San Francisco', 'Boston', 'Paris', 'Tokyo'],
'Score': [92.5, 88.0, 95.2, 78.9, 90.11111]}
df = pd.DataFrame(data)
print(df)
idxNameAgeCityScore
0Joe44New York92.5
1Eva32San Francisco88.0
2Charlie33Boston95.2
3David33Paris78.9
4Eva22Tokyo90.11111

Images

mocks out python plotting to allow plots to be interspersed with printing, and allows multiple to be made. :)

import matplotlib.pyplot as plt
import pandas as pd

print("look!")
df = pd.DataFrame(
    {
        "x": [0, 2, 3, 4, 5, 6, 7],
        "y": [10, 11, 12, 13, 14, 15, 16],
    }
)
print(df)
df.plot(x="x", y="y", kind="line")
plt.show()
print("tada!")
idxxy
0010
1211
2312
3413
4514
5615
6716

plots/babel-formatting/plot_20260202_175605_1346249.png tada!

HTML formatting

import base64
from io import BytesIO

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

# Create sample data
df = pd.DataFrame(
    {
        "x": np.linspace(0, 10, 100),
        "sin": np.sin(np.linspace(0, 10, 100)),
        "cos": np.cos(np.linspace(0, 10, 100)),
    }
)

# Create matplotlib plot
plt.figure(figsize=(8, 4))
plt.plot(df["x"][:20], df["sin"][:20], label="sin")
plt.plot(df["x"][:20], df["cos"][:20], label="cos")
plt.legend()
plt.title("Sine and Cosine Waves")

# Convert plot to base64
buf = BytesIO()
plt.savefig(buf, format="png")
plt.close()
img_base64 = base64.b64encode(buf.getvalue()).decode("utf-8")

# Create HTML with table and image
html = f"""
<h1>Data Analysis Results</h1>
<p>Here's a sample of our trigonometric functions:</p>
{df.head().to_html(classes='dataframe')}
<p><b>Visualization:</b></p>
<img src="data:image/png;base64,{img_base64}"/>
<p><i>Figure 1: First few periods of sine and cosine waves</i></p>
"""

print(html)
xsincos
00.000000.0000001.000000
10.101010.1008380.994903
20.202020.2006490.979663
30.303030.2984140.954437
40.404040.3931370.919480

Visualization:

plots/babel-formatting/aa998cf338aab4a386851d0dff713417f9d85a3a.png

Figure 1: First few periods of sine and cosine waves

Also use dataframe_image to get styled dataframes from the html output as pngs.

Error handling

print(1 / 0)
x = 0
print(1 / 0)

Get more detailed errors

Last line print

#+name testing_last_line_print

x = 1
print(x)
1000 * 2 + x

Edge case handling

Last line might not be an expression. Ideally this would get the last expression, but I’m settling for just not crashing.

(Achieved by checking if the code without the last line is valid python before execing it first; otherwise exec’s the whole block. I don’t like relying on _.) #+name testing_last_line_print_not_full_expr

(
    1,
    2,
    3,
    4,
    1,
)

Need to make sure that we handle comments on the last line – in general, print(last_line) is checked to be valid python syntax.

#+name last_line_a_comment

print(1)
# a comment

Torch

import torch
x = torch.randn(3, 3)
print(x)
import torch

# Test tensor
x = torch.randn(3, 3)
print("Tensor:", x)
# Test dict
d = {"name": "Alice", "age": 30, "city": "New York", "hobbies": ["reading", "coding"]}
print("Dict:", d)

# Test list
l = [1, 2, 3, [4, 5, 6], {"nested": "dict"}]
print("List:", l)

# Test set
s = {1, 2, 3, 4, 5}
print("Set:", s)
import os
os.path
print(100)