Skip to content

Commit abd718c

Browse files
committed
Add numba parallel example
1 parent ae1c5bc commit abd718c

File tree

2 files changed

+322
-0
lines changed

2 files changed

+322
-0
lines changed

source-code/numba/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,3 +12,5 @@ can be obtained without much effort.
1212
1. `Primes`: code to compute the first n prime numbers comparing a pure Python
1313
implementation with numba JIT and eager JIT.
1414
1. `Ufunc`: defining a numpy ufunc using numba.
15+
1. `numba_parallel.ipynb`: jupyter notebook experimenting with numba's
16+
parallel capabilities.
Lines changed: 320 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,320 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "647ec939-d34d-4a92-8a8d-60e0078fee69",
6+
"metadata": {},
7+
"source": [
8+
"# Requirements"
9+
]
10+
},
11+
{
12+
"cell_type": "code",
13+
"execution_count": 2,
14+
"id": "98749cd7-43b1-45b4-8ad8-c68006996d22",
15+
"metadata": {},
16+
"outputs": [],
17+
"source": [
18+
"from numba import njit\n",
19+
"import numpy as np\n",
20+
"import random"
21+
]
22+
},
23+
{
24+
"cell_type": "markdown",
25+
"id": "3e3a9b00-2e46-4b57-9951-3d649a9ed193",
26+
"metadata": {},
27+
"source": [
28+
"# Random $\\pi$"
29+
]
30+
},
31+
{
32+
"cell_type": "markdown",
33+
"id": "1c79e332-5fb5-4103-8a02-74ef83972464",
34+
"metadata": {},
35+
"source": [
36+
"Compute $\\pi$ by generating random points in a square and counting how many there are in the circle inscribed in the square."
37+
]
38+
},
39+
{
40+
"cell_type": "code",
41+
"execution_count": 5,
42+
"id": "75bbae32-14d6-44f4-b83d-3dda129355d2",
43+
"metadata": {},
44+
"outputs": [],
45+
"source": [
46+
"def compute_pi(nr_tries):\n",
47+
" hits = 0\n",
48+
" for _ in range(nr_tries):\n",
49+
" x = random.random()\n",
50+
" y = random.random()\n",
51+
" if x**2 + y**2 < 1.0:\n",
52+
" hits += 1\n",
53+
" return 4.0*hits/nr_tries"
54+
]
55+
},
56+
{
57+
"cell_type": "code",
58+
"execution_count": 6,
59+
"id": "f96298c8-d477-4da6-a19f-0de852c81329",
60+
"metadata": {},
61+
"outputs": [],
62+
"source": [
63+
"@njit\n",
64+
"def compute_pi_jit(nr_tries):\n",
65+
" hits = 0\n",
66+
" for _ in range(nr_tries):\n",
67+
" x = random.random()\n",
68+
" y = random.random()\n",
69+
" if x**2 + y**2 < 1.0:\n",
70+
" hits += 1\n",
71+
" return 4.0*hits/nr_tries"
72+
]
73+
},
74+
{
75+
"cell_type": "code",
76+
"execution_count": 32,
77+
"id": "a0d922f3-13ba-4c6d-beeb-c8292b1baf67",
78+
"metadata": {},
79+
"outputs": [],
80+
"source": [
81+
"@njit(['float64(int64)'])\n",
82+
"def compute_pi_jit_sign(nr_tries):\n",
83+
" hits = 0\n",
84+
" for _ in range(nr_tries):\n",
85+
" x = random.random()\n",
86+
" y = random.random()\n",
87+
" if x**2 + y**2 < 1.0:\n",
88+
" hits += 1\n",
89+
" return 4.0*hits/nr_tries"
90+
]
91+
},
92+
{
93+
"cell_type": "code",
94+
"execution_count": 9,
95+
"id": "b830e45b-bc46-42f6-9b40-2f636c9989cd",
96+
"metadata": {},
97+
"outputs": [
98+
{
99+
"name": "stdout",
100+
"output_type": "stream",
101+
"text": [
102+
"27.1 ms ± 277 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
103+
]
104+
}
105+
],
106+
"source": [
107+
"%timeit compute_pi(100_000)"
108+
]
109+
},
110+
{
111+
"cell_type": "code",
112+
"execution_count": 10,
113+
"id": "b98f5c18-a5fb-468c-8f96-ca25782ebac8",
114+
"metadata": {},
115+
"outputs": [
116+
{
117+
"name": "stdout",
118+
"output_type": "stream",
119+
"text": [
120+
"687 µs ± 9.53 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
121+
]
122+
}
123+
],
124+
"source": [
125+
"%timeit compute_pi_jit(100_000)"
126+
]
127+
},
128+
{
129+
"cell_type": "code",
130+
"execution_count": 34,
131+
"id": "78c37a87-dd0c-49c6-ac8d-85e6f21832c2",
132+
"metadata": {},
133+
"outputs": [
134+
{
135+
"name": "stdout",
136+
"output_type": "stream",
137+
"text": [
138+
"685 µs ± 8.96 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
139+
]
140+
}
141+
],
142+
"source": [
143+
"%timeit compute_pi_jit_sign(np.int64(100_000))"
144+
]
145+
},
146+
{
147+
"cell_type": "markdown",
148+
"id": "18e84532-48d9-4aa0-8218-709888e3162e",
149+
"metadata": {},
150+
"source": [
151+
"Using numba's just-in-time compiler significantly speeds up the computations."
152+
]
153+
},
154+
{
155+
"cell_type": "markdown",
156+
"id": "8fc275a6-8ac6-4481-beed-41896c5b39e9",
157+
"metadata": {},
158+
"source": [
159+
"# Quadrature $\\pi$"
160+
]
161+
},
162+
{
163+
"cell_type": "markdown",
164+
"id": "32312a15-8070-4a8a-a198-170252d8efde",
165+
"metadata": {},
166+
"source": [
167+
"Another method to compute $\\pi$ is to compute the definite integral\n",
168+
"$$\n",
169+
"\\frac{\\pi}{2} = \\int_{-1}^{1} \\sqrt{1 - x^2} dx\n",
170+
"$$"
171+
]
172+
},
173+
{
174+
"cell_type": "code",
175+
"execution_count": 38,
176+
"id": "17597694-cb80-4e2a-aa4c-c4c9e6d6de84",
177+
"metadata": {},
178+
"outputs": [],
179+
"source": [
180+
"@njit\n",
181+
"def quad_pi_jit(nr_steps):\n",
182+
" delta = 2.0/nr_steps\n",
183+
" x = np.linspace(-1.0, 1.0, nr_steps)\n",
184+
" f = np.empty_like(x)\n",
185+
" for i in range(x.size):\n",
186+
" f[i] = np.sqrt(1.0 - x[i]**2)\n",
187+
" return 2.0*f.sum()*delta"
188+
]
189+
},
190+
{
191+
"cell_type": "markdown",
192+
"id": "89b96c0e-78bf-4d89-b962-a3dd1cc9e92a",
193+
"metadata": {},
194+
"source": [
195+
"We can implement this so that the loop can be parallelized (numba cannot deal with reductions)."
196+
]
197+
},
198+
{
199+
"cell_type": "code",
200+
"execution_count": 35,
201+
"id": "39a14289-55a9-4775-b902-3d1f1b7f58ec",
202+
"metadata": {},
203+
"outputs": [],
204+
"source": [
205+
"@njit(parallel=True)\n",
206+
"def quad_pi_par(nr_steps):\n",
207+
" delta = 2.0/nr_steps\n",
208+
" x = np.linspace(-1.0, 1.0, nr_steps)\n",
209+
" f = np.empty_like(x)\n",
210+
" for i in range(x.size):\n",
211+
" f[i] = np.sqrt(1.0 - x[i]**2)\n",
212+
" return 2.0*f.sum()*delta"
213+
]
214+
},
215+
{
216+
"cell_type": "markdown",
217+
"id": "4ea90942-284e-454d-a750-bd9d08ff057e",
218+
"metadata": {},
219+
"source": [
220+
"The pure numpy implementation for comparison."
221+
]
222+
},
223+
{
224+
"cell_type": "code",
225+
"execution_count": 44,
226+
"id": "f1949a6a-8230-4c9d-b642-ce82e17fede7",
227+
"metadata": {},
228+
"outputs": [],
229+
"source": [
230+
"def quad_pi_np(nr_steps):\n",
231+
" delta = 2.0/nr_steps\n",
232+
" x = np.linspace(-1.0, 1.0, nr_steps)\n",
233+
" return 2.0*np.sqrt(1.0 - x**2).sum()*delta"
234+
]
235+
},
236+
{
237+
"cell_type": "code",
238+
"execution_count": 50,
239+
"id": "2492cd21-68d8-4b9a-9a0b-656cfb2c9e2d",
240+
"metadata": {},
241+
"outputs": [
242+
{
243+
"name": "stdout",
244+
"output_type": "stream",
245+
"text": [
246+
"328 ms ± 34.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
247+
]
248+
}
249+
],
250+
"source": [
251+
"%timeit quad_pi_jit(100_000_000)"
252+
]
253+
},
254+
{
255+
"cell_type": "code",
256+
"execution_count": 51,
257+
"id": "2acb30d2-92cf-4082-8572-675b7694747b",
258+
"metadata": {},
259+
"outputs": [
260+
{
261+
"name": "stdout",
262+
"output_type": "stream",
263+
"text": [
264+
"202 ms ± 1.19 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
265+
]
266+
}
267+
],
268+
"source": [
269+
"%timeit quad_pi_par(100_000_000)"
270+
]
271+
},
272+
{
273+
"cell_type": "code",
274+
"execution_count": 52,
275+
"id": "be932fa1-f46a-4e8b-a301-384adf777364",
276+
"metadata": {},
277+
"outputs": [
278+
{
279+
"name": "stdout",
280+
"output_type": "stream",
281+
"text": [
282+
"676 ms ± 43.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
283+
]
284+
}
285+
],
286+
"source": [
287+
"%timeit quad_pi_np(100_000_000)"
288+
]
289+
},
290+
{
291+
"cell_type": "markdown",
292+
"id": "a402e83f-ccb7-4518-b8de-b14e499f994a",
293+
"metadata": {},
294+
"source": [
295+
"The parallized version is faster, but the parallel efficiency is far from great."
296+
]
297+
}
298+
],
299+
"metadata": {
300+
"kernelspec": {
301+
"display_name": "Python 3 (ipykernel)",
302+
"language": "python",
303+
"name": "python3"
304+
},
305+
"language_info": {
306+
"codemirror_mode": {
307+
"name": "ipython",
308+
"version": 3
309+
},
310+
"file_extension": ".py",
311+
"mimetype": "text/x-python",
312+
"name": "python",
313+
"nbconvert_exporter": "python",
314+
"pygments_lexer": "ipython3",
315+
"version": "3.9.7"
316+
}
317+
},
318+
"nbformat": 4,
319+
"nbformat_minor": 5
320+
}

0 commit comments

Comments
 (0)