Skip to content

Commit e9220f0

Browse files
committed
Update 3d7pt sample output in README
1 parent d402f77 commit e9220f0

1 file changed

Lines changed: 59 additions & 27 deletions

File tree

README.md

Lines changed: 59 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -143,12 +143,32 @@ and run on shared-memory parallel machines like general-purpose multicores.
143143
`libpluto.{so,a}` is also built and can be found in `src/.libs/`. `make install`
144144
will install it.
145145

146+
## Using Pluto
147+
148+
- Use `#pragma scop` and '#pragma endscop' around the section of code
149+
you want to parallelize/optimize.
150+
151+
- Then, run:
152+
153+
./polycc <C source file> [--pet]
154+
155+
The output file will be named <original prefix>.pluto.c unless '-o
156+
<filename>" is supplied. When --debug is used, the .cloog used to
157+
generate code is not deleted and is named similarly. The pet frontend
158+
`--pet` is needed to process many of the test cases/examples.
159+
160+
Please refer to the documentation of Clan or PET for information on the
161+
kind of code around which one can put `#pragma scop` and `#pragma
162+
endscop`. Most of the time, although your program may not satisfy the
163+
constraints, it may be possible to work around them.
164+
165+
146166
## Trying a new example
147167

148168
- Use `#pragma scop` and `#pragma endscop` around the section of code
149169
you want to parallelize/optimize.
150170

151-
- Then, just run `./polycc <C source file>`.
171+
- Then, just run `./polycc <C source file> --pet`.
152172

153173
The transformation is also printed out, and `test.par.c` will have the
154174
parallelized code. If you want to see intermediate files, like the
@@ -177,25 +197,6 @@ where target can be orig, orig_par, opt, tiled, par, pipepar, etc. (see
177197
- `make check-pluto` to test for correctness, `make perf` to compare
178198
performance.
179199

180-
181-
## Using Pluto
182-
183-
- Use `#pragma scop` and '#pragma endscop' around the section of code
184-
you want to parallelize/optimize.
185-
186-
- Then, run
187-
188-
./polycc <C source file> --parallel --tile
189-
190-
The output file will be named <original prefix>.pluto.c unless '-o
191-
<filename>" is supplied. When --debug is used, the .cloog used to
192-
generate code is not deleted and is named similarly.
193-
194-
Please refer to the documentation of Clan or PET for information on the
195-
kind of code around which one can put `#pragma scop` and `#pragma
196-
endscop`. Most of the time, although your program may not satisfy the
197-
constraints, it may be possible to work around them.
198-
199200
## Command-line options
200201

201202
```shell
@@ -232,19 +233,50 @@ loops in that order. For eg., for heat-3d, you'll see this output when
232233
you run Pluto
233234

234235
```shell
235-
../../polycc 3d7pt.c
236-
237-
[...]
238-
236+
# With default tile sizes.
237+
../../polycc test/3d7pt.c --pet
238+
239+
[pluto] compute_deps (isl)
240+
[pluto] Number of statements: 1
241+
[pluto] Total number of loops: 4
242+
[pluto] Number of deps: 15
243+
[pluto] Maximum domain dimensionality: 4
244+
[pluto] Number of parameters: 0
245+
[pluto] Concurrent start hyperplanes found
239246
[pluto] Affine transformations [<iter coeff's> <param> <const>]
240247
241-
T(S1): (t, t+i, t+j, t+k)
248+
T(S1): (t-i, t+i, t+j, t+k)
242249
loop types (loop, loop, loop, loop)
243250
244-
[...]
251+
[Pluto] After tiling:
252+
T(S1): ((t-i)/32, (t+i)/32, (t+j)/32, (t+k)/32, t-i, t+i, t+j, t+k)
253+
loop types (loop, loop, loop, loop, loop, loop, loop, loop)
254+
255+
[Pluto] After intra_tile reschedule
256+
T(S1): ((t-i)/32, (t+i)/32, (t+j)/32, (t+k)/32, t, t+i, t+j, t+k)
257+
loop types (loop, loop, loop, loop, loop, loop, loop, loop)
258+
259+
[Pluto] After tile scheduling:
260+
T(S1): ((t-i)/32+(t+i)/32, (t+i)/32, (t+j)/32, (t+k)/32, t, t+i, t+j, t+k)
261+
loop types (loop, loop, loop, loop, loop, loop, loop, loop)
262+
263+
[pluto] using statement-wise -fs/-ls options: S1(5,8),
264+
[pluto-unroll-jam] No unroll jam loop candidates found
265+
[Pluto] Output written to 3d7pt.pluto.c
266+
267+
[pluto] Timing statistics
268+
[pluto] SCoP extraction + dependence analysis time: 0.087957s
269+
[pluto] Auto-transformation time: 0.011928s
270+
[pluto] Tile size selection time: 0.000000s
271+
[pluto] Total constraint solving time (LP/MIP/ILP) time: 0.002028s
272+
[pluto] Code generation time: 0.049415s
273+
[pluto] Other/Misc time: 0.310162s
274+
[pluto] Total time: 0.459462s
275+
[pluto] All times: 0.087957 0.011928 0.049415 0.310162
245276
```
246277
247-
Hence, the tile sizes specified correspond to t, t+i, t+j, and t+k.
278+
The tile sizes specified correspond to t, t+i, t+j, and t+k. Notice the
279+
multi-dimensional affine transformation function before tiling and after tiling.
248280
249281
250282
### Setting good tile sizes

0 commit comments

Comments
 (0)