Commit df554de
committed
Add OpenMP compile/link flags to setup.py for source builds
Source builds of torchvision do not pass -fopenmp (compile) or
-lomp/-lgomp (link) flags when building the _C extension. Since
at::parallel_for is a header-only template whose #pragma omp directives
are compiled into the calling translation unit (_C.so), the missing
flags cause it to silently fall back to sequential execution.
This has had no observable effect so far because no existing torchvision
C++ kernel directly uses at::parallel_for or #pragma omp. However,
upcoming changes (e.g. pytorch#9442) introduce at::parallel_for, and without
these flags source builds get 0% speedup from parallelization.
- macOS: -Xpreprocessor -fopenmp (compile) + -lomp from PyTorch's
bundled libomp (link)
- Linux: -fopenmp (compile) + -lgomp (link)
- Windows: unchanged (uses /openmp via MSVC, already handled separately)
Fixes pytorch#2783
Signed-off-by: Yonghye Kwon <developer.0hye@gmail.com>1 parent 8a5946e commit df554de
1 file changed
Lines changed: 16 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
131 | 131 | | |
132 | 132 | | |
133 | 133 | | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
134 | 140 | | |
135 | 141 | | |
136 | 142 | | |
| |||
182 | 188 | | |
183 | 189 | | |
184 | 190 | | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
185 | 200 | | |
186 | 201 | | |
187 | 202 | | |
188 | 203 | | |
189 | 204 | | |
190 | 205 | | |
| 206 | + | |
191 | 207 | | |
192 | 208 | | |
193 | 209 | | |
| |||
0 commit comments