[GSoC 2026 Draft POC] r.proj: OpenMP parallelization via RAM-resident buffer by krcoder123 · Pull Request #7185 · OSGeo/grass

krcoder123 · 2026-03-16T05:05:20Z

This draft PR demonstrates a proof of concept parallelization of the r.proj projection loop using OpenMP, submitted as part of my GSoC 2026 proposal for "Parallelization of existing tools."

Benchmark Results

UTM Zone 16 → WebMercator, 100M cells, Apple M4, 8 cores — 3 runs each:

Method	Wall Time	CPU Usage
Serial (original)	~20.3s	83% (1 core)
Parallel (this PR)	~8.1s	515% (5-6 cores)
UTM → Lat/Lon (EPSG:4326)	9.1s	572%

2.5x wall-time speedup across two independent projection pipelines.

Technical Approach

The core blocker for parallelizing r.proj is that the legacy readcell tile cache is not thread-safe. Rather than wrapping it in mutexes (which I prototyped first and confirmed serializes all threads with no speedup), this POC bypasses the cache by loading the input map into a flat RAM array before the parallel loop. Reads from a flat array are naturally thread-safe, eliminating all lock contention.

Known Limitations

OpenMP detection uses macOS-specific flags (-Xclang -fopenmp). Full implementation will wire HAVE_OPENMP into the configure system for Linux/Windows portability.
Rast_put_row requires a #pragma omp critical for sequential writes. Final implementation will use indexed row buffers.
RAM loading loop uses a workaround — final implementation will use Rast_get_row directly.
No fallback for systems where the map exceeds available RAM.
GPJ_transform shares a single tproj context across threads. Per-thread PJ_CONTEXT initialization is the next step.

This PR is not intended for merge. I made this PR solely to demonstrate that the projection math is parallelizable and the performance gain is real.

…re Apple M-series

github-actions · 2026-03-16T12:07:27Z

raster/r.proj/main.c

 static char *make_ipol_list(void);
 static char *make_ipol_desc(void);

+


[pre-commit] _{reported by reviewdog 🐶}

Suggested change

github-actions · 2026-03-16T12:07:27Z

raster/r.proj/main.c

+void interpolate_ram(void *full_map, void *obufptr, int cell_type, 
+                     double col_idx, double row_idx, struct Cell_head *incellhd) 


[pre-commit] _{reported by reviewdog 🐶}

Suggested change

void interpolate_ram(void *full_map, void *obufptr, int cell_type,

double col_idx, double row_idx, struct Cell_head *incellhd)

void interpolate_ram(void *full_map, void *obufptr, int cell_type,

double col_idx, double row_idx, struct Cell_head *incellhd)

github-actions · 2026-03-16T12:07:27Z

raster/r.proj/main.c

+    }
+
+    /* Direct memory access - Thread Safe for Reads */
+    unsigned char *src = (unsigned char *)full_map + 


[pre-commit] _{reported by reviewdog 🐶}

Suggested change

unsigned char *src = (unsigned char *)full_map +

unsigned char *src = (unsigned char *)full_map +

github-actions · 2026-03-16T12:07:27Z

raster/r.proj/main.c

+    memcpy(obufptr, src, cell_size);
+}
+
+


[pre-commit] _{reported by reviewdog 🐶}

Suggested change

github-actions · 2026-03-16T12:07:27Z

raster/r.proj/main.c

-    G_important_message(_("Projecting..."));
-    for (row = 0; row < outcellhd.rows; row++) {
-        /* obufptr = obuffer */;



[pre-commit] _{reported by reviewdog 🐶}

Suggested change

github-actions · 2026-03-16T12:07:28Z

raster/r.proj/main.c


-                /* row index in input matrix    */
-                double row_idx = (incellhd.north - ycoord1) / incellhd.ns_res;
+            if (GPJ_transform(&oproj, &iproj, &tproj, PJ_FWD, &x1, &y1, NULL) < 0) {


[pre-commit] _{reported by reviewdog 🐶}

Suggested change

if (GPJ_transform(&oproj, &iproj, &tproj, PJ_FWD, &x1, &y1, NULL) < 0) {

if (GPJ_transform(&oproj, &iproj, &tproj, PJ_FWD, &x1, &y1, NULL) <

0) {

github-actions · 2026-03-16T12:07:28Z

raster/r.proj/main.c

-                double row_idx = (incellhd.north - ycoord1) / incellhd.ns_res;
+            if (GPJ_transform(&oproj, &iproj, &tproj, PJ_FWD, &x1, &y1, NULL) < 0) {
+                Rast_set_null_value(obufptr, 1, cell_type);
+            } else {


[pre-commit] _{reported by reviewdog 🐶}

Suggested change

} else {

}

else {

github-actions · 2026-03-16T12:07:28Z

raster/r.proj/main.c

-                interpolate(ibuffer, obufptr, cell_type, col_idx, row_idx,
-                            &incellhd);
+                /* CALL OUR LOCK-FREE RAM INTERPOLATOR */
+                interpolate_ram(full_map_array, obufptr, cell_type, c_idx, r_idx, &incellhd);


[pre-commit] _{reported by reviewdog 🐶}

Suggested change

interpolate_ram(full_map_array, obufptr, cell_type, c_idx, r_idx, &incellhd);

interpolate_ram(full_map_array, obufptr, cell_type, c_idx,

r_idx, &incellhd);

github-actions · 2026-03-16T12:07:28Z

raster/r.proj/main.c

-
-        xcoord2 = outcellhd.west + (outcellhd.ew_res / 2);
-        ycoord2 -= outcellhd.ns_res;
+        #pragma omp critical


[pre-commit] _{reported by reviewdog 🐶}

Suggested change

#pragma omp critical

#pragma omp critical

github-actions · 2026-03-16T12:07:28Z

raster/r.proj/main.c


    return buf;
 }
+


[pre-commit] _{reported by reviewdog 🐶}

Suggested change

petrasovaa · 2026-03-17T02:53:43Z

Thanks! Reading the whole raster into memory goes against how GRASS handles processing, so while I understand this is a proof of concept, I would like to see a plan how to handle that.

krcoder123 · 2026-03-17T04:58:33Z

Hi Anna, thank you for the feedback. You're right that loading the full raster into RAM conflicts with GRASS's tile based memory model. This POC was mainly a first step to prove the projection math is parallelizable before tackling the harder cache problem.

For the full GSoC proposal, my plan is a two path approach: if the map fits within the user's memory limit, use the fast lock-free RAM buffer. If it doesn't, fall back to thread-local tile caches, where each thread maintains its own independent cache, removing lock contention while preserving GRASS's ability to process maps larger than RAM.

Does that direction align with what you'd want to see in the proposal?

petrasovaa · 2026-03-18T01:57:18Z

Hi Anna, thank you for the feedback. You're right that loading the full raster into RAM conflicts with GRASS's tile based memory model. This POC was mainly a first step to prove the projection math is parallelizable before tackling the harder cache problem.

For the full GSoC proposal, my plan is a two path approach: if the map fits within the user's memory limit, use the fast lock-free RAM buffer. If it doesn't, fall back to thread-local tile caches, where each thread maintains its own independent cache, removing lock contention while preserving GRASS's ability to process maps larger than RAM.

Does that direction align with what you'd want to see in the proposal?

Yes, just more detailed analysis.

krcoder123 · 2026-03-20T07:27:50Z

Yes, just more detailed analysis.

Hi Anna,

I've completed a full draft of my proposal using your feedback on the memory model. The two-path architecture is documented in detail here: https://docs.google.com/document/d/1IUOA93u2iqIp0Qix-ymChkvmn3wKTUgFkOBhWCMudrA/edit?tab=t.0
Would appreciate any additional feedback before the deadline.

Thanks!

petrasovaa · 2026-03-22T03:40:05Z

@HuidaeCho would you mind having a look at the proposal?

HuidaeCho · 2026-03-22T06:58:07Z

@HuidaeCho would you mind having a look at the proposal?

Sure, reviewing it.

HuidaeCho · 2026-03-22T07:22:24Z

@krcoder123, see my suggestions for minor edits and comments in the proposal.

metzm · 2026-03-22T18:14:55Z

Related, maybe helpfull: the GDAL warper has a NUM_THREADS option available in GDALWarpOptions. Even though GDAL uses a different caching method for input data, the GDAL implementation of using multiple threads for raster reprojection could provide some ideas about how to implement multi-threading in GRASS r.proj.

krcoder123 · 2026-03-23T03:21:11Z

@krcoder123, see my suggestions for minor edits and comments in the proposal.

Thank you for the detailed review. I've updated the proposal to address your feedback. I clarified the motivation for the two-path design, added the user controlled memory threshold with megabyte and percentage options, and added benchmarking both paths as an early task in the timeline. I also removed all the "GIS" references and fixed the grammar suggestions. Please let me know if anything else needs addressing.

krcoder123 · 2026-03-23T03:24:17Z

Related, maybe helpfull: the GDAL warper has a NUM_THREADS option available in GDALWarpOptions. Even though GDAL uses a different caching method for input data, the GDAL implementation of using multiple threads for raster reprojection could provide some ideas about how to implement multi-threading in GRASS r.proj.

Thank you for the pointer, I found this pretty helpful. Seeing how GDAL's NUM_THREADS handles thread-safe data access through independent chunks showed that the horizontal row-strip approach I'm going with for Path B is the right direction. The tricky part in GRASS is that readcell was never designed for this, so Path B needs each thread to have its own cache instance instead of sharing one. I've added the GDAL warper as a reference in the proposal.

krcoder123 · 2026-03-26T08:38:33Z

Hi @HuidaeCho

I've updated the proposal and expanded the scope beyond just r.proj. After digging into the r.fill.stats source code, I found that row-level parallelism is blocked by a sliding ring buffer that has to advance sequentially. However, the column loop inside interpolate_row() is completely independent per cell, which makes it a clean parallelization target. r.fill.stats is now confirmed as the secondary module. I'll pick an additional module during the bonding period after auditing the remaining candidates.

Please let me know if there are any changes I should make before the deadline. Thank you!

https://docs.google.com/document/d/1IUOA93u2iqIp0Qix-ymChkvmn3wKTUgFkOBhWCMudrA/edit?tab=t.0

krcoder123 · 2026-03-29T07:40:58Z

@krcoder123, see my suggestions for minor edits and comments in the proposal.

Hi @HuidaeCho,

I've updated the proposal significantly since your last review. After further source audits I found that the previously proposed secondary module had sequential dependencies that would limit speedup, so I changed to r.param.scale instead. I audited process.c, identified the sequential sliding buffer as the core blocker, and applied the same RAM preload pattern from r.proj. Draft PR #7236 shows 1.7x speedup on a 100M cell raster. r.geomorphon is planned as the third module with a source audit during the bonding period. Would you be able to take a quick look at the updated proposal before the deadline?
https://docs.google.com/document/d/1IUOA93u2iqIp0Qix-ymChkvmn3wKTUgFkOBhWCMudrA/edit?tab=t.0

Thanks,
Kaushik

POC: parallelize r.proj via RAM-resident buffer, 2.5x speedup on 8-co…

1afd473

…re Apple M-series

github-actions bot added raster Related to raster data processing C Related code is in C module labels Mar 16, 2026

Merge branch 'main' into gsoc-parallel-rproj

5b5e270

github-actions bot reviewed Mar 16, 2026

View reviewed changes

r.proj: use memory option to limit RAM buffer per community feedback

b75a6eb

krcoder123 closed this Mar 17, 2026

krcoder123 reopened this Mar 17, 2026

		static char *make_ipol_list(void);
		static char *make_ipol_desc(void);

		void interpolate_ram(void full_map, void obufptr, int cell_type,
		double col_idx, double row_idx, struct Cell_head *incellhd)

	unsigned char src = (unsigned char )full_map +
	unsigned char src = (unsigned char )full_map +

	if (GPJ_transform(&oproj, &iproj, &tproj, PJ_FWD, &x1, &y1, NULL) < 0) {
	if (GPJ_transform(&oproj, &iproj, &tproj, PJ_FWD, &x1, &y1, NULL) <
	0) {

	interpolate_ram(full_map_array, obufptr, cell_type, c_idx, r_idx, &incellhd);
	interpolate_ram(full_map_array, obufptr, cell_type, c_idx,
	r_idx, &incellhd);

Uh oh!

Conversation

krcoder123 commented Mar 16, 2026

Benchmark Results

Technical Approach

Known Limitations

Uh oh!

github-actions bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

petrasovaa commented Mar 17, 2026

Uh oh!

krcoder123 commented Mar 17, 2026

Uh oh!

petrasovaa commented Mar 18, 2026

Uh oh!

krcoder123 commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

petrasovaa commented Mar 22, 2026

Uh oh!

HuidaeCho commented Mar 22, 2026

Uh oh!

HuidaeCho commented Mar 22, 2026

Uh oh!

metzm commented Mar 22, 2026

Uh oh!

krcoder123 commented Mar 23, 2026

Uh oh!

krcoder123 commented Mar 23, 2026

Uh oh!

krcoder123 commented Mar 26, 2026

Uh oh!

krcoder123 commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

krcoder123 commented Mar 20, 2026 •

edited

Loading