Gsoc parallel rparamscale#7236

Draft

krcoder123 wants to merge 4 commits intoOSGeo:mainfrom

krcoder123:gsoc-parallel-rparamscale

krcoder123 commented Mar 28, 2026 •

edited

Loading

This draft PR demonstrates a proof of concept parallelization of r.param.scale using OpenMP, submitted as part of my GSoC 2026 proposal for "Parallelization of existing tools."

Benchmark Results

100M cell raster, Apple M4, 8 cores — 3 runs each:

Method	Wall Time	CPU Usage
Serial (original)	~7.2s	93% (1 core)
Parallel (this PR)	~4.2s	250%+ (2-3 cores)

1.7x wall-time speedup on a 100M cell raster.

Technical Approach

The core blocker for parallelizing r.param.scale is the sequential sliding buffer in process(). The original implementation shuffles rows down after each row. Each row depends on the state left by the previous row, making direct row-level parallelism impossible without reading each row's neighborhood from disk multiple times, which would be slower than serial. The solution is the same RAM preload pattern used in the r.proj parallelization: load the entire input raster into a flat 2D array before the parallel region. With the full map in RAM, each output row's neighborhood window can be accessed independently. The outer row loop is parallelized with #pragma omp parallel for schedule(dynamic), with per-thread t_window and t_obs buffers to avoid write contention.

Known Limitations

OpenMP detection uses macOS-specific flags (-Xclang -fopenmp). Full implementation will wire HAVE_OPENMP into the configure system for Linux/Windows portability.
No user-controlled memory threshold yet. Full implementation will add a memory parameter consistent with other GRASS modules.
Output writes happen serially after the parallel region using indexed row buffers.

This PR is not intended for merge. It demonstrates that the surface parameter computation is parallelizable and the performance gain is real.

Kaushik Raja and others added 4 commits

March 15, 2026 23:21


          POC: parallelize r.proj via RAM-resident buffer, 2.5x speedup on 8-co…

1afd473

…re Apple M-series


          Merge branch 'main' into gsoc-parallel-rproj

5b5e270


          r.proj: use memory option to limit RAM buffer per community feedback

b75a6eb


          POC: r.param.scale OpenMP parallelization via RAM preload - 1.7x spee…

60e606f

…dup on 100M cell raster

github-actions bot added raster C module labels

github-actions bot reviewed

View reviewed changes

raster/r.param.scale/process.c

+                  ncols = Rast_window_cols();
+                  if ((region.ew_res / region.ns_res >= 1.01) ||
+                      (region.ns_res / region.ew_res >= 1.01)) {
+                      G_warning(_("E-W and N-S grid resolutions are different. Taking average."));

Contributor

github-actions bot Mar 29, 2026

[pre-commit] _{reported by reviewdog 🐶}

Suggested change

      
                    G_warning(_("E-W and N-S grid resolutions are different. Taking average."));
          
                    G_warning(
          
                        _("E-W and N-S grid resolutions are different. Taking average."));

raster/r.param.scale/process.c

-                       */
-                  }
+                  else
+                      G_ludcmp(normal_ptr, 6, index_ptr,&temp);

Contributor

github-actions bot Mar 29, 2026

[pre-commit] _{reported by reviewdog 🐶}

Suggested change

      
                    G_ludcmp(normal_ptr, 6, index_ptr,&temp);
          
                    G_ludcmp(normal_ptr, 6, index_ptr, &temp);

raster/r.param.scale/process.c

Comment on lines 80 to +84

    
                  /*-----------------------------------------------------------------------*/

                  /*          PROCESS INPUT RASTER AND WRITE OUT RASTER LINE BY LINE       */

                  /* Parallel row loop. Each thread gets its own local computation         */

                  /* buffers (t_window, t_obs) to avoid write contention.                  */

                  /*-----------------------------------------------------------------------*/

                  if (mparam != FEATURE)

                      for (wind_row = 0; wind_row < EDGE; wind_row++)

                          Rast_put_row(fd_out, row_out,

                                       DCELL_TYPE); /* Write out the edge cells as NULL.    */

                  else

                      for (wind_row = 0; wind_row < EDGE; wind_row++)

                          Rast_put_row(fd_out, featrow_out,

                                       CELL_TYPE); /* Write out the edge cells as NULL.    */

                  for (wind_row = 0; wind_row < wsize - 1; wind_row++)

                      Rast_get_row(fd_in, row_in + (wind_row * ncols), wind_row, DCELL_TYPE);

                  /* Read in enough of the first rows to  */

                  /* allow window to be examined.         */

                  #pragma omp parallel for schedule(dynamic) private(row, col, wind_row)

Contributor

github-actions bot Mar 29, 2026

[pre-commit] _{reported by reviewdog 🐶}

Suggested change

      
                /*-----------------------------------------------------------------------*/
          
                /*          PROCESS INPUT RASTER AND WRITE OUT RASTER LINE BY LINE       */
          
                /* Parallel row loop. Each thread gets its own local computation         */
          
                /* buffers (t_window, t_obs) to avoid write contention.                  */
          
                /*-----------------------------------------------------------------------*/
          
                if (mparam != FEATURE)
          
                    for (wind_row = 0; wind_row < EDGE; wind_row++)
          
                        Rast_put_row(fd_out, row_out,
          
                                     DCELL_TYPE); /* Write out the edge cells as NULL.    */
          
                else
          
                    for (wind_row = 0; wind_row < EDGE; wind_row++)
          
                        Rast_put_row(fd_out, featrow_out,
          
                                     CELL_TYPE); /* Write out the edge cells as NULL.    */
          
                for (wind_row = 0; wind_row < wsize - 1; wind_row++)
          
                    Rast_get_row(fd_in, row_in + (wind_row * ncols), wind_row, DCELL_TYPE);
          
                /* Read in enough of the first rows to  */
          
                /* allow window to be examined.         */
          
                #pragma omp parallel for schedule(dynamic) private(row, col, wind_row)
          
            /*-----------------------------------------------------------------------*/
          
            /* Parallel row loop. Each thread gets its own local computation         */
          
            /* buffers (t_window, t_obs) to avoid write contention.                  */
          
            /*-----------------------------------------------------------------------*/
          
            #pragma omp parallel for schedule(dynamic) private(row, col, wind_row)

raster/r.param.scale/process.c

Comment on lines +111 to +112

		(size_t)(row + wind_row - EDGE) * ncols +
		col + wind_col - EDGE;

Contributor

github-actions bot Mar 29, 2026

[pre-commit] _{reported by reviewdog 🐶}

Suggested change

      
                                    (size_t)(row + wind_row - EDGE) * ncols +
          
                                    col + wind_col - EDGE;
          
                                              (size_t)(row + wind_row - EDGE) * ncols +
          
                                              col + wind_col - EDGE;

raster/r.param.scale/process.c

-                                          Rast_set_d_null_value(row_out + col, 1);
-                                      }
-                                      found_null = TRUE;
+                                          Rast_set_c_null_value((CELL *)row_buffers[row] + col, 1);

Contributor

github-actions bot Mar 29, 2026

[pre-commit] _{reported by reviewdog 🐶}

Suggested change

      
                                        Rast_set_c_null_value((CELL *)row_buffers[row] + col, 1);
          
                                        Rast_set_c_null_value(
          
                                            (CELL *)row_buffers[row] + col, 1);

raster/r.proj/main.c

-                              /* row index in input matrix    */
-                              double row_idx = (incellhd.north - ycoord1) / incellhd.ns_res;
+                          if (GPJ_transform(&oproj, &iproj, &tproj, PJ_FWD, &x1, &y1, NULL) < 0) {

Contributor

github-actions bot Mar 29, 2026

[pre-commit] _{reported by reviewdog 🐶}

Suggested change

      
                        if (GPJ_transform(&oproj, &iproj, &tproj, PJ_FWD, &x1, &y1, NULL) < 0) {
          
                        if (GPJ_transform(&oproj, &iproj, &tproj, PJ_FWD, &x1, &y1, NULL) <
          
                            0) {

raster/r.proj/main.c

-                              double row_idx = (incellhd.north - ycoord1) / incellhd.ns_res;
+                          if (GPJ_transform(&oproj, &iproj, &tproj, PJ_FWD, &x1, &y1, NULL) < 0) {
+                              Rast_set_null_value(obufptr, 1, cell_type);
+                          } else {

Contributor

github-actions bot Mar 29, 2026

[pre-commit] _{reported by reviewdog 🐶}

Suggested change

raster/r.proj/main.c

-                              interpolate(ibuffer, obufptr, cell_type, col_idx, row_idx,
-                                          &incellhd);
+                              /* CALL OUR LOCK-FREE RAM INTERPOLATOR */
+                              interpolate_ram(full_map_array, obufptr, cell_type, c_idx, r_idx, &incellhd);

Contributor

github-actions bot Mar 29, 2026

[pre-commit] _{reported by reviewdog 🐶}

Suggested change

      
                            interpolate_ram(full_map_array, obufptr, cell_type, c_idx, r_idx, &incellhd);
          
                            interpolate_ram(full_map_array, obufptr, cell_type, c_idx,
          
                                            r_idx, &incellhd);

raster/r.proj/main.c

-                      xcoord2 = outcellhd.west + (outcellhd.ew_res / 2);
-                      ycoord2 -= outcellhd.ns_res;
+                      #pragma omp critical

Contributor

github-actions bot Mar 29, 2026

[pre-commit] _{reported by reviewdog 🐶}

Suggested change

      
                    #pragma omp critical
          
            #pragma omp critical

raster/r.proj/main.c


		return buf;
		}

Contributor

github-actions bot Mar 29, 2026

[pre-commit] _{reported by reviewdog 🐶}

Suggested change

krcoder123 mentioned this pull request

[GSoC 2026 Draft POC] r.proj: OpenMP parallelization via RAM-resident buffer #7185

Draft

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C module raster