r.watershed: Can now optionally use already existing maps to optimize performance by sumitchintanwar · Pull Request #6953 · OSGeo/grass

sumitchintanwar · 2026-01-25T00:04:19Z

Overview
This PR introduces a "reuse" feature to r.watershed, allowing users to skip the computationally expensive flow accumulation and drainage direction steps if these maps already exist. This is particularly useful for:

Iterative testing of different basin threshold values.
Batch processing where the underlying elevation/flow model remains constant.

Key Changes

Core Logic: Modified main.c and RAM/SEG modules to accept existing accumulation and drainage maps as inputs.
Documentation: Added raster/r.watershed/REUSE_FEATURE.md covering workflows, limitations, and "Do's and Don'ts".
Benchmarks: Added raster/r.watershed/testsuite/BENCHMARKS.md using hyperfine.
Testing: Added new tests in testsuite/ to verify correctness.

Performance
Benchmarks performed using hyperfine on the NC SPM dataset show significant speedups when iterating on thresholds.

Multiple Benchmark tests using hyperfine show similar significant improvement like above.

Limitations

Input Compatibility: As noted in the documentation, input maps must be generated by r.watershed. Maps from r.stream.* or other tools may have different drainage conventions and are not supported (explicitly warned in docs).
Basin Delineation Only: The reuse mode focuses on recalculating basins/streams based on thresholds; it does not re-compute flow physics.

addressed Feedback

Replaced time with hyperfine for statistically significant benchmarking.
Added safety checks for flags (G_option_collective).
Added comprehensive documentation and tests.

Closes #6720

…culation

echoix · 2026-01-25T03:04:34Z

Did you ever try using hyperfine for actually making the stats, doing enough runs for it to be statistically significative, handling outliers (often caused by other programs running), and removing the shell startup time.
With the timings you shown, using time maybe has limits since it wasn't really long. The "~4 times" is on user-mode CPU time only, but I'm a little sceptical that it isn't just variations between runs.

But any improvement is an improvement!

petrasovaa · 2026-01-25T03:35:57Z

Thanks! Please check our contributing guidelines for code style (use pre-commit).

Please add a test. Include a quick test (on smaller part of "elevation" map and maybe on artificial surface) and perhaps one that runs longer with larger area and deactivate it to not slow down the CI, but I can then at least run it locally.

I am not an expert on this tool, but one of my concerns with this is that if user tries to pass a flow accumulation/drainage raster computed with different tool (e.g. r.stream.* tools can handle that), the algorithm may potentially fail (incorrect results, segfaults). This can be of course discouraged in the documentation, but maybe there are other ways, e.g. detect the drainage conventions and check they match.

raster/r.watershed/front/main.c

raster/r.watershed/ram/init_vars.c

sumitchintanwar · 2026-01-25T07:15:57Z

Did you ever try using hyperfine for actually making the stats, doing enough runs for it to be statistically significative, handling outliers (often caused by other programs running), and removing the shell startup time. With the timings you shown, using time maybe has limits since it wasn't really long. The "~4 times" is on user-mode CPU time only, but I'm a little sceptical that it isn't just variations between runs.

But any improvement is an improvement!

I haven't tried hyperfine. Thank you for the suggestion. I will try that for the tests. i think the actual speedup would be more modest once measured correctly with hyperfine. But hey, It's still a worthwhile improvement.

sumitchintanwar · 2026-01-25T07:19:47Z

Thanks! Please check our contributing guidelines for code style (use pre-commit).

Please add a test. Include a quick test (on smaller part of "elevation" map and maybe on artificial surface) and perhaps one that runs longer with larger area and deactivate it to not slow down the CI, but I can then at least run it locally.

I am not an expert on this tool, but one of my concerns with this is that if user tries to pass a flow accumulation/drainage raster computed with different tool (e.g. r.stream.* tools can handle that), the algorithm may potentially fail (incorrect results, segfaults). This can be of course discouraged in the documentation, but maybe there are other ways, e.g. detect the drainage conventions and check they match.

@petrasovaa, Thanks for the feedback and the code review. I'll check the guidelines and get back to you with proper tests.

sumitchintanwar · 2026-01-27T12:03:47Z

Hey @petrasovaa, @echoix, I have made the requested changes and also added benchmarks using hyperfine along with proper documentation, tests and constraints of this feature. Ready for review when you have a chance!

sumitchintanwar · 2026-01-27T12:07:45Z

On a side note, I have to admit, I'm still climbing the GRASS learning curve a bit, but I'm really enjoying the challenge! It's super interesting digging into how this all works. Let me know what you think of the updates!"

sumitchintanwar · 2026-01-28T02:14:21Z

Some of the checks are failing because of hyperfine not found. Is there anything I should do to resolve this specifically?

"raster/r.watershed/testsuite/benchmark_reuse.sh: line 15: hyperfine: command not found"

petrasovaa · 2026-01-28T04:53:38Z

Given the significant changes in the code, I think we need to step back and write proper tests of the current code in a separate PR. I don't think the existing tests are comprehensive enough. Once those tests are merged, we can verify this PR is not breaking anything. Limit the region so that the test run fast, but include a test with larger area that we can verify locally but skip it in the CI.

sumitchintanwar · 2026-01-28T05:16:00Z

Given the significant changes in the code, I think we need to step back and write proper tests of the current code in a separate PR. I don't think the existing tests are comprehensive enough. Once those tests are merged, we can verify this PR is not breaking anything. Limit the region so that the test run fast, but include a test with larger area that we can verify locally but skip it in the CI.

Okay. I understand @petrasovaa . I will make a separate PR with more comprehensive tests. I would like some clarity on what more tests should I add.

I have skipped Basin delineation, TCI Sp calculation as for those significant algorithm changes will have to be added.
I'm thinking we'll do after the current feature is approved.

sumitchintanwar · 2026-01-28T22:58:23Z

The Tests are for reuse functionality are added in PR #6992. I'd appreciate the review.

echoix · 2026-01-29T18:15:36Z

raster/r.watershed/testsuite/benchmark_reuse.sh

I don’t think benchmarking is appropriate to run in tests, even less in CI where the actual hardware is variable and we won’t even do something with it. It’s nice to have as a reference, maybe place it in the PR or ask if it should be left as a separate file not picked up in tests

Thanks for the review. I have skipped it in CI. Should I remove it from the PR entirely ? Results are there in the PR description. I just added it as something to test locally.

petrasovaa · 2026-01-29T22:18:08Z

Given the significant changes in the code, I think we need to step back and write proper tests of the current code in a separate PR. I don't think the existing tests are comprehensive enough. Once those tests are merged, we can verify this PR is not breaking anything. Limit the region so that the test run fast, but include a test with larger area that we can verify locally but skip it in the CI.

Okay. I understand @petrasovaa . I will make a separate PR with more comprehensive tests. I would like some clarity on what more tests should I add.

That's a misunderstanding... Please reread my comment again. To move on with this PR, we need to have tests in place that would catch any regressions of the existing functionality caused by this PR. This PR may potentially break existing functionality and need to be able to catch that. Existing tests are not comprehensive enough.

I have skipped Basin delineation, TCI Sp calculation as for those significant algorithm changes will have to be added. I'm thinking we'll do after the current feature is approved.

Could explain this a little bit more?

sumitchintanwar · 2026-01-30T03:38:15Z

That's a misunderstanding... Please reread my comment again. To move on with this PR, we need to have tests in place that would catch any regressions of the existing functionality caused by this PR. This PR may potentially break existing functionality and need to be able to catch that. Existing tests are not comprehensive enough.

I am sorry for the confusion. I misunderstood. I'll add tests for the current code in a different pr, so it is ensured that the existing functionality isn't broken.

sumitchintanwar · 2026-01-30T04:13:19Z

I have skipped Basin delineation, TCI Spi calculation as for those significant algorithm changes will have to be added. I'm thinking we'll do after the current feature is approved.

Could explain this a little bit more?

@petrasovaa
Basically, Basin outputs are dependent on threshold, different thresholds create different basin boundaries from the same flow data. This threshold-based processing is not yet implemented in reuse mode and can be planned for later.

TCI/SPI require slope computation from the elevation model, which is skipped when reusing flow maps.

To keep this change focused, reuse mode is currently limited to outputs that can be derived directly from reused flow maps.

sumitchintanwar · 2026-02-02T12:27:36Z

@petrasovaa The regression tests are added in PR #7029. I’d really appreciate a quick review.

reusing accumulation and drainage inputs for skipping unnecessary cal…

eb8b2f3

…culation

sumitchintanwar mentioned this pull request Jan 25, 2026

r.watershed: optionally use already existing maps of flow-accumulation or drainage direction #6720

Open

github-actions bot added raster Related to raster data processing C Related code is in C module labels Jan 25, 2026

petrasovaa linked an issue Jan 25, 2026 that may be closed by this pull request

r.watershed: optionally use already existing maps of flow-accumulation or drainage direction #6720

Open

petrasovaa reviewed Jan 25, 2026

View reviewed changes

raster/r.watershed/front/main.c Outdated Show resolved Hide resolved

petrasovaa reviewed Jan 25, 2026

View reviewed changes

raster/r.watershed/ram/init_vars.c Show resolved Hide resolved

github-actions bot reviewed Jan 25, 2026

View reviewed changes

raster/r.watershed/ram/init_vars.c Outdated Show resolved Hide resolved

sumitchintanwar added 4 commits January 27, 2026 05:33

wip : Fixing irregularities found while adding tests

34d5d2e

core: clarify that basin delineation is not supported

09e2795

tests: add tests for r.watershed for testing reusing maps feat.

1460adc

benchmarks: add performance benchmark for r.watershed map reuse

1449daf

sumitchintanwar requested a review from petrasovaa January 27, 2026 12:08

github-actions bot added Python Related code is in Python HTML Related code is in HTML docs markdown Related to markdown, markdown files tests Related to Test Suite labels Jan 27, 2026

sumitchintanwar force-pushed the feat/r-watershed-reuse-flow branch from e2cc694 to 1449daf Compare January 28, 2026 04:14

Merge branch 'main' into feat/r-watershed-reuse-flow

5f9851d

sumitchintanwar mentioned this pull request Jan 28, 2026

r.watershed: Add comprehensive tests for map reuse functionality #6992

Open

r.watershed: skip benchmark in CI if hyperfine is missing

7955667

echoix reviewed Jan 29, 2026

View reviewed changes

sumitchintanwar mentioned this pull request Feb 2, 2026

r.watershed: Add comprehensive regression tests #7029

Open

sumitchintanwar requested a review from echoix February 2, 2026 12:27

Merge branch 'main' into feat/r-watershed-reuse-flow

3876071

Uh oh!

Conversation

sumitchintanwar commented Jan 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

echoix commented Jan 25, 2026

Uh oh!

petrasovaa commented Jan 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sumitchintanwar commented Jan 25, 2026

Uh oh!

sumitchintanwar commented Jan 25, 2026

Uh oh!

sumitchintanwar commented Jan 27, 2026

Uh oh!

sumitchintanwar commented Jan 27, 2026

Uh oh!

sumitchintanwar commented Jan 28, 2026

Uh oh!

petrasovaa commented Jan 28, 2026

Uh oh!

sumitchintanwar commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sumitchintanwar commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

echoix Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

sumitchintanwar Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

petrasovaa commented Jan 29, 2026

Uh oh!

sumitchintanwar commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sumitchintanwar commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sumitchintanwar commented Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sumitchintanwar commented Jan 25, 2026 •

edited

Loading

petrasovaa commented Jan 25, 2026 •

edited

Loading

sumitchintanwar commented Jan 28, 2026 •

edited

Loading

sumitchintanwar commented Jan 28, 2026 •

edited

Loading

sumitchintanwar commented Jan 30, 2026 •

edited

Loading

sumitchintanwar commented Jan 30, 2026 •

edited

Loading