r.watershed: Can now optionally use already existing maps to optimize performance#6953
r.watershed: Can now optionally use already existing maps to optimize performance#6953sumitchintanwar wants to merge 8 commits intoOSGeo:mainfrom
Conversation
|
Did you ever try using hyperfine for actually making the stats, doing enough runs for it to be statistically significative, handling outliers (often caused by other programs running), and removing the shell startup time. But any improvement is an improvement! |
|
Thanks! Please check our contributing guidelines for code style (use pre-commit). Please add a test. Include a quick test (on smaller part of "elevation" map and maybe on artificial surface) and perhaps one that runs longer with larger area and deactivate it to not slow down the CI, but I can then at least run it locally. I am not an expert on this tool, but one of my concerns with this is that if user tries to pass a flow accumulation/drainage raster computed with different tool (e.g. r.stream.* tools can handle that), the algorithm may potentially fail (incorrect results, segfaults). This can be of course discouraged in the documentation, but maybe there are other ways, e.g. detect the drainage conventions and check they match. |
I haven't tried hyperfine. Thank you for the suggestion. I will try that for the tests. i think the actual speedup would be more modest once measured correctly with hyperfine. But hey, It's still a worthwhile improvement. |
@petrasovaa, Thanks for the feedback and the code review. I'll check the guidelines and get back to you with proper tests. |
|
Hey @petrasovaa, @echoix, I have made the requested changes and also added benchmarks using hyperfine along with proper documentation, tests and constraints of this feature. Ready for review when you have a chance! |
|
On a side note, I have to admit, I'm still climbing the GRASS learning curve a bit, but I'm really enjoying the challenge! It's super interesting digging into how this all works. Let me know what you think of the updates!" |
|
Some of the checks are failing because of hyperfine not found. Is there anything I should do to resolve this specifically? "raster/r.watershed/testsuite/benchmark_reuse.sh: line 15: hyperfine: command not found" |
e2cc694 to
1449daf
Compare
|
Given the significant changes in the code, I think we need to step back and write proper tests of the current code in a separate PR. I don't think the existing tests are comprehensive enough. Once those tests are merged, we can verify this PR is not breaking anything. Limit the region so that the test run fast, but include a test with larger area that we can verify locally but skip it in the CI. |
Okay. I understand @petrasovaa . I will make a separate PR with more comprehensive tests. I would like some clarity on what more tests should I add. I have skipped Basin delineation, TCI Sp calculation as for those significant algorithm changes will have to be added. |
|
The Tests are for reuse functionality are added in PR #6992. I'd appreciate the review. |
There was a problem hiding this comment.
I don’t think benchmarking is appropriate to run in tests, even less in CI where the actual hardware is variable and we won’t even do something with it. It’s nice to have as a reference, maybe place it in the PR or ask if it should be left as a separate file not picked up in tests
There was a problem hiding this comment.
Thanks for the review. I have skipped it in CI. Should I remove it from the PR entirely ? Results are there in the PR description. I just added it as something to test locally.
That's a misunderstanding... Please reread my comment again. To move on with this PR, we need to have tests in place that would catch any regressions of the existing functionality caused by this PR. This PR may potentially break existing functionality and need to be able to catch that. Existing tests are not comprehensive enough.
Could explain this a little bit more? |
I am sorry for the confusion. I misunderstood. I'll add tests for the current code in a different pr, so it is ensured that the existing functionality isn't broken. |
@petrasovaa TCI/SPI require slope computation from the elevation model, which is skipped when reusing flow maps. To keep this change focused, reuse mode is currently limited to outputs that can be derived directly from reused flow maps. |
|
@petrasovaa The regression tests are added in PR #7029. I’d really appreciate a quick review. |
Overview
This PR introduces a "reuse" feature to
r.watershed, allowing users to skip the computationally expensive flow accumulation and drainage direction steps if these maps already exist. This is particularly useful for:Key Changes
main.cand RAM/SEG modules to accept existingaccumulationanddrainagemaps as inputs.raster/r.watershed/REUSE_FEATURE.mdcovering workflows, limitations, and "Do's and Don'ts".raster/r.watershed/testsuite/BENCHMARKS.mdusinghyperfine.testsuite/to verify correctness.Performance

Benchmarks performed using
hyperfineon the NC SPM dataset show significant speedups when iterating on thresholds.Multiple Benchmark tests using hyperfine show similar significant improvement like above.
Limitations
r.watershed. Maps fromr.stream.*or other tools may have different drainage conventions and are not supported (explicitly warned in docs).addressed Feedback
timewithhyperfinefor statistically significant benchmarking.G_option_collective).Closes #6720