Skip to content

Feature/replace caching#102

Draft
creisle wants to merge 17 commits into
developfrom
feature/replace-caching
Draft

Feature/replace caching#102
creisle wants to merge 17 commits into
developfrom
feature/replace-caching

Conversation

@creisle
Copy link
Copy Markdown
Member

@creisle creisle commented Feb 12, 2026

This replaces the custom caching with a standard caching library. It is very useful if you want to cache between scripts to (for example) an sqlite database so if you were generating many reports right after one another you could so so without having to make as many API calls. By default this will just use an in-memory cache (similar to the current behaviour) unless a cache_name is given.

However I am not sure if the one function i deleted is in use anywhere so that should be double checked with other stakeholders.

@creisle creisle requested a review from elewis2 February 12, 2026 07:33
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Feb 12, 2026

Codecov Report

❌ Patch coverage is 69.64286% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.16%. Comparing base (8777c66) to head (ec6441d).
⚠️ Report is 2 commits behind head on develop.

Files with missing lines Patch % Lines
pori_python/graphkb/util.py 69.09% 17 Missing ⚠️

❌ Your patch status has failed because the patch coverage (69.64%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #102      +/-   ##
===========================================
+ Coverage    83.46%   87.16%   +3.70%     
===========================================
  Files           18       18              
  Lines         2546     2571      +25     
===========================================
+ Hits          2125     2241     +116     
+ Misses         421      330      -91     
Flag Coverage Δ
unittests 87.16% <69.64%> (+3.70%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@elewis2
Copy link
Copy Markdown
Collaborator

elewis2 commented Mar 17, 2026

Noting that tests/test_ipr/test_upload.py takes twice as long to run with this (3 mins to 6). If we want to incorporate this change to help with use of the package at scale we should probably make it optional somehow

@creisle
Copy link
Copy Markdown
Member Author

creisle commented Apr 14, 2026

Noting that tests/test_ipr/test_upload.py takes twice as long to run with this (3 mins to 6). If we want to incorporate this change to help with use of the package at scale we should probably make it optional somehow

where are you getting the 6m for that particular test file from? nothing in this PR should affect IPR upload as this was strictly gkb caching.

@elewis2
Copy link
Copy Markdown
Collaborator

elewis2 commented Apr 14, 2026

test_upload is an end-to-end test that incorporates gkb matching.

@creisle
Copy link
Copy Markdown
Member Author

creisle commented May 1, 2026

I did some load testing of this myself using the test_main script since interactions with IPR are not impacted by any of the changes in this PR. I ran several iterations on both versions all run from the same server both using python 3.11.6. Some run overlapping each other, some run not overlapping. It varies quite a bit depending on the load on the gkb server at the time the script is run (I was running other scripts against gkb from a different server while doing all this test)

master branch

======================== 8 passed in 179.95s (0:02:59) =========================
======================== 8 passed in 297.29s (0:04:57) =========================
======================== 8 passed in 377.09s (0:06:17) =========================
======================== 8 passed in 260.65s (0:04:20) =========================
======================== 8 passed in 264.46s (0:04:24) =========================
======================== 8 passed in 277.63s (0:04:37) =========================
======================== 8 passed in 176.45s (0:02:56) =========================
======================== 8 passed in 179.10s (0:02:59) =========================
======================== 8 passed in 178.38s (0:02:58) =========================
======================== 8 passed in 176.30s (0:02:56) =========================

this branch

======================== 8 passed in 164.50s (0:02:44) =========================
======================== 8 passed in 162.29s (0:02:42) =========================
======================== 8 passed in 162.59s (0:02:42) =========================
======================== 8 passed in 177.36s (0:02:57) =========================
======================== 8 passed in 197.78s (0:03:17) =========================
======================== 8 passed in 275.16s (0:04:35) =========================
======================== 8 passed in 231.05s (0:03:51) =========================
======================== 8 passed in 225.20s (0:03:45) =========================
======================== 8 passed in 175.05s (0:02:55) =========================
======================== 8 passed in 215.89s (0:03:35) =========================

PR branch: 2:42 - 4:35
master branch: 2:56 - 6:17

@elewis2
Copy link
Copy Markdown
Collaborator

elewis2 commented May 13, 2026

I've been seeing runtimes like this when I use test_upload.py to benchmark:

Branch: feature/replace-caching
Run 1: 413.1s [PASS]
Run 2: 380.3s [PASS]
Run 3: 434.2s [PASS]
Average: 409.2s

Branch: develop
Run 1: 255.9s [PASS]
Run 2: 266.9s [PASS]
Run 3: 245.6s [PASS]
Average: 256.1s

(These are from alternating runs - run on feat, switch to dev, run on dev, switch to feat, etc)

I tried with cache-control True and False and got basically the same times.

But when I use a version of test_upload that only runs create_report once, I see better times on the feature branch than on dev:

Branch: feature/replace-caching
Run 1: 190.1s [PASS]
Run 2: 200.1s [PASS]
Run 3: 189.2s [PASS]
Average: 193.2s

Branch: develop
Run 1: 206.1s [PASS]
Run 2: 204.9s [PASS]
Run 3: 207.8s [PASS]
Average: 206.3s

It seems like what's happening is the session-level implementation in feat is removing the benefit that specifically test_upload was seeing from module-level caching. So if there are multiple GKB conns being used, the benefit from caching will be removed.

I think this is an improvement in normal use. I took a look at gsc_report and it looks like a single GKB conn is used for create_report so it shouldn't affect performance there.

edit to add - I assume the rate limiter will slow things down though - appreciate having this disabled during tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants