Skip to content

fix: close file handles via context managers, fix Config string-concat and mutable default arg bugs#47

Open
suyash469 wants to merge 2 commits intoEAPD-DRB:mainfrom
suyash469-gsoc:fix/file-handle-leak-and-silent-bugs
Open

fix: close file handles via context managers, fix Config string-concat and mutable default arg bugs#47
suyash469 wants to merge 2 commits intoEAPD-DRB:mainfrom
suyash469-gsoc:fix/file-handle-leak-and-silent-bugs

Conversation

@suyash469
Copy link
Copy Markdown

Summary

Fixes three real bugs found in the API base classes during code review.

What changed

1. FileClass.py — Resource leak: unclosed file handles

All four methods (readFile, writeFile, writeFileUJson, readParamFile) used
bare open() + close(). If an exception fired between them, the file handle was
never closed — causing "Too many open files" OS errors under load and potential
data corruption on writes.

Fix: Replaced all four open/close pairs with with context managers, which
guarantee closure even on exception.

2. Config.py — Silent string concatenation in PARAMETERS_C

A missing comma between 'e' and 't' in the EmissionActivityRatio entry:

# Before (bug): 'e''t' is silently parsed by Python as 'et' → 4 dims
'EmissionActivityRatio': ['r', 'e''t', 'y', 'm'],

# After (fix): correct 5 dimensions
'EmissionActivityRatio': ['r', 'e', 't', 'y', 'm'],

@NamanmeetSingh
Copy link
Copy Markdown

Hi Suyash, great catch on the mutable default argument (kwargs={}) in the CustomThread class! That’s a subtle Python gotcha that can definitely cause state-leakage bugs.

Also, since you are refactoring FileClass.py, I’ll drop the an optimization note here:
In readFile and readParamFile, keeping json.loads(f.read()) pulls the entire text payload into memory before parsing. For standard configs this is fine, but as we prep for the OG-CLEWS integration, we will be loading massive, multi-country scenario files. Switching these to json.load(f) will allow the parser to stream directly from the file, preventing massive RAM spikes during the model runs.

Really clean fixes overall, especially the thread state correction.

@MahadevBalla
Copy link
Copy Markdown
Contributor

This is a good cleanup - switching to context managers definitely fixes the file handle leak issue but one thing I noticed is that we’re still opening files with "w", which truncates the file immediately. So if something crashes mid-write (disk full, process killed, etc.), the file could still end up partially written or empty.

Maybe it’s worth considering a temp-file + atomic rename approach here? That way the original file isn’t touched unless the new write fully succeeds. Since these JSON files hold core model data, that might make the writes a bit more resilient.

@suyash469
Copy link
Copy Markdown
Author

Hi Suyash, great catch on the mutable default argument (kwargs={}) in the CustomThread class! That’s a subtle Python gotcha that can definitely cause state-leakage bugs.

Also, since you are refactoring FileClass.py, I’ll drop the an optimization note here: In readFile and readParamFile, keeping json.loads(f.read()) pulls the entire text payload into memory before parsing. For standard configs this is fine, but as we prep for the OG-CLEWS integration, we will be loading massive, multi-country scenario files. Switching these to json.load(f) will allow the parser to stream directly from the file, preventing massive RAM spikes during the model runs.

Really clean fixes overall, especially the thread state correction.

Great point on json.load(f) vs json.loads(f.read()) — the streaming approach
is clearly better for large OG-CLEWS scenario files. Will update readFile and
readParamFile to use json.load(f) directly.

…t bug, fix mutable default arg in CustomThread

- FileClass.py: Replace bare open/close pairs with with context managers
  in readFile, writeFile, writeFileUJson, readParamFile. Prevents file handle
  leaks when exceptions are raised between open() and close().

- Config.py: Add missing comma in PARAMETERS_C['EmissionActivityRatio'].
  'e''t' was silently parsed by Python as 'et' (implicit string concat),
  giving this param only 4 dimensions instead of the required 5 (r, e, t, y, m).

- CustomThreadClass.py: Replace mutable default argument kwargs={} with
  kwargs=None sentinel. All instances that omitted kwargs previously shared
  the same dict object; now each call gets its own independent dict.
…d readParamFile

Streaming directly from the file object avoids loading the entire
JSON payload into a string before parsing, reducing peak RAM usage
for large OG-CLEWS multi-country scenario files.
@suyash469 suyash469 force-pushed the fix/file-handle-leak-and-silent-bugs branch from 735b7d2 to 8923c99 Compare March 3, 2026 05:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants