-
-
Notifications
You must be signed in to change notification settings - Fork 90
Internal Database Disk IO improvements and CPU reduction #3114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Internal Database Disk IO improvements and CPU reduction #3114
Conversation
…nts into fix/int-db-improvements
|
|
||
| # Create component card | ||
| card_class = "active" if is_active else "inactive" | ||
| if is_active and not is_alive: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these changes related or perhaps a bad merge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements significant performance optimizations to reduce CPU usage and disk I/O in Predbat, particularly for resource-constrained devices like Raspberry Pi. The core improvements include database write batching with WAL mode, throttling of file modification checks, and configurable main loop intervals.
Key Changes:
- Enabled SQLite WAL mode with batched commits (configurable interval) to drastically reduce disk I/O
- Throttled file modification checks from every 1s to every 30s to reduce filesystem overhead
- Added configurable performance mode with adjustable main loop interval (1s default, 5s in performance mode)
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/faq.md | Added FAQ entry explaining callback duration warnings (5-20s normal for calculations) |
| docs/customisation.md | Documented performance mode features including loop throttling, WAL batching, and file check throttling |
| docs/components.md | Added documentation for db_commit_interval, performance_mode, and hass_loop_interval configuration options |
| docs/apps-yaml.md | Added Performance Tuning section with configuration examples for the three new settings |
| apps/predbat/web.py | Removed unused dashboard collapsible UI code, fixed car charging condition to use car_charging_hold flag, removed unused imports |
| apps/predbat/hass.py | Implemented main loop throttling (configurable interval), file check throttling (30s), callback duration warnings, conditional log flushing, and improved set_start_method error handling |
| apps/predbat/db_manager.py | Implemented periodic commit timer with configurable interval, conditional immediate commits when interval is 0 |
| apps/predbat/db_engine.py | Enabled WAL mode and NORMAL synchronous setting, removed immediate commits to enable batching, added explicit commit() method |
| apps/predbat/components.py | Added component configuration entries for db_commit_interval, performance_mode, and hass_loop_interval |
apps/predbat/db_manager.py
Outdated
| self.return_event = threading.Event() | ||
| self.api_started = False | ||
| self.last_success_timestamp = None | ||
| self.commit_interval = self.get_arg("db_commit_interval", 0) |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The db_commit_interval parameter is retrieved twice - once from the function parameter (line 20) and again via self.get_arg() (line 30). The function parameter is never used since self.commit_interval is set from get_arg(). Either remove the function parameter and rely solely on get_arg(), or use the function parameter value. The current implementation suggests the function parameter was intended to be used but was overlooked.
| self.commit_interval = self.get_arg("db_commit_interval", 0) | |
| self.commit_interval = db_commit_interval |
apps/predbat/db_manager.py
Outdated
| wait_time = 0.1 | ||
| if self.commit_interval > 0: | ||
| remaining = self.commit_interval - (time.time() - last_commit_time) | ||
| wait_time = max(0.1, remaining) |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The wait time calculation on line 72 can result in a negative value if time.time() - last_commit_time exceeds commit_interval. This negative value passed to max(0.1, remaining) will return the negative value (since it's greater than 0.1), leading to an immediate timeout. The logic should be wait_time = max(0.1, min(remaining, commit_interval)) or handle the case where remaining is negative more explicitly.
apps/predbat/hass.py
Outdated
| except ValueError: | ||
| pass |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The try-except block silently catches ValueError when set_start_method("fork") is called, but provides no logging or context about why it failed. This is likely to handle the case where the start method has already been set (which raises RuntimeError, not ValueError) or when fork is unavailable on the platform. Consider logging when this exception occurs to help with debugging, and verify that ValueError is the correct exception type to catch - RuntimeError is raised when the context has already been set.
| except ValueError: | |
| pass | |
| except (ValueError, RuntimeError) as exc: | |
| print("Unable to set multiprocessing start method to 'fork': {}".format(exc), file=sys.stderr) |
apps/predbat/hass.py
Outdated
| if not self.args.get("performance_mode", False): | ||
| self.logfile.flush() | ||
| msg_lower = msg.lower() |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disabling log file flushing in performance mode means log entries will remain in the buffer and may be lost if the process crashes or is killed before the buffer is flushed. This trades reliability for performance. For critical error and warning messages, immediate flushing might be more appropriate regardless of performance mode, or at minimum this behavior and its implications should be documented for users enabling performance_mode.
| if not self.args.get("performance_mode", False): | |
| self.logfile.flush() | |
| msg_lower = msg.lower() | |
| msg_lower = msg.lower() | |
| # Treat errors, warnings, and explicitly non-quiet messages as critical | |
| is_critical = (not quiet) or msg_lower.startswith("error") or msg_lower.startswith("warn") | |
| # In performance_mode we generally avoid frequent flushes, but always flush critical messages | |
| if not self.args.get("performance_mode", False) or is_critical: | |
| self.logfile.flush() |
| def initialize(self, db_enable, db_days, db_commit_interval=0): | ||
| self.db_days = db_days |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method requires at least 3 positional arguments, whereas overridden ComponentBase.initialize requires 1.
| def initialize(self, db_enable, db_days, db_commit_interval=0): | |
| self.db_days = db_days | |
| def initialize(self, **kwargs): | |
| if "db_days" in kwargs: | |
| self.db_days = kwargs["db_days"] | |
| else: | |
| self.db_days = self.get_arg("db_days") |
yes, this is true, typo Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
sure Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ex creation, new log_level feature Changes: - db_engine.py: Remove commit() after entity insert (was causing constant WAL updates) - db_engine.py: Add database index for fast history queries - db_manager.py: Use function param directly, fix negative wait_time, cap wait to 1s - hass.py: Add log_level filtering feature (debug/info/warn/error) - hass.py: Catch both ValueError and RuntimeError for set_start_method - web.py: Restore error card class styling, accept log_level param - components.py: Add log_level to Web Interface args
Predbat Performance Optimisation Guide
FYI - This feature was heavily created with AI, yet it works, and my disk activity and CPU have plummeted since using it.
Overview of Changes
We have implemented several optimisations to
hass.pyand the database engine to address high CPU usage and excessive Disk I/O. These changes are designed to make Predbat run more efficiently, particularly on resource-constrained devices like Raspberry Pis or older hardware.Key Changes & Rationale
1. Database Write Batching (Critical for Disk I/O)
The Change:
WAL(Write-Ahead Logging) mode for SQLite.self.db.commit()) that was accidentally running after every single entity update.db_commit_interval: 30(seconds).The Rationale:
Previously, every time Predbat updated a sensor or entity state, it forced a write to the physical disk immediately.
predbat.db-wal) once every 30 seconds. This drastically reduces "wear and tear" on your storage and frees up system resources.2. Throttling File Modification Checks (CPU Usage)
The Change:
check_modifiedfunction (which scans all.pyfiles to see if code has changed) was running every 1 second.The Rationale:
Scanning the file system every second consumes unnecessary CPU cycles (
sysusage). By reducing this frequency,hass.pyspends less time checking for file changes and more time doing actual work (or sleeping), lowering overall CPU usage.3. Main Loop Throttling
The Change:
hass_loop_intervalto 5 seconds (up from 1s) when inperformance_mode.The Rationale:
If there is no work to do, there is no need to wake up every second. Sleeping for 5 seconds reduces the idle load on the CPU.
Explanation of Log Warnings
You noticed logs like:
Warn: Callback ... took 9.98 secondsWhat this means:
These warnings indicate that the main Predbat calculation logic (planning, fetching data, solving) took 9.98 seconds to complete. During this time, the "loop" is blocked—meaning it can't engage in other quick tasks until the calculation finishes.
Is this something to worry about?
hass_loop_intervalto 5s, a 10s calculation just means one or two "heartbeats" are skipped. As long as it finishes successfully (which it does), this is acceptable behavior for a heavy calculation task.Summary: The recent optimizations won't necessarily make the calculation faster (that depends on CPU speed), but they stop the overhead (disk writes, file checks) from slowing it down further.