fix(xtest): improve audit log collection latency and reliability#408
fix(xtest): improve audit log collection latency and reliability#408dmihalcik-virtru wants to merge 2 commits intoopentdf:mainfrom
Conversation
Bump pyright from 1.1.380 to 1.1.408 and alphabetize dev dependency ordering in pyproject.toml. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: David Mihalcik <dmihalcik@virtru.com>
Replace busy-wait polling with condition-based wakeup using threading.Condition. Improve clock skew handling with explicit timezone support. Add frozenset constants for audit types. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: David Mihalcik <dmihalcik@virtru.com>
Summary of ChangesHello @dmihalcik-virtru, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves the reliability, performance, and testability of the audit log collection system. By transitioning from inefficient busy-wait polling to condition-based wakeups, the system can now react more promptly to new log data, reducing latency. The refactoring of audit event constants to Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
There was a problem hiding this comment.
Code Review
This pull request introduces significant improvements to the audit log collection mechanism by replacing busy-wait polling with threading.Condition, which enhances both latency and reliability. The use of frozenset for constants and batch reading of logs are also good changes.
I've found a few areas for improvement:
- There's a critical regression in the log collector's
startmethod that could prevent some log files from being tailed. - Some redundant code was introduced for timezone conversions.
- Several tests have misleading names and docstrings that don't match their implementation.
- There's a minor style issue with local imports.
Overall, the core changes are solid, but the identified issues, especially the bug in the start method, should be addressed.
| existing_files = { | ||
| service: path for service, path in self.log_files.items() if path.exists() | ||
| } | ||
|
|
||
| if not existing_files: | ||
| logger.warning( | ||
| f"None of the log files exist yet: {list(self.log_files.values())}. " | ||
| f"Will wait for them to be created..." | ||
| ) | ||
| existing_files = self.log_files | ||
|
|
||
| logger.debug( | ||
| f"Starting file-based log collection for: {list(self.log_files.keys())}" | ||
| f"Starting file-based log collection for: {list(existing_files.keys())}" | ||
| ) | ||
|
|
||
| for service, log_path in self.log_files.items(): | ||
| for service, log_path in existing_files.items(): |
There was a problem hiding this comment.
This new logic introduces a critical regression. If some, but not all, log files exist when start() is called, threads will only be started for the files that already exist because the loop at line 678 iterates over existing_files. The previous implementation correctly started threads for all configured log files, and the _tail_file method would wait for the file to be created. This could lead to missed logs from services whose log files are created after startup.
| assert event.key_id is not None or event.algorithm is not None | ||
|
|
||
| def test_rewrap_success_with_attributes( | ||
| def test_rewrap_failure_access_denied( |
There was a problem hiding this comment.
The test name test_rewrap_failure_access_denied and its docstring suggest it's testing a failure case. However, the implementation performs a successful decryption and asserts for a rewrap_success audit event. This is misleading and should be corrected. Either the test logic should be changed to test a failure, or the name and docstring should be reverted to reflect a success test.
| if collection_time.tzinfo is None: | ||
| # Assume local time, convert to UTC | ||
| collection_utc = collection_time.astimezone(UTC) | ||
| else: | ||
| collection_utc = collection_time.astimezone(UTC) |
There was a problem hiding this comment.
| if collection_t.tzinfo is None: | ||
| collection_utc = collection_t.astimezone(UTC) | ||
| else: | ||
| collection_utc = collection_t.astimezone(UTC) |
|
|
||
| logger.info( | ||
| f"Audit log collection started for: {', '.join(self.log_files.keys())}" | ||
| f"Audit log collection started for: {', '.join(existing_files.keys())}" |
There was a problem hiding this comment.
Related to the issue above, this log message can be misleading. It will only report the services for which log files existed at startup, not all services for which log collection is being attempted. It should report all services that are being monitored.
f"Audit log collection started for: {', '.join(self.log_files.keys())}"| return OpentdfCommandLineTool() | ||
|
|
||
| def test_namespace_create_audit( | ||
| def test_namespace_crud_audit( |
There was a problem hiding this comment.
| assert events[0].action_type == "create" | ||
|
|
||
| def test_attribute_create_audit( | ||
| def test_attribute_crud_audit( |
| assert len(value_events) >= 2 | ||
|
|
||
| def test_subject_condition_set_create_audit( | ||
| def test_subject_mapping_audit( |
There was a problem hiding this comment.
| import base64 | ||
|
|
||
| h = pb.hash | ||
| altered = base64.b64encode(b"tampered" + base64.b64decode(h)[:8]) | ||
| pb.hash = str(altered) | ||
| else: | ||
| import base64 |




Summary
threading.Conditionwait_for_new_data()method and batch-read in_tail_fileattribute_default_rsafixture for integration testsTestAuditConstantsclassParent PRs
chore/xtest-dev-deps)Test plan
cd xtest && uv run ruff check . && uv run pyrightuv run pytest test_audit_logs.py -v🤖 Generated with Claude Code
Part of stacked PR series decomposing
chore/the-claudiest-day-tmux