perf(spp_programs): replace Python uniqueness checks with SQL constraints#116
perf(spp_programs): replace Python uniqueness checks with SQL constraints#116kneckinator wants to merge 1 commit into19.0from
Conversation
…ints Python @api.constrains methods performed per-record search() calls during bulk create, causing O(N^2) behavior. SQL UNIQUE constraints enforce uniqueness at the database level in O(1) per row. Includes pre-migration to deduplicate existing data before constraints apply.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the performance and data integrity of the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request is a great performance improvement, correctly replacing inefficient Python constraints with robust SQL UNIQUE constraints. The inclusion of a pre-migration script to handle existing data demonstrates thoroughness and foresight. The new tests are also excellent, verifying the database-level changes directly. I have one suggestion in the migration script to improve maintainability by reducing code duplication.
| def _deduplicate_program_memberships(cr): | ||
| """Remove duplicate (partner_id, program_id) rows from spp_program_membership.""" | ||
| cr.execute( | ||
| """ | ||
| DELETE FROM spp_program_membership | ||
| WHERE id IN ( | ||
| SELECT id FROM ( | ||
| SELECT id, | ||
| ROW_NUMBER() OVER ( | ||
| PARTITION BY partner_id, program_id | ||
| ORDER BY id | ||
| ) AS rn | ||
| FROM spp_program_membership | ||
| ) sub | ||
| WHERE rn > 1 | ||
| ) | ||
| """ | ||
| ) | ||
| if cr.rowcount: | ||
| _logger.info("Deduplicated %d duplicate program membership rows", cr.rowcount) | ||
|
|
||
|
|
||
| def _deduplicate_cycle_memberships(cr): | ||
| """Remove duplicate (partner_id, cycle_id) rows from spp_cycle_membership.""" | ||
| cr.execute( | ||
| """ | ||
| DELETE FROM spp_cycle_membership | ||
| WHERE id IN ( | ||
| SELECT id FROM ( | ||
| SELECT id, | ||
| ROW_NUMBER() OVER ( | ||
| PARTITION BY partner_id, cycle_id | ||
| ORDER BY id | ||
| ) AS rn | ||
| FROM spp_cycle_membership | ||
| ) sub | ||
| WHERE rn > 1 | ||
| ) | ||
| """ | ||
| ) | ||
| if cr.rowcount: | ||
| _logger.info("Deduplicated %d duplicate cycle membership rows", cr.rowcount) |
There was a problem hiding this comment.
The functions _deduplicate_program_memberships and _deduplicate_cycle_memberships contain nearly identical SQL logic. To improve maintainability and reduce code duplication, consider refactoring them into a single, generic helper function. This would make the script more concise and provide a reusable pattern for similar deduplication tasks in future migrations.
Here is an example of how you could refactor it:
def _deduplicate_records(cr, table_name, partition_by_cols, log_message):
"""Generic helper to delete duplicate records from a table."""
# Using .format() for table/column names is safe here as they are not from user input.
query = """
DELETE FROM {table}
WHERE id IN (
SELECT id FROM (
SELECT id,
ROW_NUMBER() OVER (
PARTITION BY {columns}
ORDER BY id
) AS rn
FROM {table}
) sub
WHERE rn > 1
)
""".format(table=table_name, columns=', '.join(partition_by_cols))
cr.execute(query)
if cr.rowcount:
_logger.info(log_message, cr.rowcount)
def _deduplicate_program_memberships(cr):
"""Remove duplicate (partner_id, program_id) rows from spp_program_membership."""
_deduplicate_records(
cr,
"spp_program_membership",
["partner_id", "program_id"],
"Deduplicated %d duplicate program membership rows",
)
def _deduplicate_cycle_memberships(cr):
"""Remove duplicate (partner_id, cycle_id) rows from spp_cycle_membership."""
_deduplicate_records(
cr,
"spp_cycle_membership",
["partner_id", "cycle_id"],
"Deduplicated %d duplicate cycle membership rows",
)This approach centralizes the deletion logic, making the script easier to read and maintain.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 19.0 #116 +/- ##
==========================================
- Coverage 70.14% 69.95% -0.20%
==========================================
Files 739 783 +44
Lines 43997 46568 +2571
==========================================
+ Hits 30863 32578 +1715
- Misses 13134 13990 +856
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Summary
Replace Python
@api.constrainsuniqueness checks with SQLUNIQUEconstraints onspp.program.membership,spp.cycle.membership, andspp.entitlement. Add a pre-migration script to deduplicate existing data before constraints apply. Bump module version to19.0.2.1.0.The Python
constrainsmethods performed per-recordsearch()calls during bulk create, causing O(N²) behavior. With 1M records, this made bulk enrollment and cycle membership creation prohibitively slow. SQLUNIQUEconstraints enforce the same invariants at the database level in O(1) per row.Changes
models/program_membership.pyUNIQUE(partner_id, program_id), removed_check_unique_partner_per_program()models/cycle_membership.pyUNIQUE(partner_id, cycle_id), removed_check_unique_partner_per_cycle()models/entitlement_base_model.pyUNIQUE(code), removed_check_unique_code()models/entitlement.py_check_unique_code()migrations/19.0.2.1.0/pre-migration.pyROW_NUMBER() OVER (PARTITION BY ...)before constraints apply. Memberships: deletes duplicates keeping oldest. Entitlements: appends-{id}suffix to duplicate codes (preserves financial records).tests/test_sql_constraints.pypg_constraintlevel, duplicate blocking via raw SQLINSERT, and legitimate multi-record scenariosContext
This is Phase 1 of 9 in the
spp_programsperformance optimization effort. SQL constraints are a prerequisite for Phase 7 (INSERT ON CONFLICT bulk membership creation).Test Plan
./scripts/test_single_module.sh spp_programspassesValidationError(Odoo wrapsIntegrityError)