perf(cdr): optimize queries for large CDR installations#207
Open
edospadoni wants to merge 7 commits intons8from
Open
perf(cdr): optimize queries for large CDR installations#207edospadoni wants to merge 7 commits intons8from
edospadoni wants to merge 7 commits intons8from
Conversation
…ed geo columns Replace date_format() anti-patterns with range scans to enable index usage, add indexes on cdr source table and generated cdr_YYYY/cdr_YYYY-MM tables, pre-compute geographic columns (src_region, src_province, dst_region, dst_province) to eliminate millions of correlated subqueries in dashboard views.
…ize views - Replace correlated IN (SELECT) subqueries for trunk type detection with a temporary table + LEFT JOIN (avoids per-row subquery on 2M rows) - Replace all dispositions REGEXP patterns with LIKE equivalents in 12 SQL query templates and Go ExtractDispositions function - Parallelize view execution in views.go using goroutines with bounded concurrency (4 workers) for independent SQL view files
edospadoni
added a commit
to nethesis/ns8-nethvoice
that referenced
this pull request
Feb 26, 2026
Points to nethesis/nethvoice-report#207 which includes: - Indexes on source cdr table and generated cdr_YYYY/cdr_YYYY-MM tables - Range scans replacing date_format() anti-patterns - Pre-computed geographic columns eliminating correlated subqueries - LEFT JOIN for trunk type detection instead of per-row subqueries - REGEXP replaced with LIKE for disposition filters - Parallel view execution with goroutines
This was referenced Feb 26, 2026
Prevent update timeout by removing index creation from schema.sql.tmpl (runs during container startup) and SQL templates. All DDL and geo population now runs via separate db.Exec() calls in Go using a dedicated connection pool without read/write timeout. Geo UPDATEs are batched (100K rows) to keep each operation bounded.
edospadoni
added a commit
to nethesis/ns8-nethvoice
that referenced
this pull request
Feb 26, 2026
Points to nethesis/nethvoice-report#207 which includes: - Indexes on source cdr table and generated cdr_YYYY/cdr_YYYY-MM tables - Range scans replacing date_format() anti-patterns - Pre-computed geographic columns eliminating correlated subqueries - LEFT JOIN for trunk type detection instead of per-row subqueries - REGEXP replaced with LIKE for disposition filters - Parallel view execution with goroutines
Add idempotent index creation for queue_log, queue_log_history, report_queue, and cdr in the miner script. These tables are filtered and joined repeatedly by time range, event, queue, callid, agent, and linkedid during each consolidation run. Without proper indexes, MySQL falls back to wide scans, making the job slow and increasing lock time on busy systems. Creating targeted composite indexes improves selectivity for the hottest predicates, reduces I/O and temporary work, and keeps report generation stable as data grows.
Month tables created before the geo migration don't have the src_region/src_province/dst_region/dst_province columns. When the year table gets them via migrateGeoColumns, the subsequent INSERT ... SELECT * into the month table fails with column count mismatch. Fix by ensuring geo columns exist on month tables before running the month template.
Tables created before the geo migration don't have the geo columns. When cdr_year.sql includes NULL geo placeholders in SELECT, or when cdr_month.sql does SELECT * from a year table with geo columns, the INSERT fails with column count mismatch. Fix by: - Including geo columns in cdr_year.sql CREATE TABLE definition and both SELECT statements (NULL placeholders) - Calling ensureGeoColumns() on both year and month tables before running their templates
This reverts commit 0104d29. The indexes update will be in freepbx container startup nethesis/ns8-nethvoice#717
Stell0
added a commit
to nethesis/ns8-nethvoice
that referenced
this pull request
Mar 3, 2026
Add conditional CDR index creation during FreePBX startup for upgrades, so existing installations get the required indexes only if missing. Add the same indexes to the CDR schema used on fresh installs to keep new deployments aligned with upgraded ones. Refs: nethesis/nethvoice-report#207
Stell0
added a commit
to nethesis/ns8-nethvoice
that referenced
this pull request
Mar 3, 2026
Extend slow database updates to create missing indexes on queue and report tables during upgrades, without changing fresh-install schemas. Add conditional indexes on queue_log, queue_log_history, and report_queue to improve report mining query performance. Refs: nethesis/nethvoice-report#207
edospadoni
added a commit
to nethesis/ns8-nethvoice
that referenced
this pull request
Mar 4, 2026
Points to nethesis/nethvoice-report#207 which includes: - Indexes on source cdr table and generated cdr_YYYY/cdr_YYYY-MM tables - Range scans replacing date_format() anti-patterns - Pre-computed geographic columns eliminating correlated subqueries - LEFT JOIN for trunk type detection instead of per-row subqueries - REGEXP replaced with LIKE for disposition filters - Parallel view execution with goroutines
edospadoni
added a commit
to nethesis/ns8-nethvoice
that referenced
this pull request
Mar 5, 2026
Points to nethesis/nethvoice-report#207 which includes: - Indexes on source cdr table and generated cdr_YYYY/cdr_YYYY-MM tables - Range scans replacing date_format() anti-patterns - Pre-computed geographic columns eliminating correlated subqueries - LEFT JOIN for trunk type detection instead of per-row subqueries - REGEXP replaced with LIKE for disposition filters - Parallel view execution with goroutines
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The
taskscomponent queries take 15-20 minutes on installations with ~2M CDR records.Main bottlenecks:
date_format()on indexable columns preventing index usagecdr_YYYYandcdr_YYYY-MMtablescdrtableREGEXPusage whereLIKEsufficesOptimizations
1. Indexes on source
cdrtableidx_cdr_calldate(calldate) for range scansidx_cdr_linkedid(linkedid) for GROUP BYschema.sql.tmpland idempotently incdr.go(ensureCDRSourceIndexes)2. Fix
date_format()anti-pattern with range scanscdr_year.sql:date_format(calldate, "%Y") = "YYYY"replaced withcalldate >= 'YYYY-01-01' AND calldate < ... + INTERVAL 1 YEARcdr_month.sql:date_format(calldate, "%Y-%m")replaced with range scan usingINTERVAL 1 MONTHDATE(NOW() - INTERVAL 1 DAY)/DATE(NOW())3. Pre-compute trunk type with LEFT JOIN
get_trunk_name(channel) IN (SELECT channelid FROM asterisk.trunks)with a temp table_trunk_list+ LEFT JOIN4. Indexes on generated
cdr_YYYYandcdr_YYYY-MMtablesidx_type_calldate,idx_type,idx_cnum,idx_dst,idx_channel,idx_dstchannel,idx_type_cnum_calldateidx_type_calldate,idx_type,idx_cnum,idx_dst,idx_calldateADD INDEX IF NOT EXISTS) and idempotently in Go (ensureTableIndexes)5. Pre-computed geographic columns
cdr_YYYY:src_region,src_province,dst_region,dst_provinceWHERE src_region IS NULL(only new records)6. REGEXP replaced with LIKE for dispositions
REGEXP 'ANSWERED'withLIKE '%ANSWERED%'across 12 SQL filesExtractDispositionsinutils.goREGEXP 'FOO$'becomesLIKE '%FOO'(semantically equivalent, much faster)7. Parallel view execution
views.gonow executes SQL views in parallel with 4 worker goroutinesBenchmark (installation with ~665K records/year, ~8M total)
Typical dashboard query (
SELECT type, COUNT(*) ... WHERE type='IN' AND calldate >= ... GROUP BY type):Daily INSERT on source
cdrtable:Migration
No manual operations required. Migration is fully automatic:
SQL templates (
cdr_year.sql,cdr_month.sql, dashboard views): re-executed nightly by cron (tasks cdr+tasks views), so they update automatically on the first run after the update.Indexes on existing tables:
ensureCDRSourceIndexes()andensureTableIndexes()incdr.gocheck for index existence viainformation_schemaand add them if missing. After the first run they become a fast no-op (metadata SELECT only).Geographic columns:
ADD COLUMN IF NOT EXISTS+WHERE src_region IS NULLensures:Schema template (
schema.sql.tmpl): indexes oncdruseinformation_schemalogic for compatibility with the@database_nameplaceholder.All operations are idempotent and safe to re-execute.
Test plan
SHOW INDEX FROM cdr_YYYY WHERE Key_name LIKE 'idx_%'SELECT src_region, COUNT(*) FROM cdr_YYYY WHERE type='IN' GROUP BY src_regionEXPLAIN SELECT COUNT(*) FROM cdr_YYYY WHERE type='IN' AND calldate >= ...tasks cdrwithDEBUG=1and verify completiontasks viewsand verify completion