Skip to content

categories: add mmap .ndb backend for custom category lists#3157

Open
fabiodepin wants to merge 2 commits into
ntop:devfrom
fabiodepin:categories-ndb
Open

categories: add mmap .ndb backend for custom category lists#3157
fabiodepin wants to merge 2 commits into
ntop:devfrom
fabiodepin:categories-ndb

Conversation

@fabiodepin
Copy link
Copy Markdown
Contributor

Please sign (check) the below before submitting the Pull Request:

Link to the related 3151:

Describe changes:

Introduce a compiled .ndb backend (mmap) for external custom category matching, with LEGACY / NDB_ONLY / HYBRID modes, while keeping the existing -G / Aho-Corasick list path unchanged.

Adds ndpi_load_category_ndb_file() / ndpi_unload_category_ndb(), ndpiReader --category-ndb and --category-ndb-reload-interval, a polling-based hot-reload helper, offline builder ndpi_gen_categories_bin, shared hostname normalization (generator + runtime), and on-disk layout in ndpi_categories_bin.h (domains plus IPv4/IPv6 prefix entries).

The generator writes the database atomically (temporary file + fsync + rename) so ndpiReader can reload a new valid file without restart.

@fabiodepin fabiodepin force-pushed the categories-ndb branch 5 times, most recently from 0ed3acd to 42fe9f9 Compare April 16, 2026 01:27
@IvanNardi
Copy link
Copy Markdown
Collaborator

Interesting stuff! This patch is quite big, so I might need some time to proper review it.
In the meantime, if possible:

  • try to fix compilation warnings that you can find in CI logs
  • we want to be able to compile the library without lpthread (via --disable-global-context-support). In that case, it is likely that we can't update the db at runtime and all the lock/unlock operation should be a nop

While I understand that load time is lower with this change, I really would like to see some tests and numbers about runtime/lookup performance. Is that possible?

@fabiodepin
Copy link
Copy Markdown
Contributor Author

Thanks for the feedback.

That makes sense.

I'll address the CI warnings first.

Regarding builds without pthread / --disable-global-context-support: I agree that runtime DB updates should not require pthread in that configuration. I'll adjust the implementation so that lock/unlock become no-ops there, and any runtime reload-specific path is either disabled or compiled out as needed, while keeping the basic .ndb loading path working when possible.

For performance, yes — I can add tests and numbers. I'll collect:

  • load time
  • memory usage
  • runtime / lookup performance

I'll compare the legacy path (-G / existing structures) against the .ndb backend, and I can include both a real ndpiReader run and a smaller lookup-focused benchmark if useful.

@fabiodepin fabiodepin force-pushed the categories-ndb branch 4 times, most recently from 331dd37 to 68896c5 Compare April 17, 2026 16:11
@fabiodepin
Copy link
Copy Markdown
Contributor Author

Thanks again for the review.

Quick update on the requested points:

CI warnings:
I went through the CI logs and fixed the compilation warnings that were reported.

Build without pthread (–disable-global-context-support):
This is now supported:
• the code builds without pthread
• all lock/unlock operations are implemented as no-ops in that mode
• runtime DB updates (hot reload) are effectively disabled when global context support is off

The basic .ndb loading path still works in this configuration.

Runtime / lookup performance:
I’m currently working on benchmarks for this.

It measures:
• load time
• RSS / memory usage
• lookup performance (hostname + IPv4, hit/miss/LPM)
• basic latency percentiles (block-based)

Initial results (micro dataset) already show:
• much lower load time for .ndb
• significantly better hostname miss performance
• lookup performance overall comparable to legacy

I’m now extending this to larger datasets to provide more representative numbers. I’ll share detailed results shortly.

@fabiodepin
Copy link
Copy Markdown
Contributor Author

Benchmark

I ran a dedicated benchmark (not included in this PR to keep it focused):
tests/performance/category_ndb_bench.c

It measures:

  • load time
  • RSS / memory usage
  • lookup latency (hostname + IPv4, hit/miss/LPM)
  • median and block percentiles

Key results

Large dataset (~7.5M hostnames, ~100k IPv4 rules)

  • .ndb file size: ~332 MB

Load time

  • .ndb: ~962 ms
  • legacy: ~3446 ms
    → ~3.5× faster

Memory (RSS after load)

  • .ndb: ~711 MB
  • legacy: ~1.82 GB
    → >2× reduction

Hostname lookup

  • hit:
    • .ndb: ~181 ns
    • legacy: ~167 ns
      → essentially on par
  • miss:
    • .ndb: ~226 ns
    • legacy: ~407 ns
      → ~1.8× faster

Interpretation

  • Load time and memory usage are significantly improved with .ndb
  • Hostname lookup remains competitive even at multi-million scale
  • Hostname miss is consistently faster with .ndb

Current limitation

The benchmark also highlights the current weak point:

  • IPv4 lookup in .ndb is still implemented as a linear scan (O(N))

This dominates lookup cost for large IPv4 tables and will be addressed separately.

Summary

Even at large scale:

  • .ndb significantly improves load time and memory footprint
  • hostname lookup scales well and remains competitive
  • the remaining bottleneck is the IPv4 lookup path

@fabiodepin
Copy link
Copy Markdown
Contributor Author

Happy to add the benchmark to the PR if that would make review easier.

@IvanNardi
Copy link
Copy Markdown
Collaborator

IvanNardi commented Apr 20, 2026

Load time

* .ndb: ~962 ms

* legacy: ~3446 ms
  → ~3.5× faster

Memory (RSS after load)

* .ndb: ~711 MB

* legacy: ~1.82 GB
  → >2× reduction

Interesting numbers, but quite different from the ones reported in #3151

• ~100x reduction in memory usage
• ~7x faster startup vs -G

@IvanNardi
Copy link
Copy Markdown
Collaborator

Happy to add the benchmark to the PR if that would make review easier.

yes, please. I would like to run some tests locally, before reviewing this patch

@fabiodepin
Copy link
Copy Markdown
Contributor Author

Good point — the numbers differ because they come from two different kinds of measurements.

In #3151, the numbers are from a real ndpiReader run, which includes:

  • full runtime setup
  • text parsing / runtime structure build in the legacy path
  • the full application environment

The benchmark I shared here isolates only the category load + lookup path, so it measures:

  • mmap load vs legacy load
  • lookup latency (hostname / IPv4)
  • RSS after initialization

So they are complementary rather than directly equivalent:

Regarding the benchmark: yes, I’ll add it to the PR so it can be run locally.

@fabiodepin
Copy link
Copy Markdown
Contributor Author

I’ve just pushed the benchmark code to the PR:

  • tests/performance/category_ndb_bench.c
  • Makefile target in tests/performance

It’s a simple tool to compare .ndb vs legacy for:

  • load time
  • RSS
  • hostname + IPv4 lookup latency

It supports micro/scale/stress profiles and load-only / lookup runs.

A couple of notes:

  • large IPv4 tables will highlight the current O(N) lookup behavior in .ndb
  • mixed_global currently reuses per-case pools and is not yet a true cross-API mixed loop

If you want to try it quickly (from repo root):

  • Micro dataset: quick sanity check (.ndb vs legacy)
    ./tests/performance/category_ndb_bench --profile micro --backend both --mode fixed

  • Micro dataset with mixed lookup patterns
    ./tests/performance/category_ndb_bench --profile micro --backend both --mode mixed_by_case

  • Scale dataset: load / RSS comparison
    ./tests/performance/category_ndb_bench --profile scale --backend both --only-load

  • Scale dataset: lookup comparison
    ./tests/performance/category_ndb_bench --profile scale --backend both --mode mixed_by_case

  • Stress dataset: .ndb-only load path
    ./tests/performance/category_ndb_bench --profile stress --backend ndb --only-load

Let me know if you’d like any adjustments or additional scenarios.

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 5, 2026

fabiodepin and others added 2 commits May 19, 2026 20:42
Introduce a compiled .ndb backend (mmap) for external custom category
matching, with LEGACY / NDB_ONLY / HYBRID modes, while keeping the
existing -G (Aho-Corasick) path unchanged.

Add ndpi_load_category_ndb_file() / ndpi_unload_category_ndb(), CLI
options (--category-ndb, --category-ndb-reload-interval), a polling-based
hot-reload helper, and the offline builder ndpi_gen_categories_bin.

Implement shared hostname normalization (generator + runtime) and define
the on-disk layout in ndpi_categories_bin.h (domains and IPv4/IPv6
prefix entries).

The generator writes the database atomically (temporary file + fsync +
rename), allowing ndpiReader to reload a valid file without restart.

category_ndb: use no-op locks when global context support is disabled
…oad benchmarks

Add a dedicated benchmark tool, category_ndb_bench, to compare the
compiled .ndb backend against the legacy custom-category path.

The benchmark measures:
- load time
- RSS / memory usage
- hostname and IPv4 lookup latency
- median and block percentiles

It supports synthetic micro/scale/stress profiles, temp or persisted
.ndb generation, load-only / lookup-only modes, and explicit backend
selection (ndb / legacy / both).

Also add guardrails and help text for large legacy runs:
- default legacy safety caps for hosts / IPv4 rules
- clear skip/error behavior depending on backend mode
- mixed_global caveat in output/help
- recommended commands in --help
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants