Skip to content

feat: Incidence tracking#797

Open
RobertJacobsonCDC wants to merge 1 commit intomainfrom
RobertJacobsonCDC_755_incidence_tracking
Open

feat: Incidence tracking#797
RobertJacobsonCDC wants to merge 1 commit intomainfrom
RobertJacobsonCDC_755_incidence_tracking

Conversation

@RobertJacobsonCDC
Copy link
Collaborator

@RobertJacobsonCDC RobertJacobsonCDC commented Feb 24, 2026

Updated

Add stratified periodic value-change counters for entity properties

This PR replaces the earlier incidence-oriented design with a more general value change counter architecture.

Instead of a single optional counter per property, each tracked property can now own many counters, each identified by a counter_id and configured with a strata type (PL: PropertyList<E>). This supports multiple independent reporting workflows (e.g. daily incidence, weekly incidence, ...) over the same property.

Public API

New ContextEntitiesExt methods

  • create_value_change_counter<E, PL, P>(&mut self) -> usize
    • The low-level API, only used directly for advanced use cases.
    • Creates one stratified value-change counter for tracked property P, stratified by PL.
    • Returns the counter ID (its index in the per-property counter vector).
  • track_periodic_value_change_counts<E, PL, P>(&mut self, period: f64, counter_id: usize, handler: ...)
    • Registers periodic reporting for one existing counter.
    • Emits/handles an internal PeriodicValueChangeCountEvent<E, PL, P>.
    • Filters events by counter_id, calls handler on matching counter, then clears that counter.

Internal design

Counter storage

PropertyValueStoreCore<E, P> now stores:

  • value_change_counters: Vec<RefCell<Box<dyn ValueChangeCounter<E, P>>>>

This replaces incidence_counts. Counter creation pushes a new trait object into this vector and returns its index as counter_id.

Counter trait and concrete implementation

The ValueChangeCounter trait erases the strata type. Concrete type: StratifiedValueChangeCounter<E, PL, P>

  • Backed by HashMap<(PL, P), usize>
  • On update, computes stratum PL for the entity and increments (PL, P) bucket
  • Exposes typed methods for reading and clearing counts

Where counts are updated

Counter updates occur in PartialPropertyChangeEventCore<E, P>::emit_in_context only on true transitions (current != previous).

Periodic behavior and phase semantics

  • period must be finite and > 0.
  • First report is at now + period.
  • Event emission is scheduled with add_plan_with_phase(..., ExecutionPhase::Last) so that any same-time plans/callbacks that might mutate counts run first.
  • After handler runs the counter is cleared / reset.

PropertyList changes

The old incidence-specific PropertyList helpers were removed.

PropertyList<E> now contributes one new strata-oriented helper:

  • get_values_for_entity(context, entity_id) -> Self

Used by StratifiedValueChangeCounter to compute PL during update(...).

Typical usage

let counter_id = context.create_value_change_counter::<Person, (InfectionStatus,), Age>();

context.track_periodic_value_change_counts::<Person, (InfectionStatus,), Age>(
    1.0,
    counter_id,
    |_context, counter| {
        // read counts by (stratum, new_value)
        let _new_age_count = counter.get_count((InfectionStatus::Infected,), Age(21));
        // write reports here
    },
);

Test coverage in this PR

  • Counter IDs increment in insertion order.
  • Counters increment only on true transitions (no-op writes ignored).
  • Periodic handlers read expected counts and matched counters are cleared afterward.
  • Existing entity/query/index/event behavior remains green under full test suite.

Questions and Issues

Advanced use cases

The "low level" ContextEntitiesExt::create_value_change_counter method is public to support an advanced use case in which client code would want to wire up their own schedulers and handlers. But this is arguably a YAGNI violation. We could:

  • make create_value_change_counter private
  • eliminate the PeriodicValueChangeCountEvent entirely and just run the handler directly in the plan that in the current implementation emits this event

Resolved: This feels over-engineered. Let's support an advanced use case when we have one. Simplify the code for now.

Why the extra trait bounds (PL: Eq + Hash, P: Eq + Hash)

StratifiedValueChangeCounter<E, PL, P> stores counts in HashMap<(PL, P), usize>, which requires both PL and P to implement Eq + Hash. The bounds are intentionally local to the value-change-counter APIs, so we do not need to impose Eq/Hash on all Property implementors globally. #782 / #783 addresses adding Hash to the Property trait.

Start time

Resolved:

  • The start of recording of value changes ("creation of the index") happens at ExecutionPhase::First at simulation start time (default time=0, but settable via Context::set_start_time).
  • Recording value changes outside of simulation time is unsupported.
  • Start logic: First event handler for periodic value change count event fires at ExecutionPhase::Last at start time.
    • Since the handler fires in ExecutionPhase::Last, it's possible for plans & callbacks scheduled at start time to have changed property values, with these changes recorded in the value change counter. Thus, the counter can be nonempty when first handler event fires.
    • If client code starts simulation at negative time and wants handler to run at multiples of period, client code needs to set start time to a multiple of period.
  • Handler event fires at ExecutionPhase::Last at every $start\_time + k*period$, $k\in \mathbb{N}$ until simulation end.
  • End logic: If no plan is scheduled, the next handler event is not scheduled.
    • If no plan is scheduled, then the current plan is the last plan, and the simulation is about to end—there is nothing left to record.
    • The last period might be a "partial period." Client code will just need to understand this.

The handler's parameters

The periodic event handler is given

  • &Context: an immutable reference to the Context. A mutable reference is not required for reporting.
  • &mut StratifiedValueChangeCounter<E, PL, P>: a mutable reference to the counter. Since the counter is cleared when the handler returns, it doesn't matter if the handler mutates the counter.

We could do some work to try to give the handler a mutable reference to context. This might be worth spending a bit more time on.

@github-actions
Copy link

Benchmark Results

Hyperfine

Command Mean [ms] Min [ms] Max [ms] Relative
large_sir::baseline 2.9 ± 0.0 2.8 2.9 1.00
large_sir::entities 13.4 ± 0.3 13.0 14.7 4.66 ± 0.11

Criterion

Regressions (slower)
Group Bench Param Change CI Lower CI Upper
sample_entity sample_entity_whole_population 100000 33.211% 32.544% 33.995%
sample_entity sample_entity_whole_population 10000 32.910% 32.605% 33.207%
sample_entity sample_entity_whole_population 1000 32.650% 32.238% 33.011%
sample_entity sample_entity_single_property_unindexed 10000 21.899% 21.281% 22.502%
indexing query_people_count_indexed_multi-property_entities 7.257% 6.994% 7.445%
indexing query_people_indexed_multi-property_entities 5.104% 4.044% 6.135%
sampling sampling_single_unindexed_entities 3.454% 3.062% 3.863%
indexing with_query_results_indexed_multi-property_entities 2.999% 2.095% 3.675%
indexing query_people_multiple_individually_indexed_properties_entities 2.811% 2.383% 3.274%
sampling sampling_multiple_l_reservoir_entities 2.661% 1.791% 3.554%
sample_entity sample_entity_multi_property_indexed 1000 2.621% 1.651% 3.960%
sample_entity sample_entity_single_property_unindexed 100000 2.516% 2.271% 2.843%
counts reindex_after_adding_more_entities 2.453% 2.254% 2.647%
examples example-basic-infection 2.292% 2.064% 2.526%
sample_entity sample_entity_multi_property_indexed 100000 1.968% 1.470% 2.642%
Improvements (faster)
Group Bench Param Change CI Lower CI Upper
sample_entity sample_entity_single_property_unindexed 1000 -20.548% -21.938% -19.382%
counts multi_property_unindexed_entities -10.455% -12.007% -8.606%
large_dataset bench_query_population_multi_indexed_entities -9.982% -12.261% -7.903%
indexing query_people_single_indexed_property_entities -8.626% -10.872% -6.095%
counts multi_property_indexed_entities -6.270% -6.552% -5.781%
large_dataset bench_query_population_derived_property_entities -3.960% -4.678% -3.349%
algorithm_benches algorithm_sampling_multiple_l_reservoir -3.394% -3.741% -3.088%
sampling sampling_single_known_length_entities -2.505% -3.458% -1.540%
algorithm_benches algorithm_sampling_multiple_known_length -2.393% -3.321% -1.003%
large_dataset bench_query_population_multi_unindexed_entities -2.227% -3.615% -1.146%
counts index_after_adding_entities -2.032% -2.307% -1.818%
Unchanged / inconclusive (CI crosses 0%)
Group Bench Param Change CI Lower CI Upper
large_dataset bench_filter_unindexed_entity 2.896% -1.853% 8.015%
large_dataset bench_filter_indexed_entity -2.436% -13.590% 9.808%
indexing with_query_results_multiple_individually_indexed_properties_enti 1.371% 0.725% 1.750%
indexing with_query_results_single_indexed_property_entities 1.179% 0.263% 1.988%
sample_entity sample_entity_multi_property_indexed 10000 1.170% 0.959% 1.368%
large_dataset bench_match_entity -1.140% -1.551% -0.771%
sampling sampling_multiple_known_length_entities -1.133% -1.865% -0.400%
examples example-births-deaths -0.942% -1.190% -0.654%
indexing query_people_count_multiple_individually_indexed_properties_enti -0.795% -0.939% -0.662%
sample_entity sample_entity_single_property_indexed 1000 0.713% 0.542% 0.906%
sampling sampling_multiple_unindexed_entities 0.690% 0.303% 1.115%
sample_entity sample_entity_single_property_indexed 100000 0.666% 0.418% 0.884%
algorithm_benches algorithm_sampling_single_l_reservoir -0.600% -0.917% -0.397%
sample_entity sample_entity_single_property_indexed 10000 0.574% -0.045% 1.105%
indexing query_people_count_single_indexed_property_entities 0.503% 0.147% 0.985%
large_dataset bench_query_population_indexed_property_entities -0.492% -0.934% -0.167%
counts single_property_unindexed_entities -0.467% -0.725% -0.201%
algorithm_benches algorithm_sampling_single_known_length 0.288% -0.078% 0.680%
large_dataset bench_query_population_property_entities 0.215% -0.048% 0.526%
sampling sampling_single_l_reservoir_entities 0.108% -0.414% 0.820%
counts single_property_indexed_entities -0.009% -0.385% 0.306%
algorithm_benches algorithm_sampling_single_rand_reservoir -0.008% -0.240% 0.202%

github-actions bot added a commit that referenced this pull request Feb 24, 2026
Copy link
Collaborator

@k88hudson-cfa k88hudson-cfa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on our discussion, I would make the following changes:

  • Let's have an explicit start (implying we start it from the First phase) and panic if the current time is in the past (including if it's current time but during Normal)
  • I think the use of the tuple for multiple output types is confusing as it is a very common use case to want to have counts stratified by multiple properties (e.g., infection status and age group). I would instead give people a pattern for a shared function, and also have the docs show a multi property example if that's the way people need to count stratified properties
  • It's important that we support multiple periods for the same index sets. Could you maybe store them by period/start time and simply reuse instead of create on creation?
  • I would favour a more generic name personally (like aggregate) but I don't have a great alternative in mind, maybe somebody else does? If not we can leave it for now

@RobertJacobsonCDC
Copy link
Collaborator Author

For stratification, I think what we really want is this: When the tracked property value changes, e.g. Hospitalized, then we store whatever "stratification property values" the entity has at that very instant. This separates the properties we are using to stratify from the property whose values are changing.

The problem with using multi-properties is this: Suppose we count the multi-property (Age(31), Hospitalized(true)) during the period because a 31 year old enters the hospital. Now suppose by coincidence this person has a birthday before the period ends. The multi-property changes to the value (Age(32), Hospitalized(true)), so the system would count that value even though it's not a new hospitalization.

So we want to only count on changes of one property but record stratification property values at the instant of the change.

Multi-properties work for stratification only if the property values you want to use for stratification don't change.

@RobertJacobsonCDC RobertJacobsonCDC force-pushed the RobertJacobsonCDC_755_incidence_tracking branch 2 times, most recently from 94bfb3a to 8f7f35b Compare February 26, 2026 19:11
@RobertJacobsonCDC RobertJacobsonCDC force-pushed the RobertJacobsonCDC_755_incidence_tracking branch from 8f7f35b to 94140f3 Compare February 26, 2026 19:23
@github-actions
Copy link

Benchmark Results

Hyperfine

Command Mean [ms] Min [ms] Max [ms] Relative
large_sir::baseline 3.0 ± 0.1 2.9 3.2 1.00
large_sir::entities 13.2 ± 0.3 13.0 14.6 4.41 ± 0.13

Criterion

Regressions (slower)
Group Bench Param Change CI Lower CI Upper
counts single_property_indexed_entities 65.614% 64.578% 66.519%
counts multi_property_indexed_entities 9.874% 9.263% 10.665%
large_dataset bench_query_population_multi_unindexed_entities 7.502% 5.735% 9.445%
algorithm_benches algorithm_sampling_multiple_known_length 6.965% 6.576% 7.283%
large_dataset bench_match_entity 5.548% 5.162% 5.814%
sample_entity sample_entity_whole_population 10000 5.504% 4.723% 6.317%
sample_entity sample_entity_multi_property_indexed 10000 4.642% 3.908% 5.334%
sample_entity sample_entity_multi_property_indexed 100000 4.573% 3.016% 5.815%
sample_entity sample_entity_whole_population 1000 4.155% 3.670% 4.793%
sample_entity sample_entity_multi_property_indexed 1000 4.043% 3.318% 4.807%
indexing query_people_multiple_individually_indexed_properties_entities 3.322% 2.901% 3.729%
counts reindex_after_adding_more_entities 3.016% 2.772% 3.254%
counts index_after_adding_entities 2.778% 2.329% 3.310%
Improvements (faster)
Group Bench Param Change CI Lower CI Upper
sample_entity sample_entity_single_property_unindexed 10000 -15.745% -17.392% -14.384%
indexing query_people_single_indexed_property_entities -7.119% -8.773% -5.487%
indexing query_people_indexed_multi-property_entities -6.099% -7.044% -5.066%
sample_entity sample_entity_whole_population 100000 -4.652% -5.426% -3.947%
sample_entity sample_entity_single_property_unindexed 1000 -4.398% -5.845% -2.770%
indexing with_query_results_multiple_individually_indexed_properties_enti -4.078% -4.329% -3.836%
large_dataset bench_query_population_multi_indexed_entities -3.400% -4.333% -2.607%
large_dataset bench_query_population_derived_property_entities -3.061% -3.816% -2.350%
sampling sampling_multiple_l_reservoir_entities -2.262% -2.373% -2.143%
Unchanged / inconclusive (CI crosses 0%)
Group Bench Param Change CI Lower CI Upper
large_dataset bench_filter_indexed_entity 11.551% -3.296% 28.170%
large_dataset bench_filter_unindexed_entity 2.208% -2.834% 7.579%
indexing query_people_count_multiple_individually_indexed_properties_enti 1.525% 0.971% 2.093%
counts multi_property_unindexed_entities -1.499% -2.591% -0.687%
indexing with_query_results_indexed_multi-property_entities -1.386% -2.142% -0.687%
examples example-births-deaths -1.150% -1.331% -0.939%
large_dataset bench_query_population_indexed_property_entities 1.058% 0.835% 1.278%
indexing query_people_count_indexed_multi-property_entities -0.833% -1.190% -0.312%
sample_entity sample_entity_single_property_indexed 100000 0.799% 0.264% 1.360%
examples example-basic-infection 0.749% -0.027% 1.526%
indexing query_people_count_single_indexed_property_entities -0.414% -0.830% -0.145%
large_dataset bench_query_population_property_entities 0.402% -0.094% 0.931%
counts single_property_unindexed_entities 0.391% -0.196% 0.918%
sample_entity sample_entity_single_property_indexed 1000 0.386% -0.011% 0.935%
indexing with_query_results_single_indexed_property_entities -0.319% -0.921% 0.264%
sampling sampling_multiple_known_length_entities 0.257% -0.102% 0.724%
sample_entity sample_entity_single_property_indexed 10000 -0.240% -1.656% 0.677%
algorithm_benches algorithm_sampling_multiple_l_reservoir -0.226% -0.585% 0.236%
sample_entity sample_entity_single_property_unindexed 100000 0.222% -0.244% 0.775%
sampling sampling_single_known_length_entities -0.203% -0.864% 0.505%
algorithm_benches algorithm_sampling_single_known_length 0.154% -0.392% 0.938%
algorithm_benches algorithm_sampling_single_rand_reservoir 0.040% -0.157% 0.301%
algorithm_benches algorithm_sampling_single_l_reservoir -0.035% -0.203% 0.101%
sampling sampling_single_l_reservoir_entities -0.024% -0.218% 0.156%
sampling sampling_multiple_unindexed_entities -0.020% -0.455% 0.408%
sampling sampling_single_unindexed_entities -0.011% -0.092% 0.078%

github-actions bot added a commit that referenced this pull request Feb 26, 2026
github-actions bot added a commit that referenced this pull request Feb 26, 2026
@github-actions
Copy link

Benchmark Results

Hyperfine

Command Mean [ms] Min [ms] Max [ms] Relative
large_sir::baseline 2.4 ± 0.1 2.3 2.6 1.00
large_sir::entities 10.7 ± 0.2 10.4 11.3 4.44 ± 0.13

Criterion

Regressions (slower)
Group Bench Param Change CI Lower CI Upper
sampling sampling_multiple_unindexed_entities 11.567% 9.887% 13.307%
sampling sampling_single_unindexed_entities 7.453% 5.532% 9.563%
sample_entity sample_entity_single_property_indexed 1000 5.186% 3.893% 6.764%
indexing query_people_count_indexed_multi-property_entities 4.516% 3.555% 5.399%
indexing query_people_multiple_individually_indexed_properties_entities 4.319% 3.776% 4.875%
counts reindex_after_adding_more_entities 2.357% 2.153% 2.569%
sampling sampling_multiple_known_length_entities 1.765% 1.350% 2.116%
Improvements (faster)
Group Bench Param Change CI Lower CI Upper
sample_entity sample_entity_single_property_unindexed 10000 -19.640% -21.690% -18.245%
sample_entity sample_entity_single_property_unindexed 1000 -11.919% -12.219% -11.595%
large_dataset bench_query_population_multi_unindexed_entities -9.359% -11.061% -7.696%
sample_entity sample_entity_single_property_unindexed 100000 -9.070% -9.260% -8.881%
sample_entity sample_entity_whole_population 1000 -6.328% -6.671% -6.067%
sample_entity sample_entity_whole_population 100000 -6.048% -6.567% -5.670%
sampling sampling_single_known_length_entities -4.537% -5.216% -3.980%
large_dataset bench_match_entity -4.419% -4.657% -4.184%
large_dataset bench_query_population_derived_property_entities -4.034% -4.369% -3.715%
large_dataset bench_query_population_multi_indexed_entities -3.236% -3.902% -2.638%
sample_entity sample_entity_whole_population 10000 -3.081% -3.726% -2.554%
sample_entity sample_entity_multi_property_indexed 100000 -2.295% -3.105% -1.347%
sampling sampling_multiple_l_reservoir_entities -2.083% -2.352% -1.636%
sample_entity sample_entity_multi_property_indexed 1000 -2.075% -2.726% -1.199%
indexing query_people_single_indexed_property_entities -1.971% -2.149% -1.843%
counts index_after_adding_entities -1.468% -1.741% -1.130%
sampling sampling_single_l_reservoir_entities -1.314% -1.451% -1.203%
Unchanged / inconclusive (CI crosses 0%)
Group Bench Param Change CI Lower CI Upper
large_dataset bench_filter_indexed_entity 2.322% -11.342% 16.270%
large_dataset bench_filter_unindexed_entity -1.718% -7.420% 3.989%
sample_entity sample_entity_multi_property_indexed 10000 -1.678% -2.863% 0.664%
indexing with_query_results_multiple_individually_indexed_properties_enti -1.577% -2.110% -0.998%
counts multi_property_indexed_entities 1.255% 0.887% 1.519%
sample_entity sample_entity_single_property_indexed 10000 1.217% 0.647% 1.852%
examples example-births-deaths 1.107% 0.775% 1.357%
algorithm_benches algorithm_sampling_multiple_known_length -1.099% -1.313% -0.859%
algorithm_benches algorithm_sampling_multiple_l_reservoir -0.785% -1.022% -0.480%
large_dataset bench_query_population_indexed_property_entities -0.701% -0.907% -0.529%
indexing with_query_results_indexed_multi-property_entities 0.697% -0.586% 1.782%
sample_entity sample_entity_single_property_indexed 100000 0.626% 0.395% 0.847%
indexing with_query_results_single_indexed_property_entities 0.446% -0.112% 1.051%
algorithm_benches algorithm_sampling_single_rand_reservoir -0.433% -0.868% -0.078%
indexing query_people_count_multiple_individually_indexed_properties_enti 0.426% 0.164% 0.693%
large_dataset bench_query_population_property_entities 0.350% -0.471% 1.063%
counts single_property_indexed_entities -0.339% -0.824% 0.277%
counts multi_property_unindexed_entities 0.325% -0.736% 1.201%
algorithm_benches algorithm_sampling_single_known_length -0.317% -0.919% 0.136%
indexing query_people_indexed_multi-property_entities -0.207% -0.513% 0.191%
counts single_property_unindexed_entities -0.121% -0.747% 0.477%
algorithm_benches algorithm_sampling_single_l_reservoir 0.087% -0.033% 0.192%
examples example-basic-infection 0.068% -1.806% 1.664%
indexing query_people_count_single_indexed_property_entities 0.053% -0.149% 0.234%

@RobertJacobsonCDC RobertJacobsonCDC linked an issue Feb 26, 2026 that may be closed by this pull request
@RobertJacobsonCDC RobertJacobsonCDC linked an issue Feb 26, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incidence mechanism for properties Design a replacement for "tabulators" for entities

2 participants