Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 3 additions & 17 deletions docs/book/src/topics/indexing.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,9 @@ Best practices:
can put all of your `Context::index_property` calls together in a main
initialization function if you prefer.
- It is not an error to call `Context::index_property` in the middle of a
running simulation or to call it twice for the same property, but it makes
little sense to do so.
running simulation or to call it twice for the same property.
- Calling `Context::index_property` enables indexing and catches the index up to
the current population at the time of the call.

## Property Value Storage in Ixa

Expand Down Expand Up @@ -112,17 +113,6 @@ An index in Ixa is just a map between a property _value_ and the list of all
property value is (almost) as fast as looking up the property value for a given
`PersonId`.

> [!INFO] Ixa's Intelligent Indexing Strategy
>
> The picture we have painted of how Ixa implements indexing is necessarily a
> simplification. Ixa uses a lazy indexing strategy, which means if a property
> is never queried, then Ixa never actually does the work of computing the
> index. Ixa also keeps track of whether new people have been added to the
> simulation since the last time a query was run, so that it only has to update
> its index for those newly added people. These and other optimizations make
> Ixa's indexing very fast and memory efficient compared to the simplistic
> version described in this section.

## The Costs of Creating an Index

There are two costs you have to pay for indexing:
Expand All @@ -145,10 +135,6 @@ There are two costs you have to pay for indexing:
> it all at once doesn't matter: the sum of all the small efforts to update the
> index every time a person is added is equal to the cost of creating the index
> from scratch for an existing set of data.
>
> We can, however, save the effort of updating the index when property values
> _change_ if we wait until we actually run a query that needs to use the index
> before we construct the index. This is why Ixa uses a lazy indexing strategy.

Usually scanning through the whole property column is so slow relative to
maintaining an index that the extra computational cost of maintaining the index
Expand Down
85 changes: 49 additions & 36 deletions src/entity/context_extension.rs
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,12 @@ pub trait ContextEntitiesExt {
property_value: P,
);

/// Enables indexing of property values for the property `P`.
/// Enables full indexing of property values for the property `P`.
///
/// This method is called with the turbo-fish syntax:
/// `context.index_property::<Person, Age>()`
/// The actual computation of the index is done lazily as needed upon execution of queries,
/// not when this method is called.
///
/// This method both enables the index and catches it up to the current population.
fn index_property<E: Entity, P: Property<E>>(&mut self);

/// Enables value-count indexing of property values for the property `P`.
Expand Down Expand Up @@ -145,8 +145,19 @@ impl ContextEntitiesExt for Context {

// Assign the properties in the list to the new entity.
// This does not generate a property change event.
property_list
.set_values_for_entity(new_entity_id, self.entity_store.get_property_store::<E>());
property_list.set_values_for_new_entity(
new_entity_id,
self.entity_store.get_property_store_mut::<E>(),
);

// Keep all enabled indexes caught up as entities are created.
let context_ptr: *const Context = self;
let property_store = self.entity_store.get_property_store_mut::<E>();
// SAFETY: We create a shared `&Context` for read-only property access while mutably
// borrowing the property store to update index internals.
unsafe {
property_store.index_unindexed_entities_for_all_properties(&*context_ptr);
}

// Emit an `EntityCreatedEvent<Entity>`.
self.emit_event(EntityCreatedEvent::<E>::new(new_entity_id));
Expand All @@ -170,24 +181,12 @@ impl ContextEntitiesExt for Context {
) {
debug_assert!(!P::is_derived(), "cannot set a derived property");

// The algorithm is as follows
// 1. Get the previous value of the property.
// 1.1 If it's the same as `property_value`, exit.
// 1.2 Otherwise, create a `PartialPropertyChangeEvent<E, P>`.
// 2. Remove the `entity_id` from the index bucket corresponding to its old value.
// 3. For each dependent of the property, do the analog of steps 1 & 2:
// 3.1 Compute the previous value of the dependent property `Q`, creating a
// `PartialPropertyChangeEvent<E, Q>` instance if necessary.
// 3.2 Remove the `entity_id` from the index bucket corresponding to the old value of `Q`.
// 4. Set the new value of the (main) property in the property store.
// 5. Update the property index: Insert the `entity_id` into the index bucket corresponding to the new value.
// 6. Emit the property change event: convert the `PartialPropertyChangeEvent<E, P>` into a
// `event: PropertyChangeEvent<E, P>` and call `Context::emit_event(event)`.
// 7. For each dependent of the property, do the analog of steps 4-6:
// 7.1 Compute the new value of the dependent property
// 7.2 Add `entity_id` to the index bucket corresponding to the new value.
// 7.3 convert the `PartialPropertyChangeEvent<E, Q>` into a
// `event: PropertyChangeEvent<E, Q>` and call `Context::emit_event(event)`.
// The algorithm is as follows:
// 1. Snapshot previous values for the main property and its dependents by creating
// `PartialPropertyChangeEvent` instances.
// 2. Set the new value of the main property in the property store.
// 3. Emit each partial event; during emission each event computes the current value,
// updates its index (remove old/add new), and emits a `PropertyChangeEvent`.

// We need two passes over the dependents: one pass to compute all the old values and
// another to compute all the new values. We group the steps for each dependent (and, it
Expand All @@ -211,32 +210,48 @@ impl ContextEntitiesExt for Context {
// - There may be use cases for listening to "writes" that don't actually change values.

let mut dependents: Vec<Box<dyn PartialPropertyChangeEvent>> = vec![];
let property_store = self.entity_store.get_property_store::<E>();

// Create the partial property change for this value.
dependents.push(property_store.create_partial_property_change(P::id(), entity_id, self));
// Now create partial property change events for each dependent.
for dependent_idx in P::dependents() {
// Immutable: Collect the previous value to create partial property change events
{
let property_store = self.entity_store.get_property_store::<E>();

// Create the partial property change for this value.
dependents.push(property_store.create_partial_property_change(
*dependent_idx,
P::id(),
entity_id,
self,
));
// Now create partial property change events for each dependent.
for dependent_idx in P::dependents() {
dependents.push(property_store.create_partial_property_change(
*dependent_idx,
entity_id,
self,
));
}
}

// Update the value
let property_value_store = self.get_property_value_store::<E, P>();
property_value_store.set(entity_id, property_value);

// After updating the value
// Mutable: After updating the value, we update its dependents, removing old values and
// storing the new values in their respective indexes, and emit the property change event.
for dependent in dependents.into_iter() {
dependent.emit_in_context(self)
}
}

fn index_property<E: Entity, P: Property<E>>(&mut self) {
let property_id = P::index_id();
let context_ptr: *const Context = self;
let property_store = self.entity_store.get_property_store_mut::<E>();
property_store.set_property_indexed::<P>(PropertyIndexType::FullIndex);
// SAFETY: This only creates a shared reference to `Context` while mutably borrowing
// the property store to update index internals.
unsafe {
property_store.index_unindexed_entities_for_property_id(&*context_ptr, property_id);
}
}

fn index_property_counts<E: Entity, P: Property<E>>(&mut self) {
Expand Down Expand Up @@ -265,7 +280,6 @@ impl ContextEntitiesExt for Context {
if let Some(multi_property_id) = query.multi_property_id() {
let property_store = self.entity_store.get_property_store::<E>();
match property_store.get_index_set_with_hash_for_property_id(
self,
multi_property_id,
query.multi_property_value_hash(),
) {
Expand Down Expand Up @@ -305,7 +319,6 @@ impl ContextEntitiesExt for Context {
if let Some(multi_property_id) = query.multi_property_id() {
let property_store = self.entity_store.get_property_store::<E>();
match property_store.get_index_count_with_hash_for_property_id(
self,
multi_property_id,
query.multi_property_value_hash(),
) {
Expand Down Expand Up @@ -422,7 +435,7 @@ impl ContextEntitiesExt for Context {

#[cfg(test)]
mod tests {
use std::cell::{Ref, RefCell};
use std::cell::RefCell;
use std::rc::Rc;

use super::*;
Expand Down Expand Up @@ -568,7 +581,7 @@ mod tests {
}

// Helper for index tests
#[derive(Copy, Clone)]
#[derive(Copy, Clone, Debug)]
enum IndexMode {
Unindexed,
FullIndex,
Expand Down Expand Up @@ -608,7 +621,7 @@ mod tests {
context.with_query_results((existing_value,), &mut |people_set| {
existing_len = people_set.into_iter().count();
});
assert_eq!(existing_len, 2);
assert_eq!(existing_len, 2, "Wrong length for {mode:?}");

let mut missing_len = 0;
context.with_query_results((missing_value,), &mut |people_set| {
Expand Down Expand Up @@ -926,7 +939,7 @@ mod tests {

let property_store = context.entity_store.get_property_store::<Person>();
let property_value_store = property_store.get_with_id(index_id);
let bucket: Ref<IndexSet<EntityId<Person>>> = property_value_store
let bucket: &IndexSet<EntityId<Person>> = property_value_store
.get_index_set_with_hash(
(InfectionStatus::Susceptible, Vaccinated(true)).multi_property_value_hash(),
)
Expand Down
37 changes: 14 additions & 23 deletions src/entity/entity_set/entity_set.rs
Original file line number Diff line number Diff line change
Expand Up @@ -312,8 +312,6 @@ impl<'a, E: Entity> IntoIterator for EntitySet<'a, E> {

#[cfg(test)]
mod tests {
use std::cell::RefCell;

use super::*;
use crate::entity::ContextEntitiesExt;
use crate::hashing::IndexSet;
Expand All @@ -322,17 +320,15 @@ mod tests {
define_entity!(Person);
define_property!(struct Age(u8), Person);

fn finite_set(ids: &[usize]) -> RefCell<IndexSet<EntityId<Person>>> {
RefCell::new(
ids.iter()
.copied()
.map(EntityId::<Person>::new)
.collect::<IndexSet<_>>(),
)
fn finite_set(ids: &[usize]) -> IndexSet<EntityId<Person>> {
ids.iter()
.copied()
.map(EntityId::<Person>::new)
.collect::<IndexSet<_>>()
}

fn as_entity_set(set: &RefCell<IndexSet<EntityId<Person>>>) -> EntitySet<Person> {
EntitySet::from_source(SourceSet::IndexSet(set.borrow()))
fn as_entity_set(set: &IndexSet<EntityId<Person>>) -> EntitySet<Person> {
EntitySet::from_source(SourceSet::IndexSet(set))
}

#[test]
Expand Down Expand Up @@ -470,16 +466,13 @@ mod tests {
let b = finite_set(&[2, 3, 4]);

let union = as_entity_set(&a).union(as_entity_set(&b));
assert_eq!(union.sort_key(), (a.borrow().len() + b.borrow().len(), 3));
assert_eq!(union.sort_key(), (a.len() + b.len(), 3));

let intersection = as_entity_set(&a).intersection(as_entity_set(&b));
assert_eq!(
intersection.sort_key(),
(a.borrow().len().min(b.borrow().len()), 6)
);
assert_eq!(intersection.sort_key(), (a.len().min(b.len()), 6));

let difference = as_entity_set(&a).difference(as_entity_set(&b));
assert_eq!(difference.sort_key(), (a.borrow().len(), 6));
assert_eq!(difference.sort_key(), (a.len(), 6));
}

#[test]
Expand Down Expand Up @@ -525,12 +518,10 @@ mod tests {
let population = EntitySet::<Person>::from_source(SourceSet::Population(5));
assert_eq!(population.try_len(), Some(5));

let index_data = RefCell::new(
[EntityId::new(1), EntityId::new(2), EntityId::new(3)]
.into_iter()
.collect::<IndexSet<_>>(),
);
let indexed = EntitySet::<Person>::from_source(SourceSet::IndexSet(index_data.borrow()));
let index_data = [EntityId::new(1), EntityId::new(2), EntityId::new(3)]
.into_iter()
.collect::<IndexSet<_>>();
let indexed = EntitySet::<Person>::from_source(SourceSet::IndexSet(&index_data));
assert_eq!(indexed.try_len(), Some(3));

let mut context = Context::new();
Expand Down
22 changes: 8 additions & 14 deletions src/entity/entity_set/entity_set_iterator.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,6 @@
//!
//! The iterator is created through `EntitySet::into_iter()`.

use std::cell::Ref;

use log::warn;
use rand::Rng;

Expand Down Expand Up @@ -260,7 +258,7 @@ impl<'c, E: Entity> EntitySetIterator<'c, E> {
}
}

pub(crate) fn from_index_set(set: Ref<'c, IndexSet<EntityId<E>>>) -> EntitySetIterator<'c, E> {
pub(crate) fn from_index_set(set: &'c IndexSet<EntityId<E>>) -> EntitySetIterator<'c, E> {
EntitySetIterator {
inner: EntitySetIteratorInner::Source(SourceSet::IndexSet(set).into_iter()),
}
Expand Down Expand Up @@ -449,8 +447,6 @@ mod tests {
set", making the tested property NOT the initial source position.
*/

use std::cell::RefCell;

use indexmap::IndexSet;

use crate::entity::entity_set::{EntitySet, SourceSet};
Expand Down Expand Up @@ -1053,17 +1049,15 @@ mod tests {
assert_eq!(remaining, 2);
}

fn finite_set(ids: &[usize]) -> RefCell<FxIndexSet<EntityId<Person>>> {
RefCell::new(
ids.iter()
.copied()
.map(EntityId::new)
.collect::<FxIndexSet<_>>(),
)
fn finite_set(ids: &[usize]) -> FxIndexSet<EntityId<Person>> {
ids.iter()
.copied()
.map(EntityId::new)
.collect::<FxIndexSet<_>>()
}

fn as_entity_set(set: &RefCell<FxIndexSet<EntityId<Person>>>) -> EntitySet<Person> {
EntitySet::from_source(SourceSet::IndexSet(set.borrow()))
fn as_entity_set(set: &FxIndexSet<EntityId<Person>>) -> EntitySet<Person> {
EntitySet::from_source(SourceSet::IndexSet(set))
}

#[test]
Expand Down
10 changes: 3 additions & 7 deletions src/entity/entity_set/source_iterator.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@
//! (`ConcretePropertySource` and `DerivedPropertySource`), which serve as both
//! set-facing wrappers and iterators.

use std::cell::Ref;
use std::fmt::{Debug, Formatter};

use ouroboros::self_referencing;
Expand All @@ -29,14 +28,14 @@ use crate::hashing::{IndexSet, IndexSetIter};
/// iterator in the `Iterator` implementation on `SourceIterator`.
#[self_referencing]
pub(super) struct IndexSetIterator<'a, E: Entity> {
index_set: Ref<'a, IndexSet<EntityId<E>>>,
index_set: &'a IndexSet<EntityId<E>>,
#[borrows(index_set)]
#[covariant]
iter: IndexSetIter<'this, EntityId<E>>,
}

impl<'a, E: Entity> IndexSetIterator<'a, E> {
pub fn from_index_set(index_set: Ref<'a, IndexSet<EntityId<E>>>) -> Self {
pub fn from_index_set(index_set: &'a IndexSet<EntityId<E>>) -> Self {
IndexSetIteratorBuilder {
index_set,
iter_builder: |index_set| index_set.iter(),
Expand Down Expand Up @@ -226,8 +225,6 @@ impl<'c, E: Entity> std::iter::FusedIterator for SourceIterator<'c, E> {}

#[cfg(test)]
mod tests {
use std::cell::RefCell;

use super::super::source_set::{ConcretePropertySource, SourceSet};
use crate::entity::property_value_store_core::RawPropertyValueVec;
use crate::entity::EntityId;
Expand All @@ -247,8 +244,7 @@ mod tests {
EntityId::new(3),
EntityId::new(6),
]);
let people_set = RefCell::new(people_set);
let people_set_ref = people_set.borrow();
let people_set_ref = &people_set;

let mut iter = SourceSet::IndexSet(people_set_ref).into_iter();
assert_eq!(iter.next(), Some(EntityId::new(0)));
Expand Down
Loading
Loading