-
Notifications
You must be signed in to change notification settings - Fork 83
Open
Description
The current implementation uses a boolean[] as an input. Use of a BitSet (https://docs.oracle.com/javase/7/docs/api/java/util/BitSet.html) would be a lot more efficient.
For example, if dictionary size is Integer.MAX_INT, as it would be with the "hashing shingles" approach given in 3.2.3 of Ullman et al, I need to allocate 2GB of memory to store an array of booleans. With BitSet, I can store that in approximately 8 times less space.
Metadata
Metadata
Assignees
Labels
No labels