Currently, our Q-table (self.q_table) uses exact string keys for state-action values, e.g., "2,2|8,0|13". This means only exact state matches are possible. For better generalization and learning, we want to experiment with fuzzy lookup strategies that can retrieve similar states, not just exact matches.
Goals
- Implement and benchmark three different fuzzy lookup strategies for the Q-table:
- Manual Iteration with Custom Similarity Function
- Structured Key with Spatial Index (e.g., KD-tree)
- Embedding-based Approximate Nearest Neighbor (ANN) Search
Tasks
1. Manual Iteration with Custom Similarity Function
2. Structured Key with Spatial Index
3. Embedding-based ANN Search
Evaluation
References
Currently, our Q-table (
self.q_table) uses exact string keys for state-action values, e.g.,"2,2|8,0|13". This means only exact state matches are possible. For better generalization and learning, we want to experiment with fuzzy lookup strategies that can retrieve similar states, not just exact matches.Goals
Tasks
1. Manual Iteration with Custom Similarity Function
2. Structured Key with Spatial Index
(x, y, tx, ty, steps)).scikit-learn's KD-tree or BallTree) to perform fast nearest-neighbor lookups.3. Embedding-based ANN Search
faiss,annoy, orscann) to index and search state embeddings.Evaluation
References