Add Nuplan central token extraction and USDZ dataset support for Alpasim integration#62
Open
WCJ-BERT wants to merge 4 commits intoNVlabs:alpasimfrom
Open
Add Nuplan central token extraction and USDZ dataset support for Alpasim integration#62WCJ-BERT wants to merge 4 commits intoNVlabs:alpasimfrom
WCJ-BERT wants to merge 4 commits intoNVlabs:alpasimfrom
Conversation
Add USDZ dataset implementation with the following features: - Wrap alpasim_utils.Artifact for stable USDZ parsing - Convert Artifact data to trajdata's standard DataFrame format - Extract velocity and acceleration from trajectory positions - Use project's arr_utils.quaternion_to_yaw() for consistency - Calculate actual dt from timestamps (fallback to 0.1s) - Optimize derivative computation with single groupby pass - Support maps and agent metadata extraction Technical improvements: - Proper quaternion order conversion ([x,y,z,w] -> [w,x,y,z]) - Dynamic time step calculation from trajectory timestamps - Efficient pandas operations for velocity/acceleration - Clean integration with env_utils Total: 461 lines of new USDZ dataset code
- Add dataset_kwargs parameter to ParallelDatasetPreprocessor - Pass dataset_kwargs through multiprocessing to child processes - Save dataset_kwargs in UnifiedDataset for parallel preprocessing - This enables dataset-specific configurations to work in parallel mode Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…path
Major changes:
- Change dataset_kwargs format from flat to nested dict structure
Old: dataset_kwargs={'param': value} (shared by all datasets)
New: dataset_kwargs={'dataset_name': {'param': value}} (per-dataset)
- Remove yaml_config_path parameter from NuplanDataset and NuPlanObject
Use central_tokens_config directly instead
- Update env_utils.get_raw_datasets() to support nested dict format
Each dataset now receives only its own specific parameters
- Optimize parallel preprocessing to avoid redundant parameter extraction
ParallelDatasetPreprocessor now correctly handles nested dict format
- Clean up df_cache.py: remove unused resolution parameter
Benefits:
- Clearer separation of per-dataset parameters
- More flexible multi-dataset configuration
- Simpler parallel worker logic
Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
This PR enables trajdata as a unified data source for Alpasim by adding:
All changes are fully backward compatible with existing trajdata usage, and ready to support Unified Scene Data Flow changes in Alpasim.