We welcome contributions! Please open github issues or submit pull requests for:
- New algorithms and models
- Performance improvements
- Documentation enhancements
- Bug fixes and feature requests
This guide explains how to create and add your own custom machine learning algorithms to Neuracore. You have two options:
- Open Source Contribution: Submit a PR to add your algorithm to the Neuracore repository
- Private Algorithm: Upload your algorithm directly to your account at neuracore.com
All Neuracore algorithms must extend the NeuracoreModel class. This base class provides the foundation for creating models that can process robot data and generate actions.
- Data Types: Neuracore supports various data types (joint positions, RGB images, etc.)
- Batched Data: Input and output data is provided in batched form
- Model Architecture: You define how your model processes inputs and generates outputs
Your model must inherit from NeuracoreModel and implement several required methods:
from typing import Any, cast
import torch
import torch.nn as nn
from neuracore_types import (
BatchedJointData,
BatchedNCData,
DataItemStats,
DataType,
JointDataStats,
ModelInitDescription,
)
from neuracore.ml import (
BatchedInferenceInputs,
BatchedTrainingOutputs,
BatchedTrainingSamples,
NeuracoreModel,
)
from neuracore.ml.algorithm_utils.normalizer import MeanStdNormalizer
PROPRIO_NORMALIZER = MeanStdNormalizer
ACTION_NORMALIZER = MeanStdNormalizer
class MyCustomAlgorithm(NeuracoreModel):
"""A custom algorithm for robot control."""
def __init__(
self,
model_init_description: ModelInitDescription,
lr: float = 1e-4,
# Add any other hyperparameters your model needs
):
super().__init__(model_init_description)
self.lr = lr
# Access data types and statistics from base class
# self.input_data_types, self.output_data_types,
# self.input_dataset_statistics, and self.output_dataset_statistics
# are available after calling super().__init__()
# Stats for inputs
joint_stats = DataItemStats()
for stat in cast(
list[JointDataStats],
self.input_dataset_statistics.get(DataType.JOINT_POSITIONS, []),
):
joint_stats = joint_stats.concatenate(stat.value)
if len(joint_stats.mean) == 0:
raise ValueError("JOINT_POSITIONS must be present in input data types")
# Stats for outputs
output_stats: list[DataItemStats] = []
target_stats = DataItemStats()
for stat in cast(
list[JointDataStats],
self.output_dataset_statistics.get(
DataType.JOINT_TARGET_POSITIONS, []
),
):
target_stats = target_stats.concatenate(stat.value)
if len(target_stats.mean) == 0:
raise ValueError("JOINT_TARGET_POSITIONS must be present in output data types")
output_stats.append(target_stats)
self.max_output_size = len(target_stats.mean)
# Normalizers
self.proprio_normalizer = PROPRIO_NORMALIZER(
name="proprioception", statistics=[joint_stats]
)
self.action_normalizer = ACTION_NORMALIZER(name="actions", statistics=output_stats)
# Output layer predicts entire sequence directly from normalized joints
self.output_size = self.max_output_size * self.output_prediction_horizon
self.output_layer = nn.Linear(len(joint_stats.mean), self.output_size)
def forward(
self, batch: BatchedInferenceInputs
) -> dict[DataType, list[BatchedNCData]]:
"""Forward pass for inference.
Args:
batch: Input batch with observations
Returns:
dict[DataType, list[BatchedNCData]]: Model predictions with action sequences
"""
# Extract and normalize joint positions
batched_joint_data = cast(
list[BatchedJointData], batch.inputs[DataType.JOINT_POSITIONS]
)
mask = batch.inputs_mask[DataType.JOINT_POSITIONS]
# Concatenate all joint values: (B, T, num_joints)
joint_data = torch.cat([bjd.value for bjd in batched_joint_data], dim=-1)
last_joint = joint_data[:, -1, :] # (B, num_joints)
masked_joint = last_joint * mask
# Normalize
normalized_joint = self.proprio_normalizer.normalize(masked_joint)
# Predict directly from normalized joints
action_preds = self.output_layer(normalized_joint) # (B, output_size)
# Reshape to (B, T, action_dim)
batch_size = len(batch)
action_preds = action_preds.view(
batch_size, self.output_prediction_horizon, self.max_output_size
)
# Unnormalize predictions
predictions = self.action_normalizer.unnormalize(action_preds)
# Format output as dict[DataType, list[BatchedNCData]]
output_tensors: dict[DataType, list[BatchedNCData]] = {}
batched_outputs = []
for i in range(
len(self.output_dataset_statistics[DataType.JOINT_TARGET_POSITIONS])
):
joint_preds = predictions[:, :, i : i + 1] # (B, T, 1)
batched_outputs.append(BatchedJointData(value=joint_preds))
output_tensors[DataType.JOINT_TARGET_POSITIONS] = batched_outputs
return output_tensors
def training_step(self, batch: BatchedTrainingSamples) -> BatchedTrainingOutputs:
"""Training step - forward pass with loss calculation.
Args:
batch: Training batch with inputs and targets
Returns:
BatchedTrainingOutputs: Training outputs with losses and metrics
"""
# Create inference batch from training inputs
inference_sample = BatchedInferenceInputs(
inputs=batch.inputs,
inputs_mask=batch.inputs_mask,
batch_size=batch.batch_size,
)
# Get predictions
predictions_dict = self.forward(inference_sample)
# Extract target actions
if DataType.JOINT_TARGET_POSITIONS not in batch.outputs:
raise ValueError("JOINT_TARGET_POSITIONS required in batch outputs")
batched_joints = cast(
list[BatchedJointData], batch.outputs[DataType.JOINT_TARGET_POSITIONS]
)
action_targets = [bjd.value for bjd in batched_joints]
action_data = torch.cat(action_targets, dim=-1) # (B, T, num_joints)
action_data = action_data.view(
batch.batch_size, self.output_prediction_horizon, -1
) # (B, T, action_dim)
target_actions = self.action_normalizer.normalize(action_data)
# Get predicted actions (already normalized in forward)
pred_joints = cast(
list[BatchedJointData],
predictions_dict[DataType.JOINT_TARGET_POSITIONS],
)
pred_actions = torch.cat([bjd.value for bjd in pred_joints], dim=-1)
pred_actions = pred_actions.view(
batch.batch_size, self.output_prediction_horizon, -1
)
pred_actions_normalized = self.action_normalizer.normalize(pred_actions)
# Calculate loss
losses: dict[str, Any] = {}
metrics: dict[str, Any] = {}
if self.training:
losses["mse_loss"] = nn.functional.mse_loss(
pred_actions_normalized, target_actions
)
return BatchedTrainingOutputs(losses=losses, metrics=metrics)
def configure_optimizers(
self,
) -> list[torch.optim.Optimizer]:
"""Configure optimizer for training.
Returns:
list[torch.optim.Optimizer]: List of optimizers
"""
return [torch.optim.Adam(self.parameters(), lr=self.lr)]
@staticmethod
def get_supported_input_data_types() -> set[DataType]:
"""Return the data types supported by the model.
Returns:
set[DataType]: Set of supported input data types
"""
return {DataType.JOINT_POSITIONS}
@staticmethod
def get_supported_output_data_types() -> set[DataType]:
"""Return the data types supported by the model.
Returns:
set[DataType]: Set of supported output data types
"""
return {DataType.JOINT_TARGET_POSITIONS}Your model must implement these methods:
__init__: Initialize your model with necessary layers and componentsforward: Define the inference logic of your modeltraining_step: Define how your model trains on a batch of dataconfigure_optimizers: Define what optimizers to use for trainingget_supported_input_data_types: Declare what input data types your model supportsget_supported_output_data_types: Declare what output data types your model can produce
configure_schedulers: Define what optimization schedulers to use for training
For a basic algorithm, a single Python file containing your model class is sufficient. For more complex algorithms:
algorithms/my_algorithm/
├── __init__.py # Empty or importing from my_algorithm.py
├── my_algorithm.py # Main model class definition
├── modules.py # Helper modules and components
└── requirements.txt # Optional: additional dependencies
After implementation finished, you can validate the custom algorithms with:
# Validate custom algorithms before upload
neuracore-validate /path/to/your/algorithmBefore submitting or uploading your algorithm, you can test it locally by creating a test similar to the ones in tests/unit/ml/algorithms
To setup the development environment:
git clone https://github.com/neuracoreai/neuracore
cd neuracore
pip install -e .[dev,ml]- Fork the Neuracore repository
- Add your algorithm to
neuracore/ml/algorithms/your_algorithm/ - Ensure your implementation passes all tests
- Submit a pull request to the main repository
- Go to the "Algorithms" tab on Neuracore Dashboard
- Click the "Upload Algorithm" button
- Either:
- Upload a single Python file containing your
NeuracoreModelextension - Upload a ZIP file containing your algorithm directory
- Upload a single Python file containing your
After uploading, your algorithm will appear as a trainable option when launching training jobs.
- Start Simple: Begin with a basic model architecture and gradually add complexity
- Study Existing Algorithms: Look at the algorithms in
neuracore/ml/algorithms/for examples - Mind Your Dependencies: If your algorithm requires additional packages, include them in a
requirements.txtfile - Test Thoroughly: Ensure your model handles all the data types it claims to support
- Document Well: Include docstrings and comments explaining your model's architecture and approach
If you encounter issues with your algorithm:
- Verify that your model correctly handles the batch structure
- Check that your model returns outputs in the expected format
- Ensure all tensor dimensions match what Neuracore expects
- When uploading as a ZIP, make sure your module imports are correctly structured
main: The single development and release branch - target all PRs here
All PRs to main must have a version label:
version:major- Breaking changesversion:minor- New featuresversion:patch- Bug fixesversion:none- No release needed (docs, chores, CI)
Commit Format: Use conventional commits:
<prefix>: <description>
Valid prefixes: feat, fix, chore, docs, ci, test, refactor, style, perf
For significant changes, optionally update changelogs/pending-changelog.md as part of your PR:
## Summary
This release adds multi-GPU training and improves streaming performance by 40%.This human-written summary is included at the top of the GitHub release notes. It is reset to a blank template automatically after each release.
- Go to Actions → Release → Run workflow
- Check dry_run to preview (recommended)
- Review the summary, then run again without dry_run
- The workflow will:
- Analyze PRs merged to
mainsince last release - Determine version bump from PR labels
- Bump version and push directly to
main - Publish to PyPI
- Create a GitHub release with changelog
- Tag the release
- Analyze PRs merged to