Contribution Guide

We welcome contributions! Please open github issues or submit pull requests for:

New algorithms and models
Performance improvements
Documentation enhancements
Bug fixes and feature requests

Creating Your Custom Algorithm

This guide explains how to create and add your own custom machine learning algorithms to Neuracore. You have two options:

Open Source Contribution: Submit a PR to add your algorithm to the Neuracore repository
Private Algorithm: Upload your algorithm directly to your account at neuracore.com

Understanding Neuracore Models

All Neuracore algorithms must extend the NeuracoreModel class. This base class provides the foundation for creating models that can process robot data and generate actions.

Key Concepts

Data Types: Neuracore supports various data types (joint positions, RGB images, etc.)
Batched Data: Input and output data is provided in batched form
Model Architecture: You define how your model processes inputs and generates outputs

Step 1: Extend the NeuracoreModel Class

Your model must inherit from NeuracoreModel and implement several required methods:

from typing import Any, cast

import torch
import torch.nn as nn
from neuracore_types import (
    BatchedJointData,
    BatchedNCData,
    DataItemStats,
    DataType,
    JointDataStats,
    ModelInitDescription,
)

from neuracore.ml import (
    BatchedInferenceInputs,
    BatchedTrainingOutputs,
    BatchedTrainingSamples,
    NeuracoreModel,
)
from neuracore.ml.algorithm_utils.normalizer import MeanStdNormalizer

PROPRIO_NORMALIZER = MeanStdNormalizer
ACTION_NORMALIZER = MeanStdNormalizer


class MyCustomAlgorithm(NeuracoreModel):
    """A custom algorithm for robot control."""

    def __init__(
        self,
        model_init_description: ModelInitDescription,
        lr: float = 1e-4,
        # Add any other hyperparameters your model needs
    ):
        super().__init__(model_init_description)
        self.lr = lr

        # Access data types and statistics from base class
        # self.input_data_types, self.output_data_types,
        # self.input_dataset_statistics, and self.output_dataset_statistics
        # are available after calling super().__init__()

        # Stats for inputs
        joint_stats = DataItemStats()
        for stat in cast(
            list[JointDataStats],
            self.input_dataset_statistics.get(DataType.JOINT_POSITIONS, []),
        ):
            joint_stats = joint_stats.concatenate(stat.value)
        if len(joint_stats.mean) == 0:
            raise ValueError("JOINT_POSITIONS must be present in input data types")

        # Stats for outputs
        output_stats: list[DataItemStats] = []
        target_stats = DataItemStats()
        for stat in cast(
            list[JointDataStats],
            self.output_dataset_statistics.get(
                DataType.JOINT_TARGET_POSITIONS, []
            ),
        ):
            target_stats = target_stats.concatenate(stat.value)
        if len(target_stats.mean) == 0:
            raise ValueError("JOINT_TARGET_POSITIONS must be present in output data types")
        output_stats.append(target_stats)
        self.max_output_size = len(target_stats.mean)

        # Normalizers
        self.proprio_normalizer = PROPRIO_NORMALIZER(
            name="proprioception", statistics=[joint_stats]
        )
        self.action_normalizer = ACTION_NORMALIZER(name="actions", statistics=output_stats)

        # Output layer predicts entire sequence directly from normalized joints
        self.output_size = self.max_output_size * self.output_prediction_horizon
        self.output_layer = nn.Linear(len(joint_stats.mean), self.output_size)

    def forward(
        self, batch: BatchedInferenceInputs
    ) -> dict[DataType, list[BatchedNCData]]:
        """Forward pass for inference.

        Args:
            batch: Input batch with observations

        Returns:
            dict[DataType, list[BatchedNCData]]: Model predictions with action sequences
        """
        # Extract and normalize joint positions
        batched_joint_data = cast(
            list[BatchedJointData], batch.inputs[DataType.JOINT_POSITIONS]
        )
        mask = batch.inputs_mask[DataType.JOINT_POSITIONS]

        # Concatenate all joint values: (B, T, num_joints)
        joint_data = torch.cat([bjd.value for bjd in batched_joint_data], dim=-1)
        last_joint = joint_data[:, -1, :]  # (B, num_joints)
        masked_joint = last_joint * mask

        # Normalize
        normalized_joint = self.proprio_normalizer.normalize(masked_joint)

        # Predict directly from normalized joints
        action_preds = self.output_layer(normalized_joint)  # (B, output_size)

        # Reshape to (B, T, action_dim)
        batch_size = len(batch)
        action_preds = action_preds.view(
            batch_size, self.output_prediction_horizon, self.max_output_size
        )

        # Unnormalize predictions
        predictions = self.action_normalizer.unnormalize(action_preds)

        # Format output as dict[DataType, list[BatchedNCData]]
        output_tensors: dict[DataType, list[BatchedNCData]] = {}
        batched_outputs = []
        for i in range(
            len(self.output_dataset_statistics[DataType.JOINT_TARGET_POSITIONS])
        ):
            joint_preds = predictions[:, :, i : i + 1]  # (B, T, 1)
            batched_outputs.append(BatchedJointData(value=joint_preds))
        output_tensors[DataType.JOINT_TARGET_POSITIONS] = batched_outputs

        return output_tensors

    def training_step(self, batch: BatchedTrainingSamples) -> BatchedTrainingOutputs:
        """Training step - forward pass with loss calculation.

        Args:
            batch: Training batch with inputs and targets

        Returns:
            BatchedTrainingOutputs: Training outputs with losses and metrics
        """
        # Create inference batch from training inputs
        inference_sample = BatchedInferenceInputs(
            inputs=batch.inputs,
            inputs_mask=batch.inputs_mask,
            batch_size=batch.batch_size,
        )

        # Get predictions
        predictions_dict = self.forward(inference_sample)

        # Extract target actions
        if DataType.JOINT_TARGET_POSITIONS not in batch.outputs:
            raise ValueError("JOINT_TARGET_POSITIONS required in batch outputs")

        batched_joints = cast(
            list[BatchedJointData], batch.outputs[DataType.JOINT_TARGET_POSITIONS]
        )
        action_targets = [bjd.value for bjd in batched_joints]
        action_data = torch.cat(action_targets, dim=-1)  # (B, T, num_joints)
        action_data = action_data.view(
            batch.batch_size, self.output_prediction_horizon, -1
        )  # (B, T, action_dim)

        target_actions = self.action_normalizer.normalize(action_data)

        # Get predicted actions (already normalized in forward)
        pred_joints = cast(
            list[BatchedJointData],
            predictions_dict[DataType.JOINT_TARGET_POSITIONS],
        )
        pred_actions = torch.cat([bjd.value for bjd in pred_joints], dim=-1)
        pred_actions = pred_actions.view(
            batch.batch_size, self.output_prediction_horizon, -1
        )
        pred_actions_normalized = self.action_normalizer.normalize(pred_actions)

        # Calculate loss
        losses: dict[str, Any] = {}
        metrics: dict[str, Any] = {}

        if self.training:
            losses["mse_loss"] = nn.functional.mse_loss(
                pred_actions_normalized, target_actions
            )

        return BatchedTrainingOutputs(losses=losses, metrics=metrics)

    def configure_optimizers(
        self,
    ) -> list[torch.optim.Optimizer]:
        """Configure optimizer for training.

        Returns:
            list[torch.optim.Optimizer]: List of optimizers
        """
        return [torch.optim.Adam(self.parameters(), lr=self.lr)]

    @staticmethod
    def get_supported_input_data_types() -> set[DataType]:
        """Return the data types supported by the model.

        Returns:
            set[DataType]: Set of supported input data types
        """
        return {DataType.JOINT_POSITIONS}

    @staticmethod
    def get_supported_output_data_types() -> set[DataType]:
        """Return the data types supported by the model.

        Returns:
            set[DataType]: Set of supported output data types
        """
        return {DataType.JOINT_TARGET_POSITIONS}

Step 2: Required Methods

Your model must implement these methods:

__init__: Initialize your model with necessary layers and components
forward: Define the inference logic of your model
training_step: Define how your model trains on a batch of data
configure_optimizers: Define what optimizers to use for training
get_supported_input_data_types: Declare what input data types your model supports
get_supported_output_data_types: Declare what output data types your model can produce

Step 3: Optional Methods

configure_schedulers: Define what optimization schedulers to use for training

Step 4: File Organization

For a basic algorithm, a single Python file containing your model class is sufficient. For more complex algorithms:

algorithms/my_algorithm/
├── __init__.py               # Empty or importing from my_algorithm.py
├── my_algorithm.py           # Main model class definition
├── modules.py                # Helper modules and components
└── requirements.txt          # Optional: additional dependencies

Step 5: Algorithm Validation

After implementation finished, you can validate the custom algorithms with:

# Validate custom algorithms before upload
neuracore-validate /path/to/your/algorithm

Step 6: Test Your Algorithm

Before submitting or uploading your algorithm, you can test it locally by creating a test similar to the ones in tests/unit/ml/algorithms

To setup the development environment:

git clone https://github.com/neuracoreai/neuracore
cd neuracore
pip install -e .[dev,ml]

Adding Your Algorithm to Neuracore

Option 1: Open Source Contribution

Fork the Neuracore repository
Add your algorithm to neuracore/ml/algorithms/your_algorithm/
Ensure your implementation passes all tests
Submit a pull request to the main repository

Option 2: Private Algorithm Upload

Go to the "Algorithms" tab on Neuracore Dashboard
Click the "Upload Algorithm" button
Either:
- Upload a single Python file containing your NeuracoreModel extension
- Upload a ZIP file containing your algorithm directory

After uploading, your algorithm will appear as a trainable option when launching training jobs.

Tips for Algorithm Development

Start Simple: Begin with a basic model architecture and gradually add complexity
Study Existing Algorithms: Look at the algorithms in neuracore/ml/algorithms/ for examples
Mind Your Dependencies: If your algorithm requires additional packages, include them in a requirements.txt file
Test Thoroughly: Ensure your model handles all the data types it claims to support
Document Well: Include docstrings and comments explaining your model's architecture and approach

Troubleshooting

If you encounter issues with your algorithm:

Verify that your model correctly handles the batch structure
Check that your model returns outputs in the expected format
Ensure all tensor dimensions match what Neuracore expects
When uploading as a ZIP, make sure your module imports are correctly structured

Release Process

Branch Strategy

main: The single development and release branch - target all PRs here

Creating PRs

All PRs to main must have a version label:

version:major - Breaking changes
version:minor - New features
version:patch - Bug fixes
version:none - No release needed (docs, chores, CI)

Commit Format: Use conventional commits:

<prefix>: <description>

Valid prefixes: feat, fix, chore, docs, ci, test, refactor, style, perf

Pending Changelog

For significant changes, optionally update changelogs/pending-changelog.md as part of your PR:

## Summary

This release adds multi-GPU training and improves streaming performance by 40%.

This human-written summary is included at the top of the GitHub release notes. It is reset to a blank template automatically after each release.

Creating a Release

Go to Actions → Release → Run workflow
Check dry_run to preview (recommended)
Review the summary, then run again without dry_run
The workflow will:
- Analyze PRs merged to main since last release
- Determine version bump from PR labels
- Bump version and push directly to main
- Publish to PyPI
- Create a GitHub release with changelog
- Tag the release

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contribution Guide

Creating Your Custom Algorithm

Understanding Neuracore Models

Key Concepts

Step 1: Extend the NeuracoreModel Class

Step 2: Required Methods

Step 3: Optional Methods

Step 4: File Organization

Step 5: Algorithm Validation

Step 6: Test Your Algorithm

Adding Your Algorithm to Neuracore

Option 1: Open Source Contribution

Option 2: Private Algorithm Upload

Tips for Algorithm Development

Troubleshooting

Release Process

Branch Strategy

Creating PRs

Pending Changelog

Creating a Release

FilesExpand file tree

contribution_guide.md

Latest commit

History

contribution_guide.md

File metadata and controls

Contribution Guide

Creating Your Custom Algorithm

Understanding Neuracore Models

Key Concepts

Step 1: Extend the NeuracoreModel Class

Step 2: Required Methods

Step 3: Optional Methods

Step 4: File Organization

Step 5: Algorithm Validation

Step 6: Test Your Algorithm

Adding Your Algorithm to Neuracore

Option 1: Open Source Contribution

Option 2: Private Algorithm Upload

Tips for Algorithm Development

Troubleshooting

Release Process

Branch Strategy

Creating PRs

Pending Changelog

Creating a Release