feat: add model construction docs by sdas-neuraco · Pull Request #405 · NeuracoreAI/neuracore

sdas-neuraco · 2026-02-16T14:56:02Z

Features

Added documentation to explain how we auto-construct models.

* fix: renaming of uploader to importer

Co-authored-by: Steven Jacobs <steven.jacobs@neuracore.com>

* feat: add local training metadata for local runs

Co-authored-by: favour-neuraco <favour@neurco.com>

Removed note about the 'main' branch being a development branch.

github-actions · 2026-02-16T14:56:12Z

✅ PR source branch is valid

Source: feat/add-model-docs
Target: develop

kwangneuraco · 2026-02-18T08:22:32Z

docs/assets/model_input.png

Both Franka and Kuka has 7 Dofs, so using Franka here is not appropriate, using UR-5 is good because it has 6 DoF

kwangneuraco · 2026-02-18T08:26:15Z

docs/model_construction.md

+It usually begins with a single robot. You train a model on a Franka Emika Panda, define the input tensors, fix the output dimensionality, and everything works as expected. The model converges, inference is stable, and the system feels clean and well-structured. At this stage, the architecture appears robot-agnostic — but in reality, it is tightly coupled to one embodiment.
+
+Then a second dataset is introduced, perhaps from a KUKA LBR iiwa. On the surface, the robots are similar: both are 7-DoF manipulators with wrist joints and RGB inputs. But under the hood, the differences begin to surface. Joint names differ. One robot has an extra joint. Camera placements are not identical. The ordering of data points in the logs does not match. Even wrist joint data may be represented differently.
+


same as the pic, a bit confused, why one robot has an extra robot

kwangneuraco · 2026-02-18T08:44:13Z

docs/model_construction.md

+## What Is a Cross Embodiment Description?
+
+```python
+EmbodimentDescription = dict[DataType, dict[int, str]]


dict[str, int]?

kwangneuraco · 2026-02-18T08:46:07Z

docs/model_construction.md

+
+An `EmbodimentDescription` defines what data points exist for one robot and where each datapoint lives in index space.
+
+Each `DataType` (for example `JOINTS`, `RGB_IMAGES`, `GRIPPER_JOINT_OPEN_AMOUNT`) maps to:


JOINT_POSITION? I feel it is better we can strictly match names in the doc to the code

kwangneuraco · 2026-02-18T09:10:43Z

docs/model_construction.md

+            0: "panda_joint_1",
+            1: "panda_joint_2",
+            2: "panda_joint_3",
+            3: "panda_joint_4",


I find it be better to put kuka first and franka second, otherwise the reader may feel confused why there is a missing without too much context

kwangneuraco · 2026-02-18T09:11:38Z

docs/model_construction.md

+Different robots can have different numbers of joints, but training requires a **consistent tensor shape**.
+In this example, the wrist joint value stays at its globally assigned index, even when some intermediate indices are empty for a given robot.
+
+If one robot has fewer joints, Neuracore fills missing dimensions with `0` so all records align to the same width.


To be more precise, it is 0.0, float

* refactor: encapsulate logic in main function * chore: example script renamed and docs updated --------- Co-authored-by: Gabriele Tiboni <gabriele.tiboni@neuraco.com>

… dataset upload and all relevant tests (#400)

… and stopped recordings (#421)

Co-authored-by: Gabriele Tiboni <gabriele.tiboni@neuraco.com>

Co-authored-by: StevenJacobs61 <stevenjacobs61@gamil.com>

mark-neuracore · 2026-02-19T18:00:53Z

docs/model_construction.md

+
+Then a second dataset is introduced, perhaps from a KUKA LBR iiwa. On the surface, the robots are similar: both are 7-DoF manipulators with wrist joints and RGB inputs. But under the hood, the differences begin to surface. Joint names differ. One robot has an extra joint. Camera placements are not identical. The ordering of data points in the logs does not match. Even wrist joint data may be represented differently.
+
+At first, the solution is incremental. You reorder tensors. You insert padding. You add a mapping layer. You write conditionals in the inference code. The pipeline still runs. But now structure has started leaking. The model input definition is no longer cleanly separated from embodiment logic.


conditions?

Co-authored-by: Steven Jacobs <steven.jacobs@neuracore.com>

Co-authored-by: Cougar Tasker <cougar@neuraco.com>

* docs: cleaner bigym installation instructions * chore: import try clause and cleaner comments --------- Co-authored-by: Gabriele Tiboni <gabriele.tiboni@neuraco.com>

muneebneura and others added 22 commits February 7, 2026 15:27

feat: merge data daemon (#378)

b165ff1

fix: renaming of uploader to importer (#379)

8e3a9da

* fix: renaming of uploader to importer

fix: training run names (#380)

b71ef99

fix: auto start daemon (#381)

67fc41b

docs: update importer docs (#384)

2380729

fix: add back pi0 test and fix the patching (#356)

debdf4a

feat: improved error for sync-recording (#382)

3be44e7

feat: add DROID dataset (#373)

67a232f

feat: check duplicate training name (#386)

e12549b

fix: data type limit in ring buffer chunk headers (#387)

bf300be

feat: change discord badge (#389)

b204052

fix: move dataset cache defaults to core constants (#392)

ee9d251

fix: rename environment variable doc and update links (#393)

3b8ea95

fix: impl path retrieving helper functions (#394)

52611d7

Co-authored-by: Steven Jacobs <steven.jacobs@neuracore.com>

fix: remove unused importer logging helper and state enum (#395)

9c84b35

feat: add local training metadata for local runs (#397)

c3f7e9c

* feat: add local training metadata for local runs

fix: missing frames in encoding (#388)

1d9a947

Co-authored-by: favour-neuraco <favour@neurco.com>

fix: removed data type content mapping (#401)

c2de774

feat: joint position from ik (#366)

ead5b49

fix: Update README to remove main branch note (#402)

4cbe1cd

Removed note about the 'main' branch being a development branch.

fix: added codec context options to lossless vidoes (#404)

41c5b1b

fix: disk encoder buffer orphaned frames (#403)

327cd06

sdas-neuraco self-assigned this Feb 16, 2026

sdas-neuraco added the version:patch non-breaking bug fixes; no input/output changes (except defaults) label Feb 16, 2026

sdas-neuraco force-pushed the feat/add-model-docs branch from 6552ee5 to 0d56471 Compare February 16, 2026 14:56

sdas-neuraco requested review from kwangneuraco, mark-neuracore and stepjam and removed request for kwangneuraco February 16, 2026 15:06

kwangneuraco reviewed Feb 18, 2026

View reviewed changes

gtiboni-neuraco and others added 8 commits February 19, 2026 11:25

chore: rename example view dataset and code cleaning (#423)

5d640ef

* refactor: encapsulate logic in main function * chore: example script renamed and docs updated --------- Co-authored-by: Gabriele Tiboni <gabriele.tiboni@neuraco.com>

feat: add tfds dataformat import and combine with rlds, add bridge v2…

13dd73a

… dataset upload and all relevant tests (#400)

fix: fixed data daemon race condition on concurrent logging api calls…

9ef31aa

… and stopped recordings (#421)

fix: add json and video sources with correct the event loop (#407)

8c1a815

fix: update documentation link in README (#424)

a562bbd

Co-authored-by: Gabriele Tiboni <gabriele.tiboni@neuraco.com>

fix: implemented 2 phase stop recording (#425)

6af84d7

Co-authored-by: StevenJacobs61 <stevenjacobs61@gamil.com>

feat: add rich hander to logging for indented colorful logs (#426)

fb39f26

feat: add lerobot dataset importer test file (#422)

371c186

mark-neuracore reviewed Feb 19, 2026

View reviewed changes

ypang-neuraco and others added 11 commits February 20, 2026 10:02

feat: add mobile aloha dataset (#427)

43b507e

fix: removed redundant state (#429)

4cf3251

Co-authored-by: Steven Jacobs <steven.jacobs@neuracore.com>

feat: add docs on examples and delete the no_robot (#433)

5658615

feat: save requirement.txt in the right folder when creating nc (#432)

addf9dd

fix: broken permissions on forked repository (#436)

5626aea

Co-authored-by: Cougar Tasker <cougar@neuraco.com>

feat: add pi0 to the integration test (#437)

fbd5f08

chore: comments and import improved in bigym example (#438)

d21f050

* docs: cleaner bigym installation instructions * chore: import try clause and cleaner comments --------- Co-authored-by: Gabriele Tiboni <gabriele.tiboni@neuraco.com>

feat: switch to bucket logging (#420)

ee1189d

fix: WIP

8068701

fix: WIP

483f7b7

Merge branch 'develop' into feat/add-model-docs

d1feb4d

sdas-neuraco requested review from kwangneuraco and mark-neuracore February 24, 2026 14:20

stepjam force-pushed the develop branch 3 times, most recently from 3982231 to e37ad91 Compare March 5, 2026 10:49

stepjam deleted the branch main March 5, 2026 16:19

stepjam closed this Mar 5, 2026

stepjam reopened this Mar 5, 2026

stepjam changed the base branch from develop to main March 5, 2026 16:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add model construction docs#405

feat: add model construction docs#405
sdas-neuraco wants to merge 60 commits intomainfrom
feat/add-model-docs

sdas-neuraco commented Feb 16, 2026

Uh oh!

github-actions bot commented Feb 16, 2026 •

edited

Loading

Uh oh!

kwangneuraco Feb 18, 2026

Uh oh!

kwangneuraco Feb 18, 2026

Uh oh!

kwangneuraco Feb 18, 2026

Uh oh!

kwangneuraco Feb 18, 2026

Uh oh!

kwangneuraco Feb 18, 2026

Uh oh!

kwangneuraco Feb 18, 2026

Uh oh!

mark-neuracore Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

		It usually begins with a single robot. You train a model on a Franka Emika Panda, define the input tensors, fix the output dimensionality, and everything works as expected. The model converges, inference is stable, and the system feels clean and well-structured. At this stage, the architecture appears robot-agnostic — but in reality, it is tightly coupled to one embodiment.

		Then a second dataset is introduced, perhaps from a KUKA LBR iiwa. On the surface, the robots are similar: both are 7-DoF manipulators with wrist joints and RGB inputs. But under the hood, the differences begin to surface. Joint names differ. One robot has an extra joint. Camera placements are not identical. The ordering of data points in the logs does not match. Even wrist joint data may be represented differently.


		An `EmbodimentDescription` defines what data points exist for one robot and where each datapoint lives in index space.

		Each `DataType` (for example `JOINTS`, `RGB_IMAGES`, `GRIPPER_JOINT_OPEN_AMOUNT`) maps to:


		Then a second dataset is introduced, perhaps from a KUKA LBR iiwa. On the surface, the robots are similar: both are 7-DoF manipulators with wrist joints and RGB inputs. But under the hood, the differences begin to surface. Joint names differ. One robot has an extra joint. Camera placements are not identical. The ordering of data points in the logs does not match. Even wrist joint data may be represented differently.

		At first, the solution is incremental. You reorder tensors. You insert padding. You add a mapping layer. You write conditionals in the inference code. The pipeline still runs. But now structure has started leaking. The model input definition is no longer cleanly separated from embodiment logic.

Conversation

sdas-neuraco commented Feb 16, 2026

Features

Uh oh!

github-actions bot commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kwangneuraco Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

kwangneuraco Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

kwangneuraco Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

kwangneuraco Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

kwangneuraco Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

kwangneuraco Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

mark-neuracore Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

github-actions bot commented Feb 16, 2026 •

edited

Loading