Rename to OpenClaw and update image tags by TimPietruskyRunPod · Pull Request #5 · runpod-workers/openclaw2go

TimPietruskyRunPod · 2026-01-30T14:59:49Z

Summary

rename images, docs, scripts, and configs to OpenClaw naming
update env vars and paths to OPENCLAW_*
refresh templates and CI tags for OpenClaw images

Test plan

Not run (docs/config/workflow changes only)

Refresh images, docs, and scripts to use Moltbot naming and env vars. Update Docker build workflow to tag images with branch names.

Clarify that branch builds publish tags using the branch name with slashes normalized.

Push images on branch and PR builds using the source branch name and allow all branches/tags to trigger builds.

Fail fast when moltbot is missing so the rename does not silently fall back.

Trigger image builds on pull requests (branch tag) and release tags only, with documentation to match.

Trigger builds on main pushes so :latest is published while keeping PR builds for branches.

Pin to the beta tag so the image gets the moltbot binary.

Use the supported clawdbot package and provide a moltbot symlink.

Ensure clawdbot reads the intended state directory in the gguf entrypoint.

Create required state directories and lock down permissions after doctor.

Rewrite the root README to focus on Moltbot images, context sizes, and status summary.

Align images, configs, and entrypoints with OpenClaw branding and paths. Update docs and templates to drop Moltbot/Clawdbot references.

Centralize web UI and SSH log output across entrypoints. Adjust build contexts to include shared scripts and document builds.

Document the tokenized Web UI URL and device pairing approval commands.

Add an OpenClaw skill and CLI wrapper for FLUX.2 SDNQ image generation. Wire skills loading and install dependencies in images.

- PyTorch cu128 required for Blackwell sm_120 GPU support - Diffusers from git required for Flux2KleinPipeline (not in stable 0.36.0)

- New root AGENTS.md with architecture, model variants, skills, quick commands - CLAUDE.md now references AGENTS.md for agents/devs

Focus on build/test commands, code style, testing instructions

- Codebase structure with purpose of each folder - Key architectural decisions (llama.cpp for 5090, cu128, etc.) - Where to make changes table - Build, test, and debugging commands

Add speech-to-text and text-to-speech capabilities using LiquidAI's LFM2.5-Audio-1.5B model with GPU acceleration on RTX 5090. Changes: - Build audio runners from llama.cpp PR #18641 with CUDA SM120 support - Add openclaw-tts script with voice selection (US/UK male/female) - Add openclaw-stt script for audio transcription - Add skills/tts and skills/stt for OpenClaw integration - ~80x speedup vs CPU-only prebuilt runners (2s vs 15s) Performance on RTX 5090: - TTS: ~965 tokens/sec, ~2.3s for short sentences - STT: ~688 tokens/sec, ~2.0s for short clips - Audio decode: 4ms (vs 1296ms on CPU) Model files downloaded at runtime to /workspace/models/LFM2.5-Audio-GGUF/

Replace per-request model loading with persistent audio server: - Scripts now use streaming API to audio server on port 8001 - TTS: 0.8s vs 2.5s (3x faster) - STT: 0.3s vs 2.0s (7x faster) - Model stays loaded in VRAM (~845 MiB) Changes: - Rewrite openclaw-tts/stt as Python scripts using server API - Add -ngl 99 to entrypoint for GPU-accelerated audio server - Server auto-starts with container on port 8001

add persistent flux.2 klein image generation server on port 8002 for instant inference with pre-loaded model in vram - add openclaw-image-server http server that loads model at startup - refactor openclaw-image-gen to use server api instead of loading per request - reduce llm context from 200k to 100k tokens to free vram for image server - update entrypoint to start image server alongside llm and audio servers - update openclaw config contextTokens to match reduced context - add image server to cleanup function and startup messages

fix image server to work alongside llm and audio servers by optimizing vram usage and fixing sdnq quantizer registration - register sdnq quantizer with diffusers to fix model loading errors - disable torch compile/inductor to reduce vram pressure - enable attention/vae slicing and tiling for lower memory usage - restore llm context to 200k (was reduced to 100k) - add llama_parallel=1 config for single slot (no concurrency) - add llama_gpu_layers=44 config to free vram for image server - update agents.md with vram usage table and binary separation docs - document critical requirement: llm and audio binaries must be separate

copy openclaw-image-server to docker image and expose port 8002 for persistent image generation server

set speed-first defaults and align openclaw context limits ensure audio server loads its shared libs via LD_LIBRARY_PATH

persist generated images and expose /latest and /images endpoints ensure media output dirs exist and surface public/proxy urls

include flux2-klein-1024 and test-robot examples

add a lightweight media proxy + static ui on port 8080 bundle a tool_result hook to render image urls inline

enable toolresult hook in entrypoint so chat surfaces audio links. add proxy audio endpoints and ui controls for tts and stt.

Adds ik_llama.cpp as a new engine to support custom quant types (type 139+) incompatible with standard llama.cpp. Unlocks MiniMax M2.5 IQ2_KS (~70 GiB) on A100/H100 80GB and smol-IQ3_KS (~87 GiB) on B200 180GB.

Models using --fit on need gpuLayers=auto so llama.cpp can determine the optimal number of layers on GPU vs CPU. Explicitly setting -ngl 999 prevents --fit from offloading layers to system RAM.

Actual KV cache: 3500 MiB for 98304 context = ~36 MB/1k tokens (with q8_0 quantization and kv_unified=true). Original estimate of 10 MB/1k was too optimistic for MLA with unified KV.

- strip pytorch test/debug artifacts from image-gen venv (~200-500 MB) - remove unused apt packages: sudo, unzip, python3-venv (~30-50 MB) - add --omit=dev to npm install, strip docs from node_modules (~10-30 MB) - remove all vllm code paths, model configs, and engine entry - create .dockerignore for faster build context - delete empty plugins/disable-external-image-tools/ directory

consolidate 4 engine builds into 2 via openclaw2go-llamacpp fork: - openclaw2go-llamacpp: unified fork with audio, outetss, eagle-3 - ik-llamacpp: custom quants (unchanged) add 4 new task roles: vision (8003), embedding (8004), reranking (8005), tts (8006) add 7 new model configs: qwen2.5-vl-7b, qwen3-embedding-0.6b, jina-reranker-v3, outetss-0.2-500m, arcee-trinity-mini, olmo-3.1-think-32b, glm-4.7-full rename all references from llamacpp-openclaw to openclaw2go-llamacpp. pr #19460 (glm-5 dsa) already merged upstream, removed from cherry-pick list.

- **Registry consolidation**: Integrate registry into main repo instead of separate openclaw2go-registry repo - **Web configurator**: Add React + Vite + TypeScript UI (site/) for VRAM-first GPU pod configuration - **Schema validation**: Add JSON schemas for models, GPUs, and profiles with validation scripts - **New models**: Add MLX-based models (glm47-flash, gpt-oss-20b, nemotron3-nano) - **CI/CD**: Add workflows for building catalog and validating PRs - **Documentation**: Add contributing guide and design brief - Update registry fetch URL to point to consolidated registry in main repo - Add support for new engines (openclaw2go-llamacpp, mlx, vllm) in allowed engines list

…ents

… testing

…ethods Replace 2-card grid (local/cloud) with tabbed interface supporting cli-local, cli-cloud, docker, and mlx deploy methods. Each tab generates correct commands with copy button and contextual hints. Tab visibility adapts to selected OS (docker hidden on mac, mlx hidden on linux/windows). Add isDefault and mlx fields to CatalogModel for proper command generation.

…memory

Tested on RTX 5090 (32GB): tool calling works, kvCache calibrated to 70 MB/1k, 18311/32607 MiB VRAM, ~200 tok/s, OpenClaw gateway integration verified with proper device pairing.

Verified on RTX 5090 (Docker/llama.cpp Q8_0) and Apple M3 Pro (MLX 4-bit). Tool calling works on both platforms.

Models with an mlx field (multi-platform) were only tagged as linux/windows. Now they show on all three platforms.

GGUF models with embedded mlx fields were getting ['linux', 'windows', 'mac'] as their OS array, causing Mac to appear in the Linux/Windows platform tab. Now that separate MLX model files exist with platform: 'mlx', OS assignment only checks m.platform. Also auto-populates mlx field for platform MLX models so DeployOutput still generates correct Mac deploy commands.

- Remove kv rate row from info table (not useful to end users) - Swap VRAM bar above info table so bars align across cards - Remove fixed min-height on info table - Always render detail row in segment labels to prevent layout jumps when switching between platforms with different segment counts - Wire platform tab clicks to sync all selected models via onVariantSwitch

Models with the same base name (e.g., GLM-5, MiniMax-M2.5) are now ordered by quantization level: 1-bit first, then 2-bit, 3-bit, etc.

…ovements - add qwen3.5-397b-a17b model (2-bit, 137gb, 1m context on b200) - use --override-kv to bypass llama-server slot capping at n_ctx_train - format context display as "1m" for million-scale tokens - migrate registry url and site base path to openclaw2go.io - rewrite catalog mlx model separation using flatmap for correct os tabs - add variant switch handler for cross-platform model selection - auto-detect mmproj for any llm with vision projection file - add vision type and mmproj field to model schema - update flux2-klein-mlx repo and vram

…tching

…vram display

- fix duplicate model count caused by gguf+mlx entries sharing same id - make platform tabs local state so switching doesn't affect global os - deduplicate variant groups by os to prevent duplicate macos tabs - remove unused onVariantSwitch prop chain - move accent border from left to top on selected model cards - remove vram label from memory presets - add vram presets: 141 (h200), 256 (m4 ultra), 288 (b300) - add m4 ultra 256gb to mac gpu list - add tts type, vision badges, model filters, hasVision field - fix qwen3.5 overhead and step3.5 context length

ik_llama.cpp only sets CMAKE_RUNTIME_OUTPUT_DIRECTORY (executables) but not CMAKE_LIBRARY_OUTPUT_DIRECTORY, so .so files end up in build/src/ instead of build/bin/ where the COPY glob expects them. our openclaw2go-llamacpp fork sets both to build/bin/. this caused /opt/engines/ik-llamacpp/lib/ to be empty, breaking minimax m2.5 iq2_ks and iq3_ks models.

add tps column to model picker list with column header, display tps as first info block in selected model cards (reordered to tps → quant → engine), and plumb tps data from registry model json through catalog and group-models to all ui components. add per-gpu tps benchmarks to 13 model configs.

- step-3.5-flash: 69 t/s on a100 - qwen3-coder-next: 118 t/s on l40 - kimi-k2.5-tq1: 9 t/s on b200 (with offload) - fix glm-4.7-full file paths: repo reorganized to Q4_K_M/ subdir with 5 splits instead of 4

Served via GitHub Pages at: curl -fsSL https://openclaw2go.io/install.sh | sh irm https://openclaw2go.io/install.ps1 | iex

arrow up/down moves focus between model rows across sections. auto-swaps selection only within a type that already has an active model, preserving cross-type independence.

…ne layer placement

… determine layer placement" This reverts commit c8a786d.

GLM-4.7 Full Q4_K_M needs ~228GB VRAM which exceeds even B200 (182GB). GLM-4.7 Flash (17GB, 179 t/s) and GLM-5 TQ1 (176GB, 27 t/s on B200) cover the same use cases better.

TimPietruskyRunPod added 30 commits January 29, 2026 11:24

feat: rename repo to Moltbot

6a4b630

Refresh images, docs, and scripts to use Moltbot naming and env vars. Update Docker build workflow to tag images with branch names.

chore: document branch image tags

e843e25

Clarify that branch builds publish tags using the branch name with slashes normalized.

fix: tag PR images by head branch

5998e37

Push images on branch and PR builds using the source branch name and allow all branches/tags to trigger builds.

fix: enforce moltbot binary in gguf entrypoint

328269d

Fail fast when moltbot is missing so the rename does not silently fall back.

fix: avoid duplicate builds

11c416b

Trigger image builds on pull requests (branch tag) and release tags only, with documentation to match.

fix: publish latest on main

d38cf84

Trigger builds on main pushes so :latest is published while keeping PR builds for branches.

fix: install moltbot CLI in gguf image

bbb2b7c

Pin to the beta tag so the image gets the moltbot binary.

fix: install clawdbot with moltbot shim

929c9be

Use the supported clawdbot package and provide a moltbot symlink.

fix: use clawdbot state dir env vars

3f7d76d

Ensure clawdbot reads the intended state directory in the gguf entrypoint.

fix: harden clawdbot state dir setup

f022778

Create required state directories and lock down permissions after doctor.

docs: refresh model matrix

ccf2c1a

Rewrite the root README to focus on Moltbot images, context sizes, and status summary.

feat: rename to openclaw

3c6ed9a

Align images, configs, and entrypoints with OpenClaw branding and paths. Update docs and templates to drop Moltbot/Clawdbot references.

refactor: share entrypoint helpers

7bc877a

Centralize web UI and SSH log output across entrypoints. Adjust build contexts to include shared scripts and document builds.

docs: add RunPod pairing steps

f090059

Document the tokenized Web UI URL and device pairing approval commands.

feat: add image gen skill and CLI

eb4310a

Add an OpenClaw skill and CLI wrapper for FLUX.2 SDNQ image generation. Wire skills loading and install dependencies in images.

fix(gguf): use PyTorch cu128 + diffusers git for RTX 5090 image gen

24b3afc

- PyTorch cu128 required for Blackwell sm_120 GPU support - Diffusers from git required for Flux2KleinPipeline (not in stable 0.36.0)

docs(agents): add image generation skill reference

01745de

docs: add AGENTS.md for high-level onboarding

113f6d6

- New root AGENTS.md with architecture, model variants, skills, quick commands - CLAUDE.md now references AGENTS.md for agents/devs

docs: rewrite AGENTS.md per agents.md spec

c3708c3

Focus on build/test commands, code style, testing instructions

docs: comprehensive AGENTS.md with structure and key decisions

617af10

- Codebase structure with purpose of each folder - Key architectural decisions (llama.cpp for 5090, cu128, etc.) - Where to make changes table - Build, test, and debugging commands

fix(dockerfile): add image server script and port 8002

7bcdb36

copy openclaw-image-server to docker image and expose port 8002 for persistent image generation server

perf(entrypoint): default 150k ctx and full gpu offload

6e85325

set speed-first defaults and align openclaw context limits ensure audio server loads its shared libs via LD_LIBRARY_PATH

feat(media): add public image urls and output dirs

cce3caf

persist generated images and expose /latest and /images endpoints ensure media output dirs exist and surface public/proxy urls

docs(images): add generated sample images

6fdab81

include flux2-klein-1024 and test-robot examples

feat(media): add proxy ui and toolresult images

8795d1a

add a lightweight media proxy + static ui on port 8080 bundle a tool_result hook to render image urls inline

feat(media): add audio links and proxy ui tts/stt

e40ab72

enable toolresult hook in entrypoint so chat surfaces audio links. add proxy audio endpoints and ui controls for tts and stt.

TimPietruskyRunPod added 30 commits February 13, 2026 23:29

feat: add ik_llama.cpp engine for MiniMax M2.5 IQ2_KS and smol-IQ3_KS

8282fc6

Adds ik_llama.cpp as a new engine to support custom quant types (type 139+) incompatible with standard llama.cpp. Unlocks MiniMax M2.5 IQ2_KS (~70 GiB) on A100/H100 80GB and smol-IQ3_KS (~87 GiB) on B200 180GB.

fix: let --fit auto-determine gpu layers for offloading models

1e28e62

Models using --fit on need gpuLayers=auto so llama.cpp can determine the optimal number of layers on GPU vs CPU. Explicitly setting -ngl 999 prevents --fit from offloading layers to system RAM.

fix: calibrate kimi k2.5 kv cache rate to 36 MB/1k from B200 testing

8ce3028

Actual KV cache: 3500 MiB for 98304 context = ~36 MB/1k tokens (with q8_0 quantization and kv_unified=true). Original estimate of 10 MB/1k was too optimistic for MLA with unified KV.

ci(watch): add cron workflow to rebuild on new openclaw releases

cc5d3c1

feat: add ministral-3-8b-instruct model + mlx configs + site improvem…

adab03c

…ents

fix: calibrate ministral-3-8b kv cache rate to 70 MB/1k from RTX 5090…

de0d48b

… testing

fix(site): restore quant column in model selector and rename vram to …

c061c17

…memory

docs: add ministral-3-8b RTX 5090 test results to verified configs

8890be8

Tested on RTX 5090 (32GB): tool calling works, kvCache calibrated to 70 MB/1k, 18311/32607 MiB VRAM, ~200 tok/s, OpenClaw gateway integration verified with proper device pairing.

feat: promote ministral-3-8b to stable multi-platform model

cfded46

Verified on RTX 5090 (Docker/llama.cpp Q8_0) and Apple M3 Pro (MLX 4-bit). Tool calling works on both platforms.

feat: add Apple M3 Pro 18GB to Mac GPU options in configurator

a0f6f6e

fix: show multi-platform models on macOS in configurator

4a9dd39

Models with an mlx field (multi-platform) were only tagged as linux/windows. Now they show on all three platforms.

fix(site): sort model groups by name then bit level ascending

50d25a0

Models with the same base name (e.g., GLM-5, MiniMax-M2.5) are now ordered by quantization level: 1-bit first, then 2-bit, 3-bit, etc.

feat(site): integrate catalog build into vite dev server with file wa…

a38b5e5

…tching

fix(site): reduce hardware section padding and drop trailing .0 from …

a949e0b

…vram display

feat(models): add tps benchmarks and fix glm-4.7 file paths

2dc8b3b

- step-3.5-flash: 69 t/s on a100 - qwen3-coder-next: 118 t/s on l40 - kimi-k2.5-tq1: 9 t/s on b200 (with offload) - fix glm-4.7-full file paths: repo reorganized to Q4_K_M/ subdir with 5 splits instead of 4

feat: add cli install scripts for openclaw2go.io

e66de18

Served via GitHub Pages at: curl -fsSL https://openclaw2go.io/install.sh | sh irm https://openclaw2go.io/install.ps1 | iex

feat(site): add keyboard arrow navigation to model selector

ff2d21f

arrow up/down moves focus between model rows across sections. auto-swaps selection only within a type that already has an active model, preserving cross-type independence.

fix(models): use gpuLayers auto for glm-4.7-full to let --fit determi…

c8a786d

…ne layer placement

Revert "fix(models): use gpuLayers auto for glm-4.7-full to let --fit…

cacfd28

… determine layer placement" This reverts commit c8a786d.

fix(models): remove glm-4.7-full (202gb doesn't fit any single gpu)

88a100a

GLM-4.7 Full Q4_K_M needs ~228GB VRAM which exceeds even B200 (182GB). GLM-4.7 Flash (17GB, 179 t/s) and GLM-5 TQ1 (176GB, 27 t/s on B200) cover the same use cases better.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Rename to OpenClaw and update image tags#5

Rename to OpenClaw and update image tags#5
TimPietruskyRunPod wants to merge 101 commits intomainfrom
openclaw-rename

TimPietruskyRunPod commented Jan 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

TimPietruskyRunPod commented Jan 30, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants