Skip to content

elvatis/openclaw-gpu-bridge

@elvatis_com/openclaw-gpu-bridge

OpenClaw plugin to offload ML tasks (BERTScore + embeddings) to one or many remote GPU hosts.

v0.2 Highlights

  • Multi-GPU host pool (hosts[]) with:
    • round-robin or least-busy load balancing
    • automatic failover
    • periodic host health checks
  • Backward compatibility with v0.1 (serviceUrl / url)
  • Flexible model selection per request (model / model_type)
  • GPU service model caching (on-demand loading)
  • Optional transfer visibility via /status endpoint + batch progress logs

Tools

  • gpu_health
  • gpu_info
  • gpu_status (new in v0.2)
  • gpu_bertscore
  • gpu_embed

OpenClaw Plugin Config

v0.2 (recommended)

{
  "plugins": {
    "@elvatis_com/openclaw-gpu-bridge": {
      "hosts": [
        {
          "name": "rtx-2080ti",
          "url": "http://your-gpu-host:8765",
          "apiKey": "gpu-key-1"
        },
        {
          "name": "rtx-3090",
          "url": "http://your-second-gpu-host:8765",
          "apiKey": "gpu-key-2"
        }
      ],
      "loadBalancing": "least-busy",
      "healthCheckIntervalSeconds": 30,
      "timeout": 45,
      "models": {
        "embed": "all-MiniLM-L6-v2",
        "bertscore": "microsoft/deberta-xlarge-mnli"
      }
    }
  }
}

v0.1 compatibility

{
  "plugins": {
    "@elvatis_com/openclaw-gpu-bridge": {
      "serviceUrl": "http://your-gpu-host:8765",
      "apiKey": "gpu-key",
      "timeout": 45
    }
  }
}

Config reference

  • hosts: array of GPU hosts (v0.2)
  • serviceUrl / url: legacy single-host config
  • loadBalancing: round-robin or least-busy
  • healthCheckIntervalSeconds: host health polling interval
  • timeout: request timeout for compute endpoints
  • apiKey: fallback API key for hosts that do not define per-host key
  • models.embed, models.bertscore: plugin-side default models

GPU Service (Python) Setup

cd gpu-service
pip install -r requirements.txt
uvicorn gpu_service:app --host 0.0.0.0 --port 8765

Default models are warmed on startup:

  • Embed: all-MiniLM-L6-v2
  • BERTScore: microsoft/deberta-xlarge-mnli

Additional models are loaded on-demand and cached in memory.

Environment variables

  • API_KEY: require X-API-Key for all endpoints except /health
  • GPU_MAX_CONCURRENT: max parallel jobs (default 2)
  • GPU_EMBED_BATCH: embedding chunk size for progress logging (default 32)
  • GPU_MAX_BATCH_SIZE: max items per batch (default 100)
  • GPU_MAX_TEXT_LENGTH: max character length per text (default 10000)
  • MODEL_BERTSCORE: default warm model for BERTScore
  • MODEL_EMBED: default warm model for embeddings
  • TORCH_DEVICE: force device (cuda, cpu, cuda:1)

API Endpoints (GPU Service)

  • GET /health
  • GET /info
  • GET /status (queue + active jobs + progress)
  • POST /bertscore
  • POST /embed

Request-level model override

/bertscore:

{
  "candidates": ["a"],
  "references": ["b"],
  "model_type": "microsoft/deberta-xlarge-mnli"
}

/embed:

{
  "texts": ["hello world"],
  "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"
}

Exposing to the Internet

If you expose your GPU service outside LAN, use defense-in-depth:

  1. Pre-shared key auth (required)

    • Set API_KEY on service
    • Configure same key in plugin host config (apiKey)
    • Requests must include X-API-Key
  2. TLS/HTTPS (required on public internet)

    • Recommended: nginx reverse proxy with Let’s Encrypt certs
    • Alternative: run uvicorn with SSL cert/key directly

nginx reverse proxy example

server {
  listen 443 ssl http2;
  server_name gpu.example.com;

  ssl_certificate /etc/letsencrypt/live/gpu.example.com/fullchain.pem;
  ssl_certificate_key /etc/letsencrypt/live/gpu.example.com/privkey.pem;

  location / {
    proxy_pass http://127.0.0.1:8765;
    proxy_set_header Host $host;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
  }
}

uvicorn SSL example

uvicorn gpu_service:app --host 0.0.0.0 --port 8765 \
  --ssl-keyfile /path/key.pem \
  --ssl-certfile /path/cert.pem
  1. Optional: WireGuard VPN instead of public exposure

    • Keep service private behind VPN
    • Prefer private WireGuard IPs in plugin hosts[].url
  2. Operational hardening

    • Firewall allowlist only OpenClaw server IP
    • Rate limiting at reverse proxy
    • Monitor logs and rotate keys periodically

Development

npm run build
npm test

TypeScript runs in strict mode.

Shared Template

For automation that creates GitHub issues, use src/templates/github-issue-helper.ts. It provides isValidIssueRepoSlug(), resolveIssueRepo(), and buildGhIssueCreateCommand().

License

MIT

About

OpenClaw plugin: Offload heavy compute (embeddings, BERTScore) to a remote GPU server

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors