This guide covers how to run ProcessAce with Ollama for local artifact generation, how to choose a deployment mode based on your operating system and hardware, and what is currently supported.
ProcessAce supports Ollama as a first-class provider for artifact generation.
Current scope:
- Supported: local generation models for BPMN, SIPOC, RACI, and narrative artifacts
- Supported: bundled Docker Ollama, host-native Ollama, and Linux AMD ROCm Docker mode
- Not supported: Ollama as the transcription backend
Transcription remains on OpenAI-compatible speech-to-text providers. Ollama's current OpenAI compatibility layer does not provide the audio transcription endpoint that ProcessAce would need for the existing STT runtime.
Use this when you do not want local models and only plan to use cloud providers.
Before starting the stack, configure the required base variables in .env:
JWT_SECRETENCRYPTION_KEYSQLITE_ENCRYPTION_KEYCORS_ALLOWED_ORIGINSREDIS_PASSWORD
docker compose up -d --buildThis starts only the application and Redis. No bundled ollama container is created.
If you use Linux bind mounts for ./data and ./uploads, make sure those host directories are writable by the container UID because the app runs as a non-root user.
Use this when you want local generation models through a bundled Ollama container.
docker compose -f docker-compose.yml -f docker-compose.ollama.yml up -d --buildDefault container-to-container routing:
OLLAMA_BASE_URL_DEFAULT=http://ollama:11434/v1OLLAMA_PULL_HOST=http://ollama:11434
This mode works on any machine that can run the ProcessAce Docker stack, but generation speed depends entirely on CPU performance and available RAM.
Use this when you run Docker Desktop on Windows and want Ollama to use host hardware directly.
Recommended for:
- Windows users in general
- Windows + AMD GPU hosts
- cases where Docker GPU passthrough is unavailable or unreliable
Steps:
- Install and start Ollama on the Windows host.
- Set the ProcessAce
.envvalues:
CORS_ALLOWED_ORIGINS=http://localhost:3000
OLLAMA_BASE_URL_DEFAULT=http://host.docker.internal:11434/v1
OLLAMA_PULL_HOST=http://host.docker.internal:11434- Start ProcessAce normally:
docker compose up -d --buildIn this mode, the app container talks to the host Ollama instance through host.docker.internal.
Use this when the Docker host is Linux and the machine has a ROCm-capable AMD GPU.
Start with:
docker compose -f docker-compose.yml -f docker-compose.ollama.yml -f docker-compose.ollama-amd.yml up -d --buildThis switches the Ollama service to ollama/ollama:rocm and passes through:
/dev/kfd/dev/dri
Host prerequisites:
- Linux host with Docker Engine
- ROCm-capable AMD GPU
- working AMD/ROCm driver stack on the host
- Docker access to
/dev/kfdand/dev/dri
The App Settings page exposes a curated Ollama catalog in the 2.1 Local Model Manager section.
The catalog includes:
- model size
- parameter size when known
- context window when known
- recommended hardware guidance
Use those hints as practical guidance, not hard limits. CPU-only execution is possible for smaller models, but latency may be high on weaker machines.
For CPU-only systems, prefer smaller models first. Larger models may still load, but generation latency can become impractical.
Local model usability is constrained more by available RAM and model size than by raw CPU clock speed alone. If the machine is close to its memory limit, expect swapping, slower startup, or model eviction.
- Windows + AMD: prefer host Ollama
- Linux + AMD: use the ROCm Docker override
- No supported GPU path: use bundled CPU Ollama
- Open
/app-settings.html. - In
2. Default Model Selection, chooseOllama (Local)as the LLM provider. - Confirm the Ollama base URL for your deployment mode.
- Use
Load Modelsto refresh installed Ollama models. - Use
2.1 Local Model Managerto:- check installed status
- download curated models
- uninstall unused models
- set an installed model as the active default
The Use Model action saves the model immediately, so the selected Ollama model becomes active without a separate extra step.
- Open App Settings
- Select
Ollama (Local) - Use
Load Models - Confirm the installed model list loads successfully
docker compose exec ollama ollama list
docker compose exec ollama ollama psdocker compose exec ollama ls /dev/kfd /dev/dri
docker compose exec ollama ollama ps- confirm the settings page connects through
http://host.docker.internal:11434/v1 - verify host CPU/GPU activity while Ollama is serving requests
That value is expected only when the Ollama Docker override is enabled. If .env defines OLLAMA_BASE_URL_DEFAULT, the settings page should show that value after the app container is rebuilt. Rebuild with:
docker compose up -d --buildIf you want bundled Ollama instead, start the stack with:
docker compose -f docker-compose.yml -f docker-compose.ollama.yml up -d --buildThis usually means the model is running on CPU, or the selected model is too large for the machine's practical RAM/VRAM budget. Try a smaller model first.
That usually means the request never reached a runnable Ollama inference path. Check:
- the configured base URL
- whether the selected model is installed
- whether the request is generation, not transcription
Ollama is not currently supported as the transcription backend in ProcessAce. Keep transcription on OpenAI-compatible speech-to-text providers.