Skip to content

[BUG] Default docker /dev/shm size too small for tensor parallelism to work #387

@robfuscator

Description

@robfuscator

OS

Linux

GPU Library

CUDA 12.x

Python version

3.12

Describe the bug

You guys are awesome, thank you so much for your work!

When using the docker compose file provided in the repo, loading a model with tensor parallelism fails. I get a [Errno 28] No space left on device. error (see logs below). I think I've pinpointed it down, it seems to reference to the space available at /dev/shm.

Increasing the shm_size of the compose service to a larger value allowed to me to load the model successfully. The default is 64mb if my research is correct. Setting it to 2gb did not help, but setting it to 16gb did work. I assume it has to be large enough to fit a whole model layer.

Reproduction steps

  1. Spin up the API using the docker-compose.yml provided in this repo
  2. Load a model via the API with tensor parallelism enabled (I've tried with Doctor-Shotgun/GLM-4.5-Air-exl3_3.14bpw-h6)

Expected behavior

I'd expect the default docker-compose.yml file to work out of the box. I suggest adding a shm_size that works for most setups. Maybe you have some insights into how much space is required. I'd be willing to create a PR to adjust the docker-compose file and maybe add a note to the wiki if you'd like.

Logs

tabbyapi-1  | 2025-10-11 09:52:28.381 INFO:     127.0.0.1:51336 - "POST /v1/model/load 
tabbyapi-1  | HTTP/1.1" 200
tabbyapi-1  | 2025-10-11 09:52:28.811 INFO:     Using backend exllamav3
tabbyapi-1  | 2025-10-11 09:52:28.815 INFO:     exllamav3 version: 0.0.7
tabbyapi-1  | 2025-10-11 09:52:28.816 WARNING:  ExllamaV3 is currently in an alpha state. 
tabbyapi-1  | Please note that all config options may not work.
tabbyapi-1  | 2025-10-11 09:52:31.175 WARNING:  The provided model does not have vision 
tabbyapi-1  | capabilities that are supported by ExllamaV3. Vision input is disabled.
tabbyapi-1  | 2025-10-11 09:52:31.176 WARNING:  Draft model is disabled because a model name 
tabbyapi-1  | wasn't provided. Please check your config.yml!
tabbyapi-1  | 2025-10-11 09:52:31.176 WARNING:  The given cache size (86000) is not a multiple
tabbyapi-1  | of 256.
tabbyapi-1  | 2025-10-11 09:52:31.176 WARNING:  Overriding cache_size with an overestimated 
tabbyapi-1  | value of 86016 tokens.
tabbyapi-1  | 2025-10-11 09:52:31.177 WARNING:  The given cache_size (86016) is less than 2 * 
tabbyapi-1  | max_seq_len and may be too small for requests using CFG. 
tabbyapi-1  | 2025-10-11 09:52:31.177 WARNING:  Ignore this warning if you do not plan on 
tabbyapi-1  | using CFG.
tabbyapi-1  | 2025-10-11 09:52:31.185 INFO:     Attempting to load a prompt template if 
tabbyapi-1  | present.
tabbyapi-1  | 2025-10-11 09:52:31.211 INFO:     Using template "chat_template" for chat 
tabbyapi-1  | completions.
tabbyapi-1  | 2025-10-11 09:52:31.213 INFO:     Loading model: 
tabbyapi-1  | /app/models/GLM-4.5-Air-exl3_3.14bpw-h6
tabbyapi-1  | 2025-10-11 09:52:31.213 INFO:     Loading with tensor parallel
tabbyapi-1  | /opt/venv/lib/python3.12/site-packages/joblib/_multiprocessing_helpers.py:44: UserWarning: [Errno 28] No space left on device.  joblib will operate in serial mode
tabbyapi-1  |   warnings.warn("%s.  joblib will operate in serial mode" % (e,))
tabbyapi-1  | /opt/venv/lib/python3.12/site-packages/joblib/_multiprocessing_helpers.py:44: UserWarning: [Errno 28] No space left on device.  joblib will operate in serial mode
tabbyapi-1  |   warnings.warn("%s.  joblib will operate in serial mode" % (e,))

Additional context

No response

Acknowledgements

  • I have looked for similar issues before submitting this one.
  • I have read the disclaimer, and this issue is related to a code bug. If I have a question, I will use the Discord server.
  • I understand that the developers have lives and my issue will be answered when possible.
  • I understand the developers of this program are human, and I will ask my questions politely.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions