LocalAI version:
image: localai/localai:latest-gpu-vulkan (doesn't see the integrated GPU) or image: localai/localai:latest-gpu-hipblas.
LocalAI v4.5.5 (d11b202)
Environment, CPU architecture, OS, and Version:
Linux with AMD, AMD Ryzen 7 3700X 8-Core Processor. 96 GB of unified memory, 48 GB set for framebuffer in UEFI.
alexander@minisforum:/srv$ rocminfo
ROCk module is loaded
=====================
HSA System Attributes
=====================
Runtime Version: 1.1
Runtime Ext Version: 1.14
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
Mwaitx: DISABLED
XNACK enabled: NO
DMAbuf Support: YES
VMM Support: YES
==========
HSA Agents
==========
*******
Agent 1
*******
Name: AMD Ryzen AI 9 HX 370 w/ Radeon 890M
Uuid: CPU-XX
Marketing Name: AMD Ryzen AI 9 HX 370 w/ Radeon 890M
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 49152(0xc000) KB
Chip ID: 0(0x0)
ASIC Revision: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 5157
BDFID: 0
Internal Node ID: 0
Compute Unit: 24
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Memory Properties:
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: FINE GRAINED
Size: 47876700(0x2da8a5c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED
Size: 47876700(0x2da8a5c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 3
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 47876700(0x2da8a5c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 4
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 47876700(0x2da8a5c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
*******
Agent 2
*******
Name: gfx1150
Uuid: GPU-XX
Marketing Name: AMD Radeon 890M Graphics
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 32(0x20) KB
L2: 2048(0x800) KB
Chip ID: 5390(0x150e)
ASIC Revision: 4(0x4)
Cacheline Size: 128(0x80)
Max Clock Freq. (MHz): 2900
BDFID: 50688
Internal Node ID: 1
Compute Unit: 16
SIMDs per CU: 2
Shader Engines: 1
Shader Arrs. per Eng.: 2
WatchPts on Addr. Ranges:4
Coherent Host Access: FALSE
Memory Properties: APU
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 2147483647(0x7fffffff)
y 65535(0xffff)
z 65535(0xffff)
Max fbarriers/Workgrp: 32
Packet Processor uCode:: 34
SDMA engine uCode:: 15
IOMMU Support:: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 50331648(0x3000000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED
Size: 50331648(0x3000000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 3
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Recommended Granule:0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1150
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 2147483647(0x7fffffff)
y 65535(0xffff)
z 65535(0xffff)
FBarrier Max Size: 32
ISA 2
Name: amdgcn-amd-amdhsa--gfx11-generic
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 2147483647(0x7fffffff)
y 65535(0xffff)
z 65535(0xffff)
FBarrier Max Size: 32
*** Done ***
docker-compose.yml:
localai:
# Whichever:
# image: localai/localai:latest-gpu-hipblas #-- untill ollama finds all available memory.
image: localai/localai:latest-gpu-vulkan
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/readyz"]
interval: 1m
timeout: 20m
retries: 5
ports:
- 8080:8080
networks:
- webui
- flowise
extra_hosts:
- "host.docker.internal:host-gateway"
- "nas:192.168.1.104"
environment:
AMD_VISIBLE_DEVICES: all
DEBUG: true
LOCALAI_FORCE_META_BACKEND_CAPABILITY: amd
BUILD_TYPE: hipblas
# REBUILD: true
GGML_CUDA_ENABLE_UNIFIED_MEMORY: 1
LOCALAI_AGENT_POOL_DEFAULT_MODEL: hermes-3-llama3.1-8b
LOCALAI_AGENT_POOL_EMBEDDING_MODEL: granite-embedding-107m-multilingual
LOCALAI_AGENT_POOL_ENABLE_SKILLS: true
LOCALAI_AGENT_POOL_ENABLE_LOGS: true
LOCALAI_AGENT_POOL_VECTOR_ENGINE: postgres
LOCALAI_AGENT_POOL_DATABASE_URL: [redacted]
LOCALAI_API_KEY: [redacted]
# user: "${UID}:${GID}"
volumes:
- models:/models:rw
- backends:/backends:rw
- data:/data:rw
- config:/etc/localai:rw
- configuration:/configuration:rw
- content:/tmp/generated/content:rw
group_add:
- video
# - render
runtime: amd
devices:
- /dev/kfd:/dev/kfd
- /dev/dri:/dev/dri
Describe the bug
An attempt to generate an image (in this case, with the Chroma1-HD model, but also with some others results in "failed to load model with internal loader: could not load model: rpc error: code = Canceled desc = context canceled" with /opt/amdgpu/share/libdrm/amdgpu.ids: No such file or directory in the backend process output.
/opt/amdgpu/share/libdrm/amdgpu.ids is absent, but there is /usr/share/libdrm/amdgpu.ids in the container.
If a symlink is made in the container, then model load fails for unknown reason:
Jul 01 12:23:29 ERROR Failed to load model modelID="chroma1-hd" error=failed to load model with internal loader: could not load model: rpc error: code = Canceled desc = context canceled backend="diffusers" caller={caller.file="/build/pkg/model/initializers.go" caller.L=263 }
Jul 01 12:23:29 INFO HTTP request method="POST" path="/v1/images/generations" status=500 caller={caller.file="/build/core/http/app.go" caller.L=217 }
Jul 01 12:23:30 WARN Backend process exited unexpectedly id="chroma1-hd" address="127.0.0.1:46655" process="run.sh" exitCode="-1" caller={caller.file="/build/pkg/model/process.go" caller.L=255 }
In the container logs:
Jul 01 19:04:40 DEBUG GRPC stdout id="chroma1-hd-127.0.0.1:33513" line="Received termination signal. Shutting down..." caller={caller.file="/build/pkg/model/process.go" caller.L=220 }
dmesg shows:
…
amdxdna 0000:c7:00.1: [drm] *ERROR* amdxdna_drm_open: SVA bind device failed, ret -19
LocalAI version:
image: localai/localai:latest-gpu-vulkan(doesn't see the integrated GPU) orimage: localai/localai:latest-gpu-hipblas.LocalAI v4.5.5 (d11b202)
Environment, CPU architecture, OS, and Version:
Linux with AMD, AMD Ryzen 7 3700X 8-Core Processor. 96 GB of unified memory, 48 GB set for framebuffer in UEFI.
docker-compose.yml:Describe the bug
An attempt to generate an image (in this case, with the Chroma1-HD model, but also with some others results in "failed to load model with internal loader: could not load model: rpc error: code = Canceled desc = context canceled" with
/opt/amdgpu/share/libdrm/amdgpu.ids: No such file or directoryin the backend process output./opt/amdgpu/share/libdrm/amdgpu.idsis absent, but there is/usr/share/libdrm/amdgpu.idsin the container.If a symlink is made in the container, then model load fails for unknown reason:
In the container logs:
dmesgshows: