model not loading on GPU

Hi, I just want to run this simple code on GPU...

```
from langchain_community.llms import CTransformers

llm = CTransformers(model="./airoboros-mistral2.2-7b.Q4_K_S.gguf", model_type="mistral", gpu_layers=32, verbose=True)

print(llm.invoke('AI is going to'))
```
![image](https://github.com/user-attachments/assets/97b8d8df-06b7-4ba0-9ac8-16368f7cb398)

As you can see...GPU usage is at 0% and it took about 1 minute which is very long for a quick prompt

Is there nothing I can do to visualize what's going on, like printing something on the terminal? How do I even know if it's running or not?   I cannot fully trust task manager

I'm a newcomer, please help me

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

model not loading on GPU #212

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

model not loading on GPU #212

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions