You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: update example script for 70B model inference with disk KV cache support
- Changed model from Qwen2.5-0.5B to Qwen2.5-72B for enhanced performance.
- Implemented disk-based KV cache to prevent OOM issues on 8 GB VRAM.
- Updated user prompt in the example to reflect a new question.
- Removed outdated comments and added new ones for clarity.
0 commit comments