Skip to content

Commit 5c0d564

Browse files
authored
Merge pull request #10 from GetSoloTech/container
Container
2 parents c1ffa9d + cc90f60 commit 5c0d564

12 files changed

Lines changed: 454 additions & 220 deletions

File tree

README.md

Lines changed: 67 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -20,13 +20,13 @@ Solo Server is a lightweight platform that enables users to manage and monitor A
2020
## Features
2121

2222
- **Seamless Setup:** Manage your on device AI with a simple CLI and HTTP servers
23-
- **Open Model Registry:** Pull models from registries like Hugging Face and Ollama
23+
- **Open Model Registry:** Pull models from registries like Ollama & Hugging Face
2424
- **Lean Load Testing:** Built-in commands to benchmark endpoints
2525
- **Cross-Platform Compatibility:** Deploy AI models effortlessly on your hardware
2626
- **Configurable Framework:** Auto-detect hardware (CPU, GPU, RAM) and sets configs
2727

2828
## Supported Models
29-
Solo Server supports **multiple model sources**, including **Ollama, Hugging Face, and Ramalama**.
29+
Solo Server supports **multiple model sources**, including **Ollama & Hugging Face**.
3030

3131
| **Model Name** | **Source** |
3232
|------------------------|----------------------------------------------------------|
@@ -39,7 +39,7 @@ Solo Server supports **multiple model sources**, including **Ollama, Hugging Fac
3939
| **Mistral 7B v3** | `hf://MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF` |
4040
| **Hermes 2 Pro** | `hf://NousResearch/Hermes-2-Pro-Mistral-7B-GGUF` |
4141
| **Cerebrum 1.0 7B** | `hf://froggeric/Cerebrum-1.0-7b-GGUF` |
42-
| **Dragon Mistral 7B** | `hf://llmware/dragon-mistral-7b-v0`
42+
| **Dragon Mistral 7B** | `hf://llmware/dragon-mistral-7b-v0` |
4343

4444
## Table of Contents
4545

@@ -52,6 +52,12 @@ Solo Server supports **multiple model sources**, including **Ollama, Hugging Fac
5252

5353
## Installation
5454

55+
### **🔹Prerequisites**
56+
57+
- **🐋 Docker:** Required for containerization
58+
- [Install Docker](https://docs.docker.com/get-docker/)
59+
- Ensure Docker daemon is running
60+
5561
### **🔹 Install via PyPI**
5662
```sh
5763
pip install solo-server
@@ -65,22 +71,39 @@ Creates an isolated environment using `uv` for performance and stability.
6571

6672
Run the **interactive setup** to configure Solo Server:
6773
```sh
68-
solo setup
74+
solo start
6975
```
7076
### **🔹 Setup Features**
7177
✔️ **Detects CPU, GPU, RAM** for **hardware-optimized execution**
7278
✔️ **Auto-configures `solo.conf` with optimal settings**
73-
✔️ **Requests API keys for Ngrok and Replicatea**
79+
✔️ **Requests API keys for Ngrok and Replicate**
7480
✔️ **Recommends the compute backend OCI (CUDA, HIP, SYCL, Vulkan, CPU, Metal)**
7581

7682
---
7783

84+
**Example Output:**
85+
```sh
86+
🖥️ System Information
87+
Operating System: Windows
88+
CPU: AMD64 Family 23 Model 96 Stepping 1, AuthenticAMD
89+
CPU Cores: 8
90+
Memory: 15.42GB
91+
GPU: NVIDIA
92+
GPU Model: NVIDIA GeForce GTX 1660 Ti
93+
GPU Memory: 6144.0GB
94+
Compute Backend: CUDA
95+
96+
🚀 Setting up Solo Server...
97+
✅ Solo server is ready!
98+
```
99+
100+
---
101+
78102
## **Commands**
79-
### **1️⃣ Pull a Model**
103+
### **1️⃣ Pull & Run a Model**
80104
```sh
81-
solo pull llama3
105+
solo run llama3.2
82106
```
83-
84107

85108
---
86109

@@ -96,6 +119,39 @@ http://127.0.0.1:5070 #SOLO_SERVER_PORT
96119

97120
---
98121

122+
## Diagram
123+
124+
```
125+
+-------------------+
126+
| |
127+
| solo run llama3.2 |
128+
| |
129+
+---------+---------+
130+
|
131+
|
132+
| +------------------+ +----------------------+
133+
| | Pull inferencing | | Pull model layer |
134+
+-----------| runtime (cuda) |---------->| llama3.2 |
135+
+------------------+ +----------------------+
136+
| Repo options |
137+
++-----------+--------++
138+
| | |
139+
v v v
140+
+----------+ +----------+ +-------------+
141+
| Ollama | | vLLM | | HuggingFace |
142+
| Registry | | registry | | Registry |
143+
+-----+------+---+------+-++------------+
144+
| | |
145+
v v v
146+
+---------------------+
147+
| Start with |
148+
| cuda runtime |
149+
| and |
150+
| llama3.2 |
151+
+---------------------+
152+
```
153+
---
154+
99155
### **3️⃣ Benchmark a Model**
100156
```sh
101157
solo benchmark llama3
@@ -148,12 +204,12 @@ solo status
148204

149205
### **5️⃣ Stop a Model**
150206
```sh
151-
solo stop llama3
207+
solo stop
152208
```
153209
**Example Output:**
154210
```sh
155-
Stopping llama3...
156-
llama3 stopped successfully.
211+
🛑 Stopping Solo Server...
212+
✅ Solo server stopped successfully.
157213
```
158214

159215
---

setup.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,19 +11,23 @@
1111
description="AIOps for the Physical World.",
1212
long_description=long_description,
1313
long_description_content_type="text/markdown",
14-
url="https://github.com/AIEngineersDev/solo-server",
14+
url="https://github.com/GetSoloTech/solo-server",
1515
packages=find_packages(include=["solo_server", "solo_server.*"]),
1616
include_package_data=True,
1717
install_requires=[
1818
"typer",
19+
"GPUtil",
20+
"psutil",
21+
"requests",
22+
"tabulate",
1923
],
2024
extras_require={
2125
"dev": ["pytest", "black", "isort"],
2226
},
2327
python_requires=">=3.8",
2428
entry_points={
2529
"console_scripts": [
26-
"solo-server=solo_server.cli:app",
30+
"solo=solo_server.cli:app",
2731
],
2832
},
2933
)

solo_server/cli.py

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,13 @@
11
import typer
2-
from .commands import pull, serve, stop, status, benchmark
3-
from .setup import interactive_setup
2+
from .commands import run, stop, status
3+
from .start import start
44
app = typer.Typer()
55

66
# Commands
7-
app.command()(pull.pull)
8-
app.command()(serve.serve)
7+
app.command()(run.run)
98
app.command()(stop.stop)
109
app.command()(status.status)
11-
app.command()(benchmark.benchmark)
12-
app.command()(interactive_setup)
10+
app.command()(start)
1311

1412
if __name__ == "__main__":
1513
app()

solo_server/commands/pull.py

Lines changed: 0 additions & 26 deletions
This file was deleted.

solo_server/commands/run.py

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
import typer
2+
import subprocess
3+
4+
def run(model: str):
5+
"""
6+
Serves a model using Ollama and enables interactive chat.
7+
"""
8+
typer.echo(f"🚀 Starting model {model}...")
9+
10+
# Check if Docker container is running
11+
try:
12+
check_cmd = ["docker", "ps", "-q", "-f", "name=solo"]
13+
if not subprocess.run(check_cmd, capture_output=True, text=True).stdout:
14+
typer.echo("❌ Solo server is not active. Please start solo server first.", err=True)
15+
return
16+
17+
command = ["docker", "exec", "-it", "solo", "ollama", "run", model]
18+
19+
# Use subprocess.run with shell=True for interactive terminal
20+
process = subprocess.run(
21+
" ".join(command),
22+
shell=True,
23+
text=True
24+
)
25+
except subprocess.CalledProcessError as e:
26+
typer.echo(f"❌ An error occurred: {e}", err=True)

solo_server/commands/serve.py

Lines changed: 41 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,46 @@
1+
import requests
2+
import json
13
import typer
2-
import subprocess
34

4-
def serve(name: str, model: str):
5-
"""
6-
Serves a model using Ramalama.
7-
"""
8-
typer.echo(f"🚀 Starting model {model} as {name}...")
5+
def serve(
6+
model: str = typer.Option("llama3.2", "--model", "-m", help="Model to use"),
7+
input: str = typer.Option("Hello", "--input", "-i", help="Input text for inference"),
8+
stream: bool = typer.Option(False, "--stream", "-s", help="Enable streaming mode")
9+
):
10+
# API Endpoint
11+
url = "http://localhost:11434/api/chat"
912

10-
try:
11-
command = ["ramalama", "serve", model]
12-
process = subprocess.run(command, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
13+
# Chat request payload
14+
data = {
15+
"model": model,
16+
"messages": [
17+
{
18+
"role": "user",
19+
"content": input
20+
}
21+
],
22+
"stream": stream # Set to True for streaming
23+
}
1324

14-
typer.echo(f"✅ Model {model} is now running as {name}.")
15-
typer.echo(f"🌐 Access the UI at: http://127.0.0.1:5070")
25+
if data["stream"] == False:
26+
# Sending POST request
27+
response = requests.post(url, json=data)
28+
# Check if response is valid JSON
29+
try:
30+
response_json = response.json()
31+
if "message" in response_json and "content" in response_json["message"]:
32+
print("Assistant Response:", response_json["message"]["content"])
33+
else:
34+
print("Unexpected Response:", json.dumps(response_json, indent=2))
35+
except json.JSONDecodeError:
36+
print("Error: API did not return valid JSON.")
37+
print("Raw Response:", response.text)
1638

17-
except subprocess.CalledProcessError as e:
18-
typer.echo(f"❌ Failed to serve model {model}: {e.stderr}", err=True)
19-
except Exception as e:
20-
typer.echo(f"⚠️ Unexpected error: {e}", err=True)
39+
40+
else:
41+
with requests.post(url, json=data, stream=True) as response:
42+
for line in response.iter_lines():
43+
if line:
44+
json_obj = json.loads(line)
45+
if "message" in json_obj and "content" in json_obj["message"]:
46+
print(json_obj["message"]["content"], end="", flush=True) # Streaming output

solo_server/commands/status.py

Lines changed: 39 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,46 @@
11
import typer
22
import subprocess
3+
from solo_server.utils.hardware import display_hardware_info
4+
from tabulate import tabulate
5+
import json
36

47
app = typer.Typer()
58

69
@app.command()
710
def status():
8-
"""Check running models."""
9-
typer.echo("Checking running model containers...")
10-
subprocess.run(["podman", "ps", "--filter", "name=solo-container"], check=True)
11+
"""Check running models and system status."""
12+
display_hardware_info(typer)
13+
14+
# Check for running solo container
15+
container_result = subprocess.run(["docker", "ps", "-f", "name=solo", "--format", "{{json .}}"],
16+
capture_output=True, text=True, check=True)
17+
18+
if container_result.stdout.strip():
19+
# Container is running, show available models
20+
typer.echo("\n🔍 Available Models:")
21+
models_result = subprocess.run(["docker", "exec", "solo", "ollama", "list"],
22+
capture_output=True, text=True, check=True)
23+
models = []
24+
for line in models_result.stdout.strip().split('\n'):
25+
parts = line.split()
26+
if len(parts) >= 7:
27+
size = f"{parts[2]} {parts[3]}"
28+
modified = f"{parts[4]} {parts[5]} {parts[6]}"
29+
models.append([parts[0], parts[1], size, modified])
30+
31+
if models:
32+
print(tabulate(models, headers=['NAME', 'ID', 'SIZE', 'MODIFIED'], tablefmt='grid'))
33+
34+
# Show running containers section (will be empty if none running)
35+
typer.echo("\n🔍 Running Containers:")
36+
containers = []
37+
if container_result.stdout.strip():
38+
for line in container_result.stdout.strip().split('\n'):
39+
container = json.loads(line)
40+
containers.append([
41+
container['Names'],
42+
container['Status'],
43+
container['Ports']
44+
])
45+
46+
print(tabulate(containers, headers=['NAME', 'STATUS', 'PORTS'], tablefmt='grid'))

solo_server/commands/stop.py

Lines changed: 23 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,34 @@
11
import typer
22
import subprocess
33

4-
def stop(name: str):
4+
def stop(name: str = ""):
55
"""
6-
Stops a running model container using Ramalama.
6+
Stops the Ollama Docker container and any running models.
77
"""
8-
typer.echo(f"🛑 Stopping {name} using Ramalama...")
8+
typer.echo("🛑 Stopping Solo Server...")
99

1010
try:
11-
subprocess.run(["ramalama", "stop", name], check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
12-
typer.echo(f"✅ {name} stopped successfully.")
11+
# Stop the Docker container
12+
subprocess.run(
13+
["docker", "stop", "solo"],
14+
check=True,
15+
stdout=subprocess.PIPE,
16+
stderr=subprocess.PIPE,
17+
text=True
18+
)
19+
typer.echo("✅ Solo server stopped successfully.")
20+
21+
# # Remove the container
22+
# subprocess.run(
23+
# ["docker", "rm", "ollama"],
24+
# check=True,
25+
# stdout=subprocess.PIPE,
26+
# stderr=subprocess.PIPE,
27+
# text=True
28+
# )
29+
# typer.echo("🗑️ Ollama container removed.")
1330

1431
except subprocess.CalledProcessError as e:
15-
typer.echo(f"❌ Failed to stop {name}: {e.stderr}", err=True)
32+
typer.echo(f"❌ Failed to stop Solo Server: {e.stderr}", err=True)
1633
except Exception as e:
1734
typer.echo(f"⚠️ Unexpected error: {e}", err=True)

0 commit comments

Comments
 (0)