You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jul 4, 2025. It is now read-only.
Copy file name to clipboardExpand all lines: README.md
+41-43Lines changed: 41 additions & 43 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,11 +17,9 @@
17
17
- Quick Setup: Approximately 10-second initialization for swift deployment.
18
18
- Enhanced Web Framework: Incorporates drogon cpp to boost web service efficiency.
19
19
20
-
## Documentation
21
-
22
20
## About Nitro
23
21
24
-
Nitro is a light-weight integration layer (and soon to be inference engine) for cutting edge inference engine, make deployment of AI models easier than ever before!
22
+
Nitro is a high-efficiency C++ inference engine for edge computing, powering [Jan](https://jan.ai/). It is lightweight and embeddable, ideal for product integration.
25
23
26
24
The binary of nitro after zipped is only ~3mb in size with none to minimal dependencies (if you use a GPU need CUDA for example) make it desirable for any edge/server deployment 👍.
27
25
@@ -40,37 +38,57 @@ The binary of nitro after zipped is only ~3mb in size with none to minimal depen
40
38
41
39
## Quickstart
42
40
43
-
**Step 1: Download Nitro**
41
+
**Step 1: Install Nitro**
44
42
45
-
To use Nitro, download the released binaries from the release page below:
Double-click on Nitro to run it. After downloading your model, make sure it's saved to a specific path. Then, make an API call to load your model into Nitro.
64
+
```bash title="Run Nitro server"
65
+
nitro
66
+
```
60
67
68
+
**Step 4: Load model**
61
69
62
-
```zsh
63
-
curl -X POST 'http://localhost:3928/inferences/llamacpp/loadmodel' \
Copy file name to clipboardExpand all lines: docs/docs/examples/chatbox.md
+44-4Lines changed: 44 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,10 +2,50 @@
2
2
title: Nitro with Chatbox
3
3
---
4
4
5
-
:::info COMING SOON
6
-
:::
5
+
This guide demonstrates how to integrate Nitro with Chatbox, showcasing the compatibility of Nitro with various platforms.
7
6
8
-
<!--
9
7
## What is Chatbox?
8
+
Chatbox is a versatile desktop client that supports multiple cutting-edge Large Language Models (LLMs). It is available for Windows, Mac, and Linux operating systems.
10
9
11
-
## How to use Nitro as backend -->
10
+
For more information, please visit the [Chatbox official GitHub page](https://github.com/Bin-Huang/chatbox).
11
+
12
+
13
+
## Downloading and Installing Chatbox
14
+
15
+
To download and install Chatbox, follow the instructions available at this [link](https://github.com/Bin-Huang/chatbox#download).
16
+
17
+
## Using Nitro as a Backend
18
+
19
+
1. Start Nitro server
20
+
21
+
Open your command line tool and enter:
22
+
```
23
+
nitro
24
+
```
25
+
26
+
> Ensure you are using the latest version of [Nitro](new/install.md)
Copy file name to clipboardExpand all lines: docs/docs/new/about.md
+1-5Lines changed: 1 addition & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
2
title: About Nitro
3
-
slug: /docs
3
+
slug: /about
4
4
---
5
5
6
6
Nitro is a high-efficiency C++ inference engine for edge computing, powering [Jan](https://jan.ai/). It is lightweight and embeddable, ideal for product integration.
@@ -119,7 +119,3 @@ Nitro welcomes contributions in various forms, not just coding. Here are some wa
119
119
120
120
-[drogon](https://github.com/drogonframework/drogon): The fast C++ web framework
121
121
-[llama.cpp](https://github.com/ggerganov/llama.cpp): Inference of LLaMA model in pure C/C++
### 1. Is Nitro the same as Llama.cpp with an API server?
7
+
8
+
Yes, that's correct. However, Nitro isn't limited to just Llama.cpp; it will soon integrate multiple other models like Whisper, Bark, and Stable Diffusion, all in a single binary. This eliminates the need for you to develop a separate API server on top of AI models. Nitro is a comprehensive solution, designed for ease of use and efficiency.
9
+
10
+
### 2. Is Nitro simply Llama-cpp-python?
11
+
12
+
Indeed, Nitro isn't bound to Python, which allows you to leverage high-performance software that fully utilizes your system's capabilities. With Nitro, learning how to deploy a Python web server or use FastAPI isn't necessary. The Nitro web server is already fully optimized.
13
+
14
+
### 3. Why should I switch to Nitro over Ollama?
15
+
16
+
While Ollama does provide similar functionalities, its design serves a different purpose. Ollama has a larger size (around 200MB) compared to Nitro's 3MB distribution. Nitro's compact size allows for easy embedding into subprocesses, ensuring minimal concerns about package size for your application. This makes Nitro a more suitable choice for applications where efficiency and minimal resource usage are key.
17
+
18
+
### 4. Why is the model named "chat-gpt-3.5"?
19
+
20
+
Many applications implement the OpenAI ChatGPT API, and we want Nitro to be versatile for any AI client. While you can use any model name, we've ensured that if you're already using the chatgpt API, switching to Nitro is seamless. Just replace api.openai.com with localhost:3928 in your client settings (like Chatbox, Sillytavern, Oobaboga, etc.), and it will work smoothly with Nitro.
0 commit comments