Personal AI Server is a project designed to serve an AI server under a proxy.
This project uses ollama as the AI server, accessible behind a proxy that requires an Authorization header to access, but can be replaced with other AI servers that expose endpoints.
- Ollama server accessible through a proxy
- Dockerised setup for simple deployment
Before setting up the project, ensure you have the following installed:
- Docker
- Docker Compose
- Host computer (A computer that can run Docker e.g. macOS, Linux, a public server, etc.)
- Fork/Copy/Clone this repository
- Update
/api-keysand add the keys that you allow to access your endpoint
For example
12312-323214-432324-23432
aslkdnk-321kn-321nkasd-321
- (Optional) Generate keys using
openssh
echo "ak-auth-$(openssl rand -hex 16)"
and use this value as your api key
- Run the server by starting the docker containers (ollama, auth-server, caddy) using
docker-compose
docker compose up -d
This will use docker-compose.yml and run your server behind port 9000 by default.
Since this server hides the ollama server behind an authenticated proxy, the endpoints and requests will be the same as talking to an ollama server.
The difference is that now it requires a Authorization header to prevent others from using your AI server, especially if you are planning to host your server publicly.
An example cURL request for your server would look something like this:
curl -X POST http://localhost:9000/api/generate -H "Authorization: your-secret-api-key" -d '{"model":"llama3.2","prompt": "Why is the sky blue? Answer in 1 sentence","options": {"temperature": 0.7}}'If you're running this project on a Mac, the ollama server running inside the docker container won't be able to utilise your Mac's GPU.
Whilst using the default setup works, ollama running inside the Docker container will only use the Mac's CPU instead of the GPU, making the performance slower.
In order to make the performance better, it is advised to run ollama on your Mac instead of being inside a docker container.
- Download
ollamahere or install usingbrew
brew install ollama
- Run ollama using the app shortcut or command-line
ollama serve
- Pull a model
ollama run llama3.2
- Run the
personal-ai-serverusingdocker-compose-mac-native.yml
docker compose -f docker-compose-mac-native.yml up -d
This will use a different configuration where the caddy proxy will redirect authenticated requests to your Mac's running ollama server and will not have ollama running inside a Docker container.
One of the aims of this project is to have your own AI server to be accessibly publicly.
You can use zrok to host your server and expose it to the internet via a public URL.
- Sign up for an account in
zrok - Log in to zrok api and obtain the secret account token
- Come up with a unique address token. This will be useful to have a consistent URL to access your server.
- Run the script found in
scripts/setup_zrok.sh
chmod +x scripts/setup_zrok.sh
scripts/setup_zrok.sh --token YOUR_SECRET_TOKEN uniqueaddress123- Once successful, you should be able to publlicly access to your AI server using
https://uniqueaddress123.share.zrok.io
Contributions are welcome! Feel free to open issues or submit pull requests.
- Make the Ollama model a variable instead of defaulting to
llama3.2
This project is licensed under the MIT Licence.