Skip to content

Commit e0b37b4

Browse files
committed
new post on local ai server - 2025-09-06 22:00:46]
1 parent 372bfc9 commit e0b37b4

5 files changed

Lines changed: 119 additions & 59 deletions

File tree

  • _freeze/posts/buildingapowerfulenoughlocalaiserver/index/execute-results
  • drafts
    • buildingapowerfulenoughlocalaiserver
    • programaticallyaddingtabsetsinquarto
  • posts/buildingapowerfulenoughlocalaiserver

_freeze/posts/buildingapowerfulenoughlocalaiserver/index/execute-results/html.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
{
2-
"hash": "7292ea7890b433f95c7ed85a563373b3",
2+
"hash": "2910e3d5e25105865252e02476a5f3c0",
33
"result": {
44
"engine": "jupyter",
5-
"markdown": "---\ntitle: Building a local AI server\nformat: html\ntoc: true\nlang: en\njupyter: python3\nipynb-shell-interactivity: all\nexecute:\n echo: false\ndate: 2025-06-03 06:50 +0200\ncategories: [\"ai\"]\ntags: [\"ollama\", \"aihardware\"]\ncomments:\n giscus:\n repo: jeev20/jeev20.github.io\n---\n\n\n## Requirements\n\n### Hardware\nMy requirements were quite basic:\n\n* A minimum of 16GB VRAM (preferably Nvidia)\n* A minimum of 16GB RAM\n* A minimum of 6 cores / 12 threads Ryzen CPU\n* A motherboard with 2 PCI GPU slots (does not need to be 16 lanes PCI)\n* A minimum of 600 watt power supply\n* A wi-fi smart-plug with scheduling capabilities\n* A motherboard which support power-on after power restoration\n\n\n### Software\n* Linux OS with a long-term support and used by a lot of users\n* Linux OS with Tailscale native support (systemd)\n* Linux OS with easy installation of Docker, Nvidia Container Toolkit and Cuda Toolkit\n* Linux OS with OpenSSH server to manage server remotely\n* Linux OS with crontab to schedule running of scripts\n \n\n## Server \nI ended up purchasing a second-hand PC with all the above requirements and then purchased two RTX 3060 GPUs with 12 GB Vram each. The RAM capacity is somewhat low for this use-case and I plan to upgrade it in the future. \n\n\n## Scheduling\n\n## Updates\nAs everything runs on an Ubuntu LTS 24.04. \n\n## Docker\n\n## Ollama and OpenWebUi Bundle\n\n## Docker exec\n\n## Power automation\n\n",
5+
"markdown": "---\ntitle: Building a local AI server\nformat: html\ntoc: true\nlang: en\njupyter: python3\nipynb-shell-interactivity: all\nexecute:\n echo: false\ndate: 2025-05-15 06:50 +0200\ncategories: [\"ai server\"]\ntags: [\"ollama\", \"aihardware\", \"machine learning hardware\"]\ncomments:\n giscus:\n repo: jeev20/jeev20.github.io\n---\n\nI have been an avid tinkerer with home pc's. I enjoy building and configuring them. Naturally, as LLM's become a developer necessity, I wanted to build an economical yet a performant LLM server for my home-lab.\n\n## Requirements\n\n### Hardware\nMy requirements were quite basic:\n\n* A minimum of 16GB VRAM (preferably Nvidia)\n* A minimum of 16GB RAM\n* A minimum of 6 cores / 12 threads Ryzen CPU\n* A motherboard with 2 PCI GPU slots (does not need to be 16 lanes PCI)\n* A minimum of 600 watt power supply\n* A wi-fi smart-plug with scheduling capabilities\n* A motherboard which support power-on after power restoration\n\n\n### Software\n* Linux OS with a long-term support and used by a lot of users\n* Linux OS with Tailscale native support (systemd)\n* Linux OS with easy installation of Docker, Nvidia Container Toolkit and Cuda Toolkit\n* Linux OS with OpenSSH server to manage server remotely\n* Linux OS with crontab to schedule running of scripts\n \n\n## Server \nI ended up purchasing a second-hand PC with all the above requirements and then purchased two used RTX 3060 GPUs with 12 GB Vram each. The combined 24 GB of Vram is plenty enough for my use cases. In the future, this may be upgraded to either a 32 GB or higher. \n\nThe RAM capacity (16 GB) is somewhat low for this use-case and I plan to upgrade it in the future. \n\nThe PC costed me 1500 NOK and 5200 NOK for the 2 GPUs. The total cost was 6700 NOK, which is equivalent to 666 USD. I am sure it is a fantastic price for the performance it offers.\n\nTo ensure stable internet connectivity, I choose to connect the server directly to a Unify router via the internet cable, which is part of a mesh network. \n\n\n![LLM Server - a sleeper build](images/LLMServer.jpg)\n\nAlthough it would have been great to connect to the main router, my main router is placed up in the attic which can get quite cold in the winters. Also, I do not wish to go up there when I have to access this server. \n\n### Scheduling\nI have a cron job set up to switch of the server at around 01:00 everynight. Simulatneously, a smart wi-fi switch off from the power socket. However, to wake up the server in the morning, I use a smart switch to power on and in the server bios, I set \"wake on power up\" setting. \n\nThis ensures that when the smart switch is in the on position, the server starts up. \n\nThis method to optimize power useage has been working without any hicups for several months now and I recommend this to anyone looking to get into home-labing. \n\n### Updates\nAs everything runs on an Ubuntu LTS 24.04, I have very litte proactive updates to make manually. \n\n### Combining Ollama and Open-WebUi\n\nFrom the Open-Webui repository, I choose to run this configuration \n\n```{.bash}\nsudo docker run -d -p 3000:8080 -p 11434:11434 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always -e OLLAMA_HOST=0.0.0.0 ghcr.io/open-webui/open-webui:ollama\n```\n\nTo keep this installation upto date, I use `watchtower`\n```{.bash}\nsudo docker run --rm --volume /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --run-once open-webui\n\n```\n### Docker exec\nOffcourse, an LLM server should be able to fetch the newest open-source models. I do this via an easy ollama command. \n```{.bash}\nsudo docker exec open-webui ollama pull gpt-oss:latest\n```\n\n\n### Remote access\nIn order to have acces to my LLM server outside my home network, I use [Tailscale](https://tailscale.com/). Tailscale is an amazing technology which allows to create encrypted tunnels across devices. \n\nMy phone is also part of the this `tailnet`, which allows it to connect to my LLM server from any public internet network. Tailscale automatically updates to the latest version when I run `apt update && apt upgrade` ensuring minimal updating overhead. \n\n## Usage and experience\n\nOverall, I am quite happy with how this server performs. I use this server for both machine learning tasks and LLM experiments. Since Github Copilot has an agent mode which can be configured with ollama, this server serves the models to VS Code when I need it. \n\nI have also configured this server as the LLM provider in my [Marimo](https://marimo.io/) notebooks. \n\nI am looking forward to many wonderful evenings of exploring LLMs and machine learning on this nifty little server. \n\n",
66
"supporting": [
77
"index_files"
88
],

drafts/buildingapowerfulenoughlocalaiserver/index.qmd

Lines changed: 0 additions & 57 deletions
This file was deleted.
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
---
2+
title: Programatically adding tabsets in Quarto
3+
description: A walkthorugh of using embedd option to create dashboards
4+
author: "Jeevith Hegde"
5+
date: 2025-08-25 15:33 +0200
6+
format: html
7+
toc: true
8+
lang: en
9+
jupyter: python3
10+
ipynb-shell-interactivity: all
11+
execute:
12+
echo: false
13+
categories: ["quarto"]
14+
tags: ["tutorial"]
15+
comments:
16+
giscus:
17+
repo: jeev20/jeev20.github.io
18+
---
19+
20+
Test
220 KB
Loading
Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
---
2+
title: Building a local AI server
3+
format: html
4+
toc: true
5+
lang: en
6+
jupyter: python3
7+
ipynb-shell-interactivity: all
8+
execute:
9+
echo: false
10+
date: 2025-05-15 06:50 +0200
11+
categories: ["ai server"]
12+
tags: ["ollama", "aihardware", "machine learning hardware"]
13+
comments:
14+
giscus:
15+
repo: jeev20/jeev20.github.io
16+
---
17+
18+
I have been an avid tinkerer with home pc's. I enjoy building and configuring them. Naturally, as LLM's become a developer necessity, I wanted to build an economical yet a performant LLM server for my home-lab.
19+
20+
## Requirements
21+
22+
### Hardware
23+
My requirements were quite basic:
24+
25+
* A minimum of 16GB VRAM (preferably Nvidia)
26+
* A minimum of 16GB RAM
27+
* A minimum of 6 cores / 12 threads Ryzen CPU
28+
* A motherboard with 2 PCI GPU slots (does not need to be 16 lanes PCI)
29+
* A minimum of 600 watt power supply
30+
* A wi-fi smart-plug with scheduling capabilities
31+
* A motherboard which support power-on after power restoration
32+
33+
34+
### Software
35+
* Linux OS with a long-term support and used by a lot of users
36+
* Linux OS with Tailscale native support (systemd)
37+
* Linux OS with easy installation of Docker, Nvidia Container Toolkit and Cuda Toolkit
38+
* Linux OS with OpenSSH server to manage server remotely
39+
* Linux OS with crontab to schedule running of scripts
40+
41+
42+
## Server
43+
I ended up purchasing a second-hand PC with all the above requirements and then purchased two used RTX 3060 GPUs with 12 GB Vram each. The combined 24 GB of Vram is plenty enough for my use cases. In the future, this may be upgraded to either a 32 GB or higher.
44+
45+
The RAM capacity (16 GB) is somewhat low for this use-case and I plan to upgrade it in the future.
46+
47+
The PC costed me 1500 NOK and 5200 NOK for the 2 GPUs. The total cost was 6700 NOK, which is equivalent to 666 USD. I am sure it is a fantastic price for the performance it offers.
48+
49+
To ensure stable internet connectivity, I choose to connect the server directly to a Unify router via the internet cable, which is part of a mesh network.
50+
51+
52+
![LLM Server - a sleeper build](images/LLMServer.jpg)
53+
54+
Although it would have been great to connect to the main router, my main router is placed up in the attic which can get quite cold in the winters. Also, I do not wish to go up there when I have to access this server.
55+
56+
### Scheduling
57+
I have a cron job set up to switch of the server at around 01:00 everynight. Simulatneously, a smart wi-fi switch off from the power socket. However, to wake up the server in the morning, I use a smart switch to power on and in the server bios, I set "wake on power up" setting.
58+
59+
This ensures that when the smart switch is in the on position, the server starts up.
60+
61+
This method to optimize power useage has been working without any hicups for several months now and I recommend this to anyone looking to get into home-labing.
62+
63+
### Updates
64+
As everything runs on an Ubuntu LTS 24.04, I have very litte proactive updates to make manually.
65+
66+
### Combining Ollama and Open-WebUi
67+
68+
From the Open-Webui repository, I choose to run this configuration
69+
70+
```{.bash}
71+
sudo docker run -d -p 3000:8080 -p 11434:11434 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always -e OLLAMA_HOST=0.0.0.0 ghcr.io/open-webui/open-webui:ollama
72+
```
73+
74+
To keep this installation upto date, I use `watchtower`
75+
```{.bash}
76+
sudo docker run --rm --volume /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --run-once open-webui
77+
78+
```
79+
### Docker exec
80+
Offcourse, an LLM server should be able to fetch the newest open-source models. I do this via an easy ollama command.
81+
```{.bash}
82+
sudo docker exec open-webui ollama pull gpt-oss:latest
83+
```
84+
85+
86+
### Remote access
87+
In order to have acces to my LLM server outside my home network, I use [Tailscale](https://tailscale.com/). Tailscale is an amazing technology which allows to create encrypted tunnels across devices.
88+
89+
My phone is also part of the this `tailnet`, which allows it to connect to my LLM server from any public internet network. Tailscale automatically updates to the latest version when I run `apt update && apt upgrade` ensuring minimal updating overhead.
90+
91+
## Usage and experience
92+
93+
Overall, I am quite happy with how this server performs. I use this server for both machine learning tasks and LLM experiments. Since Github Copilot has an agent mode which can be configured with ollama, this server serves the models to VS Code when I need it.
94+
95+
I have also configured this server as the LLM provider in my [Marimo](https://marimo.io/) notebooks.
96+
97+
I am looking forward to many wonderful evenings of exploring LLMs and machine learning on this nifty little server.

0 commit comments

Comments
 (0)