|
| 1 | +--- |
| 2 | +title: Create a llamafile service in linux |
| 3 | +description: A walkthrough of running llamafile as a service |
| 4 | +author: "Jeevith Hegde" |
| 5 | +date: 2025-11-04 19:08 +0200 |
| 6 | +format: html |
| 7 | +toc: true |
| 8 | +lang: en |
| 9 | +jupyter: python3 |
| 10 | +ipynb-shell-interactivity: all |
| 11 | +execute: |
| 12 | + echo: false |
| 13 | +categories: ["llm"] |
| 14 | +tags: ["local llm", "llamafile", "self hosted", "servicefile"] |
| 15 | +comments: |
| 16 | + giscus: |
| 17 | + repo: jeev20/jeev20.github.io |
| 18 | +--- |
| 19 | + |
| 20 | + |
| 21 | +## Background |
| 22 | +In order to run llamafile as a service we have to create a system account which has access to the folder and can run the command on restart. We call the user `llamafile`. |
| 23 | + |
| 24 | +Below I show how a fine_tuned Phi4 model can be served. |
| 25 | + |
| 26 | + |
| 27 | +```bash |
| 28 | +sudo useradd -r -s /usr/sbin/nologin -U -m -d /data/LLM/phi4_finetuning llamafile |
| 29 | +sudo chown -R llamafile:llamafile /usr/local/bin/llamafile |
| 30 | +``` |
| 31 | + |
| 32 | + |
| 33 | + |
| 34 | +Lets move the llamafile binary to a better place and change the ownership |
| 35 | + |
| 36 | +```bash |
| 37 | +sudo mv /data/LLM/phi4_finetuning/llamafile-0.9.3 /usr/local/bin/llamafile |
| 38 | +sudo chmod +x /usr/local/bin/llamafile |
| 39 | +``` |
| 40 | + |
| 41 | + |
| 42 | + |
| 43 | +We will also create log files for the service |
| 44 | + |
| 45 | +```bash |
| 46 | +sudo nano /var/log/llamafile.log /var/log/llamafile.err |
| 47 | +sudo chown llamafile:llamafile /var/log/llamafile.* |
| 48 | +``` |
| 49 | + |
| 50 | + |
| 51 | +## Wrap our commands in a shell script |
| 52 | + |
| 53 | +```bash |
| 54 | +sudo nano /usr/local/bin/llamafile-wrapper.sh |
| 55 | +``` |
| 56 | + |
| 57 | +Paste the command in the shell script |
| 58 | + |
| 59 | +```bash |
| 60 | +#!/bin/bash |
| 61 | +exec /usr/local/bin/llamafile -m /data/LLM/phi4_finetuning/unsloth.Q4_K_M.gguf -ngl 9999 --gpu nvidia --server --v2 -l 0.0.0.0:8080 --temp 0 |
| 62 | +``` |
| 63 | + |
| 64 | +Make it an executable, this ensures that the user we created previously can easily invoke this script on restart. |
| 65 | + |
| 66 | +```bash |
| 67 | +sudo chmod +x /usr/local/bin/llamafile-wrapper.sh |
| 68 | +``` |
| 69 | + |
| 70 | +## Service file contents |
| 71 | + |
| 72 | +Then we create a Service file with the content and link the shell script to it. As we want the service to restart automatically on every boot, we have to set `Restart=always`. |
| 73 | + |
| 74 | + |
| 75 | + |
| 76 | +```bash |
| 77 | +[Unit] |
| 78 | +Description=Llamafile v2 Server |
| 79 | +After=network.target |
| 80 | + |
| 81 | +[Service] |
| 82 | +Type=simple |
| 83 | +ExecStart=/usr/local/bin/llamafile-wrapper.sh |
| 84 | +Restart=always |
| 85 | +RestartSec=10 |
| 86 | +User=llamafile |
| 87 | +WorkingDirectory=/data/LLM/phi4_finetuning |
| 88 | +Environment=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin |
| 89 | +StandardOutput=append:/var/log/llamafile.log |
| 90 | +StandardError=append:/var/log/llamafile.err |
| 91 | + |
| 92 | +[Install] |
| 93 | +WantedBy=multi-user.target |
| 94 | + |
| 95 | +``` |
| 96 | + |
| 97 | + |
| 98 | + |
| 99 | +Now we move this file to `/etc/systemd/system/llamafile.service` |
| 100 | + |
| 101 | + |
| 102 | + |
| 103 | +We could also create a file and paste in the above content by using |
| 104 | + |
| 105 | +```bash |
| 106 | +sudo nano /etc/systemd/system/llamafile.service |
| 107 | +``` |
| 108 | + |
| 109 | +Paste content from clipboard, then press ctr+x. Press y and enter. |
| 110 | + |
| 111 | + |
| 112 | + |
| 113 | +Now the service is created but we have to enable and start it. We then run the following commands. |
| 114 | + |
| 115 | +```bash |
| 116 | +sudo systemctl daemon-reload |
| 117 | +sudo systemctl enable llamafile |
| 118 | +sudo systemctl start llamafile |
| 119 | +``` |
| 120 | + |
| 121 | + |
| 122 | +To check the status of the service |
| 123 | +```bash |
| 124 | +sudo systemctl status llamafile |
| 125 | +``` |
| 126 | + |
| 127 | + |
| 128 | +Stop the service by using |
| 129 | + |
| 130 | +```bash |
| 131 | +sudo systemctl stop llamafile |
| 132 | +``` |
| 133 | + |
| 134 | + |
| 135 | + |
| 136 | +Disable the service |
| 137 | + |
| 138 | +```bash |
| 139 | +sudo systemctl disable ollama |
| 140 | +``` |
| 141 | + |
| 142 | + |
| 143 | + |
| 144 | + |
| 145 | +## Future updates |
| 146 | + |
| 147 | + |
| 148 | + |
| 149 | +Now you only need to update the command in shell script `/usr/local/bin/llamafile-wrapper.sh`. You make the edits in nano or any text editor. |
| 150 | + |
| 151 | + |
| 152 | + |
| 153 | +```bash |
| 154 | +sudo nano /usr/local/bin/llamafile-wrapper.sh |
| 155 | +``` |
| 156 | + |
| 157 | + |
| 158 | + |
| 159 | +Once the new command is set. We do have to reload the daemon and start llamafile service once again. |
| 160 | + |
| 161 | +```bash |
| 162 | +sudo systemctl daemon-reload |
| 163 | +sudo systemctl enable llamafile |
| 164 | +sudo systemctl start llamafile |
| 165 | +``` |
| 166 | + |
| 167 | +That is it! This post showed an example for a llamfile, but this technique is useful to create any kind of service file. |
0 commit comments