+ "markdown": "---\ntitle: Create a llamafile service in linux\ndescription: A walkthrough of running llamafile as a service\nauthor: \"Jeevith Hegde\"\ndate: 2025-11-04 19:08 +0200\nformat: html\ntoc: true\nlang: en\njupyter: python3\nipynb-shell-interactivity: all\nexecute:\n echo: false\ncategories: [\"llm\"]\ntags: [\"local llm\", \"llamafile\", \"self hosted\", \"servicefile\"]\ncomments:\n giscus:\n repo: jeev20/jeev20.github.io\n---\n\n\n## Background \nIn order to run llamafile as a service we have to create a system account which has access to the folder and can run the command on restart. We call the user `llamafile`. \n\nBelow I show how a fine_tuned Phi4 model can be served.\n\n \n```bash\nsudo useradd -r -s /usr/sbin/nologin -U -m -d /data/LLM/phi4_finetuning llamafile\nsudo chown -R llamafile:llamafile /usr/local/bin/llamafile\n```\n\n \n\nLets move the llamafile binary to a better place and change the ownership\n\n```bash\nsudo mv /data/LLM/phi4_finetuning/llamafile-0.9.3 /usr/local/bin/llamafile\nsudo chmod +x /usr/local/bin/llamafile\n```\n\n \n\nWe will also create log files for the service\n\n```bash\nsudo nano /var/log/llamafile.log /var/log/llamafile.err\nsudo chown llamafile:llamafile /var/log/llamafile.*\n```\n\n \n## Wrap our commands in a shell script \n \n```bash\nsudo nano /usr/local/bin/llamafile-wrapper.sh\n```\n\nPaste the command in the shell script\n\n```bash\n#!/bin/bash\nexec /usr/local/bin/llamafile -m /data/LLM/phi4_finetuning/unsloth.Q4_K_M.gguf -ngl 9999 --gpu nvidia --server --v2 -l 0.0.0.0:8080 --temp 0\n```\n\nMake it an executable, this ensures that the user we created previously can easily invoke this script on restart.\n\n```bash\nsudo chmod +x /usr/local/bin/llamafile-wrapper.sh\n```\n\n## Service file contents\n\nThen we create a Service file with the content and link the shell script to it. As we want the service to restart automatically on every boot, we have to set `Restart=always`. \n\n \n\n```bash\n[Unit]\nDescription=Llamafile v2 Server\nAfter=network.target\n\n[Service]\nType=simple\nExecStart=/usr/local/bin/llamafile-wrapper.sh\nRestart=always\nRestartSec=10\nUser=llamafile\nWorkingDirectory=/data/LLM/phi4_finetuning\nEnvironment=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin\nStandardOutput=append:/var/log/llamafile.log\nStandardError=append:/var/log/llamafile.err\n\n[Install]\nWantedBy=multi-user.target\n\n```\n\n \n\nNow we move this file to `/etc/systemd/system/llamafile.service`\n\n \n\nWe could also create a file and paste in the above content by using\n\n```bash\nsudo nano /etc/systemd/system/llamafile.service\n```\n\nPaste content from clipboard, then press ctr+x. Press y and enter.\n\n \n\nNow the service is created but we have to enable and start it. We then run the following commands. \n\n```bash\nsudo systemctl daemon-reload\nsudo systemctl enable llamafile\nsudo systemctl start llamafile\n```\n\n\nTo check the status of the service\n```bash\nsudo systemctl status llamafile\n```\n\n\nStop the service by using\n\n```bash\nsudo systemctl stop llamafile\n```\n\n \n\nDisable the service\n\n```bash\nsudo systemctl disable ollama\n```\n\n \n \n\n## Future updates\n\n \n\nNow you only need to update the command in shell script `/usr/local/bin/llamafile-wrapper.sh`. You make the edits in nano or any text editor. \n\n \n\n```bash\nsudo nano /usr/local/bin/llamafile-wrapper.sh\n```\n\n \n\nOnce the new command is set. We do have to reload the daemon and start llamafile service once again.\n\n```bash\nsudo systemctl daemon-reload\nsudo systemctl enable llamafile\nsudo systemctl start llamafile\n```\n\nThat is it! This post showed an example for a llamfile, but this technique is useful to create any kind of service file. \n\n",
0 commit comments