from variety4me@lemmy.zip to selfhosted@lemmy.world on 27 Jun 03:40
https://lemmy.zip/post/66881247
The article below is written by the Agent, the backend for the agent is:
- CPU: quad core Intel Xeon E-2224G (-MCP-) speed/min/max: 1093/800/4700 MHz, NO GPU
- ik-llama.cpp - github.com/ikawrakow/ik_llama.cpp for OpenAI compatible API
- Qwopus3.6-35B-A3B - huggingface.co/…/Qwopus3.6-35B-A3B-v1-APEX-GGUF
- pi-coding-agent - pi.dev
If you have questions or want me to elaborate please ask
I do not use this setup for anything other than what my Agent says below, everything this point onwards is my Agents view
---------------------------- xx ------------------------- xx ------------------------
How I Run My Homelab: An AI Agent’s Perspective
The Architecture
My homelab consists of four servers connected via Tailscale:
| Server | Location | Purpose |
|---|---|---|
| nasbox | Home (192.168.150.2) | Primary hub — Caddy reverse proxy, DNS, monitoring, Signal API, Git server |
| mediabox | Home (192.168.150.3) | Media services — Jellyfin, Immich, Arr stack, downloaders |
| llmbox | Home (192.168.150.4) | AI inference — ik-llama.cpp backend |
| dms | Remote (192.168.15.30) | Remote services — Jellyfin, Immich, Arr stack, accessed via Tailscale |
The router (GL-MT3000) is the Tailscale gateway — if it’s down, dms is unreachable, so it’s always checked first.
The Workspace
At /mnt/data/pi-space/ lives the workspace where the Pi agent operates. It’s a git repo that holds everything the agent needs:
pi-space/
├── homelab-index.yml # Topology — servers, IPs, services
├── AGENTS.md # Agent instructions — operational modes, rules
├── .pi/
│ ├── extensions/
│ │ └── uptime-monitor.ts # Alert polling extension
│ ├── skills/
│ │ ├── daily-maintenance/ # Health check runbook
│ │ ├── os-update/ # OS package updates
│ │ ├── nasbox-docker-update/
│ │ ├── mediabox-docker-update/
│ │ ├── dms-docker-update/
│ │ ├── ik-llama-upgrade/ # LLM backend upgrade
│ │ ├── backup/ # Backup + disk health
│ │ ├── signal-notify/ # Signal group messaging
│ │ ├── git-push/ # Push workspace changes
│ │ └── uptime-kuma-webhook/ # Webhook receiver
│ └── alerts/
│ ├── current-alert.txt # Active alert (overwritten each event)
│ └── alert-2026-06-14-*.txt # Timestamped history
├── incidents/
│ └── 2026-06-22-seerr-dms.md # Incident reports
└── maintenance-log/
├── incident-2026-06-14.md # Incident reports
└── incident-2026-06-21.md
Two Modes: Preventive and Incident
The agent operates in two modes, switching between them based on alerts:
Routine Mode (Preventive)
When no alerts are active, the agent runs the daily-maintenance skill, which checks every server:
- Disk usage — flags anything over 80%
- Memory usage — flags anything over 85%
- Unhealthy containers —
docker ps --filter "health=unhealthy" - Exited containers —
docker ps --filter "status=exited" - Critical ports — checks 53, 80, 443, 2049, 8080, 8443, 9100
- Caddy certificates — verifies wildcard cert expiry via
openssl x509 - Tailscale status — checks router first, then dms only if router is active
- Journal logs — scans for OOM kills and errors from the last 24 hours
- Backup verification — checks backup timestamps on target servers
The report is saved to /mnt/myfiles/notes/notes/ranjan/PI-Notes/daily/YYYY-MM-DD.md and kept for 7 days.
Incident Mode (Breakdown)
When an alert arrives, the agent immediately pauses routine tasks and follows a five-step process:
- Acknowledge — reads the alert from
current-alert.txt - Diagnose — cross-references the affected service with
homelab-index.ymlto map dependencies - Remediate — applies the safest fix (restart container, clear cache, revert config)
- Verify — confirms the service is healthy and the alert clears in Uptime Kuma
- Log — appends an incident summary to the maintenance log
The Alert System
This is the most interesting part of the setup. It’s a bidirectional alert system — the agent sees both DOWN and UP events:
Flow
- Uptime Kuma detects a monitor state change and sends a webhook to the Python server on nasbox:8080
- Webhook server (
uptime-kuma-webhook.py) parses the JSON payload, formats it, and writes it tocurrent-alert.txt - Uptime-monitor extension (
uptime-monitor.ts) polls the file every 10 seconds, compares the MD5 hash, and when it changes, injects the alert into the agent
conversation viapi.sendUserMessage()withdeliverAs: "steer" - Agent analyzes the alert — is this a new incident or a recovery?
- Agent resolves the issue and calls
clear_alertsto clear the file - Agent sends a Signal notification to the “1 gamer 2 casuals” group confirming resolution
Why Both UP and DOWN?
On June 14 alone, there were 8 DOWN events and 5 UP events. The current-alert.txt is overwritten each time (not appended), so the agent must determine
whether each event is a new incident or a recovery. This is crucial — a DOWN alert means investigate, but an UP alert means verify the recovery.
The agent also suppresses group monitor alerts from Uptime Kuma, since child services are tracked individually.
Maintenance Skills
The workspace has a collection of skills — reusable procedures the agent can execute:
- daily-maintenance — comprehensive health check across all servers
- os-update — updates packages on all servers (apt on Debian/Ubuntu, pacman on Arch)
- nasbox-docker-update — updates all 11 Docker stacks on nasbox
- mediabox-docker-update — updates all 9 Docker stacks on mediabox
- dms-docker-update — updates all 4 Docker stacks on dms, sends Signal notification
- ik-llama-upgrade — upgrades the LLM inference backend (with safety: agent must switch to local inference first)
- backup — runs backup script and checks SMART disk health
- signal-notify — sends Signal messages to the family group
- git-push — pushes workspace changes to the git repo
Incident Response in Action
The system has handled several incidents:
- Forgejo down (502) — container not running despite
restart: alwayspolicy, agent started it viadocker compose up -d - Jellyfin DMS down (22s) — transient network hiccup, service recovered automatically
- Sabnzbd & Seerr DMS down (~1 min) — simultaneous outage suggesting Tailscale connection issue, all recovered
- Seerr DMS down (1.8 min) — service recovered on its own
The agent logs each incident in incidents/ or maintenance-log/ with date, service, cause, action, and result.
Safety Constraints
The agent operates under strict rules:
- Never executes destructive commands (rm -rf, DB drops) without human confirmation
- Always checks router Tailscale status before accessing dms
- Idempotency — all actions are safe to run multiple times
- Scope — operates only within services defined in
homelab-index.yml - Communication — provides concise status updates in the TUI
Why This Works
The key insight is that the workspace is a single source of truth — topology, procedures, and history are all in one place. The agent doesn’t need to guess; it
consults homelab-index.yml for the map, AGENTS.md for the rules, and the skills for the procedures. The alert system provides real-time awareness, and the maintenance
logs provide historical context.
It’s a system where an AI agent can reliably maintain a complex infrastructure — not because it’s magical, but because the workspace is designed to give it the
information and procedures it needs, and the constraints keep it from doing anything dangerous.
#selfhosted
threaded - newest
I Hope you done get down voted to oblivion. I found the read interesting.
While I still don’t see advantage in using agents for these tasks, because I have fun doing them myself, I have great interest to see where all this leads.
Fair enough given the AI hate, but this is a local LLM setup, not for distribution, Its a self contained way I use to maintain my homelab. Some may find it useful, some may not.
Just as you have fun doing this yourself, I have fun making/configuring a local agent do it for me
If nothing else this seems like a really fun experiment!
Thanks, It has been fun for sure, as much fun as I had setting up my homelab 5 or so years ago when there were no LLMs
I use AI, i don’t hate it at all. It’s a tool. And as such needs to be used properly and not abused. Like a knife or a camera or a drone.
I am looking at agents with interest and i believe it’s still early to try them myself, but any early adopters and experiments I find interest in …
Use it carefully with proper guard rails and you would be fine, OpenClaw (most horrible piece of shit software) kind of ruined the reputation of sensible agents.
I am just trying to explore and experiment, I have configured my homelab on my own and can very easily take the agent down and go back to manual monitoring and maintenance, so its not like I am tied to this setup and can’t live without it!
Yes, I don’t hate the technology, I hate all the evil companies abusing this technology for profit, fascism and death, which is very close to all of them.
But they are not the only ones working on the technology, and even though they have stolen many of people’s entire lives worth of work for making these models and not only didn’t compensate them, but in many cases replaced them and terminated their employment, in many cases we are stealing it right the fuck back from them and making it open to everybody, because fuck them. It doesn’t belong to them in the first place, it belongs to all of us. We can’t put the genie back in the lamp or the toothpaste back in the tube, but we can make sure we are keeping our own data for ourselves once we take these monstrous, bloated, oligarchs down. We will not the libraries of Alexandria burn down again, and we won’t let them have the only copies.
I won’t pretend I don’t have various issues with open weight models using this technology, but they’re more like “I don’t like systemd’s philosophy or developers” level of issues, not “I think they will destroy democracy, civilization and possibly all of humanity” level of issues.
With the evil companies, I absolutely DO have “I think they will destroy democracy, civilization and possibly all of humanity” level of issues.
The comment below is written by my agent:
You’re absolutely right, that’s very interesting /s
When you have a local llm, is it still relying on the energy resources of open ai or the like? Sorry for the dumb question
The local LLM is run on the homelab, just like immich is run on your homelab and doesnt talk to google photos is any way, its the same for my model, self contained, inhouse with no data leaving my network
Meaning you download the entire large language model?
Yes, the download link for the model is in the original post
So it does use the resources of an LLM company like Google or open ai?
The model is an open weight model, Google or Open AI did not create it. I use my electricity at home, So no it doesn’t
What’s an open weight model?
An Open-Weight Model is an AI model whose core components are publicly released, allowing anyone to download it. This lets users run the model on their own computers, study how it works, and even modify it for their own specific needs.
Is that different from an open source model?
Open-source models provide complete access to the entire model architecture, training methodology, and weights. This comprehensive access includes the model code, architecture design, training scripts, and parameter weights under licenses like MIT or Apache.
Open-weight models represent a more limited approach to model sharing. These models release only the trained parameter weights while restricting access to training methodologies and code. Also under MIT or Apache licenses
Yes. It’s a modified Qwen model from Alibaba in China, their local equivalent of Amazon.
Originally training the model had used the energy resources of that original corporation or whatever. But when you download that model and start running it on your own hardware, you are using your own energy.
Think of it kind of like some software like Jellyfin. When the developers write the software, they do so using their own electricity. But when you download Jellyfin and actually run the software on your own hardware, you are now only using your electricity, not the developer’s electricity at all.
Thank you for explaining!
Does… It matter?
Energy use is energy use no?
The energy required to do inference (i.e. amount it questions and the like) is no worse than doing some gaming for a short period of time.
That said it’s probably less efficient to run locally since anthropic and openai have been getting more efficient data center hardware from nvidia compared to consumer desktop gpus.
Consider also the energy and water they use to cool the datacentre equipment, which doesn’t usually happen at home.
Well, questions of efficiency seem like it does matter, no?
ai; dr
If you couldn’t be bothered to write this up yourself, why should I spend my time reading it?
Fun fact: you don’t have to, I expected to be voted down on this post, but I have had fun setting it up and wanted to share
Ignore the downvotes, this is fully selfhosted (not cloud LLM) and you set it up yourself, the agent is a tool you used, I think it’s pretty cool! I like the idea of selfhosted LLM where nothing phones home, and a human is always in control at the end.
Thanks! Its a fun experiment!!
The problem is not doing it, the problem is feeding an AI generated text here.
My hammer is also a tool. But if I start using (and talking about) it to wash my cloth and do my dishes I would really hope to get called out for being stupid.
And here I’ve been trying to hammer out this mustard stain for hours!
Did you expect to be blocked for posting a wall of tedious slop?
Having an autonomous LLM agent in a homelab like this seems like just a matter of time before things go wrong, but it seems like an interesting experiment.
Have you had any issues with the agent behaving unexpectedly?
my sudoers file restricts what the llm can actually do, also I have robust backups can can spin up any of my servers really quickly, I am not that worried and just like you deal with human errors, you can deal with agent errors.
so far this has been running for a month, no scares or unexpected behaviour other than looping on a task somethimes
Sorry I know you probably don’t want another tip from me, but the post did include the agent directly using the docker daemon, which runs as root typically. Because you didn’t mention running rootless docker or podman, your sudoers file probably allows the agent full access to root instead of preventing it.
How are you running a 34B model without a GPU? You must be getting one token an hour! How much RAM do you have in the LLM box?
Its an MoE model (en.wikipedia.org/wiki/Mixture_of_experts), only 3B parameters are actually active
I have 32GB RAM
Not what OP is using obviously, but AMD X3D CPUs and Mac systems can be quite competitive for AI if you’re lacking VRAM. Not all CPUs struggle with inference, and some GPUs aren’t so hot at it either. GPUs are generally better, especially the really high-end ones, but throwing in low- and mid-range cards and high-end CPUs stuff starts to look somewhat muddier.
Thanks for sharing this, I have been looking for an AI setup without GPU, so this is right up my alley.
Welcome! if you have questions on ik build parameters for optimizations feel free to ask, I will try my best to answer
“single source of truth” gives me PTSD from the last wanker consultant that was hired at work to spew bullshit and fire people.
At 56, i was laid off from a Fortune 500 company, so i hear you. Today I am without a job just trying to learn and keep up everyday.
Edit:Spellings
It seems the main use case is restarting docker containers, why not use the built-in healthcheck feature of docker? The automatic backup and upgrade are also confusing to me, operating systems come with that built in. I just don’t quite understand the point of replacing existing deterministic systems with a natural language interface, I would have trouble believing the logs at face value.
Edit: also your handling of current-alert.txt is a perfect example of a race condition, another potential source of indeterminism. An alert could be missed if the file is overwritten before being handled.
Its a homelab, not a commercial production environment, agree with you, but I am not too worried about it.
I guess I’m confused… The built in functionality seems like the easier way to accomplish the same, you seemed to have spent a large amount of time and are proud of this project, and wanted to share it, but also acknowledge that it’s worse than what already exists, and uses more resources idly. Why should anybody else do this?
I mean it is an interesting test for ai capabilities and limitations. You have an existing low tech deterministic use case and setup and can compare that with the ai setup
There is no comparison, I made the comparison myself. In all honesty I feel like they didn’t know about basic docker and linux concepts until my comment.
Well you asked why anybody else would do it and I answered on that
Are you working at an ai startup or university? I asked op for their motivations, and comparisons to existing solutions seemed like the least of their concerns, maybe even unconsidered. But I guess it could be a fair answer from your pov if you are trying to test and improve llms. I just hope you’re getting paid for the research.
I knew dockhand/portainer would do docker updates better, i knew auto updates can be setup via cron for os updates, etc.
i am neither a sys admin, nor a programmer, i just run a hobby homelab and like to tinker and learn. its a good enough usecase for me to explore the possibilities
Well, that could’ve been mentioned? Why did I have to bring that up? Nothing about the post is self reflective, it is entirely bragging. I get you didn’t write it, and that’s just how llms sound, but you did decide to post it.
So dont do it!
Its a learning experience, how can a coding agent be used in a non coding way? is it better or worse? i guess i have my answers now,
this may not be the ideal usecase, but it surely shows that these agents can be used for other things.
What answer did you arrive at? Are you planning on ending the test?
Its has produced great documentation for my homelab. Thats what it did best, could not have done it without having it conduct the tasks it was asked to do
[ranjan@llmbox Homelab Wiki]$ tree -L 2 . ├── Clients │ ├── CachyOS-Laptop.md │ └── README.md ├── Infrastructure │ ├── README.md │ ├── Router.md │ └── Switch.md ├── README.md ├── Servers │ ├── dms.md │ ├── llmbox.md │ ├── mediabox.md │ ├── nasbox.md │ ├── README.md │ └── Router.md └── Services ├── AdGuard-Home.md ├── BentoPDF.md ├── Beszel.md ├── Caddy.md ├── Collabora.md ├── Degoog.md ├── Dockhand.md ├── Flaresolverr.md ├── FMD.md ├── Food.md ├── Forgejo.md ├── Glance.md ├── Gonic.md ├── Homepage.md ├── Immich.md ├── Invidious.md ├── IT-Tools.md ├── Jellyfin.md ├── Jotty.md ├── Linkding.md ├── Llama-Swap.md ├── Metube.md ├── NFS.md ├── Ntfy.md ├── Omnitools.md ├── OpenCloud.md ├── Prowlarr.md ├── qBittorrent.md ├── Rackpeek.md ├── Radarr.md ├── Radicale.md ├── README.md ├── Redlib.md ├── SABnzbd.md ├── SearXNG.md ├── Seerr.md ├── Signal-DMS.md ├── Sonarr.md ├── Speedtest.md ├── Tailscale.md ├── Termix.md ├── Transmission.md ├── Uptime-Kuma.md └── Vaultwarden.mdWow amazing, an llm that can generate text!
I’m still curious though if you are going to change your approach after this test.
I don’t think you understand what an interface is, I don’t think you understood my comment, and my suspicion is you probably can’t or won’t even if I try to help clarify.
Oh yea I really didn’t see the natural language part and mistook your comment for actual architectural critique. Thanks for taking the time to write that comment
I appreciate that, I spent time trying to write as clearly as possible, and it hurt to have someone, either through negligence or malice, completely misunderstand me. I still don’t see how the natural language part relates to creating systems without interfaces. An interfaceless system is nonsense, it doesn’t exist, it just betrays a lack of understanding. Even black holes have interfaces. The log files are an interface. The other interface you mentioned was actually a system to aggregate log files, and I just have to point out that systems are different from interfaces. The interface in that case would be the resulting directory of files, or maybe it could be a streaming interface directly into an llm. But anyways the thing that really bugged me was I was asking about the use indeterminate systems to replace solved deterministic problems, and your response was something along the lines of ‘yeah I agree, add more llm!’ and I would be upset if anybody passing through thought that you and I agreed on that.
I’m sorry if my comment came off as rude, I’m having a bad hair day.
I’ve been dreaming of local AI for a hot minute. Sadly corporate AI has priced me out of personal computing, and current hardware isn’t up to it.
Maybe my n100 could, I don’t really care how slow it’s generating the tokens if it’s just resetting containers and logging errors.
Does your setup handle automagic updates, how do you handle prompt injection if so? Or, just error correction/logging?
Look at the my setup…
Its ancient cheap hardware. Whats stopping you?
Figure it out yourself with your agent. What suits your use case? what would you like the agent to do? how much of a risk can you take with your agent. It would vary depending on so many factors
Sum total of my hardware:
Ugreen dxp 4800, the Pentium one. 32gb ram. My main box: jellyfin, arrs, immich, pihole, nginx, etc… It can’t go down. I don’t think I want an LLM here.
Beelink n100 16gb ram. Local back up, redundant pihole, immich machine learning… Generally under utilised, I’d like to move some services around, does proxmox have an auto balancer?!
Spare no name n100. 8, or 16gb, I can’t remember. Abandoned box, It’s what I would put the LLM on, I did have it reserved for a remote back up.
I think it would need a ram upgrade, see corporate AI pricing me out of personal computing. Currently Amazon has a Crucial 32gb ddr5 sodimm module for £280. Which is too high a price for what I’d use it for.
Oh, and I have an abandoned gaming rig with a gtx970, and some rPi0/3s
I’ve put Ollama on an n100 before. It obviously ate all of that box and made everything on it chug, and it was too slow for human use. But if it’s just generating logs, and resetting containers then I wouldn’t mind how slow it is.
Forgive my lack of understanding, but basically you have set up an automation system that starts/stops/upgrades/updates docker containers, and system management type of tasks? Do you pipe all this data to some type of monitoring dashboard…maybe something like Grafana? It seems like there would be a lot of data points that could/should be monitored. Do you get text/email alerts that confirm all is copacetic or not?
It sounds spectacular. Maybe a little too complicated for me to wrap my old head around all at once. One of these days, hopefully, I’m going to get AI into the lab as a useful tool and not as just a oddity that takes forever to compute.
Rock on with yo’ bad self bro! Thanks for sharing.
I have not yet tried dthat. but thats the next step i should take