Skip to content

Hardware

attic-gremlin

attic-gremlin is the most general-purpose machine in the lab. It runs the monitoring stack and hosts Ollama for Home Assistant's AI features. It also serves as the off-NAS backup target for smaug's important data.

Component Detail
Motherboard Gigabyte B550 EAGLE WIFI6
CPU AMD, 12 cores/threads, ~4.27 GHz boost, governor: performance
RAM 32 GB (30.7 GiB usable)
GPU AMD Radeon 9060 XT (amdgpu)
OS drive Corsair MP600 ELITE NVMe ~916 GB (nvme0n1, mounted /)
Secondary SSD Micron 512 GB (sda, barely used)
Backup drive Samsung 870 EVO 1 TB (sdb, ZFS pool named backup, mounted /backup)
Network enp7s0 1 Gbps ethernet, wlp6s0 WiFi
OS Ubuntu 25.10 (Questing Quokka), kernel 6.17.0-14-generic

GPU monitoring note: Temperature, clock speed, and power draw are all exposed natively through node_hwmon (chip: amdgpu). Available sensors include edge temp, junction temp, mem temp, shader clock (sclk), memory clock (mclk), and PPT power. A separate GPU exporter is only needed if you want utilisation percentage — see Future Work.

ZFS backup pool: The pool is named backup and mounts at /backup — not /mnt/backup. This matters when writing PromQL queries. Pool state is exposed via node_zfs_zpool_state{state="online",zpool="backup"}. Alert if this drops to 0.


immichbox

immichbox is an HP ProDesk 600 G2 SFF repurposed as a media server. It runs the full *arr stack, Jellyfin, and also hosts the Home Assistant VM. It's the busiest machine in the lab in terms of services, and it runs close to its RAM ceiling — read the memory notes carefully.

Component Detail
CPU Intel Skylake, 4 cores / 8 threads, ~4 GHz, governor: powersave
RAM 16 GB — runs tight, see memory notes below
GPU Intel ARC A310 (i915) — fan, temp, energy via node_hwmon
OS drive 512 GB MKNSSDS2512GB (sdc, mounted /, ~41% used)
Jellyfin TV Samsung 870 EVO 1 TB (sda, mounted /mnt/jellyfin/tv)
Jellyfin Movies Samsung 870 EVO 1 TB (sdb, mounted /mnt/jellyfin/movies)
NFS from smaug 192.168.4.50:/mnt/smaug/immichbox/jellyfin/mnt/smaug_jellyfin (~8.6 TB available)
OS Ubuntu 24.04 LTS (Noble Numbat), kernel 6.17.0-14-generic

GPU monitoring note: Like attic-gremlin, the Intel ARC A310 exposes thermals via node_hwmon out of the box. At idle: fan ~3400 RPM, GPU temp ~49°C. No separate GPU exporter needed for thermals.

Memory — important: Under normal full load (HAOS VM + media services), available RAM should sit around 6–7 GB. If it drops significantly below that, something is wrong. The HAOS VM alone consumes 4 GB. A known incident (2026-03-15) saw gnome-system-monitor — triggered by an iOS monitoring app maintaining a remote desktop session — leak to 8 GB over ~9 days and nearly cause an OOM event, exhausting all 4 GB of swap. Killing the process recovered ~6 GB immediately. Swap drains back slowly after recovery; that's normal Linux behaviour.

The Prometheus alert to have in place: node_memory_MemAvailable_bytes{job="immichbox"} < 500000000 (500 MB threshold).


smaug

smaug is the NAS. It runs TrueNAS SCALE and is treated as a high-value storage appliance — no third-party software is installed directly on it. Everything it does beyond storage (Immich, metrics) runs either as a TrueNAS app or via built-in TrueNAS features.

Component Detail
OS TrueNAS SCALE 25.10.1 (Goldfisch)
Storage Mirror RAID: 2× 14 TB HDDs + SSD cache
Network 192.168.4.50 / smaug.lan

The philosophy here is intentional: smaug stores things that are hard to replace (photos, documents, media libraries). Keeping the software surface minimal reduces the risk of something breaking the array. TrueNAS's built-in Graphite reporting exporter handles metrics delivery — no Node Exporter, no agent software.

Graphite metrics — what's flowing: To verify at any time what categories are being received:

curl -s http://localhost:9108/metrics | grep "^smaug" | grep -v "^#" | sed 's/ .*//' | sort -u | sed 's/smaug_truenas_//' | cut -d_ -f1 | sort -u
Category Key metrics available
truenas CPU usage per core (cpu0–cpu7), memory total/available, disk reads/writes/busy per drive by serial, ZFS ARC size/free/available, ARC hit/miss ratios
cputemp CPU temperature overall + per core (cpu0–cpu7)
system Uptime, load average (1/5/15m), net received/sent, clock sync status
services CPU, memory, I/O per TrueNAS service (middlewared, docker, nfs, nginx, etc.)
cgroup Per-container CPU/memory/IO — container hash 2ae96f9fa3e6 is the Immich container
nfsd NFS server statistics
net Per-interface network stats

Disk serial numbers: Disk metrics use serial numbers in the metric name. Reference:

Serial Identity Metric suffix
ZR606EQS 14 TB HDD (mirror member 1) _ZR606EQS_5000c500e325cd04_
ZR705SWK 14 TB HDD (mirror member 2) _ZR705SWK_5000c500e325cd5f_
UGXWR01J7BDAZE SSD cache _UGXWR01J7BDAZE_500a07511ef813be_
25356C800602 4th drive (identity unknown) _25356C800602_e8238fa6bf530001001b448b42f6cd96_

Full metric name pattern: smaug_truenas_truenas_disk_stats__SERIAL_reads / _writes / _busy

Pool health — important gap: TrueNAS SCALE 25.10 does not push ZFS pool state via Graphite. Pool health must be checked via the TrueNAS UI (Storage → Pools). A future option is to poll the TrueNAS REST API at /api/v2.0/pool from attic-gremlin using a textfile collector script exposed via node_exporter's --collector.textfile.directory flag — see Future Work.


HAOS (Home Assistant OS)

HAOS runs as a VM on immichbox using libvirt/virt-manager. Its VM name on the host is arx-domus. It has a physical NIC passed through via VFIO, which is why it gets its own IP rather than sharing immichbox's. This makes it behave like a standalone device on the network despite being a VM.

Detail Value
VM host immichbox
VM name arx-domus
IP 192.168.4.40 / homeass.lan
Port 8123
QCOW2 image /home/immichbox/Desktop/haos_ova-16.3.qcow2
RAM allocated 4 GB
vCPUs 2

bazzite

Primary gaming PC. Two IPs due to separate ethernet and WiFi interfaces.

Detail Value
IP (ethernet) 192.168.4.100
IP (wifi) 192.168.4.101
OS Bazzite

ender

Creality Ender V3 KE running Klipper firmware with Mainsail as the web UI. LAN-accessible only — external access via mainsail.yooshtek.com is currently broken (see Known Gotchas).

Detail Value
IP 192.168.4.200 / ender.lan
UI Mainsail on port 4409http://ender.lan:4409
Firmware Klipper (stock Creality board)