A running log of decisions, gotchas, and insights for turning this build into blog posts. Written as things happen — rough notes to be shaped into posts later.
- Architecture & Network Design — why this layout, what each tier catches
- Firewall VM — Alpine + nftables, why not OPNsense
- NIDS: Suricata Dual-Tier — perimeter + internal, EVE JSON, rsyslog forwarding
- SIEM: Wazuh — autoinstall, EVE decoding, syslog receiver
- Log Analysis: Splunk — setup, EVE ingestion, SPL queries on Suricata data
- Windows AD + Detection — DC, Sysmon, Atomic Red Team
- Putting It Together — ART runs, detections across all tiers
- Forces all lab traffic through a single choke point — realistic enterprise topology
- nftables runs entirely inside an Alpine VM; easy to snapshot, rebuild, script
- Separates firewall concerns from the hypervisor — closer to how real environments work
Most home SOC labs put Suricata on one box and call it done. The problem: intra-subnet VM-to-VM traffic never crosses the router — it's switched at L2 by the hypervisor bridge and is invisible to a perimeter sensor.
Solution: two Suricata instances.
- Tier 1 on fw-router eth1 (lab-net facing) — catches north-south traffic: C2 callbacks, exfil, inbound scans, internet-bound anomalies
- Tier 2 on the aurora host itself, listening on virbr-lab — catches east-west lateral movement between VMs that never touches the router
This mirrors an enterprise setup where you'd have a perimeter IDS and a core switch SPAN port feeding an internal sensor.
- Original plan was OPNsense, switched after realising the web UI / interactive console made scripted deployment painful via virsh send-key
- Alpine boots in ~512MB RAM, nftables config is a single readable text file,
setup-alpineis fully scriptable - Downside: no pre-built Wazuh agent (glibc vs musl) — solved with rsyslog forwarding
- Aurora OS is image-based (Universal Blue / Fedora Kinoite). Adding packages
via
rpm-ostreebreaks the clean image update chain — you'd need to build a custom base image to keep getting updates properly - Podman + Quadlet systemd units give the same result without touching the base OS
- Everything is reproducible from scripts in the repo
apk add suricata fails on Alpine 3.23 — the package only exists in
edge/community. Must use:
apk add --repository http://dl-cdn.alpinelinux.org/alpine/edge/community suricataThe postinstall script auto-runs suricata-update and fetches ET Open rules (~48k rules, ~48k enabled) — no manual rule download step needed.
community-id is a hash of the 5-tuple (src/dst IP+port, proto) that's the same across Suricata, Zeek, and other tools. Enables correlation between Suricata alerts and flow records even when session IDs differ. Worth enabling — zero cost.
Wazuh agent binaries are compiled against glibc. Alpine Linux uses musl libc. They're binary-incompatible. Options considered:
- Build Wazuh agent from source on Alpine (painful, fragile)
- Run agent in a container on fw-router (possible but adds complexity to a VM that's already resource-constrained at 1GB RAM)
- Forward via syslog — Wazuh has a built-in syslog receiver and Suricata decoder
Syslog forwarding won. rsyslog's imfile module tails the EVE JSON file and
ships each line as a syslog message to Wazuh port 514/TCP.
When fw-router connects to Wazuh at 192.168.10.10, the source IP is 192.168.10.1
(eth1 — same subnet). The WAN IP (192.168.122.10) is never used for this connection.
Wazuh's <allowed-ips> must list 192.168.10.1, not 192.168.122.10.
The jasonish/suricata container image includes suricata-update. Running it to
fetch ET Open rules requires internet access. Without --network host, the
container's DNS fails:
Error: [Errno -2] Name or service not known
Add --network host to the suricata-update run. Also: /var/lib/suricata must
be a separate mounted volume — that's where suricata-update stores the enabled
sources list. If it's not persisted, enable-source et/open is lost between runs.
Status: Ready to write. All configs working, data verified in Splunk. Audience: Security engineers / blue teamers who need to get structured logs into Splunk from a host that can't run a Splunk UF (musl libc, embedded device, container, etc.) Origin: Someone at a conference asked about this exact pattern.
Splunk's Universal Forwarder requires glibc. Alpine Linux uses musl — they're binary incompatible. Same issue applies to embedded devices, minimal containers, etc. The UF is also heavier than necessary when you only need to forward one log file.
rsyslog's imfile module tails a file and ships each line as a syslog message.
Splunk has a built-in TCP input that receives those messages. A transforms.conf
stanza strips the syslog header, leaving clean JSON as _raw.
module(load="imfile" PollingInterval="5")
input(type="imfile"
File="/var/log/suricata/eve.json"
Tag="suricata-eve"
Severity="info"
Facility="local3"
PersistStateInterval="10"
ReadMode="0"
FreshStartTail="on"
StateFile="suricata-eve")
if $syslogfacility-text == 'local3' then {
action(type="omfwd"
Target="192.168.10.10"
Port="514"
Protocol="tcp"
Template="RSYSLOG_SyslogProtocol23Format")
action(type="omfwd"
Target="192.168.10.40"
Port="5514"
Protocol="tcp"
Template="RSYSLOG_SyslogProtocol23Format")
stop
}
props.conf (/opt/splunk/etc/apps/search/local/props.conf):
[suricata:eve]
SHOULD_LINEMERGE = false
KV_MODE = json
TIME_PREFIX = "timestamp":"
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%6N%z
TRANSFORMS-strip_syslog_header = strip_syslog_headertransforms.conf (/opt/splunk/etc/apps/search/local/transforms.conf):
[strip_syslog_header]
REGEX = ^[^{]*(\{.+\})$
FORMAT = $1
DEST_KEY = _rawSplunk inputs.conf (TCP input, port 5514, index=suricata, sourcetype=suricata:eve) — created via Splunk web UI: Settings → Data Inputs → TCP
imfiletails any file — not just syslog-format logs; works with EVE JSON, audit logs, etc.- Two
action()blocks in oneif= dual-forward to Wazuh AND Splunk simultaneously; one config, two SIEMs - Port 5514 (not 514) avoids collision with other syslog receivers; TCP not UDP for reliability
- The regex in transforms.conf:
^[^{]*(\{.+\})$— skips everything before the first{This strips the syslog header (timestamp, hostname, tag) leaving only the JSON payload as _raw KV_MODE = jsondoes all field extraction automatically — no field aliases neededFreshStartTail = onmeans rsyslog only ships new lines after startup — avoids replaying the whole file on restart
index=suricata | stats count by event_type
index=suricata event_type=alert | table _time, src_ip, dest_ip, alert.signature
Status: Ready to write. Process completed and verified. Audience: Homelabbers / security folks who want a reproducible, cattle-not-pets hypervisor Hook: One USB drive, three reboots, and a fully configured KVM hypervisor appears — no clicking, no forgetting configs.
Every time you reinstall a hypervisor host you go through the same ritual: click through the installer, set the hostname, create the user, install packages, copy configs, set up libvirt networks, enable services. It works, but it's manual and error-prone. Six months later you rebuild and realise you forgot a config you didn't write down.
Fedora CoreOS + Ignition flips this. The machine definition lives in a file in your git repo. The USB drive carries that definition. Boot, install, walk away.
Ignition is a first-boot provisioning system built into Fedora CoreOS. Unlike cloud-init (which runs on every boot), Ignition runs exactly once — during the initial install — and applies your config atomically before the machine is ever handed to you. By the time you can SSH in, everything is already in place: users, SSH keys, files, systemd units, the works.
The human-editable format is Butane (.bu) — a clean YAML dialect. You compile it
to the actual JSON Ignition format (.ign) before provisioning.
ucore-hci is a Universal Blue image — a custom Fedora CoreOS image that ships with KVM/libvirt, Podman, and Cockpit pre-installed. It's purpose-built for bare-metal hypervisor duty. No extra packages needed; you boot into a machine that already knows how to run VMs.
Because CoreOS uses rpm-ostree (immutable base image), you can't just dnf install
things at will. ucore-hci solves this by baking the right packages into the image itself.
For anything else — host-side services like Suricata — you run Podman containers instead
of layering packages.
ucore-hci doesn't have its own ISO. You boot the standard Fedora CoreOS live ISO and let two systemd oneshot services handle the image swap:
- Boot 1 (vanilla CoreOS):
ucore-unsigned-autorebase.servicefires, rebases toostree-unverified-registry:ghcr.io/ublue-os/ucore-hci:stable, creates a sentinel file, disables itself, reboots. - Boot 2 (unsigned ucore):
ucore-signed-autorebase.servicefires, rebases toostree-image-signed:docker://ghcr.io/ublue-os/ucore-hci:stable(cosign-verified), creates its sentinel, disables itself, reboots. - Boot 3+: Both sentinel files exist, neither service fires — normal operation.
The sentinel pattern (ConditionPathExists=!/etc/ucore-autorebase/unverified etc.)
comes directly from the official ublue-os/ucore examples. Don't invent your own
guard logic here — the official pattern is what it is for good reason.
The full config lives at ignition/ucore-hci.bu in this repo. Key sections:
- Users + SSH key — blyons account, wheel/libvirt/kvm groups, passwordless sudo
- NetworkManager unmanaged config — prevents NM from grabbing virbr* at boot (a real gotcha: NM will steal virbr0 before libvirt can claim it, breaking the default NAT network on every reboot — learned the hard way on aurora, baked in here)
- systemd-resolved stub zone — routes
*.lab.localqueries to fw-router dnsmasq - Kerberos LAB realm config — so xfreerdp NLA works with domain accounts out of the box
- libvirt network XML — lab-net definition dropped into /etc/soc-lab/ for first-boot script
- Podman Quadlet units — Suricata Tier 2 + rsyslog EVE forwarder, auto-start via systemd
- soc-lab-first-boot.service — one-shot: defines libvirt networks, DHCP reservation, storage pool
- soc-lab-routes.service — adds 192.168.10.0/24 + 192.168.40.0/24 routes via fw-router after virtnetworkd starts
- Nightly shutdown timer — powers off at 21:00, rtcwake programs RTC alarm for 08:00
There are two approaches, depending on how much you want the USB to do on its own.
coreos-installer iso customize lets you bake both the ignition config and the
target install disk into the ISO. Booting the USB installs CoreOS automatically and
powers off — no manual steps, no network required.
# 1. Compile Butane → Ignition
podman run --rm -i quay.io/coreos/butane:release \
--strict < ignition/ucore-hci.bu > ignition/ucore-hci.ign
# 2. Customize the ISO — bakes in config + target disk
cp ~/Downloads/fedora-coreos-*-live.x86_64.iso ~/Downloads/fedora-coreos-lefthand.iso
podman run --rm \
-v ~/Downloads:/data:z \
-v $(pwd)/ignition:/ign:z \
quay.io/coreos/coreos-installer:release \
iso customize \
--dest-ignition /ign/ucore-hci.ign \
--dest-device /dev/nvme0n1 \
/data/fedora-coreos-lefthand.iso
# 3. Write to USB
sudo dd if=~/Downloads/fedora-coreos-lefthand.iso of=/dev/sdX bs=4M status=progress oflag=syncBoot the USB → CoreOS installs itself to /dev/nvme0n1 with your config → machine
powers off. Remove USB, power on, wait for the three ucore-hci autorebase boots.
Done.
Caveat: --dest-device must match the actual disk on the target machine. Verify
with lsblk if unsure. Wrong device = wrong disk gets wiped.
Use this if you need to confirm the disk device first, or want to run the installer interactively.
The live ISO requires some Ignition config to boot — without one,
ignition-fetch-offline.service fails and pulls the system into emergency mode.
Use a minimal live-boot.ign (just your SSH key) for the live environment, then
pass the real config to coreos-installer install.
Step 1 — serve two configs from aurora:
cd /var/home/blyons/workspace/soc-lab/ignition
python3 -m http.server 8080live-boot.ign — minimal config, just adds your SSH key to the core user:
{"ignition":{"version":"3.4.0"},"passwd":{"users":[{"name":"core","sshAuthorizedKeys":["YOUR_SSH_PUBKEY"]}]}}Step 2 — at the GRUB menu on the target machine, press e and add:
ignition.config.url=http://<aurora-ip>:8080/live-boot.ign
Ctrl+X to boot. The live environment comes up and your SSH key is authorised.
Step 3 — SSH in from aurora and run the installer:
ssh core@<lefthand-ip>
lsblk # confirm disk name
sudo coreos-installer install /dev/nvme0n1 \
--ignition-url http://<aurora-ip>:8080/ucore-hci.ign \
--insecure-ignition--insecure-ignition is required when fetching over plain HTTP (not HTTPS).
Step 4 — reboot, remove USB.
Note: both iso ignition embed and iso customize modify a file, not a block
device. Always work on the ISO file, verify, then dd to USB — not the other way
around.
Three boots later: fully provisioned ucore-hci hypervisor, all lab services running.
The hypervisor is the most tedious machine to rebuild. Every config detail that lives only in your head is technical debt. Ignition moves that debt into a git repo. When (not if) the hardware dies or gets wiped, recovery is: compile, embed, boot. The VMs come back via rsync from a backup. No tribal knowledge required.
.ignfiles are JSON and contain your SSH pubkey in plaintext — gitignore them if your repo is public. The.busource (no secrets) is what you commit.- SecureBoot: if the machine has it enabled, you'll need to enroll the ublue-os MOK key
after the first successful boot (
sudo mokutil --import /etc/pki/akmods/certs/akmods-ublue.der). Easiest to just disable SecureBoot in BIOS on a lab machine. --bypass-driverflag in the rebase commands: required because rpm-ostree's default driver detection can fail on first boot before the image is fully settled.- Don't specify
uidfor users. Thecoreuser already owns UID 1000 on Fedora CoreOS. Specifyinguid: 1000for a second user fails withuseradd: UID 1000 is not unique(exit status 4). Remove theuidfield entirely — let the system assign one. - Don't add groups that don't exist on vanilla CoreOS. Ignition runs on first boot
before the autorebase to ucore-hci. If your user definition includes groups like
libvirtorkvm,useraddwill fail with exit status 6 ("group does not exist") and Ignition aborts. Only include groups that exist on base CoreOS (e.g.wheel). The ucore-hci-specific groups can be added after the autorebase completes if needed. - Don't use
2>&1when compiling Butane.podman run ... > ucore-hci.ign 2>&1mixes podman's image pull progress messages into the output file. The.ignends up starting withTrying to pull...instead of{, and Ignition fails with "invalid character T at line 1 col 2". Always redirect only stdout:> ucore-hci.ignwith no2>&1. - The
sleep 8insoc-lab-routes.service: virtnetworkd.service reports active before virbr0 is actually up. Without the sleep,ip route replaceraces and loses. Ugly but reliable.
Version: Splunk Enterprise 10.2.1
Install method: .deb package, manual download from splunk.com (requires free account)
- URL format:
https://download.splunk.com/products/splunk/releases/<ver>/linux/splunk-<ver>-<hash>-linux-amd64.deb - The build hash in the filename is version-specific — get the exact URL from the download page
Running as root is deprecated in 10.x.
Create a dedicated splunk system user and start with -u splunk:
sudo useradd -r -m -s /bin/bash splunk
sudo chown -R splunk:splunk /opt/splunk
sudo -u splunk /opt/splunk/bin/splunk start --accept-license --answer-yes --no-promptSetting admin password non-interactively:
The --seed-passwd flag didn't work reliably in testing. Use user-seed.conf instead:
# /opt/splunk/etc/system/local/user-seed.conf
[user_info]
USERNAME = admin
PASSWORD = YourPasswordHereCreate this file before first start. Splunk reads it on startup and removes it.
Boot-start:
sudo /opt/splunk/bin/splunk enable boot-start -user splunkInstalls an init.d script. Splunk starts as the splunk user on boot.
UF receiver:
sudo -u splunk /opt/splunk/bin/splunk enable listen 9997 -auth admin:passwordrsyslog supports multiple action() blocks in a single if block. Adding Splunk
as a second target is as simple as adding a second omfwd action before the stop:
if $syslogfacility-text == 'local3' then {
action(type="omfwd" Target="192.168.10.10" Port="514" ...) # Wazuh
action(type="omfwd" Target="192.168.10.40" Port="5514" ...) # Splunk
stop
}
Both Tier 1 (fw-router rsyslog) and Tier 2 (rsyslog container on aurora) use this pattern. No agent needed on either host.
Splunk input config:
- Index:
suricata - TCP input: port 5514
- Sourcetype:
suricata:eve - props.conf:
KV_MODE = json— Splunk auto-extracts all EVE JSON fields - transforms.conf: regex to strip syslog header, leaving clean JSON as
_raw
Splunk port choice: Used 5514 (not 514) to avoid collision with Wazuh's syslog receiver. Both listen on their respective VMs so there's no actual conflict, but different ports make firewall rules and troubleshooting cleaner.
| Confirm data is flowing
index=suricata | stats count by event_type
| Recent alerts only
index=suricata event_type=alert | table _time, src_ip, dest_ip, dest_port, alert.signature, alert.severity
| Top talkers by destination
index=suricata event_type=flow | stats sum(flow.bytes_toserver) as bytes by dest_ip | sort -bytes
| DNS queries seen by Suricata
index=suricata event_type=dns dns.type=query | table _time, src_ip, dns.rrname, dns.rcode
| TLS connections with SNI
index=suricata event_type=tls | table _time, src_ip, dest_ip, tls.sni, tls.version