What you will find here
This section is a living technical handbook. It contains the same internal standards we use to run real production systems.
Infrastructure Automation
Reference designs and standards for automating Linux infrastructure using Ansible.
Monitoring & Early Warning
Monitoring-as-code standards for reliable alerts and ownership.
Documentation Standards
Runbooks, structures, and operational documentation patterns.
- automation-first
- documentation-driven
- reproducible systems
- vendor-neutral
Details
What you will find here
This section is a living technical handbook. It contains the same internal standards we use to run real production systems.
Everything here is designed to be automation-friendly, production-tested, vendor-neutral, and documentation-driven.
Infrastructure Automation (Ansible)
Topics covered
- Repository structure and naming conventions
- Inventory design (single site / multi-site)
- Role design (idempotent, reusable, composable)
- Playbook layering (bootstrap → services → hardening)
- Variable hierarchy and secrets management
- Safe rollout strategies (serial, canary, check mode)
Example use cases
- Server bootstrap (users, SSH, sudo, updates)
- Service deployment (GLPI, Zabbix, BookStack, Nextcloud)
- Configuration drift prevention
- Disaster recovery automation
Monitoring & Alerting Playbooks
Topics covered
- Host and template design
- Proxy-based monitoring architectures
- Active vs passive checks
- Network, service, and application monitoring
- Log-based monitoring (Loki, Vector)
- Alert routing and escalation logic
- Maintenance and change windows
Outcome
- Predictable alerts
- No alert noise
- Clear ownership
- Actionable notifications
Documentation Standards
Standards defined
- BookStack structure (Book → Chapter → Page)
- Command documentation format
- Runbook templates
- Incident documentation
- Architecture decision records (ADR)
- Change and rollback procedures
Goal
Any sysadmin can take over a system without tribal knowledge.
Reference Architectures
Included architectures
- Single-node production servers
- HA web stacks
- Monitoring and logging stacks
- Inventory and discovery networks
- Backup and disaster recovery layouts
- Secure remote access architectures
Each architecture includes
- Network diagrams
- VM layout
- Storage design
- Security boundaries
- Automation entry points
Security Baselines
Covers
- SSH hardening
- Firewall policy
- Update strategy
- Service isolation
- Secrets handling
- Audit logging
Designed to be enforced automatically.
Intended audience
- Run Linux in production
- Want reproducible infrastructure
- Care about documentation quality
- Automate instead of clicking
- Need clarity, not magic
How to use this section
- Use it as reference
- Use it to write playbooks
- Use it to onboard engineers
- Use it to standardize operations
All content is structured so it can be directly translated into Ansible roles and playbooks.
Status
This section is actively evolving. Content is added as architectures and standards are validated in production.
If you want access to implementation playbooks or reviewed architectures, use the request form on the website.