BLACKGLASS · DRIFT
What drift looks like in practice
· Obsidian Dynamics
When teams say they have a drift problem, they often picture an attacker pivoting through SSH or a rootkit on a database host. That happens. It is also not the usual Tuesday.
Most drift is quieter: a package installed during an incident fix, a firewall rule opened for a one-off test, a sudoers line added for a contractor, a cron job that outlived the project it supported. None of it reads like a breach in the moment. All of it changes the shape of the system you thought you were running.
The changes that show up first
On Linux fleets we see the same categories repeat:
- Remote access posture.
PermitRootLogin, password authentication, new keys inauthorized_keys, or a jump host that suddenly allows a wider CIDR than the baseline. - Listening services. A debug port left open after a deploy, Docker publishing a container port to
0.0.0.0, or a forgottensystemdunit binding on a non-standard interface. - Package and user churn. Tools installed ad hoc on production because someone needed them once. Shared accounts that never got removed. UID gaps that make audits painful.
- Scheduled work. Cron and timer units are drift factories. The job made sense in March. Nobody remembers it in November.
Each item alone might be defensible. The problem is accumulation without a record. You cannot triage what you never captured as a change.
Why severity beats raw diffs
A full configuration diff on a busy host is unreadable. Operators need ranking: which changes widen blast radius, which are cosmetic, which need a human before close of business.
That is why Blackglass classifies findings rather than dumping bytes. A new listening port on a database host is not the same severity as a documentation package update on a bastion. Both are drift. Only one should wake someone up.
What good response looks like
Good drift handling is boring on purpose:
- Capture an approved baseline when the host is in a known-good state.
- Compare on a cadence that matches risk — daily for production, weekly for lower tiers.
- Triage by severity, assign an owner, export evidence if compliance is in scope.
- Either revert, approve, or update the baseline — but do not leave the finding in limbo.
The failure mode is not missing a scanner. It is having output nobody acts on because every alert looks equally urgent.
Drift and incidents
During incidents, drift accelerates. Teams open paths that would never pass change review under normal conditions. That is rational in the moment and expensive later if nobody reconciles the host back to policy.
The useful question after an incident is not only "what broke?" but "what did we change while fixing it?" Baselines make that answer searchable instead of tribal.
If you want to see severity-ranked drift on your own fleet, Blackglass is built for exactly this. There is a free Lab tier for homelabs and a 14-day trial on paid plans.
If we missed a drift pattern you see often, tell us. We update posts.