APPLICATION · NO. 15JUNE 12, 2026

Numbers Were the Best in the Company

A site can be spotless on the injuries it counts and saturated with the kind it does not. The literature has known where the next fire sits for forty years, and known why nobody wrote it down.

HECTOR BENITEZ VENTURA · LATENT VARIABLES

The bypass that never came back out

I have stood in a control room the morning after a flash fire and watched a board operator say, almost bored, that everyone on days knew that unit would bite someone eventually. An interlock had gone into bypass during a startup so the unit could come up on schedule, and never came back in, and nobody logged it. When I asked why, he gave the answer that explains most of this field in one sentence: because then you own it. Upstairs the fire read as bad luck. The recordable injury rate was among the best in the company, the audits were green, and the one fact that predicted the event sat in the heads of half a dozen operators who had every reason to keep it there.

That gap, between a clean number and a known hazard, is the whole subject, and it is the oldest finding in safety research and the most ignored. The literature has said for forty years that personal-injury statistics and major-hazard exposure are different animals, and that the people who can see the next event coming are the least likely to say so. None of it is new; it just refuses to stick.

Counting the wrong injuries

The most expensive version of this mistake has a name and a number. After the 2005 BP Texas City explosion killed fifteen people and injured 180^[1], the independent panel chaired by James Baker found a refinery that had confused good personal-injury rates with good process safety. The slips-and-trips numbers were fine. The thing that blew up was a different category of risk, governed by maintenance backlogs, fatigued operators, and a leadership that did not hear bad news. Reading the earlier signals as noise cost past two billion dollars in civil settlements. Five years later the same company lost the Deepwater Horizon and absorbed something near sixty-five billion. The buyer of a culture diagnosis is paying to learn which of those worlds they are in before the second event tells them for free.

Which world were they in: the published cost band, same company

CSB final report on BP Texas City (2007); BP Deepwater Horizon total cost (Maritime Executive).

The reason a clean recordable rate is not reassurance is mechanical, and the Krause Bell Group made it precise^[2]. Their multi-company work established that serious injuries and fatalities have different precursors than minor injuries, which is why fatality rates stayed flat for years while recordable rates fell. Roughly a fifth of the recordables in their data carried serious-injury-or-fatality potential; the rest did not. So a site that aims its spending at recordable counts is mostly buying down sprained ankles while its fatality exposure sits untouched. A SIF precursor is just high energy present and a weak control, and the people who can see it are the ones doing the task. It explains how a site can be genuinely proud of its numbers and genuinely on fire at the same time.

Work as imagined, work as done

If you want to understand why the procedure and the practice diverge, the cleanest account is Sidney Dekker's distinction between work as imagined and work as done^[3]. Work as imagined is the procedure and the plan; work as done is what the crew actually does to get the job out the door. Dekker's point, correct and underused, is that the gap between the two is permanent and often necessary, not a discipline problem. The procedure cannot anticipate every degraded pump and missing part, so the floor improvises, and the improvisation becomes the real operating method that nobody wrote down. Treat the worker as a problem to correct and you lose the one person who can map it.

Todd Conklin built a practical instrument out of the same insight. His learning teams^[4] ban the two questions that kill candor, what rule was broken and who is at fault, and keep the ones I would put on the wall of every incident room. What surprised you. What did the procedure assume that was not true. What do you manage around to get the job done.

“Workers are not the problem to be controlled. They are the solution to be harnessed.”

TODD CONKLIN, ON LEARNING TEAMS AND PRE-ACCIDENT INVESTIGATION

Behind both of them sits James Reason, who died in 2025^[5] and remains the most cited theorist in the field. His Swiss cheese model says accidents happen when holes in successive layers of defense line up, and the holes at the sharp end, the unsafe acts, are created upstream by latent conditions: decisions about design, staffing, budget, and schedule made far from the event. The failure type I think about most is incompatible goals, the honest word for what the floor lives every shift: the production target and the safety rule collide, somebody chooses in real time, and the choice is never recorded. Reason's discipline is to never stop at the unsafe act, but to trace the latent condition and the goal conflict that made it rational.

The same plant, described twice

Here is where the diagnostic move gets concrete. The Baker Panel visited all five BP US refineries, interviewed more than 700 employees, surveyed about 7,500, and reviewed over 340,000 pages of documents^[6], checking testimony against physical reality. The signature instrument was simple: ask the same culture questions of executives, managers, and hourly workers, and report the divergence by level. The dss+ Bradley Curve, backed by a database of over 5.5 million survey responses^[7], makes the same move its headline finding. What matters is not where a site lands on the maturity curve in the absolute, but the gap between how the hourly workers score the place and how the managers score it. When the plant manager and the operators describe two different plants, you have learned more than any score could tell you.

Below the line or above it: Bradley Curve stages, centered on the ladder midpoint

dss+ Bradley Curve maturity stages; database of 5.5M+ responses. Each stage shown as its distance in points above or below the ladder midpoint.

I want to push back a little, because the survey instruments can flatter themselves. A perception survey locates a sick unit; it does not tell you why. Dov Zohar, who invented the safety climate construct in 1980^[8], drew the line that matters: climate is workers' shared read on which behaviors actually get rewarded under pressure, and it forms at the work-group level, not the company level. A single site can hold an excellent climate and a terrible one a shift over. The diagnostic is not a slogan audit. It is a question: what did the supervisor do the last time production and safety collided. The truth is in the last collision, not the values statement.

Drift, and the things that used to stop the job

The slowest and most dangerous failure has no single moment you can point to. Diane Vaughan called it the normalization of deviance^[9] in her reconstruction of the Challenger decision. Each flight that came back with O-ring damage made the next one's damage acceptable, the baseline shifted by degrees, and the whole thing drifted toward disaster while everyone believed they were being reasonable. Drift is invisible from inside the system that is drifting. The only way to measure it is to ask the long-tenured people what would have stopped the job ten years ago that does not stop it today, and to notice when they laugh before they answer. The laugh is the data: the thing that should still be alarming has become a Tuesday.

Andrew Hopkins, reading the inquiry records of single disasters, gave it an attention rule I use. His Failure to Learn shows Texas City repeating the lessons of Longford almost exactly^[10]: warning signs visible and ignored, attention fixed on lost-time injuries while the major-hazard indicators went unmanaged. Ask what the site measures and bonuses on, because that is where attention goes, and major-hazard risk is almost never in the bonus. Weick and Sutcliffe^[11] sharpen the instinct into a test of episodes, not survey items: the last time a junior person disagreed with a senior call, what happened.

Why it stays in their heads

Pull these threads together and they converge on one mechanism, the thing I actually believe after standing in enough of these rooms. The information that predicts the next event almost never lives in the data systems. It lives in the heads of operators, mechanics, nurses, drivers, and foremen, and stays there because every channel built to collect it has historically punished honesty. The operator knows which near miss never became a report. The EHS manager knows which unit they would not let their own kid work.

And it gets quieter every level it climbs, until the board hears a green status color. A chilled environment, the worst case, is defined entirely by the reports that were not filed, and you cannot see the absence of a report on a chart.

What the floor knows, and how little of it reaches the board

Baker Panel (leadership did not hear bad news); the EHS 'honest denominator' that real near misses run 5x+ the reported count. Share of frontline-known signal surviving to each level.

The grim part is that the answer is not unknowable. It is unasked. Every expert above is describing the same recovery move: get a neutral party to ask a specific person about a specific recent moment, somewhere the answer cannot be used against them. The DOE Office of Enterprise Assessments wrote down the most explicit version^[12], a protocol of structured interviews mapped to behavioral anchors so testimony becomes comparable across assessors. The NRC's standard for a safety-conscious work environment^[13] is the same thing in regulatory language. None of it is exotic. The Baker Panel, the DOE protocol, the nuclear assessments, the learning teams, every one is at bottom a structured program of confidential interviews. Organizations do not lack the answer because it is hidden. They lack it because the only channels they built are the ones the floor long ago learned to lie into.

Which points at the kind of instrument this calls for. Not another perception survey, which finds the sick unit and stops there, and not another audit, which the site grooms for the week before. Something closer to what the canon has described for forty years: a neutral, confidential conversation, anchored to a real recent shift, run for enough people that no answer traces back to one person, and read at the altitude where each of them actually knows something. That is the instrument we are building at Latent Variables. The literature already told us where the next fire sits, and why nobody wrote it down; the open problem was only ever reaching the people who know in time.

REFERENCES

1.U.S. Chemical Safety Board, Investigation Report: Refinery Explosion and Fire, BP Texas City (2007); 15 killed, 180 injured; ~$2.1B civil settlements, $87.4M OSHA fine. www.csb.gov/assets/1/20/csbfinalreportbp.pdf
2.Krause Bell Group, serious injury and fatality (SIF) precursor research: SIFs have different precursors than minor injuries, and roughly a fifth of recordables carry SIF potential. krausebellgroup.com/what-is-a-sif-precursor
3.Sidney Dekker, Safety Differently and The Field Guide to Understanding Human Error: work as imagined versus work as done. sidneydekker.com
4.Todd Conklin, Pre-Accident Investigations: Better Questions (Routledge); learning teams and operational learning. www.routledge.com/Pre-Accident-Investigations-Better-Questions---An-Applied-Approach-to-Operational-Learning/Conklin/p/book/9781472486134
5.James Reason, organizational accidents, latent conditions, and the Swiss cheese model; with Hudson, Tripod Delta for Shell. FONCSI obituary, 2025. www.foncsi.org/en/news/james-reason-passed-away
6.BP US Refineries Independent Safety Review Panel (Baker Panel), January 2007: 700+ interviews, ~7,500 surveyed, 340,000+ pages reviewed; see Rodriguez et al., Journal of Safety Research (2011). safetyclimate.sites.tamu.edu/wp-content/uploads/sites/96/2016/05/Rodriguez-et-al.-2011-Impact-of-BP-Baker-Report.pdf
7.dss+ (formerly DuPont Sustainable Solutions), Bradley Curve and Safety Perception Survey; benchmark database of 5.5M+ responses across 45 countries. www.consultdss.com/transform-culture/dss-bradley-curve
8.Dov Zohar and Gil Luria, multilevel model of safety climate, Journal of Applied Psychology (2005); group-level climate and microaccidents (2000). pubmed.ncbi.nlm.nih.gov/16060782
9.Diane Vaughan, The Challenger Launch Decision (University of Chicago Press, 1996): the normalization of deviance. en.wikipedia.org/wiki/Normalization_of_deviance
10.Andrew Hopkins, Failure to Learn: The BP Texas City Refinery Disaster and Lessons from Longford. www.processsafety.com.au/books/failure-to-learn
11.Karl Weick and Kathleen Sutcliffe, Managing the Unexpected: high reliability organizing and the five principles of collective mindfulness. www.oreilly.com/library/view/managing-the-unexpected/9780787996499
12.U.S. DOE Office of Enterprise Assessments, independent assessments of safety culture (Hanford WTP, 2015 and 2023); structured interviews with behaviorally anchored rating scales. www.energy.gov/documents/safety-culture-survey-methods-hanford-june-2023pdf
13.U.S. NRC, Safety Conscious Work Environment (SCWE); INPO 12-012, Traits of a Healthy Nuclear Safety Culture; NUREG-2165. www.nrc.gov/about-nrc/safety-culture/scwe