This study sought to compare ED doctor’s gestalt vs screening tools to identify sepsis. The tools included SIRS, qSOFA, SOFA, MEWS, and a machine learning one called LASSO.
We would intuitively believe that these screening tools are an improvement over gestalt. Why would we use them if they weren't better? Oddly enough, the screening tools have never been compared to gestalt. What if they are worse?
This study took place in a single well-functioning ED
in the USA with over 100,000 patient visits per year. They included
undifferentiated adults who were critically ill and went to the resuscitation area of their ED. They excluded trauma, obvious cause of
illness, STEMI, strokes, and transfers.
At 15 and 60 minutes after
presentation to the ED, a trained independent researcher asked the ED
consultant a single question; “What is the likelihood that this patient has
sepsis.” They rated their answer on an iPad displaying a slide bar from 0
(no infection) to 100 (infection).
The gold standard diagnosis of sepsis was determined by an ICD-10
code at hospital discharge. Gestalt and screening tools performance were
appropriately compared by looking at using area under the curve (AUC).
Results?
2,484 patients met eligibility. 275 (11%) had sepsis.
Which was better? Gestalt or screening tools?
Gestalt was crazy good with an AUC of 0.9. This is really exceptional. Other tools performed worse. Machine learning AUC 0.84 (pretty
good), MEWS 0.72 (good), and qSOFA, SIRS and SOFA about 0.67 (not great.)
Don’t forget, an AUC of 0.5 is considered random… like flipping a coin.
Although I agree with the overall message of this paper,
there are some important limitations. I’ll only mention a few.
They only included very sick patients that went to
resuscitation. Sepsis should be more obvious in this cohort. I’m
guessing this is why gestalt performed so well. What I am really interested in
are the subtle cases that may still be in the waiting room. One could argue
that they studied the wrong patient population.
This was a single centre that used the gestalt of a highly
experienced consultant. In addition, they were very quick and efficient at
getting diagnostic tests back including x-ray, bloods, vital signs, and POCUS.
The paper relied on a hospital discharge diagnosis of sepsis
as the gold standard. It is very likely that patients with medical conditions
who presented to the ED without sepsis subsequently developed the
condition while in hospital. This would bias the results against gestalt.
What is my take home message?
Before we rely upon screening tools or clinical decision instruments, we must know that they are better than what we are currently doing. As demonstrated in this limited study, tools may perform worse!
Covering: Knack SK, Scott N, Driver B, et al. Early
Physician Gestalt vs Usual Screening Tools for the Prediction of Sepsis in
Critically Ill Emergency Patients. Ann Emerg Med;84:246-258. [Link to article]