When we feel sick, or something seems wrong, most of us want an answer reasonably fast. Certainly, you can make an appointment with your primary care doctor or go to urgent care. But what if you feel too lousy to see a doctor in person? Or, what if you think your condition might be too minor to bother with a doctor’s appointment? Can you trust symptom checker apps or websites to provide a proper diagnosis? And how do these apps and sites measure up against a diagnosis from a doctor? Moreover, if you use an app or website to check your symptoms, do you also need to see a doctor?
Symptom checker apps and websites – what are they?
Symptom checker apps and websites use Artificial Intelligence (AI) to assess your symptoms and notify you of possible causes and provide information on the potential urgency of the proposed conditions. If you visit your phone’s app store, or search on your computer, you will find numerous symptom checkers.
Can you trust symptom checker apps to diagnose you?
Babylon Health stated their symptom checker app can diagnose medical conditions better than humans, yet this bold assertion appears to be untrue. In fact, researchers found that doctors were much more likely to make a correct diagnosis, as compared to symptom checker apps. In this study, published in 2020, three experienced doctors identified the “gold-standard” diagnosis and urgency level for 200 clinical vignettes. The researchers evaluated the diagnosis and urgency advice of 8 apps and 7 doctors based on the vignettes. Interestingly, the doctors listed the correct diagnosis in their top three suggestions significantly more often than the apps did. However, the results for diagnostic accuracy and urgency advice ranged widely between various apps. Moreover, they found a significant variation in the number of conditions the apps could even assess.
Importantly, this study was funded by Ada Health, a company with their own symptom checker app. However, before you assume bias, the study was peer-reviewed and published by the reputable BMJ Open.
The accuracy of the diagnoses.
Based on the clinical vignettes, doctors listed the correct diagnosis in their list of top 3 suggestions 82% of the time. In comparison, the percentage of times the app listed the “required answer” diagnosis as one of their top three suggestions was significantly lower, as follows:
- Ada – 70.5%
- Buoy – 43%
- K Health – 36%
- WebMD – 35.5%
- Mediktor – 36%
- Babylon – 32%
- Symptomate – 27.5%
- Your.MD – 23.5%
Importantly, the studies’ authors note that some of these apps do not provide diagnoses for certain user demographics or conditions (such as children or pregnant women), leading to lower scores. In these cases, the apps’ overall performance was generally greater when the vignettes with no possible diagnoses were omitted.
Why such a large gap in diagnostic accuracy?
Importantly, the scope of possible conditions among the apps was highly variable, with some apps not offering any suggestions at all for many users, which led to lower overall scores. For example, Babylon, which scored only 32% on diagnostic accuracy, didn’t offer a suggestion for roughly half of the cases in the study.
Specifically, some symptom checkers should not be used for children, while others do not diagnose those with mental health conditions and/or pregnant women. In other cases, apps didn’t recognize symptoms presented, or would not suggest a condition for users with severe symptoms.
The accuracy of urgency advice.
The doctors and the apps were also evaluated for safety based on their recommendations regarding the urgency of each diagnosis, such as whether a patient needed to be seen within the next day. Most apps erred on the side of caution, but in a few cases, an app’s suggestions were potentially unsafe.
Interestingly, the difference regarding urgency between the doctors and the apps was narrow. The urgency advice from doctors was correct 97% of the time. In comparison, the percentage of correct urgency advice from the apps was as follows:
- Symptomate – 97.8%
- Ada – 97%
- Babylon – 95.5%
- Your.MD – 92.6%
- Mediktor – 87.3%
- K Health – 81.3%
- Buoy – 80%
Other researchers finds problems with symptom checker apps.
Researchers evaluated the accuracy of 12 publicly available symptom-checkers using 50 clinical vignettes. Their report, published in July 2021, found the mean diagnostic accuracy of the symptom checker systems was poor. In fact, they found the correct diagnosis listed as a top five diagnostic option only 51% of the time. Although they found a wide variation in performance among the 12 tested platforms. they concluded that the overall performance is significantly below what would be accepted in any other medical field. They state that external validation and regulation are urgently required to ensure these tools are safe.
Symptom checker apps are improving. But is it enough?
Although the percentage of accurate diagnoses leaves much room for improvement, the quality of symptom checker apps has improved. A report published in 2015 showed that 23 symptom checkers listed the correct diagnosis first only 34% of the time. Additionally, the advice regarding urgency was correct only 57% of the time. Certainly, these are not yet reliable.
Can you trust symptom checker apps for COVID-19?
A recent study evaluated the safety and efficacy of COVID-19 symptom checker sites to determine if they accurately identified potential COVID-19 cases and subsequently directed the patient to seek the proper level of care. The study assessed the government-run sites used by 4 countries – Singapore, Japan, US, and UK. For each country’s site, the researchers tested the same 52 simulated cases of a range of COVID-19 presentations (mild, moderate, severe and critical), as well as cases similar to COVID-19 such as sepsis and bacterial pneumonia. Unfortunately, the results were dismal.
Both the US and UK symptom checkers consistently failed to identify severe COVID-19, bacterial pneumonia and sepsis, thereby recommending these patients stay home, instead of seeking medical care. In fact, the US checker was only correct 38% of the time. In contrast, the symptom checkers for Singapore and Japan fared much better, properly identifying 88% and 77% respectively, of the simulated cases.
The study’s authors conclude there is potential for these symptom checker tools to cause worse outcomes by causing patients to delay seeking appropriate medical care.
Should you trust symptom checker apps?
Personally, I think it’s evident that you should not rely on these tools to provide a definitive diagnosis. If the findings from this recent study don’t convince you, consider that a recent article in MedCity News states that Babylon emphasizes that its symptom checker “app is not intended to be used as a diagnostic tool”.
So, if you can’t rely on these apps and websites to provide a definitive diagnose, should you use them at all? In my opinion, NO. But if you can’t resist the temptation to use these symptom checker apps and websites, you should proceed with caution. Although you may receive a possible diagnosis from these apps, I strongly recommend you also get medical advice from a doctor.
A word of caution regarding diagnostic errors.
Even the best doctors occasionally misdiagnose patients. In fact, according to the Society to Improve Diagnosis in Medicine, “diagnostic error is one of the most important safety problems in health care today, and inflicts the most harm. Major diagnostic errors are found in 10% to 20% of autopsies, suggesting that 40,000 to 80,000 patients die annually in the US from diagnostic errors“.
Since knowledge empowers, read these blog posts to learn more about reducing your risk of diagnostic error – even if you always choose human doctors over symptom checker apps and websites:
- 10 Steps to Reduce Your Risk of Diagnostic Error.
- Radiology Diagnostic Errors Are Surprisingly High.
- Should you Speak Up if You Think Your Doctor is Wrong? YES!
- Learn a Lesson From Serena Williams: Trust Your Instincts When it Comes to Your Health.
NOTE: I updated this post on 7-19-21.