Illustration = ChatGPT DALL·E 3

Can artificial intelligence (AI) replace doctors? The possibility is growing. AI has already acquired medical knowledge and passed the medical licensure examination. It is assisting doctors in analyzing medical images and test results in hospitals. Now, it has even developed the ability to converse directly with patients and infer diseases. AI doctors developed by major corporations like Microsoft (MS) and Google have surpassed human doctors in accuracy and diagnosis expense.

Academics expect that if AI doctors acquire reasoning abilities like human doctors, they could revolutionize the medical system. It is said that AI doctors can significantly reduce unnecessary medical expenditures by assisting in complex cases where it is difficult for doctors to make judgments alone. U.S. medical expenditures already approach 20% of the gross domestic product (GDP), with up to 25% of this estimated to be wasted on treatments that have little impact on outcomes.

◇"My throat hurts..." Consult an AI doctor

MS unveiled its disease-diagnosing AI, the "MS AI Diagnostic Orchestrator (MAI-DxO)," on the 30th of last month. The AI diagnoses diseases by conversing with patients through a chat window, just as a doctor would in an examination room.

For example, if a patient inputs, "I have a sore throat and my tonsils are swollen; I have been hospitalized. Antibiotics haven't helped," into the chat window, AI follows up with "Do you have fever, weight loss, or fatigue?" The AI synthesizes the conversation and concludes there is a tumor on the right side of the tonsil, catching something the doctor might have overlooked.

MAI-DxO is a generative AI that learns from large datasets, identifies patterns, and produces sentences, images, and videos. In the past, generative AI has learned from medical information and the myriad of diagnostic cases available online, scoring nearly perfect on the U.S. Medical Licensing Examination. However, it had not actually seen patients like a real doctor. MS stated, "While previous AI passed the medical examination, it was limited to solving multiple-choice questions; this time, we have developed AI to the level of inferring patient conditions."

MS compared the diagnoses of MAI-DxO against 304 patient cases from Massachusetts General Hospital (MGH), mentioned in the international journal "The New England Journal of Medicine (NEJM)," with 21 doctors from the U.S. and the U.K. who had 5 to 20 years of experience. The AI's diagnostic accuracy was 85.5%, surpassing that of doctors by 20%. MS's AI doctors outperformed others, including Geminae and DeepSeek. This result was published on the preprint site arXiv on that day.

Dr. Eric Topol, director of the Scripps Research Translational Institute in the U.S., noted in the magazine Time on the 2nd, "A diagnostic accuracy that is four times greater than humans is a significantly large figure compared to previous cases. Most AI-human differences were about 10%, making this a truly big leap."

Dr. Topol particularly highlighted the expense. He said, "AI was not only more accurate but also much cheaper." Doctors determine what tests are necessary by listening to patient symptoms and verify if their inference is correct based on that. All tests and medical imaging incur expenses. According to MS, AI doctors made more accurate diagnoses at an average expense that was 20% lower than humans.

Comparison of the diagnostic accuracy and average diagnostic examination expense of various AI doctors. The horizontal axis represents the expense per diagnosis, and the vertical axis represents diagnostic accuracy. MS's AI doctor MAI-DxO surpasses not only humans but also other AIs such as Gemini, Grók, and DeepSeek. The red cross at the bottom shows the average diagnostic ability of 21 current physicians./MS

◇Seems likely to reduce medical expenditures... but caution is needed regarding hallucinations

Other major corporations are also developing AI doctors. Apple is preparing what it calls an "AI doctor." The Apple Watch, a smart device worn on the wrist, will collect information such as heart rate and integrate it with health apps to provide customized management. The Apple Watch received certification as a medical device development tool from the U.S. Food and Drug Administration (FDA) for its function to detect irregular heartbeats. U.S. economic magazine Forbes predicts Apple may reveal related information as early as next year.

Google is also developing a conversational AI diagnostic system that mimics the conversation process between doctors and patients. Named "The Accurate Medical Information Explorer (AMIE)," this AI similarly collects information from patients and interprets symptoms to mimic human doctors' reasoning processes. According to initial test results released by Google in January last year, it showed superior performance compared to doctors. The initial version of Google's system diagnosed 59% of the presented cases accurately, while human doctors achieved only a 33% diagnosis rate.

Google also revealed the research process for its medical generative AI, Med Geminae, last August. It instructed Med Geminae to write a diagnosis after viewing a chest X-ray. When shown the results without informing the doctors that it was an AI, 72% believed Geminae's diagnosis was similar to or better than that of a doctor. Google is also preparing technology to analyze patients' cough sounds to identify potential issues.

AI is more accurate and makes quicker decisions than humans. As a result, it is expected that the waiting period for patient treatment will be shortened. The National Bureau of Economic Research (NBER) projected in a March report last year that if medical AI resolves over-treatment issues, it could reduce U.S. medical expenditures by $200 billion to $360 billion (278 trillion to 500 trillion won) annually. This represents 5% to 10% of U.S. medical expenditures.

However, caution should be exercised regarding AI hallucinations. Hallucinations occur when AI provides plausible but false answers when it is unsure of the correct response. Blindly trusting AI doctors could have irreversible effects on patients' lives. Time stated, "Approval from regulatory agencies is needed to ensure that AI-based decision-making does not harm patients."

The U.S. Food and Drug Administration (FDA) has still not clarified its stance on whether AI diagnostic systems are medical devices. MS stated that the AI doctor MAI-DxO has also not yet received clinical trial approval from the FDA.

References

arXiv (2025), DOI: https://doi.org/10.48550/arXiv.2506.2240

Google (2024), https://research.google/blog/amie-a-research-ai-system-for-diagnostic-medical-reasoning-and-conversations/

※ This article has been translated by AI. Share your feedback here.