Monday, January 15, 2024

AI Medical Diagnosis

Google has just published a paper on a major trial of their new medical diagnosis AI, which they call AIME. As the graph above shows, AIME scored significantly better than primary care physicians on pretty much every metric. (This google blog post is the best summary)

This is not especially surprising; one of AI's most famous early achievements was getting very good at predicting the 5-year survival chances for heart disease patients. One analysis I read of that study said that maybe AI did better because it was focused only on quantifiable data, whereas the human physicians were distracted by other factors. Anyway, results like this have been around for twenty years.

In the current study, the investigators took a standard LLM and trained it using transcripts of recorded doctor-patient conversations. But just as chess programs learn by playing against themselves, AIME honed its skills via "a novel self-play based simulated diagnostic dialogue environment with automated feedback mechanisms to enrich and accelerate its learning process." That is, the AI simulated both the doctor and the patient, holding thousands of conversations with itself.

The investigators then had a bunch of physicians hold conversations with patients via texting, with no face-to-face interactions, while AIME had similar conversations with other patients. The physicians and AIME both then issued diagnoses. Both the conversations and the diagnoses were then evaluated by "specialist physicians" who scored them for quality and accuracy without knowing which were human doctors and which were AIs.

Two things to note about this are that 1) the physicians did not have face-to-face interactions with the patients, which at least some doctors think makes a big difference (which is why telemedicine is controversial), and 2) the scoring was still done by humans. Since we have other data (from the cardiology studies) showing that in some cases AI can be better than any human, that is a limitation on the accuracy of the whole study.

In another note, some people think the future will be doctors using AI assistants, but Google has data showing the AIME is more accurate than a doctors/AI combo.

Google tries to soften the blow by saying this is all about supplying medical expertise in parts of the world where doctors are in short supply, but I think this points to a vast array of professions in which AI already is or soon will be better than people.

I, for one, would be happy to see investment bankers be entirely replaced by machines.

4 comments:

David said...

It would be interesting to know if there's any pattern in where the human doctors go wrong. For example, is there a suggestion that humans tend, as a human proclivity, to predict better or worse outcomes? Do they have a bias toward whatever they were taught/was current/what they remember from medical school, vs. looking at the whole most up-to-date understanding (if the latter is what the AI in fact does)?

Here's a good side to AI instead of human doctors: infinite patience with hypochondria.

Here's two bad sides (this goes for replacing investment bankers with AI too): all those fired people feeling bad about themselves, and all those saved salaries going to the .1%.

Here's a nice idea: UBI.

Here's a fun idea: a grand Sino-American alliance to crush all tax havens, starting with Peter Thiel's private island.

G. Verloren said...

"A computer can never be held accountable - therefor, a computer must never make a management decision."

- IBM Training Manual, 1970s

Susi said...

I haven’t seen many humans held accountable for their decisions. I’d prefer a decision made by myself based on an AI interaction. I have found few medical personnel, especially doctors, who are capable of putting together multiple symptoms for a diagnosis. It’s been indicated by studies that time of day impacts the ability of medical personnel to provide care. The problematic time is when the patient cant give input.

David said...

It would be interesting if drug or medical insurance companies get input on the algorithms the AI uses.