GPT-4 performance comparable with physicians on official medical board residency examinations. Model performance near or above official passing rate in all medical specialties tested

cyu@sh.itjust.works to Technology@lemmy.ml – 58 points –
ai.nejm.org
42

You are viewing a single comment

It's just a multiple choice test with question prompts. This is the exact sort of thing an LLM should be very good at. This isn't chat gpt trying to do the job of an actual doctor, it would be quite abysmal at that. And even this multiple choice test had to be stacked in favor of chat gpt.

Because GPT models cannot interpret images, questions including imaging analysis, such as those related to ultrasound, electrocardiography, x-ray, magnetic resonance, computed tomography, and positron emission tomography/computed tomography imaging, were excluded.

Don't get me wrong though, I think there's some interesting ways AI can provide some useful assistive tools in medicine, especially tasks involving integrating large amounts of data. I think the authors use some misleading language though, saying things like AI "are performing at the standard we require from physicians," which would only be true if the job of a physician was filling out multiple choice tests.

I, too, can pass the Boards if you remove all the questions I don't understand.

I’d be fine with LLMs being a supplementary aid for medical professionals, but not with them doing the whole thing.