Deepfake images can mislead viewers, upend elections, and instigate violence. They can also, researchers say, disrupt medical care.
In a study published Tuesday in Radiologyan international team of researchers tested whether 17 radiologists could tell the difference between real X-rays and those generated by ChatGPT. Only 41% noticed that anything was awry when they were initially asked to diagnose patients based on the synthetic images. Even once they knew to look out for deepfake X-rays, they only differentiated them accurately 75% of the time.
These images aren’t hard to generate: The researchers used simple prompts to get ChatGPT to spit out X-rays with a specified anatomical location, disorder, and level of noise. But as easy as it is for the models to create convincing radiographs, they can’t reliably detect them. Four multimodal models, including the one that generated the images, accurately distinguished deepfakes just 57% to 85% of the time.
This article is exclusive to STAT+ subscribers
Unlock this article — and get additional analysis of the technologies disrupting health care — by subscribing to STAT+.
Already have an account? Log in
