There’s a lot of noise out there in the machine-learning modeling world, but this demo is really impressive — or scary, if you’re inclined to climb into MRI scanners recreationally.
The new research is presented in a paper titled “High-Resolution Image Reconstruction Using Latent Diffusion Models of Human Brain Activity,” co-authored by Professor Shinji Nishimoto and Assistant Professor Yu Takagi of the Graduate School of Frontier Biosciences (FBS) at the University of Osaka. What the techs have done is found a way to pass fMRI brain scans to the open source Stable Diffusion latent variable model, created by billion dollar startup unicorn Stability AI.
The results are surprising to say the least. Presented with the output of an fMRI brain scan – which appears very close to random noise to our eyes – the researchers’ constrained diffusion model can, in their words:
reconstruct high-resolution images in an uncomplicated manner with high fidelity, without the need for additional training and fine-tuning of complex deep-learning models.
The preprinted paper shows five recovered images: a teddy bear, complete with bow tie; an avenue of trees; a jet plane landing (or possibly taking off); a snowboarder on the slopes; and a tapered bell tower. The level of agreement is variable and a sixth image, of a steam locomotive, is less clear, but it’s remarkably good, even if the researchers had chosen the best of their results, as we suspect they would naturally be inclined to do.
The techs say the source code for their model will be “available soon”. Their input data was four of eight volunteers whose scans are in the University of Minnesota’s public Natural Scenes Dataset, or NSD, and the sample images in the paper are from one person.
Stable Diffusion itself has become famous for taking textual descriptions and generating sometimes very realistic images from just a handful of words – and if chosen carefully enough, the text can recall the original images used to train the model.
So while this isn’t exactly a computer that reads someone’s mind, it produces significantly better results than, say, previous attempts in this direction that we reported on in 2021. If we follow the paper correctly, they use Stable Diffusion to ingest the recovered images by elements from the training database. For comparison, about 12 years ago, a comparable newspaper [PDF] the use of Bayesian statistics and modeling did produce some recognizable images, but of considerably lower quality.
As we’ve reported in the past, the claims of fMRI research have long been controversial, but this is the kind of area where machine learning and neural network algorithms can be most useful: finding, correlating, and matching very weak signals with their vast libraries. with images, to produce easily recognizable results.
Functional MRI is a subset of magnetic resonance imaging, or nuclear magnetic resonance imaging as it used to be called before the techies realized the “n-word” scared people off. The scanners involved are extremely large (and this Vulture can attest to having been in more than one extreme noisy) machines. No one is going to point a parabolic dish at your head from across the street and read what’s on your mind. But if you sign some waivers first, then lie with your head clamped still for an hour while a huge doughnut-shaped magnet revolves around it, yes, this kind of technique might be able to tell which photo you’re looking at.
MRI – Click to enlarge
The two pros have a page about their work, and you can read the abstract or the entire 11-page paper [PDF] on the bioRchiv preprint server. They will present their findings at this year’s CVPR in Vancouver in June. ®