The technology to decode our thoughts is getting closer and closer. Neuroscientists at the University of Texas have for the first time decoded data from noninvasive brain scans and used it to reconstruct language and meaning from stories people hear, see or even imagine.
In a new study published in Nature NeuroscienceAlexander Huth and his colleagues managed to recover most of the language and sometimes exact sentences from functional magnetic resonance imaging (fMRI) brain recordings of three participants.
Technology that can create language from brain signals could be extremely useful for people who cannot speak due to conditions such as motor neuron disease. At the same time, it raises concerns about the future privacy of our thoughts.
Language decoding patternsalso called “speech decoders”, aim to use recordings of a person’s brain activity to discover the words they hear, imagine or speak.
Until now, voice decoders have only been used with data from devices surgically implanted in the brain, which limits their usefulness. Other decoders that used noninvasive recordings of brain activity were able to decode single words or short sentences, but not continuous language.
The new research used the signal dependent on blood oxygen level from fMRI scans, which show changes in blood flow and oxygenation levels in different parts of the brain. By focusing on patterns of activity in brain regions and networks that process language, the researchers found that their decoder could be trained to reconstruct continuous language (including some specific words and the general meaning of sentences) .
Specifically, the decoder took three participants’ brain responses as they listened to stories and generated word sequences that may have produced those brain responses. These word sequences did a good job of capturing the gist of the stories, and in some cases included exact words and phrases.
The researchers also asked participants to watch silent movies and imagine stories while being scanned. In both cases, the decoder was often successful in predicting the gist of the stories.
For example, a user thought “I don’t have my driver’s license yet”, and the decoder predicted “she hasn’t even started learning to drive yet”.
Additionally, when participants actively listened to a story while ignoring another simultaneously played story, the decoder could identify the meaning of the actively listened story.
How it works?
The researchers started by asking each participant to lie inside an fMRI scanner and listen to 16 hours of storytelling while their brain responses were recorded.
These brain responses were then used to form a encoder – a computer model that attempts to predict how the brain will react to words a user hears. After the training, the encoder could predict quite accurately how each participant’s brain signals would react to hearing a given string of words.
However, going in the opposite direction – from recorded brain responses to words – is trickier.
The encoder model is designed to relate brain responses to “semantic features” or the broad meaning of words and phrases. To do this, the system uses the original GPT language model, which is the precursor to the current GPT-4 model. The decoder then generates word sequences that could have produced the observed brain responses.
The accuracy of each “guess” is then checked by using it to predict previously recorded brain activity, with the prediction then compared to the actual recorded activity.
During this resource-intensive process, multiple guesses are generated at once and ranked in order of accuracy. Bad guesses are discarded and good ones retained. The process continues by guessing the next word in sequence, and so on until the most accurate sequence is determined.
Words and meanings
The study found that data from several specific brain regions – including the speech network, the parietal-temporal-occipital association region, and the prefrontal cortex – were needed for the most accurate predictions.
One of the main differences between this work and previous efforts is in the decoded data. Most decoding systems relate brain data to motor characteristics or recorded activity in brain regions involved in the final stage of speech production, mouth and tongue movement. This decoder works rather at the level of ideas and meanings.
A limitation of using fMRI data is its low “temporal resolution”. The blood oxygen level-dependent signal rises and falls over a period of about 10 seconds, during which time a person may have heard 20 or more words. Therefore, this technique cannot detect individual words, only the potential meanings of sequences of words.
No need to panic about privacy (yet)
The idea of technology that can “read minds” raises concerns about mental privacy. The researchers conducted additional experiments to address some of these concerns.
These experiences have shown that we need not worry just yet about having our thoughts decoded while we are walking down the street, or even without our extended cooperation.
A decoder trained on one person’s thoughts performed poorly when predicting semantic detail from another participant’s data. Additionally, participants might disrupt decoding by diverting their attention to a different task such as naming animals or telling a different story.
Movements in the scanner can also disturb the decoder because fMRI is very sensitive to movement, so the cooperation of the participants is essential. Given these requirements and the need for powerful computing resources, it is highly unlikely that anyone’s thoughts could be decoded against their will at this point.
Finally, the decoder does not currently work on data other than fMRI, which is an expensive and often impractical procedure. The group plans to test their approach on other non-invasive brain data in the future.
This article is republished from The conversation under Creative Commons license. Read it original article.