Size does matter

There are plenty of linguistic variables that one has to consider when designing experiments on language processing. The list of all those factors is getting longer so swiftly that it’s difficult to catch up with them. We’ve recently conducted a neuroimaging pilot study looking at a new such variable – semantic size i.e., the real size of an object to which a given word refers to. Here is what drove us to do that.

In 2009, Sereno, O’Donnell and Sereno came up with the concept of semantic size, and initially examined its effect on word recognition in a lexical decision task. They compared RT and accuracy between ‘big’ (e.g., ‘jungle’) and ‘small’ words (‘needle’), and found that semantic size accounted for the behavioural responses over and above well-established variables (word length, frequency, etc.). Sereno et al. were the first ones to show that ‘big’ words (513 ms) were recognised faster than words denoting small entities (528 ms). However, soon after the study  Kang, Yap, Tse, and Kurby (2011) used the same stimuli and task, and found no latency advantage for bigger words. Not only did they have a larger sample than Sereno’s team (24 vs 80 Ss), but their detection power was also far better (.32 vs .81 Cohen’s d). To support the null hypothesis, Kang et al. additionally analysed latency data from two word recognition megastudies. This is where the story ends.

It does make sense. Why would the size of a word’s referent facilitate lexical decision (distinguishing between pseudowords and words)? We were, on the other hand, interested whether the brain processes semantic size at all. So we conducted a simple fMRI pilot study on five folks, in which we used the stimuli from Sereno et al. in a silent word reading paradigm (adding fixation cross and verbs as low- and high-level baselines, respectively). The results were far more interesting from what we had expected. Reading big nouns increased the BOLD signal in the left middle occipitial gyrus [-36 -88 31], whilst small nouns correlated with a cluster in the right cuneus [3 -82 28]. A more lenient threshold for that contrast also showed a four-voxel activation within the right lingual gyrus [15 -55 -2].

While the precise interpretation of the results remains under further scrutiny, we propose that semantic size can modulate brain activity through processes related to mental imagery. Given that much of our conceptual knowledge is represented in the perceptual system, it’s plausible that word recognition, not exclusively of visual modality, can automatically initiate mental imagery of the perceptual properties (size) of the word’s referent. The neural mechanisms that seem to support this process are those located in the primary visual system and nearby cortical regions (BA 17, 18, 19). For instance, BA 17 has been shown to be more active during mental imagery of letters (Kosslyn et al., 1993) and the size of objects (Kosslyn, Thompson, & Alpert, 1997). Activation within BA 19 (visual association cortex) is on the other hand a correlate of mental imagery of shape (Knauff, Kassubek, Mulack, & Greenlee, 2000). Finally, a neuroimaging study of Ganis, Thompson, and Kosslyn (2004) demonstrated that all BA 17, 18, and 19 underlie not only visual perception, but also visual mental imagery.

Although all we’ve got is data from 5 subjects that might have some false positives, the results are quite cool – semantic size may modulate brain activity in a conceptually similar manner as object size affects visual perception. When viewed at the same distance, bigger objects, compared to smaller ones, are transmitted far more quickly through the magnocellular pathway in the visual system. More importantly, semantic size modulation would suggest that words trigger a perceptual representation of the object they refer to, thus somewhat bridging word meaning and object recognition. For now, we’ve stopped right there. We’ve decided to do a couple of behavioural studies first, to know what actually is going on in the scanner. We hope to understand a bit more on processing of semantic size through the Stroop effect, and later using dual-task methods (word processing vs mental imagery).

How much do we already know about speech comprehension? A tiny review of fMRI studies

The functional and neural architecture of  speech comprehension has been under scrutiny for more than a hundred years. As early as 1905, Wernicke  started it all. His intuitive hypothesis came from the observation that all his patients, with lesions to the left STG, were unable to understand spoken language. It wasn’t long enough until Wernicke’s view on the cortical organisation of speech comprehension proved to be over-simplified. The role of the STG was mainly disputed based on neuropsychological findings; damage to the STG in fact results in impaired speech production, as opposed to comprehension. Nobody is saying that the STG doesn’t support speech comprehension, but it does seem that our ability to understand each other is much more complex as for its cortical organisation. Is it the case?

What I’m about to do now is to summarise literally a hundred of fMRI studies on speech comprehension. I’m more specifically interested whether neuroimaging findings suggests that speech comprehension is instantiated in the brain as a uniform process, or whether it relies on multiple regions of different specialisation. We should definitely find some answers given the number of papers on that topic. A simple Google Scholar search gives me 19.000 results for ‘speech comprehension and fMRI’, 45.900 for ‘speech processing and fMRI’, and surprisingly only 2000 additional results come up if you type in ‘neuroimaging’ instead of just of the methods – fMRI. It came as an even bigger shock to find Cathy Price‘s review of 100 fMRI papers on speech comprehension and production published exclusively in 2009! 100 papers on the same topic in one year! And this excludes TMS, EEG, MEG, and PET studies, papers from 2008 and backwards, and studies on acquired/developmental language disorders. I should probably cross out ‘tiny’ from the title now. OK, let’s see what they’ve got.

  • Prelexical auditory processing (extraction of meaningful units from acoustic input, analysing the frequency spectrum, phonemic categorisation, etc.) has been consistenly shown to correlate with increased signal in the vicinity of Heschl’s gyrus (BA 41). However, much research shows that even the first phase of speech comprehension, prelexical analysis, is a more complex neural process that we could have assumed. It appears that we have multiple subregions responsible for different aspects of prelexical processing, proximal to Heschl’s gyrus but extending in different directions (anterior, ventral, posterior). I wonder whether these activation loci really reflect the cortical organisation of prelexical analysis, or whether this is all due to other stuff, say, task demands, differences in stimuli and paradigms, top-down modulation, etc.
  • Word recognition (matching speech units to lexical entries within 100-150 ms of speech, and meaning retrieval) activates areas proximal to the anterior, ventral and posterior borders of the perisylvian area that is related to prelexical analysis. This signal extension characteristic of lexical-semantic processing at the word level shows that we have distinct systems for speech perception and comprehension.
  • Much of the existing research indicates that sentence comprehension is associated with the anterior part of the left MTG (BA 21). Other activation loci are the left angular gyrus (BA 39), temporal pole (BA 38) and posterior cingulate/precuneus. Unlike the MTG, these brain regions are reported more rarely in the literature, which may suggest that each of them is specialised in a different aspect of sentence comprehension. What is interesting about these findings is that most of the signals related to sentence comprehension lie in the same sphere of activation as during word processing. Price (2010) suggests that the increased activation in the STS during sentence processing may in fact reflect a sort of general integrated concept, consolidated from many single concepts. For instance, each word in the sentence “He sees the world through pink glasses“ is associated with a relatively different semantic concept, whereas the sentence per se carries one concept i.e., optimism. Although still unsupported, the hypothesis would explain why semantic processing of single words results in extensive activation in the STS, which in turn consolidates in the ventral part when integrating multiple concepts into one.

It’s rather no exaggeration to say that fMRI studies have advanced our knowledge on speech comprehension in a way that has been nothing short of revolutionary. We’ve taken a major step from intuitive inferences based on rare lesion studies to complex research on the functional organisation of speech processes. However, I’d like other neuroboffins to integrate what we have so far with findings from single, sometimes overlooked, studies on other computations involved in speech comprehension. That is, syntactic processing and ambiguity, semantic ambiguity, prosody, to name but a few.

Given how quickly and effortlessly we understand speech, successful spoken language comprehension involves processing linguistic information of different type (phonological, semantic, syntactic, prosodic, etc.), which must be accessed and coordinated within milliseconds. It shouldn’t then come as a surprise that our brain relies on separate neural subsystems to execute these complex and quick computations that are so necessary to make sense of what we hear. Neuroimaging findings are in line with my claim.

First, sound enters the ear canal and is analysed as single tone frequencies in the primary auditory cortex. Speech and speech-like sounds are then processed in regions surrounding Heschl’s gyrus and the planum temporal. Activation nearby the STS  is in turn characteristic of all early information processing stages such as auditory analysis and mapping sound onto specific speech units (phonemes, syllables, vowels). The functional localisation of more complex computational processes is best described as a temporofrontal network in which the middle and superior temporal lobes support the extraction of phonological, lexical, and semantic information of the incoming input. This information is then combined and encoded for later use within the anterior temporal lobe. The IFG also constitutes an important part of the speech comprehension system. It executes complex semantic and syntactic operations when speech complexity and ambiguity increase. I’ve mapped the processes onto the brain image below (squares – prelexical analysis; circles – word recognition; triangles – sentence comprehension).

 

adas