Spoken Word Recognition for Listeners

(First published in IATEFL Conference Selections 2023; this article is a written summary of the conference presentation)

Unlike the written word, the spoken word is different every time you hear it – think of all the different voices and accents in the world. How do listeners ever recognise these various versions as being the same word? This crucial aspect of the listening skill is known as ‘spoken word recognition’. In this presentation, we look at some of the difficulties involved in this, and some of the things we can do in the language class. The analysis is divided into four parts, which I call spelling, storing, priming and processing.


According to John Field, the written forms of words tend to stick in the memory more strongly than the spoken forms. Unfortunately, in English, the written form is often misleading and often leads to mispronunciation. For example, many learners pronounce ‘comfortable’ like ‘come for table’.

But what about the consequences for listening? The problem is this: if the learner expects words to sound like their written form, they may not recognise them in speech. We should bear this in mind when teaching. Encourage learners to make some kind of note of words which are pronounced very differently from their spelling. If possible, give them guidance about spelling rules. For example, make sure they are aware of vowel reduction: the letter ‘a’ in ‘comfortable’ is not the same as in ‘table’!


When we hear a word, we compare it to words which are stored in our memory and look for a match. However, words don’t have only one form. John Field gives the example of ‘actually’. If you’re speaking really carefully, this may have four syllables, but said quickly it may come out as only two, like ‘ashley’. Field proposes that instead of having storing a single form of a word, the listener stores multiple versions (or ‘exemplars’) of it. To help learners build up their repertoire of stored exemplars of a word, we need to expose them to more variety. We can assemble multiple examples of the same word in different contexts, and for this purpose, the online tool called ‘Youglish’ is very useful. It’s like a search engine of video material, and you type in the word (or phrase) you want to hear, and it gives you thousands of examples in different voices, accents and speeds. In order to give learners a sense of how the spoken form of words like this vary, we can use YouGlish in class like this, or else encourage them to make regular use of it at home.


Listeners do not hear neutrally. We are primed to pay attention to features which are important in our language, while ignoring features which are not. For example, if word stress is important in your first language, you tend to notice it; if it is not, then you tend to be what Anne Cutler calls ‘stress deaf’. With our learners, we somehow need to prime them to pay attention to features which may not be common in their L1, but which are common in English. One approach is to use texts which have a high density of certain common patterns in English such as word endings. For example, I have designed this rhyme to draw attention to the ending ‘able/ible’:

They’re comfortable and durable

They’re lovable, adorable

Fashionable but sensible

To me they’re indispensable

You can make short texts with lots of examples for yourself. Try using Chat GPT: Instruct it to write a brief text containing… and then give a list of words with the suffix you want to focus on.


Listeners have to process what they’re hearing in real time. According to John Field, ‘listeners may need to form tentative matches on the basis of the available evidence and to confirm or change them as they hear more and more of the utterance’. In the examples below, after hearing the first part of a sentence, the listener understands a, but then after hearing the ending, they must change their interpretation to b:

a. It’s a fish… b. It’s official.

a. Pay a ten… b. Pay attention.

a. It’s a nun… b. It’s an onion.

Expert listeners do this all the time; learners on the other hand tend to stick with the first interpretation, no matter how bizarre. We can use dictations like the examples above in class to raise awareness of this. Read out ‘a’ first and ask learners to write what they hear. Then read out ‘b’ and ask them to correct and complete what they wrote.


Cutler, A. (2012). Native Listening : Language Experience and the Recognition of Spoken Words MIT Press

Field, J. (2008). Listening in the Language Classroom. Cambridge University Press

Leave a Reply

Your e-mail address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.