Introduction
It is an unquestionable fact that English has become the language of communication worldwide (Sharifian, 2017). Every day, millions of interactions are conducted in English in contexts where the participants' L1 backgrounds are of languages other than English. The rise of English as an international language has had a great impact on fields such as second language acquisition, applied linguistics, and English language teaching (ELT). Even though the teaching of second language pronunciation has received less attention than other fields within second and foreign language acquisition, such as syntax or morphology (Al-Azzawi & Barany, 2016; Foote, Trofimovich, Collins, & Urzúa, 2016; Koike, 2016; Pourhosein Gilakjani & Sabouri, 2016), it has still received some focus. Unfortunately, that focus is sometimes accidental rather than planned (Al-Azzawi & Barany, 2016).
In the teaching of any language, be it as a second or foreign language, a key point lies in deciding what accent is to be adopted as the model for the learners (Carrie, 2013; Moedjito, 2015). In the case of English, the general tendency is in the direction of a native accent. In other words, the type of accent spoken by an individual ".. .usually of an inner circle English and largely based on monolingual language practices and norms" (Hansen-Edwards, 2016, p. 1). Thus, the choice is between two commonly adopted native-speaker varieties (Carrie, 2013; Moedjito, 2015). These are Received Pronunciation (RP), which is the accent taught to L2 English language learners who aim at a British model of pronunciation, and General American (GA) for those who prefer the American accent.
It is also unquestionable that in spite of the efforts that L2 English language learners make, they rarely attain near-native or native-like pronunciation (Chan, 2018). Empirical research has attempted to determine the factors that prevent these learners from achieving native-like levels of pronunciation attainment by examining a number of variables that are thought to be the cause of failure in accomplishing this goal. A review by Mackay, Piske and Flege (2001) provides an examination of the factors that have been studied as the predictors of attaining an L2 foreign accent. These factors include: age of L2 learning, length of residence in a country where the L2 is the language of communication, gender, years of formal instruction, motivation, language learning aptitude, and amount of L1 and L2 use. The findings seem to agree that the best predictor of L2 foreign accent is that of age of learning (Mackay et al., 2001; Oyama, 1976; Patkowski, 1990). This evidence seems to support the notion of the Critical Period Hypothesis (CPH), which claims that near-native speaking attainment in a second language is biologically determined. Thus, the L2 learner will not always be able to achieve native-like proficiency in a second language due to age constraints (Ghazi-Saidi, Dash & Ansaldo, 2015; Szyszka, 2015). All of this seems to be true, especially for the attainment of the kind of/level of L2 pronunciation.
Apparently, empirical evidence seems to indicate that all efforts to adopt a native speaker model of pronunciation in language teaching, after a certain age, are hopeless and theoretically unfounded (Susan, Suzanne & Carter, 2018). This may happen due to the fact that most language-acquisition cognitive functions take place during childhood. These functions seem to disappear once the L1 has been settled; therefore, these are no longer available for L2 learning (Schmid, Gilbers, & Nota, 2014). This makes posing the following questions a must for any ELT professional and anyone involved in the area: are native speaker accents a valid model for L2 learners to imitate? What is then the L2 pronunciation model to be adopted? The answers to these questions are not straightforward and are the cause of vigorous debate among scholars and phoneticians worldwide. However, in relation to the first question, there seems to be a consensus that the present goal in L2 English pronunciation should aim at intelligibility rather than native speaker mastery (Susan, Suzanne, & Carter, 2018; Pourhosein, 2016).
I shall now offer an examination of the two ends of the continuum on the above-mentioned issue. On the one hand, I review the approach proposed by Jennifer Jenkins (1998; 2000) termed "The Lingua Franca Core." Then, I present an examination of two of its detractors, namely Trudgill (2005) and Wells (2005), who claim that the L2 English learner will not communicate, exclusively, with other non-native speakers. It seems impossible, they claim, to predict who those L2 speakers will interact with using English in future interactions.
Neither teachers nor students can foretell what contexts, whether EIL or EFL, they will take part in. Finally, a discussion on the contribution of each approach is offered.
The Lingua Franca Core
Jenkins (1998) claims that, as learners generally fail to acquire native-like pronunciation of English, the direction of English language pronunciation teaching should be changed. This direction should no longer aim to get the learner to achieve a native accent of English as their target model. Instead, the English language teacher should aim at "comfortable intelligibility." She argues that, contrary to what has widely been assumed and accepted, native speakers' pronunciation is not the most intelligible model to adopt. In addition, she gives an account of the large and rapid rise of the number of non-native speakers of English and the number of interactions that occur among them, and states that in most of these interactions the participants are non-native speakers of English.
On the basis of evidence in support of the CPH and the international status of English, Jenkins devised what she calls "The Lingua Franca Core," which consists of a set of features that she considers essential for mutual intelligibility, as reported by her own empirical evidence on interactions conducted in English in international contexts (Jenkins, 2002). These include the areas concerned with the production of segmentals, placement of nuclear stress, and articulatory setting (Jenkins, 1998). Additionally, a set of non-core features is provided. These non-core features are not essential since their absence does not affect intelligibility. Among these are word stress, features of connected speech (elision, assimilation, linking, and weak forms), and rhythm. Finally, Jenkins (1998, 2000) claims that any trace of L1 transfer is not to be considered an error since under the LFC it would be labeled as a regional variant.
Concerning English language teachers' role, Walker (2001) emphasizes their responsibility to reformulate priorities regarding the choice of a model that allows L2 English learners to achieve an acceptable level of intelligibility. Walker also supports Jenkins's contribution in the development of the LFC, and highlights that this is the only approach based on findings drawn from empirical evidence. According to Walker (2001), a possible solution in setting the priorities in the teaching of pronunciation lies in having recourse to contrastive analysis between the phonological systems of the L1 and the L2. Additionally, Walker (2001, p. 2) claims that the adoption of the LFC is/was informed by contrastive analysis results in a positive effect for two reasons "a) the total workload required of teacher and learner is now greatly reduced; b) the new goals are more achievable both in terms of teaching and learning." This practice has been adopted by authors such as Zoghbor (2018), who examined the differences and similarities of Modern Standard Arabic and the LFC in order to identify possible communications breakdowns. Finally, Walker (2001) suggests that adopting the LFC would imply a lower psychological burden on the learner by means of emphasizing what she/he can do and not by setting unrealistic goals that she/he cannot achieve, such as imposing the insurmountable objective of native speaker pronunciation.
The Other Side of the Coin
The first objection to Jenkins's work is that manifested by Trudgill (2005). This author suggests that there is a distortion of the actual superiority of non-native speakers of English over native speakers in terms of number. Trudgill states that the number of non-native speakers is much smaller than the one that Jenkins (2000) asserts, as reported by Crystal (as cited in Trudgill, 2005). This may happen since, in her account, Jenkins includes the figures corresponding to those speakers of English in an ESL background and those whose level of proficiency is not high enough to be considered real speakers of the language. Moreover, Trudgill (2005) adds that the number of interactions in English is far larger among native speakers of English than among L2 English speakers. This would, to an extent, discredit the notion of English as an International Language which is used by Jenkins (2000) to address the interaction of non-native speakers with other non-native speakers in English. However, the current total number of English language speakers worldwide is estimated at 1,132,366,680, out of whom 379,007,140 are L1 speakers and 753,359,540 are L2 English speakers (Eberhard, Simons & Fennig, 2019). These figures are in line with those claimed by Jenkins.
Trudgill (2005) then points out that non-native English language speakers will not only wish to communicate with other non-native English language speakers, but also with native speakers of English. Even more, some will even aim to attain native speaker pronunciation. Thus, there is no point in having English language learners make a choice on who they want to interact with, for the range of their potential interlocutors includes speakers from all backgrounds: ESL, EFL, or native English language speakers (Wells, 2005). Additionally, English language learners may want to use English with a variety of interlocutors and not exclusively with one single group of speakers. Thus, there is no way for English language teachers to predict with whom their learners are going to use English. In other words, these teachers should be able to cater for a range of learner preferences. The same issue is raised by Wells (2005).
Trudgill (2005) then proceeds to refute Jenkins's (2000) claim that non-native English language speakers are a more intelligible model. He does this by citing different studies which conclude that non-natives find only a slight advantage in the speech produced by other non-native speakers at their initial stages of L2 English learning. For instance, he mentions a study conducted by Wijngaarden (as cited in Trudgill, 2005) in which he found that trilingual Dutch-L1 speakers (L2 English, L3 German) found non-native speakers more intelligible than native speakers. However, this occurred when they listened to the speakers using their second foreign language (German, their lower-proficiency second language). On the contrary, when they listened to the speakers in their primary second language (English), they found native speakers of English more intelligible than other non-native speakers.
With respect to the LFC, Trudgill (2005) suggests that Jenkins’s proposal in phonological terms is insufficient. He claims that the LFC focuses on the phonetic level. Even more, he claims that the LFC still poses a huge burden on the language learner. Finally, he claims that it is overwhelmed with vagueness and lacks detail concerning the number of segments and phonetic information of the vowel system. For instance, there is no account (or count?) of the number of pure vowels, diphthongs, and triphthongs. Furthermore, RP and GA, on which the LFC is grounded, do not have identical vowel systems. In addition, in these two native-speaker accents, there are a large number of words which are pronounced with a different vowel phoneme. For example, the word “got” is pronounced /gɑːt/ in GA, whereas RP has /gɒt/. The lack of phonological and phonetic detail about vowels that the LFC presents allows for an endless number of confusing realizations for this and other similar words which, rather than facilitating communication and intelligibility, could hinder them.
Another argument offered by Dauer (2005) is in relation to phonological and phonetic observations on changes that could have been proposed in the LFC. For instance, she criticizes the fact that the LFC is a rhotic model grounded on a non-rhotic accent (RP). First, Jenkins does not provide a justification for the rhoticity of the LFC. If rhoticity is to be used to help distinguish pairs of words such as /pɒt/ and /pɔː(r)t /, then vowel length does not seem to be as important as the LFC suggests, or at least this item has not properly been accounted for. Additionally, a word like “fire” is pronounced as /faɪr/ in GA and as /faɪǝ/ in RP. However, the LFC does not specify whether words like this should be pronounced as /faɪr/ (with a diphthong) or /faɪǝr/ (with a triphthong) as a result of rhoticity. Thus, the LFC does not account for the treatment of diphthongs and triphthongs, a practice which generates doubt about the sequence of segments in words such as “fire” and other similar words which have a different phonemic sequence in GA and RP. Again, it is worth remembering that the LFC is based on these two accents.
Dauer (2005) also disagrees with Jenkins (2000) about the teachability of word stress. Concerning this issue, the LFC does not consider this feature an essential one. However, Dauer (2005) emphasizes that the LFC considers aspiration of /p, t, k/ in initial position in stressed syllables to be crucial for intelligibility. This is a fundamental contradiction, for it is impossible to use aspiration accurately without being able to properly stress words. Hence, the treatment of word stress has also been neglected.
In summary, there is no consensus as to the number of L1 and L2 interactions in English. Thus, it is difficult to determine which group of English users is predominant, native speakers or non-native speakers. Besides that, it is argued that there is no way of predicting the potential interlocutors that learners of English will have in their future interactions. Moreover, the advantage of non-native speakers over native speakers in terms of intelligibility is questioned in light of empirical evidence. Finally, not only is the LFC full of vagueness and imprecisions in terms of its phonological system and phonetic details, but it also presents a paucity of essential information on important features such as stress, aspiration, and their relationship.
I shall now offer a discussion of the issues that have been raised by both parts in the debate. For this, I shall provide examples and insights drawn from my own experience both as a learner of English and as an English language teacher to learners whose first language is Spanish.
Discussion
Jenkins (2000) claims that native English language speakers' pronunciation is not the most intelligible. However, non-native English language speakers' deviations from the native standard form make non-native speech even more difficult (Lev-Ari, van Heugten & Peperkamp, 2016). This is especially true when the non-native listener is not familiar with another non-native speaker's accent. For instance, in a study by White, Treenate, Kiatgungwalgrai, Somnuk, & Chaloemchatvarakorn (2016), the results suggested significant differences between accent familiarity and listener comprehension. The study included audios recorded by speakers with eight different accents. These included Thai, Irish, British, Spanish, Korean, Indian, Croatian, and Nigerian English. The accents were assigned to four groups according to the listeners' level of familiarity with such accents. The results revealed significant differences across all groups. The group with the highest test scores was that with the most familiar accents. This group included the Thai, the British and the Irish accents.
As an EFL speaker of English, I have myself experienced the difficulty to understand both native speakers and non-native speakers of English on many occasions. To illustrate this, I shall mention an anecdote that happened to me when I was at a baker’s shop and was served by an Asian man during my stay in Australia. It is worth mentioning that at the time I was not familiar with any Asian accents in L2 English. Having ordered some rolls of bread, the attendant asked “soft or crunchy?” However, he produced the utterance as [ˈsɒft ɔː ˈkʌntɹi] which I interpreted as “soft or country?” I was not able to understand what he meant until a few minutes later with the help of context as I began thinking of phonetically similar words which could apply to the situation. This is a clear example of how unintelligible pronunciation can lead to miscommunication. To this, I have to add the many other occasions on which I have experienced a similar situation, not only as a foreigner overseas, but also as an English language teacher.
Regarding the issue described in the preceding paragraph, I agree with Trudgill (2005) that non-natives of English do not understand other non-natives of English more simply because they produce fewer phonological contrasts. Actually, reducing the number of these contrasts may result in misunderstanding; let us just consider the case in which L1 Spanish speakers generally tend to collapse /æ, ʌ, ɑː/into [a], thus producing cat, cut, and cart as homophones.
Unlike phonological contrasts, a feature that does seem to play a crucial role in understanding other speakers is that of speech rate (Chang, 2018). In general, non-native speakers of English tend to present a much slower speech pace than native speakers of English (Baese-Berk & Morrill, 2015). This does, to some extent, aid intelligibility. However, slower speech tempo does not guarantee intelligibility. On the contrary, on many occasions non-native English language speakers are still found to be unintelligible regardless of their slower speech rate.
Within the notion of English as an International Language (EIL), Jenkins (1998) suggests that the LFC grows out of the need to adapt to the change of direction of English as a result of the non-native-to-non-native interactions in English. The aim of the LFC is then to facilitate the learning of the pronunciation of English in the EIL context. Thus, a question emerges, as Wells (2005) poses it, "Do you and your students want to be able to interact with native speakers? Or only with non-native speakers?" (p. 1). Or put differently, do you discard or discriminate against a particular group of speakers of a language when you embark upon the task of learning it?
In terms of vowel quantity, Jenkins (as cited in Dauer, 2005) recommends that these should be clipped before voiceless consonants and lengthened before voiced consonants, e.g. sat, sad [sæt, sæːd]. In terms of quality, Jenkins suggests that any trace of a foreign accent is permissible as long as vowel quality is consistent. However, by bearing in mind these two recommendations about vowel quantity and quality, consider an L1 Spanish speaker, who is an L2 English learner, whose /æ/, due to L1 transfer, goes in the direction of cardinal vowel [a], which is allowed by the LFC, and who, again by means of the LFC, was taught to lengthen this vowel before a voiced consonant. A learner of this kind would eventually end up producing something similar to [haːd] for “had”, in its strong form according to the rules of the LFC. Hence, a native speaker of English, or even a non-native speaker of English, would most likely decode this as hard rather than had.
It can be concluded that speakers whose pronunciation is based on the LFC will eventually be understood by other non-LFC-pronunciation-based speakers, regardless of the type of pronunciation these potential interlocutors might have. In such a case, extreme freedom in the quality and quantity of vowels might lead to misunderstanding if interaction with native speakers were to take place, for different native speakers might rely on these two features differently to decode the meaning of some words. A native English language speaker of RP and a native speaker of, say, Australian English might focus on different parameters to distinguish the difference between pairs such as cut and cart as pronounced by a non-native speaker of English. The RP speaker would most probably focus his attention on vowel quality rather than quantity, whereas the Australian speaker would do the opposite due to the phonetic characteristics of the systems of each speaker. This is due to the fact that RP distinguishes the separation of vowel qualities for /ʌ/ and /ɑː/ (Bjelaković, 2016) as opposed to Australian English which uses the same quality for both (Andreu Nadal, 2016).
In accordance with the LFC, the processes involved in connected speech such as weak forms, elision, and assimilations are to be avoided. This is contradictory with Jenkins’s claim (as cited in Trudgill, 2005) that the LFC would “Drastically reduce the pronunciation teaching load” (p. 79). If avoidance of these features were to facilitate the pronunciation of English, then how could it be easier for the learner to pronounce phonological sequences such as that encountered in phrases like /henri ðə sɪks θrəʊn/ which even native speakers tend to avoid by means of the elision of some segments. Undoubtedly, the teaching of these features of connected speech is an aid for English language learners to overcome such difficulties and achieve comfortable levels of intelligibility when interacting with a native speaker of English (Moedjito, 2015).
The LFC claims that vowel epenthesis is preferred (over the elision of consonants?) as compared to the elision of consonants. Thus, words like “McDonald’s” would sound better and more intelligible as “Macudonaludo”, presumably pronounced as something in the direction of [mækʊdɒnæluːdəʊ], or “product” as [pərɒdʌkʊtə] rather than [pɒdʌk]. First, this results in words containing a much larger number of syllables (twice as many in the case of “McDonald’s”, and even more than twice the number of syllables for “product”), leading to a potentially higher degree of unintelligibility than those realizations that contain elision of consonants.
As Wells (2005) states, the irregular spelling system of English is one of the sources of the many difficulties that the English language learner faces. This is especially true for those learners whose L1 has a high level of correspondence between its spelling system and its pronunciation, e. g. Spanish and Italian. These difficulties can be overcome with appropriate instruction aided with adequate techniques. In this respect, the use of phonetic transcription plays a paramount role, Wells (2002) claims:
The principal reason for using phonetic transcription is easily stated. When we transcribe a word or an utterance, we give a direct specification of its pronunciation. If ordinary spelling reliably indicated actual pronunciation, phonetic transcription might be unnecessary; but often it does not. (Wells, 2002, para. 2).
The LFC includes the deletion of /ð, θ/ from the phonemic inventory of English as an International Language (EIL). In this respect, I think it is a much better decision to encourage English language learners to acquire these segments, if they are not already part of their L1 inventory. Practicing these segments in order to master them seems sensible; on the contrary, not learning them at all does not. Thus, the latter scenario entails that the learner is at risk of being exposed to possible misunderstandings.
After all, evidence seems to indicate that L2 English learners can actually benefit from pronunciation instruction (Barrera-Pardo, 2004). In addition to the segments mentioned above, as Dauer (2005) pinpoints, there is no mention to /ʒ/ whose distribution is limited and could well be coped with by substitution with other phonemes. For instance, it could be replaced by /ʃ/ in intervocalic position, /ʃ/ or /ʤ/ in final position as in “beige”, and with /ʤ/ in its extremely rare initial position in “genre”.
The LFC is supposed to facilitate the teaching load and goals for the language teacher (Jenkins, 1998, 2000). Concerning this matter, I must state I strongly disagree. In spite of the effort that any English language teacher may make, it seems impractical for them to survey their learners on what type of pronunciation model they would like to learn and then cater for all tastes together in one class. This would evidently result in educational chaos and a much larger workload for English language teachers. Besides, not all English language teachers can count on the necessary expertise or confidence (Pourhosein Gilakjani & Sabouri, 2016) to teach pronunciation, which adds extra difficulty to the situation.
I do agree though, just as Wells (2005) does, that English language learners’ goals should be considered as well as their L1 background. This necessarily leads to rethinking the L2 pronunciation models. Thus, the issue is, as Trudgill (2005) poses it, not whether to adopt or not a native model of L2 pronunciation but to which extent it is adopted. English language teachers should then ask themselves: what is the best L2 English pronunciation model available that suits my students’ needs and L1 background? In this way, the possibility goes beyond RP or GA. In the case of Chilean learners of English, who happen to be those who I teach and for the majority of whom RP has been the model for decades, is RP still a valid and suitable option? Considering that Spanish-speaking learners of English generally produce /r/ in post-vocalic position, should General American then or any other rhotic accent be adopted as the model to aim at? Of course, the answer to this question is not straightforward. Decisions made in this regard can benefit to a great extent from work informed by contrastive analysis.
Conclusion
In terms of adopting models for the teaching of English pronunciation, no one owns the truth. As can be seen from the arguments presented above, scholars and academics from all over the world have made their contributions. Although I do not consider the LFC a feasible option, Jenkins has raised the issue and made an attempt to offer a solution to the difficulties and challenges that millions of English language learners face every day. Still, (or Yet, Conversely,) Trudgill offers a more sensible approach by claiming that it is important to bear in mind the extent to which native speaker models of English language pronunciation are aimed.
Finally, other issues such as learners' goals, and their L1 backgrounds, should also be considered when choosing an L2 English pronunciation model. The teaching and learning of English language pronunciation should be guided and aided by useful tools such as phonetic transcription, for the more advanced learners, and other resources which are now more accessible such as computer software, mobile apps, and internet tools (Buss, 2016).