luni, 10 octombrie 2011


The Ohio State University
This paper addresses three key observations relating to crosslinguistic patterns of metathesis.
First, the order of sounds resulting from metathesis can differ from language to language such
that a similar combination of sounds can be realized in one order in one language, but in the
reverse order in another language. Second, for some sound combinations, only one order is commonly
attested as the result of metathesis, while for other combinations, either order can be
observed. Third, the acoustic/auditory cues to the identification of the sequence resulting from
metathesis are often better than those of the expected, yet nonoccurring, order. These patterns
receive a straightforward explanation when we consider the phonetic nature of the sounds involved
as well as the speaker/hearer’s knowledge of native sound patterns and their frequency of occurrence.
Neither factor alone is sufficient to provide a predictive account of metathesis. This study
shows, however, that by taking into account both factors, we are able to understand why certain
sound combinations tend to undergo metathesis, why others are common results of metathesis,
why patterns of metathesis differ across languages, and, importantly, why metathesis occurs in
the first place.*
1. INTRODUCTION. Metathesis is the process whereby in certain languages the expected
linear ordering of sounds is reversed under certain conditions. Thus, in a string
of sounds where we would expect the ordering to be . . . xy . . . , we find instead . . .
yx . . . . In the verbal system of the Dravidian language Kui (Winfield 1928), for example,
the expected ordering of a stem-final consonant and suffix-initial labial is reversed
just in case the stem ends in a velar stop (e.g. /bluk pa/ [blupka] ‘to break’, cf.
/gas pa/ [gaspa] ‘to hang’). While variation in the linear ordering of elements is
typical in the domain of syntax, it is comparatively striking in phonology and differs
in nature from most other processes which are typically defined in terms of a single
sound, or target, that undergoes a change in a specified context.
The apparently distinct nature of metathesis has resulted in the perpetuation of what
one might refer to as the METATHESIS MYTH, the commonly held view of metathesis as
sporadic and irregular, restricted to performance errors, child language, or sound change.
This view is regularly expressed in the linguistic literature, including the most up-todate
instructional texts and dictionaries (e.g. Crystal 1997, Spencer 1996).
* This study has benefited from the input of many individuals and I am very grateful to them for their
suggestions and criticisms. I would like to thank Mary Beckman, Mary Bradshaw, Nick Clements, Peter
Culicover, Keith Johnson, Ilse Lehiste, He´le`ne Loevenbruck, Matt Makashay, Jeff Mielke, Jennifer Muller,
David Odden, John Ohala, Sharon Peperkamp, Mark Pitt, Keren Rice, Anton Rytting, Shari Speer, Donca
Steriade, Steve Winters, Richard Wright, and Draga Zec. I would also like to acknowledge the many linguists
who generously shared their knowledge of specific languages with me. In this regard I would like to thank
Outi Bat-El (Modern Hebrew), Marika Butskhrikidze (Georgian), Catherine Callaghan (Mutsun), Sheldon
Harrison (Mokilese), Larry Hyman (Basaa), Hjalmar Peterson (Faroese), Miklo´s To¨rkenczy (Hungarian),
Krisztina Polgardi (Hungarian), Wolfgang Schulze (Udi), Grover Hudson (Sidamo and other Ethiopian languages),
Deborah Schmidt (Basaa), John Wolff (Tagalog, Cebuano), and Fernando Martinez-Gil (Old Spanish).
My thanks also go to Jeff Mielke, Jennifer Muller, and Misun Seo for their valuable research assistance.
Finally, I would like to thank the editor of this journal, Brian Joseph, associate editor Eugene Buckley,
editorial assistant Hope Dawson, and two anonymous referees for their very helpful input. My sincere
apologies to those whom I have inadvertently forgotten to acknowledge. This research was supported by a
grant from the National Science Foundation: SBR-9809732.
204 LANGUAGE, VOLUME 80, NUMBER 2 (2004)
An important factor underlying this view relates to data.1 Despite the fact that numerous
cases of metathesis are reported in the literature, basic knowledge has been lacking
concerning the full range of metatheses that are possible in language, under what conditions
metathesis applies, why metathesis happens, and how metathesis interacts with
other processes affecting sound structure. This information is critical to providing an
accurate picture of the nature of metathesis. It is also of crucial importance for advancing
our knowledge of language since developing an explanatory model of language is
impossible without a clear understanding of the fundamental processes possible.
It is therefore significant that more recent studies are bolstering the previously existing
literature to create a solid empirical foundation for the study of metathesis.2 These
works include crosslinguistic surveys (e.g. Bailey 1970, Blevins & Garrett 1998, Grammont
1933, Hock 1985, Hume 1998, 2001, Janda 1984, Langdon 1976, McCarthy 1989,
Mielke & Hume 2001, Semiloff-Zelasko 1973, Silva 1973, Ultan 1978, Wanner 1989),
in-depth studies of metathesis in individual languages or language families (e.g. Alexander
1985, Besnier 1987, Black 1974, Butskhrikidze & van de Weijer 2001, Delancey
1989, Dume´nil 1983, 1987, Hume 1997b, Hume & Seo 2004, Isebaert 1988, Keyser
1975, Laycock 1982, Lyche 1995, Malone 1971, 1985, McCarthy 2000, Montreuil
1981, Powell 1985, Shaver & Shaver 1989, Smith 1984, Sohn 1980, Thompson &
Thompson 1969, Timberlake 1985), and experimental work exploring psycholinguistic
influences on metathesis (e.g. Fay 1966, Makashay 2001, Winters 2001). An online
database of metathesis cases is also being developed and will ultimately contain information
on all reported cases of metathesis.3 It is clear from these studies that while
metathesis is less common than processes such as assimilation and deletion, it can
nonetheless occur productively in a wide range of languages.
Yet, despite this large body of literature, to date there is no unified, explanatory
account of why metathesis occurs, why it favors certain sound combinations, and why
we obtain the output that we do. One reason for this relates to the observation that
crosslinguistically, metathesis can appear to be random due to the fact that a string of
sounds can be realized in one order in one language but in the opposite order in another
language, as pointed out by Blevins and Garrett (1998). Consider metathesis involving
a glottal consonant in Balangao, Hungarian, Pawnee, and Basaa. As I outline below,
in the first two languages, the glottal is realized after a consonant, while it surfaces
before a consonant in the last two.
In Balangao, vowel deletion leads to the juxtaposition of two consonants (e.g.
/ba√ad-an/ [ba√dan] ‘time of returning’). When this would give rise to the glottals
1 An additional factor relating to the perpetuation of the metathesis myth concerns the nature of phonological
theories. In linear and nonlinear phonological theories, there is a principled reason to resist recognizing
metathesis as a legitimate phonological process of segment reversal: extending the theory to account for the
inherently distinct nature of metathesis has the potential of opening ‘a Pandora’s box of implausible-seeming
. . . processes’ (Janda 1984:92). Indeed, Webb (1974) claims that metathesis does not exist as a regular
phonological process in synchronic grammar. For additional discussion, see Hume 2001.
2 A more comprehensive listing of references for metathesis can be found at http://www.ling.ohiostate.
edu/ ehume/metathesis/bibliography/.
3 The complete database is housed in the Department of Linguistics at the Ohio State University. For a
listing of metatheses in the database as well as information on each case, see the metathesis website: http:// ehume/metathesis/.
[≈, h] occurring before a nasal or oral plosive, the expected ordering of the sound
combination is reversed, and the glottal occurs after the plosive, as in 1 (Shetler 1976).
(1) Balangao
≈i-hégép ≈ighép *≈ihgép ‘bring in’
pRhéd-én pRdhén *pRhdén ‘allow, accept’
géhéb-én gébhén *géhbén ‘burn it’
ma-hédém madhém *mahdém ‘night’
CV-≈opat ≈op≈at *≈o≈pat ‘four each’
CV-≈éném ≈én≈ém *≈é≈ném ‘six each’
A similar pattern is observed in Hungarian, though forms that undergo metathesis
are limited to stems with [h] and an approximant. The relevant forms are a subclass
of morphemes whose last nucleus alternates with ; the vowel has traditionally been
analyzed as epenthetic and is subject to vowel harmony (e.g. /bokr/ [bokor], [bokr-ot]
‘bush.NOM/ACC’, /term/ [terem], [term-et] ‘hall.NOM/ACC’). Given the order of the glottal
before the approximant in the dative forms in 2, the expected linear order of the sounds
in the plural is glottal approximant. The order is reversed, however, with the glottal
consistently occurring after the consonant (Vago 1980, Kenesei et al. 1998, Sipta´r &
To¨rkenczy 2000).
(2) Hungarian
tehernek terhek *tehrek ‘load’
peheynek peyhek *pehyek ‘fluff’
keheynek keyhek *kehyek ‘chalice’
Metathesis in Pawnee is almost the mirror image of that observed in Hungarian. In
this case, the expected sequence /r h/ is reversed so that the glottal is positioned
before the consonant, as shown in 3 (Parks 1976).
(3) Pawnee
ti-ir-hisask-hus tihrisasku ‘he is called’
ti-a-hura r-hata tahurahrata ‘it’s a hole in the ground’
ti-ur-ha k-ca tuhrakca ‘a tree is standing’
A glottal is also realized postvocalically in Basaa, a more general case of the metathesis
observed in Pawnee. In Basaa, metathesis involves the glottal fricative of the indirective
causative morpheme and a stem-final consonant (Schmidt 1994). The causative
marker is analyzed as /-aha/, with the initial vowel alternating with . The full morpheme
is realized following a CVCVC stem, as in 4a. After a CV or CVC stem, the initial
vowel of the marker is absent, shown in 4b and 4c respectively. Metathesis can be seen
in the set of examples in 4c where the expected ordering of the consonant and glottal
fricative is reversed. Other forms with consonant clusters in 4a show that metathesis
does not affect all consonant types (tones have been omitted for simplicity).
206 LANGUAGE, VOLUME 80, NUMBER 2 (2004)
(4) Basaa
a. kobol koblaha koblak ‘peel’
pidip pid1aha pid1ak ‘hate’
a√al e√laha a√lak ‘tell’
b. ce ciha cek ‘destroy’
l: loha l:k ‘arrive’
ha heha hak ‘put’
c. lεl lehla lεlεk ‘cross’
te√ tih√a te√ek ‘tie’
1on 1uhna 1onok ‘promise’
1:l 1ohla 1:l:k ‘burst’
at ehda adak ‘unite’
The sample of metathesis cases just presented illustrates the challenge for any explanation
of metathesis: metathesis can appear to be random crosslinguistically since two
elements can surface in one order in one language but in precisely the opposite order
in another language.
A predictive theory of metathesis must also account for the observation that within
the set of attested metatheses, sound combinations can differ with respect to whether
one or both orders of the sounds is generally attested as the result of metathesis crosslinguistically.
As we just saw, for instance, when one of two consonants involved is a
glottal, neither order of the sounds seems to be favored crosslinguistically; rather, both
orders are observed emerging as the result of metathesis, depending on the language.
Metathesis involving homorganic liquids and nasals also seems to fall into this category;
the result of metathesis may be nasal liquid or liquid nasal, again depending on
the language. Old Spanish and Chawchila serve to illustrate.
Metathesis in Old Spanish, shown in 5a, was conditioned by vowel syncope in the
future and conditional formations of the verbwhich resulted in the contiguity of /nr/
(examples are given for future forms of the verb). The metathesized form competed
with one in which an obstruent stop was realized between the nasal and liquid. While
both coexisted in all forms of the future and conditional in Old Spanish, only the variant
with the intrusive consonant has survived in Modern Spanish (Wanner 1989, Martinez-
Gil 1990). In Chawchila (5b), metathesis is attested in the intensive possessor suffix
which displays two alternants, [-ilin] and [-inl-] (Newman 1944; see related discussion
in Stonham 1990). The VCVC alternant occurs word-finally while the VCC variant is
realized before a vowel-initial suffix. Newman (1944) reports that the same process
takes place within the unanalyzable noun theme, although no examples are provided.
While the linear ordering of similar consonants in the two languages changes by metathesis,
the order of the output differs: in Old Spanish, the nasal is prevocalic and the
liquid preconsonantal, while the reverse order is found in Chawchila.
(5) a. Old Spanish
poner porne´, pondre´ *ponre´ ‘to put’
tener terne´, tendre´ *tenre´ ‘to have’
venir verne´, vendre´ *venre´ ‘to come’
b. Chawchila
tihthilin ‘one with many head lice’
patthilin ‘body-louse’
cawa ≈an patthinli ‘[he] shouted at the one with many body-lice’
There are also sound combinations where only one order is typically observed as
the result of metathesis. For example, when an intervocalic stop and fricative are involved,
the stop consistently surfaces before a vowel (Steriade 2001, cf. Silva 1973,
Makashay 2001). Consider Udi metathesis in 6. When a coronal plosive (oral stop or
affricate) would be expected to precede a coronal fricative or affricate through morpheme
concatenation, the stop consistently surfaces instead after the strident consonant.
Examples from the language’s verbal morphology illustrate. The last four forms demonstrate
that a noncoronal consonant does not metathesize with a following sibilant.
(6) Udi (Schulze 2002)
tad-esun tast’un4 ‘to give’
t’it’-esun t’ist’un ‘to run’
et+-esun e+t+’un ‘to bring’
bafd-sa bafst’a ‘falling into’
bot’-sa bost’a ‘cutting’
et+-sa e+t+’a ‘bringing’
tat’und-e+a tat’un+t’a ‘they (let) bring’
cf. ak’-esun ak’sun ‘to see’
aq’-esun aq’sun ‘to take’
lap-esun lapsun ‘to put on’
t+alx-sa t+alxsa ‘knowing’
A similar pattern is exemplified by metathesis in Faroese and Lithuanian. Where a
postvocalic fricative followed by two stop consonants would be expected, we find
instead the fricative flanked by the two stops. This can be seen for Faroese in 7, where
a velar stop metathesizes with an adjacent coronal fricative just in case it is followed
by another stop consonant (Lockwood 1955, Jacobson & Matras 1961, Rischel 1972,
Hume & Seo 2004).
(7) Faroese
baisk baiskυr baikst *baiskt ‘bitter’
fεsk fεskυr fεkst *fεskt ‘fresh’
rask raskυr rakst *raskt ‘energetic’
A similar pattern can be observed in Lithuanian, illustrated in 8 by a comparison of
the third person singular past imperfective verbforms with those of the imperative and
infinitive. In the former the order is fricative stop, while in the latter the order is
reversed, giving stop fricative (Kenstowicz 1972, Ambrazas 1997, Hume & Seo
(8) Lithuanian
plyeske plyeksk plyeksti *plyeskti ‘flash intensely’
tvyeske tvyeksk tvyeksti *tvyeskti ‘flash briefly’
bre+ko brek+k brek+ti *bre+kti ‘break (of dawn)’
brizto briksk briksti *briskti ‘fray’
4 There are two additional facts about the data that should be noted: first, syncope of unstressed /e/
provides the context for metathesis, and second, the coronal stop is realized as voiceless and glottalized after
a consonant.
208 LANGUAGE, VOLUME 80, NUMBER 2 (2004)
In both Faroese and Lithuanian, the expected order is altered so that the fricative is
positioned between the two stops. To my knowledge, there are no attested cases of
metathesis where a stop shifts from postvocalic to interconsonantal position (e.g. /Vkst/
N *[Vskt]).
The direction of metathesis involving intervocalic sonorant and stop consonants also
favors one order: given a sequence of two intervocalic consonants, VC1C2V, the obstruent
typically occurs in C2 position, the sonorant in C1.5 For example, in Elmolo, a
lowland East Cushitic language, metathesis occurs in the plural formation of nouns
(Zaborsky 1986). One type of plural is formed by the addition of the suffix /-o/, as in
9a. When the medial or final consonant of a bisyllabic noun is a (nongeminate) liquid,
the vowel of the last syllable elides, as in 9b. But metathesis occurs when the obstruent
stop would otherwise occur in C1 position, as shown in 9c.
(9) Elmolo
a. karris karriso ‘cheek’
e¯k e¯ko ‘fire’
na¯n na¯no ‘harpoon’
or oro ‘tree’
b. ilik ilko ‘tooth’
t+ilik t+ilko ‘foot’
elem elmo ‘sheep’
elon-te elno ‘cowry shell’
c. tikir tirko ‘catfish’
deker derko ‘horn’
mukul mulko ‘iron’
A similar pattern is observed for Sidamo in 10, where a root-final obstruent and a
following nasal metathesize; the nasal is realized as homorganic with the adjacent
obstruent. Metathesis systematically occurs before suffixes beginning with /n/, the only
suffix-initial sonorant in the language (Hudson 1975). A similar process is observed
in other Ethiopian languages such as Darasa, Gedeo, Hadiyya, and Kambata (Hudson
1975, 1995).
(10) Sidamo
hab-nemmo hambemmo ‘we forget’
gud-nonni gundonni ‘they finished’
dod-nanni dondanni ‘he will run’
it-noommo intoommo ‘we have eaten’
has-nemmo hansemmo ‘we look for’
duk-nanni du√kanni ‘they carry’
ag-nummo a√gummo ‘we drank’
ag-no a√go ‘let’s drink’
ag-ni a√gi ‘he drank’
In the examples just cited, an obstruent occurs by metathesis between a sonorant consonant
and a vowel (VSonObstV). Cases in which the result of metathesis has the obstruent
5 This type of metathesis has been used as evidence for the SYLLABLE CONTACT LAW, which requires the
sonority of the coda to be greater than that of the following onset (Vennemann 1988). See Hume 1998, 2001
and Hume & Seo 2004 for arguments against this analysis of metathesis and Seo 2003 for arguments against
the syllable contact law more generally.
positioned before the sonorant consonant (VObstSonV) are less common, though not unattested.
Of the thirty-seven cases of consonant/consonant metathesis examined that involve
morphophonemic alternations in synchronic grammars, eleven involve obstruent/
sonorant combinations. In ten of these cases, the obstruent occurs by metathesis prevocalically
after the sonorant consonant (VSonObstV), while in one, the Costanoan language
Mutsun, the obstruent occurs postvocalically before the sonorant (see the discussion in
It has also been observed that acoustic/auditory cues to the identification of the
sequence resulting from metathesis are often better, or optimized, compared to those
of the expected, yet nonoccurring, order (Hume 1998, Steriade 2001). Cases involving
obstruent stops provide a clear example of this type of pattern. It is well established
that obstruent stops depend heavily on contextual cues such as release burst and formant
transitions for the identification of their manner and place of articulation (see §2.3).
Thus, the occurrence of stops in a context in which these cues are present should be
preferred to their occurrence in a context in which the cues are absent or partially
masked. In this view then, prevocalic position can be considered preferable to preconsonantal
position since both release burst and transition cues are present in the former
context. Similarly, adjacency to a vowel is preferable to nonadjacency to a vowel. Of
interest here is the observation that these are precisely the contexts to which stops
commonly shift by metathesis. As just seen for stop/fricative metathesis in Faroese and
Lithuanian, the stop surfaces after a vowel instead of between two consonants. A stop/
fricative sequence is also involved in Udi, illustrated in 6, with the stop occurring in
prevocalic position. Similar patterns are observed in Elmolo (9) and Sidamo (10), where
a stop shifts to prevocalic position at the expense of nasals and liquids. These cases
represent a recurring pattern in metathesis: a consonant with potentially weak phonetic
cues often emerges in a context in which the cues are more robust than they would
have been in the expected, yet nonoccurring, order. An explanatory account of metathesis
must also account for this observation.
In the preceding discussion, I have illustrated a number of interrelated observations
concerning metathesis. First, the direction of change in metathesis can differ from
language to language. Thus, a similar sound combination can be realized in one order
in language A, but in the reverse order in language B. Second, for some sound combinations,
only one order of the sounds is generally observed crosslinguistically as the result
of metathesis, while either order seems just as likely to be attested for other combinations.
Third, the acoustic/auditory cues to the identification of the sequence resulting
from metathesis are frequently better, or optimized, compared to those of the expected,
yet nonoccurring, order. A successful model of metathesis should be able to provide
an explanatory account of each of these observations.
As I show in §§3 and 4, these patterns receive a straightforward account when we
consider two important factors: (i) the nature of the sounds involved, and (ii) the influence
of existing patterns in the language. To anticipate this discussion, I suggest that
for metathesis to occur, two conditions must be met. First, there must be indeterminacy
in the signal, with indeterminacy defined as a function of: (i) the listener’s experience
with the elements involved (e.g. sounds, sound sequences, morphemes, words, etc.),
and (ii) the quality of information in the signal as determined by the types of sounds
involved, the context in which they appear, the phonetic cues present, and so on. Second,
the order of elements opposite to that occurring in the input must be an attested structure
in the language. Indeterminacy sets the stage for metathesis, and the knowledge of the
sound patterns of one’s language influences how the signal is processed and, thus, the
210 LANGUAGE, VOLUME 80, NUMBER 2 (2004)
order in which the sounds are parsed. To be specific, the order inferred from the signal
is consistent with that which occurs most frequently in the language. This proposal is
consistent with Fay’s (1966:88) earlier speculations regarding metathesis: ‘when listeners
hear speech that is expected to be in the native language, their perceptual identifications
are directed by their knowledge of sequential probabilities in the language as
well as by the acoustic stimulus’.
Support for this approach comes first from the metathesis patterns themselves, but
also from a large body of research in phonetics, phonology, historical linguistics, and
psycholinguistics. For example, at the heart of the proposed account is the assumption
that an individual’s knowledge of his/her language, including its patterns of usage, is
an effective predictor of the direction of metathesis. Support for this proposal comes
from extensive research demonstrating that listeners are sensitive to the frequency of
the words, sounds, and sound combinations of their language (see, among others, Bybee
1985, 2001, Frisch 1996, Frisch et al. 2000, Lindblom 1990, Luce 1986, Makashay
2001, Pierrehumbert 1994, Pitt & McQueen 1998, Saffran, Aslin & Newport 1996,
Saffran, Newport & Aslin 1996, Vitevitch & Luce 1999). This approach also builds
on earlier, primarily diachronic, studies of metathesis that point to the influence of a
language’s sound patterns on metathesis (Grammont 1933, Hock 1985, Ultan 1978).
The proposal that indeterminacy is a necessary prerequisite for metathesis draws on a
large body of research in phonetics, phonology, and historical linguistics showing the
importance of acoustic and auditory cues in shaping phonological systems (see, among
others, Bladon 1986, Blevins & Garrett 1998, Coˆte´ 1997, Fay 1966, Flemming 1995,
Hume 1998, Hume & Johnson 2001a,b,c, Jun 1995, Liljencrants & Lindblom 1972,
Lindblom 1990, Mielke 2002, 2003, Ohala 1981, 1993, 1996, Padgett 2001, Silverman
1995, Steriade 1995, 1997, 2001, Winters 2001, Wright 1996, 2001). A key aspect of
the proposal developed in this paper, however, is that neither the phonetic nature of the
sounds involved nor one’s familiarity with native sound sequences ALONE is sufficient to
provide a fully predictive account of metathesis. Rather, it is by taking into account
BOTH factors that we are able to understand why certain sound combinations tend to
undergo metathesis, why others are favored as the result of metathesis, why patterns
of metathesis differ across languages, and, importantly, why metathesis occurs in the
first place.
The data used in this study are drawn from a database of thirty-seven cases of
consonant/consonant metathesis, supplemented by cases of consonant/vowel metathesis
when relevant (note that some languages have more than one metathesis). While the
proposal in this paper is intended to extend to all types of metathesis, the emphasis is on
consonant/consonant metathesis due largely to the fact that while there is considerable
documentation regarding metatheses involving a consonant and vocoid (see e.g. Blevins
& Garrett 1998, Hume 1997b, McCarthy 2000), less is known about the general
patterns of consonant/consonant metathesis. Most of the cases discussed involve metatheses
that can be observed as (morpho)phonological alternations or as variable realizations
of a particular order of sounds in the synchronic grammar of a language. In some
instances metathesis is observed with great regularity throughout a language, while in
others metathesis may involve only a handful of words. I consider both types of data
as valid for the present study since my interest is in understanding the factors underlying
why and how metathesis occurs. Questions relating to how metathesis is to be formalized
within phonological theory or how a single occurrence of metathesis generalizes and
spreads throughout a language are all important, yet are not ones that I specifically
address here. I refer the reader to the references cited throughout this article for relevant
discussion of these issues (see also n. 2).
2. BACKGROUND. Speech processing plays a key role in the explanation of metathesis
developed in this article and, in this regard, two important factors that bear on the
processing of speech sounds figure centrally. The first relates to the knowledge that
we have of our native language, that is, our language experience. Processing speech
is facilitated by our experience with, among other things, the words, morphemes,
sounds, and sound sequences that make up our native language, as well as the frequency
with which these elements occur. The second factor concerns the quality of the information
that occurs in the speech signal as determined by the types of sounds involved,
the phonetic cues available for the identification of the sounds, and so forth. As I discuss
in §2.1 below, given the inescapable influence of one’s language knowledge, a sequence
of sounds with identical phonetic cues may be parsed differently by different listeners
(of different languages or even of the same language). Each of these points is developed
in greater detail below, thereby providing a basis for the discussion of metathesis to
2.1. LANGUAGE EXPERIENCE AND LANGUAGE USAGE. How a particular auditory speech
signal is parsed by a hearer is influenced not only by the acoustic/auditory information
present, but also by the knowledge that the individual has of his or her language (Lindblom
1990, Luce 1986). Strong evidence in support of this view comes from psycholinguistic
research in first and second language acquisition, and speech and word
It is well established that infants are born with the ability to discriminate sounds that
contrast in their native language as well as those that do not (Aslin et al. 1981, Best
et al. 1988, Polka & Werker 1997, Streeter 1976, Trehub1976, Werker et al. 1981,
Werker & Tees 1984, 1999). Shortly thereafter, however, the effects of native-language
influence can be observed.7 For example, research shows that by four months of age,
infants being brought up in a monolingual environment are able to distinguish their
native language from a similar yet unfamiliar language (see Werker & Tees 1999). By
six months, infants show sensitivity to language-specific grammatical information (Shi
et al. 1998) and a preference for the prosodic system of their native language (Jusczyk
1997). The early influence of language experience on the parsing of the speech signal
is also observed by a decline in the infants’ ability to discriminate between sounds that
do not serve a contrastive function in their language. For example, the well-known
studies of Werker and colleagues (Werker et al. 1981, Werker & Tees 1984) reveal
that six-to-eight-month-old English-learning infants were able to distinguish place of
articulation contrasts in Hindi, just like Hindi adults. English-speaking adults, however,
had particular difficulty distinguishing the retroflex/dental place contrast. Of significance
is the finding that by ten to twelve months of age, English-learning infants were
no longer able to discriminate the nonnative contrast, thus behaving in a manner similar
6 It is beyond the scope of this paper to discuss the many principles involved in the processing of speech
sounds. I refer the reader to works such as Bladon 1986, Bregman 1990, and Johnson 1997 for in-depth
discussion of the principles of auditory and acoustic phonetics and how these principles relate to parsing
information in the speech signal.
7 Native-language influence may occur even in utero or shortly after birth. Research by Moon and colleagues
(1993) shows that infants as young as two days old display a preference for listening to their native
language, be it English or Spanish.
212 LANGUAGE, VOLUME 80, NUMBER 2 (2004)
to English adults.8 The influence of language experience is also revealed by the infant’s
ability to distinguish familiar vs. unfamiliar elements within their own language. Jusczyk
and Aslin (1995) show that by seven and a half months of age, infants are able to
show a preference for familiar vs. unfamiliar words. Similarly, by nine months, infants
can discriminate speech sound sequences that occur more frequently from those that
occur less frequently within their own language (Jusczyk & Luce 1994).
It is important to note, however, that the decline in perceptual abilities due to nativelanguage
experience is not equally pronounced for all consonantal distinctions; infants
and adults have been shown to maintain their ability to discriminate some nonnative
phonetic contrasts (Best 1994, Best et al. 1988, Werker & Tees 1984). The results from
Best et al. 1998 reveal that the distinction between apical and lateral Zulu clicks remains
distinguishable to ten-to-twelve-month-old infants as well as to adults. These findings
are taken to indicate that the decline in perceptual sensitivities is limited to sounds that
are similar to the sounds of the infant’s native language (Best 1994). Thus, the evidence
from first language acquisition shows clearly that native-language familiarity enables
us to fine-tune our ability to process the words and sounds of our language. One
consequence of this fine-tuning for second language acquisition is that listeners are
more adept at perceiving sounds of their native language than those of a second language
acquired later in life (e.g. Dupoux et al. 1997, Francis & Nusbaum 2002, Polka &
Werker 1994).
Psycholinguistic research in speech and word processing also shows that the ability
to process speech is facilitated by a listener’s familiarity with various dimensions of
the native language’s phonological system. This includes the language’s sounds (Pitt &
Samuel 1990), phonotactics (Halle´ et al. 1998, Massaro & Cohen 1983, Pitt 1998,
Pitt & McQueen 1998), patterns of contrast (Dupoux et al. 1997, Harnsberger 2001,
Hume & Johnson 2003, Lahiri & Marslen-Wilson 1991, Otake et al. 1996), and syllable
structure (Cutler & Norris 1988, Pallier et al. 1993, Pitt et al. 1998, Treiman & Danis
1988). For example, listeners are biased to parse consonant clusters that are phonotactically
impermissible into permissible sequences (Halle´ et al. 1998, Massaro & Cohen
1983, Pitt 1998). Pitt (1998) found that an epenthetic schwa is more likely to be perceived
between the consonants of phonotactically illegal consonant clusters (e.g. [tl+]
N [tUl+]) than legal clusters (e.g. [tr+] N [tUr+]).
Phonological contrast also impacts speech processing by influencing the amount of
attention paid to the cues of sounds that occur in the language. Otake and colleagues
(1996) investigated the role of nasal place of articulation on the processing of place in
a following stop consonant by Japanese and Dutch subjects. They found that Japanese
listeners made use of place cues in a nasal consonant to obtain information about the
place of articulation of a following stop. Dutch listeners, in contrast, ignored place
information in a preceding nasal when processing the place identity of a following stop
consonant. As the authors point out, these findings reflect the different phonological
status accorded place of articulation in preconsonantal nasals in the two languages.
In Japanese, a nasal is obligatorily homorganic with a following stop (Vance 1987).
Conversely, while place assimilation does occur between a nasal-stop sequence in
Dutch, it does not have the obligatory status it has in Japanese; assimilation fails to
occur both within and across word and syllable boundaries. As the authors conclude,
‘place of articulation in a nasal is a reliable source of information about a following
8 Infants’ decline in sensitivity to native-language contrast for vowels can occur even earlier for vowels
(six months of age) than for consonants (Kuhl et al. 1992).
stop for Japanese listeners, and they make use of it; it is less reliable for Dutch listeners,
and it is not used’ (Otake et al. 1996:3841). Similarly, Hume and colleagues (1999)
tested the perceptual salience of stop place of articulation in the context CV for native
speakers of Korean and American English. The results revealed that listeners’ sensitivity
to information contained in the vowel transition following the consonant was significantly
greater for Korean listeners than it was for American English listeners. The
explanation offered for this finding relates to differences in the system of phonological
contrasts in each language. Unlike English, Korean contrasts tense, lax, and aspirated
stops, a contrast that is cued in part by the amplitude of aspiration (Kim 1994). Due
to these phonological contrasts, Korean listeners learn to focus greater attention on the
interval of time following the stop-release burst, that is, on the vowel transitions.
These studies underscore the important fact that since languages differ in terms of
their lexicons and phonologies, the influence of linguistic knowledge on the way that
a speech signal is parsed is necessarily language specific. We learn to focus our attention
on the phonetic cues that are important for distinguishing the meaningful elements of
our language while ignoring those that are not. This can then yield a considerable
degree of language specificity when it comes to processing speech sounds. Thus, when
presented with identical sound stimuli, speakers/hearers with different linguistic experiences
can process stimuli in different ways. The language-specific bias in speech processing
has important implications for understanding metathesis, as I discuss in more
detail in §§3 and 4, since it means that a signal may be parsed in different ways
depending on the native-language experience of the speaker/hearer.
Familiarity with the USAGE of elements that make up one’s language also influences
speech processing. How words are parsed is affected by factors such as their frequency
of occurrence, the number of neighboring words that are phonetically similar to them,
the predictability of the sequences of sounds in the word, and how familiar they are
to the listener (Frisch et al. 2000, Luce 1986, Luce & Pisoni 1998, Pitt & McQueen
1998, Pollack et al. 1959, Savin 1963, Vitevitch & Luce 1999). With respect to word
frequency, for example, the higher the frequency of a word, the higher its probability
of being correctly recognized (Luce 1986). Bybee (2001) claims that this is because
high-frequency words have increased lexical strength due to repetition; little-used items
will tend to fade in strength and grow more difficult to access. She also argues, based on
a range of experimental evidence, that type frequency (as opposed to token frequency) is
an important determinant of productivity (Baayen & Lieber 1991, Bybee 1985, 1995,
Moder 1992, Wang & Derwing 1994). Frequency of individual sounds and sound
sequences also impacts recognition in infants and adults (Bush 2001, Coleman &
Pierrehumbert 1997, Makashay 2001, Pierrehumbert 1994, Pitt & McQueen 1998,
Pitt & Samuel 1990, Saffran, Aslin & Newport 1996, Vitevitch & Luce 1999). The
relative acceptability of nonsense words with occurring and nonoccurring phonotactic
patterns has been shown to be based on the distribution of the patterns in the lexicon;
patterns with high type frequency are judged by listeners to be more acceptable (Bybee
2001, Pierrehumbert 1994, Vitevitch & Luce 1999).
2.2. THE NATURE OF SPEECH SOUNDS. As discussed in §2.1, how one parses an auditory
signal is strongly influenced by native-language experience. One’s ability to discriminate
sounds is also dependent on the speech sounds in question. Recall Best’s (1994)
conclusion that one reason why Zulu clicks remain distinguishable to older infants and
adults is because they are dissimilar to sounds in the native language; a decline in
perceptual sensitivities is limited to sounds that are similar to those of the native lan214
guage. The findings also suggest that some speech sounds are simply more salient than
others. That is, the inherent quality of the phonetic cues of some sounds make them
easier to identify than others, with clicks being examples of the former and retroflex
stops being examples of the latter. This underscores the importance of the nature of
speech sounds to the parsing of an auditory signal, as I outline below.
2.3. PERCEPTUAL SALIENCE. As is evident, the presence of phonetic cues is crucial
for the identification of a speech sound. The better the cue package, the more information
there is about the sound, and the easier the sound is for a listener to identify (for
related discussion, see Steriade 1995, 1997, Wright 1996). Phonetic cues are determined
by two principal factors: the nature of the sound in question and the context in which
it occurs. Since sounds that differ articulatorily can have different acoustic/auditory
cues, the precise nature of the sound in question is crucial.
Also critical to determining the quality and quantity of a sound’s phonetic cues is
context, such as position in a word or phrase, neighboring sounds, prosodic prominence,
and so on. Context can determine whether a cue is present or absent, as well as the
degree to which a particular cue is manifested. Consider burst release in stops, for
example. Prevocalically, the burst release of a stop is always present, regardless of
language. Phrase-finally, on the other hand, a stop may or may not be released, the
choice of which is determined on a language-by-language basis. In Korean, for example,
stops are unreleased in this position, while in English the burst is optional. Context
can also determine the degree to which a particular cue is present. A cue may be
diminished as a consequence of masking from adjacent sounds (Byrd 1994). For example,
the release burst of a stop may be masked by the frication of a following consonant,
or the frication of [h] may be masked by that of an adjacent fricative (Mielke 2003).
The occurrence of a sound in an unstressed, as opposed to stressed, syllable can also
result in weak cues due to, among other factors, compressed duration of formant transitions
and segment-internal cues. Compressed duration may also be relevant to the
occurrence of a consonant in preconsonantal, as opposed to word-final, position. Beckman
and Edwards (1990), for example, found that segments in word-final position are
generally longer than those in word-medial position, with lengthening being even more
evident at the end of an intonational phrase. If greater duration is at issue, it is reasonable
to assume that a consonant’s perceptual cues carry more information and are thus more
salient in word-final than in word-medial coda position.
As detailed in Wright 1996, some sounds are more dependent than others on contextual
cues to their identification. To illustrate, compare the perceptual cues to place and
manner of articulation for stops and fricatives.
(11) Perceptual cues to obstruent stops
manner: silence internal
release burst contextual: consonant release
transition duration contextual: VC, CV transitions
place: F2 transition contextual: VC, CV transitions
burst spectrum contextual: consonant release
As shown in 11, stop consonants are heavily dependent on contextual cues for their
identification, in particular, release burst and vowel formant transitions. Note that place
of articulation is entirely dependent on contextual cues. As Blumstein and Stevens
(1979) point out, when both vowel transition and burst are present, the spectral characteristics
for a particular place of articulation are enhanced relative to the characteristics
that exist for either one of the components separately. Further, identification of place
of articulation is less accurate in unreleased stop consonants than in released ones
(Blumstein & Stevens 1979, Halle et al. 1957, Male´cot 1956, Stevens & Blumstein
1978, Wang 1959). Since release bursts are always present for stop consonants at the
onset to a vowel, prevocalic position is favorable for the perceptibility of a stop. In
preconsonantal position, in contrast, bursts are frequently lacking. Prevocalic position
is also favored for stops since CV transitions provide better cues than VC transitions
(Fujimura et al. 1978). Further, since, from an auditory perspective, auditory nerve
fibers show a greater response at the onset of a stimulus signal than at the offset, a
prevocalic stop is expected to be more salient than a postvocalic one (Bladon 1986,
Mielke 2002, Wright 1996, though see Steriade 1995 on retroflexion).
Compared to stop consonants, fricatives have stronger internal cues to both place
and manner of articulation, as displayed in 12. They are therefore less dependent on
context for information regarding their identity and, as a result, they generally fare
better in poorer contexts (Wright 1996).
(12) Perceptual cues to fricatives
manner: frication noise internal
noise duration internal
place: frication spectrum internal
frication amplitude internal
F2 transition contextual: VC, CV transitions
Context can also provide modulation in the signal, thus facilitating the identification
of speech sounds. Kawasaki (1982) and Ohala (e.g. 1992, 1993) propose that sharper
changes in the speech signal serve to increase the salience of cues in the portion of the
signal where the modulation takes place: the greater the magnitude of the modulation,
the better a given signal is detected. Consequently, larger modulations survive better
than smaller ones since, as Kawasaki points out, if two sounds in a sequence are
acoustically and auditorily similar, they would be subject to confusion.
The acoustic and auditory cues of a given speech sound are thus determined both
by the nature of the sound in question and by the context in which it appears. As a
result, two speech sounds occurring in exactly the same environment, produced in an
identical manner by speakers with identical vocal tracts, can be expected to generate
the same acoustic and auditory cues. Yet, as already noted, we are all familiar with
some language and, as Lindblom (1990:408) states, ‘if we know a certain language,
we can not help imposing that knowledge on the signal’. We can therefore conclude
that how an individual parses an auditory signal is a function of the quality of the speech
sounds involved as well as, importantly, the individual’s native-language experience.
2.4. SUMMARY. A key factor influencing how speech is processed is the knowledge
that an individual has of his or her language. This naturally includes familiarity with
the elements that make up the language as well as their patterns of usage, including
frequency of occurrence. Speech processing is also dependent on the nature of the
sounds involved and the context in which the sounds occur. As I show, each of these
factors plays a key role in explaining observed patterns of metathesis.
3. INDETERMINACY. As just discussed, the way that a speech signal is parsed is
strongly influenced by one’s native language. Particularly important for our understanding
of metathesis is the finding that this influence is strongest when information specify216
ing a sound or sound sequence is indeterminate (Pitt & McQueen 1998).9 Indeterminacy
in this context relates to one’s ability to parse a given speech signal which, as we have
seen, is determined both by the listener’s native-language experience and by the nature
of the speech sounds involved. I define indeterminacy concerning the order of speech
sounds in 13:
(13) INDETERMINACY OF ORDER describes a state in which there is insufficient
information concerning the linear ordering of the elements involved. Indeterminacy
is a function of two factors:
a. the listener’s experience with the elements involved (e.g. sounds, sound
sequences, morphemes, words, etc.);
b. the quality of information occurring in the speech signal (e.g. the types
of sounds involved, the context in which the sounds occur, the phonetic
cues available, etc.).
When there is indeterminacy, a listener is biased to parse the signal in a manner
consistent with the attested patterns of his/her language. Pitt and McQueen (1998), for
example, found that the transitional probabilities of voiceless alveolar and postalveolar
fricatives at the end of nonwords influenced listeners’ identification of an ambiguous
fricative as well as that of the following stop consonant. This is consistent with the
findings of Vitevitch and Luce (1999), which reveal segment and sound sequence
probabilities to be most influential when listeners are presented with unfamiliar words.
Recognition of familiar words, on the other hand, tends to be influenced more by
competition with similar sounding words in the language, rather than by sublexical
(e.g. phonotactic) patterns. Indeterminacy thus forces the listener to rely on language
experience to parse the signal.
Significantly, similar conclusions can be drawn regarding the role of indeterminacy
in the identification of linear order. In Broadbent and Ladefoged’s (1959) investigation
into the perception of order, they found that experience with sounds and sound sequences
facilitates identification of the order in which sounds occur. Warren (1982:
119) notes that ‘perception of speech and music seems to involve initial recognition
of groupings consisting of several sounds. If required, component sounds and their
orders may be inferred from these familiar sequences, even though they cannot be
perceived directly’ (see also Makashay 2001 on the perception of obstruent order in
These studies provide insight into why metathesis occurs and why languages differ
with regard to the sounds that undergo metathesis and the sequences that emerge as a
result. First, a listener makes use of his/her knowledge of native-language patterns to
facilitate the identification of the order of sounds. Second, this influence is strongest
when information about ordering is indeterminate, in which case the order is inferred.
It is thus reasonable to conclude that for metathesis to occur, there must be indeterminacy
in the speech signal (Fay 1966). Further, how the signal is parsed depends on the
9 John Ohala (e.g. 1981, 1993) has pointed to the importance of indeterminacy (or, in his terms, AMBIGUITY)
as a key factor in a listener’s misapprehension of the speech signal, the basis of sound change in his view.
The ideas in this article build on his important work in this area. Importantly, however, my approach differs
in that an individual’s knowledge of the elements of his/her language and their usage are given a central
role. Thus, in my view, how a listener interprets, or parses, a speech signal is language specific rather than
universal, as Ohala assumes.
Blevins and Garrett (1998) and Steriade (2001) also note that indeterminacy in the input is involved in
metathesis (they use the term ambiguity). They follow Ohala’s approach and, consequently, language usage
does not play a key role in their proposals.
listener’s native-language experience; specifically, the sequence resulting from metathesis
corresponds to one with which the listener has had the most experience.
The impact of native-language familiarity on speech processing is strongly supported
by crosslinguistic research, which in turn gives us insight into some of the metathesis
patterns seen in §1. For example, Mielke’s 2003 study of the perception of [h] by
listeners with different native-language backgrounds underscores language specificity
in processing speech. His results show that in both prevocalic and postvocalic position,
/h/ is significantly more perceptible to Turkish and Arabic listeners than to English
and French subjects (p 0.001). Further, prevocalic /h/ is significantly more perceptible
to English subjects than to French subjects (p 0.009). These results reflect the sound
patterns in the languages and, consequently, the degree of familiarity that the listeners
have with the sequences in question. Turkish and Arabic listeners have the highest
degree of familiarity with sequences involving /h/ given that the glottal occurs both
before and after consonants in those languages. English listeners have less experience
in this regard since /h/ is limited to prevocalic position, while French subjects are least
familiar since /h/ does not occur in the language at all.
Mielke’s results concerning the crosslinguistic perception of /h/ provide insight into
observed patterns of metathesis involving glottal consonants. In some languages, as
was shown for Hungarian and Cebuano, the temporal organization of an intervocalic
glottal/consonant input is resolved with the glottal being realized in C2 position. In
Pawnee and Basaa, however, the mirror-image is found. Of particular interest is the
observation that the input order in each case is a nonoccurring or infrequent sequence
in the language. I would suggest that the listener’s sensitivity to the sequence is weak
in these cases due to the listener’s low degree of familiarity with the input. Listeners
learn to focus attention on meaningful cues in the signal, and to ignore others. Consequently,
if the order of sounds in the input is unfamiliar to the listener, he/she may not
be tuned to the cues that can aid in identifying the sound combination. We then correctly
predict that listeners with different native-language backgrounds will process sound
combinations differently if in one language the sequence occurs while in the other it does
not. Yet, familiarity need not be considered all or none. Consistent with psycholinguistic
studies, the listener is biased to parse the signal in a manner consistent with the most
robust or frequent pattern in cases in which both orders of a given sequence occur in
a language. This claim is developed in more detail in §4.
3.1. QUALITY OF INFORMATION IN THE SPEECH SIGNAL. That indeterminacy is a factor
in metathesis is also evidenced by observations concerning the types of sounds that
metathesize. They fall into two general, yet overlapping, categories. The first is characterized
by diminished perceptual salience, while the second involves temporal resolution.
The defining phonetic characteristics of the sounds in these two categories are
key sources of indeterminacy in the temporal organization of the sounds in question.10
3.2. DIMINISHED PERCEPTUAL SALIENCE. Indeterminate sound sequences resulting
from diminished perceptual salience involve either similar sounds and/or those where
the phonetic cues to the identification of at least one of the sounds is masked.
10 My focus regarding indeterminacy relates to an auditory signal, though I speculate that indeterminacy
in a visual signal could also result in parsing symbols in a way other than is presented. Factors contributing
to indeterminacy in a visual domain include, among others, reading rate, visual quality of the text, visual
capabilities of the reader, the reader’s familiarity with the word or sound sequence, and so on.
218 LANGUAGE, VOLUME 80, NUMBER 2 (2004)
I begin with similarity. Since modulation in the speech signal contributes to the
salience of a sound’s phonetic cues and hence to the identification of the sound itself
(Kawasaki 1982, Ohala 1992, 1993), acoustic/auditory similarity between sounds can
have the effect of diminishing the degree of distinctiveness of the sounds, thereby
making them and their order less easily identifiable (Hume 1998).
That acoustic/auditory similarity is an important conditioning factor in metathesis
comes from the observation that of the thirty-seven cases of consonant/consonant metathesis
examined in this study, 35% involve sounds that are highly similar acoustically
and auditorily. In the majority of cases, the two sounds agree in sonorancy, differing
only in place and/or manner. The importance of shared values for sonorancy in perceived
similarity of sounds is consistent with Mohr and Wang’s (1968) study of consonant
similarity in English. Their findings reveal that the pairs of consonants judged to be
most similar were those that shared the major class feature [sonorant], differing only
in the value for voicing, place, or continuancy. Fay (1966) also found temporal discrimination
between segments to be poorest in sequences of two nasals or two liquids, a
finding he attributes in part to similarity in the resonant frequencies of the sounds in
each pair.
Metathesis involving two sonorant consonants is not uncommon, being attested in
Georgian (Butskhrikidze & van de Weijer 2001, Hewitt 1995), Chawchila (Newman
1944; see related discussion in Stonham 1990), Old Spanish (5a), Deg (Crouch 1994,
Hume 1997b), Aymara (Davidson 1977), and Turkana (Dimmendaal 1983), among
other languages. Ordering reversals involving two fricatives occur in, for example,
Hixkaraya (14), and involving two stops in Kui (Winfield 1928), Kuvi (Israel 1979),
Mokilese (Harrison 1976), and Classical Greek (Lejeune 1972). Homorganicity is a
condition on metathesis in Modern Hebrew (15) and Udi (6), among others. Identity
in place is also crucial in Rendille (Heine 1976, 1978, Hume 1998, Oomen 1981, Sim
1981, Zaborsky 1986) and Bedouin Arabic (Al-Mozainy 1981, Al-Mozainy et al. 1985).
In both cases, metathesis involving a pharyngeal consonant is restricted to words in
which the consonant is adjacent to a pharyngeal vowel. Metathesis in Turkana is especially
interesting since conditions on similarity extend beyond the consonants involved:
in addition to the metathesizing sounds having the same value of sonorancy, the relevant
consonants must also be adjacent to identical vowels, as in [√akεmεra] [√akεrεma]
‘mole’, [√ikwa√:r:m:ka] [√ikwa√:m:r:ka] ‘kind of tree’ (Dimmendaal 1983).
Diminished perceptual distinctiveness can also result from the masking of meaningful
phonetic cues that listeners could use to identify the sounds involved. Given the discussion
in §2.3 concerning the dependence of stop consonants on contextual cues for the
identification of place and manner, it is not surprising that over one-third of the
consonant/consonant metathesis cases examined involve a stop consonant. Recall that
stop consonants are heavily dependent on release burst and vowel formant transitions
as cues to their place and manner. In fact, place is entirely dependent on these contextual
cues. Since release bursts are always present for stop consonants at the onset to a
vowel, prevocalic position is a favorable position for the perceptibility of a stop. In
preconsonantal position, in contrast, bursts are frequently masked. The observation that
a stop/consonant sequence is reordered so that the stop emerges instead before a vowel
is thus to be expected. Representative cases occur in Elmolo (9), Fur (Jakobi 1990,
Mielke & Hume 2001), Modern Hebrew (15), Oromo (Lloret-Romanyach 1988), Sidamo
(10), and Udi (6). Given the importance of vowel transitions for the identification
of a stop’s place of articulation, the patterns observed in Faroese (7) and Lithuanian
(8) are also unremarkable. Recall that the stop is expected to be sandwiched between
two consonants, yet surfaces adjacent to a vowel in the output. In each of these cases, the
masking of important phonetic cues to the manner and especially place of articulation of
a stop consonant contributes to indeterminacy in the signal, thus creating a favorable
context for metathesis to occur.
3.3. TEMPORAL RESOLUTION. Blevins and Garrett (1998) observe that glottals, liquids,
and glides are commonly involved in metathesis. While their study focuses largely on
consonant/vowel metathesis, their claim is well supported by data from cases of
consonant/consonant metathesis. At least one of the consonants is glottal in Balangao
(1), Basaa (4), Cebuano (Bunye & Yap 1971, Wolff 1972), Cherokee (Foley 1980),
Estonian (Kiparsky 1967), Hanunoo (Conklin 1953, Mielke & Hume 2001), Harari
(Leslau 1963, Semiloff-Zelasko 1973), Hixkaryana (14), Hungarian (2), Mandaic (Macuch
1965, Malone 1971, 1985), Pawnee (3), and Twana (Semiloff-Zelasko 1973). A
glide metathesizes with a consonant in Chawchila (Newman 1944), Cherokee (Foley
1980), Kota (Emeneau 1967, 1970, Semiloff-Zelasko 1973), and Yagua (Powlison
1962, Semiloff-Zelasko 1973). A liquid is involved in Chawchila (Newman 1944), Deg
(Crouch 1994, Hume 1997a), Elmolo (9), Gidole (Black 1974), Hungarian (2), Mandaic
(Macuch 1965, Malone 1971, 1985), Pawnee (3), and Rendille (Heine 1976, 1978,
Hume 1998, Oomen 1981, Sim 1981, Zaborsky 1986), among others. Note that in some
cases more than one type of consonant is involved. Drawing on Ohala’s research on
dissimilation, Blevins and Garrett’s account incorporates the insight that glottals, liquids,
and glides have cues of relatively long duration or, as Ohala (1993:251) calls
them, ‘stretched out’ features. The burst release of a stop is a good example of a cue
that would NOT fit in this category. Since stretched out cues tend to extend over a
domain which may encompass adjacent sounds, it can result in the overlap of important
phonetic cues, potentially creating indeterminacy about the onset and offset of the
sounds involved. An example of this type of overlap can be seen in Figure 1 in the
spectrogram of an /h/-vocoid cluster, drawn from the ViC corpus of spontaneous American
English speech.11 Both the vocoid and glottal fricative have stretched out features
.... i.n.igng ho ... ho...
FIGURE 1. Spectrogram of a portion of the phrase ‘being home’, with the overlapping acoustic cues for the
glottal fricative and vowel encircled.
220 LANGUAGE, VOLUME 80, NUMBER 2 (2004)
(frication for /h/ and formant structure for the vocoid), which results in the overlapping
of acoustic cues.
Unlike the cases of metathesis conditioned by cue masking or acoustic/auditory
similarity, diminished perceptual salience seems less of an issue in the case of the
glottal fricative and vocoid in Fig. 1. Cues to both segment types are present in the
signal (frication on the part of /h/, and formant structure for the vocoid). Furthermore,
the cues are qualitatively different. In cases of this type, indeterminacy is an issue of
temporal resolution relating to the onset and offset of the respective consonants.12
If we are on the right track in assuming that metatheses conditioned by lack of
temporal resolution involve sounds with qualitatively different cues, this would underscore
once again the importance of context in understanding the relevant conditions
underlying the indeterminacy. For example, despite the fact that /h/ is produced with
a stretched out feature, the case of Hixkaryana ‘h fricative’ metathesis seen in 14
may be best classified under the category of diminished cue perceptibility, given that
both sound types exploit frication as the key perceptual cue to their identification. In
this language, metathesis occurs when a morpheme-final /s/ or / / would be expected
to be followed by morpheme-initial /h/ due to vowel loss. (The bilabial fricative, the
only other fricative in the language, does not cooccur with /h/.)
(14) Hixkaryana (Derbyshire 1979, 1985)
ahosé-hira ahohséra *ahoshéra ‘not catching it’
w-ama-+e-haka wamah+aka *wama+haka ‘let me cut it down’
This, of course, does not rule out the possibility that more than one factor may
contribute to indeterminacy in the signal. In Modern Hebrew and Udi metatheses, for
example, both sounds are obstruents and the burst release of the stop would be potentially
masked in the input. In Chawchila and Old Spanish nasal/liquid metathesis, both
consonants are sonorants, and of these, the liquid can be said to have a stretched out
cue. Though many more cases could be cited, the point is that the less easily identifiable
the sounds and their order are, the greater the possibility that the sequence will be
inferred based on the listener’s knowledge of the elements of his/her language. Of
course, indeterminacy will be even greater when, in addition, the elements in the signal
(e.g. sound sequences, words) are less familiar to the listener.
3.4. SUMMARY. In the preceding discussion I have argued that indeterminacy in the
speech signal provides a favorable context for metathesis to occur. A listener’s experience
with the elements heard, as well as the nature of the sounds involved, contributes
to his or her ability to extract information regarding the order of the sounds from the
4. ATTESTATION. In this section, I focus on the sequences resulting from metathesis.
As I show, native sound patterns exert a strong influence on the direction of metathesis.
12 Temporal indeterminacy between sounds in which the relevant cues are qualitatively different may
relate to the concept of auditory-stream segregation (Bregman 1990). This refers to the phenomenon whereby
separate auditory continua or streams are created among similar auditory cues and remain perceptually
separated without temporal cross-linking (see Warren 1982 for related discussion). Bregman and Campbell
(1971) found auditory-stream segregation for sequences of six tones made up of two clusters of three lowfrequency
and three high-frequency tones. Their findings suggest that subjects had more difficulty identifying
the order of tones across clusters than within a cluster.
In fact, a second condition on metathesis is that the structure resulting from metathesis
be attested in the language. As outlined in §3, this proposal is strongly supported by
experimental studies investigating the perception of order. Crucially, it is also supported
by observed patterns of metathesis. It is worth noting that this approach is consistent
to an extent with earlier proposals suggesting that by metathesis, uncommon language
structures are replaced by more common ones (see e.g. Grammont 1933, Ultan 1978,
Hock 1985). Importantly for the proposal developed here, however, the (un)commonness
of a given structure is determined on a language-specific basis.13
Attested cases of metathesis strongly support this conclusion. To my knowledge, the
output of metathesis consistently conforms to an existing structure in the language.
Note, however, that the level of generalization over which the relevant structure is
defined may differ. This is consistent with Dell and colleagues’ claim that ‘patterns in
language occur at many levels of generality and . . . the processing system is sensitive
to all of these levels . . . . The claim that language processing is sensitive to patterns
at many levels of generality is hardly controversial’ (Dell et al. 2000:1356). With respect
to metathesis, the relevant structure may be defined in terms of specific qualities of
sounds or be larger in scope. In Lithuanian and Faroese, for example, the relevant
generalization involves stops and coronal fricatives (Hume & Seo 2003), while in
Mutsun the specific sequence involving /k/ and /m/ is relevant (see 16). In Kuvi, by
contrast, the level of generalization refers both to the place of articulation of the consonant
and, more generally, to the classes of consonants and vowels (alternatively defined
as the prosodic level). The direction of metathesis is thus constrained by the sound
system of the language in question. While this finding may not seem surprising, it is
significant in that it means that the direction of metathesis is not arbitrary. It also
suggests that, in principle, any order of two segments is a potential output of metathesis,
provided that the reordered sequence forms an attested structure in the language.
The view just outlined makes strong predictions about what a preferred metathesis
output can be. When only a single order of some combination of sounds is attested in
a language, the predicted output will be consistent with that order. For example, in a
language with the sequence [VhCV] but not [VChV], a listener will be biased to parse
a signal containing a temporally ambiguous intervocalic consonant/glottal combination
as [VhCV]. Thus, if metathesis occurs, the preferred output will be [VhCV], as in
Pawnee (3) and Basaa (4). Conversely, with only [VChV] as the attested order, the
prediction is that a speech signal with an intervocalic consonant/glottal combination
will be parsed as [VChV], as in Balangao (1), Hungarian (2), and Cebuano (Wolff
1972). Similar observations hold for many other languages with metathesis: only a
single order of a given sequence of sounds is attested and this corresponds to the order
observed as the result of metathesis (e.g. Hanunoo, Hixkaryana, Lithuanian, Faroese,
Udi, Sidamo, Elmolo, Georgian, Rendille, and Chawchila).
This approach also makes clear predictions regarding the preferred sequence when
both orders of a combination of sounds are attested in the language. Drawing on the
finding that listeners are sensitive to the transitional probabilities of sounds in their
13 One might take this a step further and suggest that markedness is language specific, a position I support.
For new evidence in favor of this view, see Hume & Tserdanelis 2002 and Hume 2002, 2003.
222 LANGUAGE, VOLUME 80, NUMBER 2 (2004)
language, the prediction is that when both sequences are attested in a language, the
listener will be biased toward the most robust sequence, that is, the one with the highest
frequency. The more a sequence occurs, the more the speaker/hearer is exposed to it
and the more fine-tuned the processing system becomes with regard to that sequence.
Evidence for this claim comes from metathesis in, among other languages, Balangao,
Cebuano, Modern Hebrew, Mutsun, and Kuvi.14
Recall from 1 that in Balangao metathesis, the expected ordering of a glottal [≈, h]
before a nasal or oral plosive is reversed, thus yielding a plosive-glottal output (Shetler
1976), for example, /pRhéd-én/ [pRdhén] *[pRhdén] ‘allow, accept’, /CV-≈éném/ [≈én≈ém]
*[≈é≈ném] ‘six each’. While glottals are not strictly excluded from preconsonantal position,
the overwhelming tendency is for them to occur prevocalically, as in [heet], [manhamal],
[lehet], [qaho], and [bRt≈ont]. Preconsonantal glottals, however, are highly
restricted, occurring only at morpheme boundaries (e.g. /mano≈-na/ [mano≈na]
‘chicken, his’), as geminates (e.g. [ahhahayat] ‘just returned home’), or in reduplicated
forms (e.g. [pahpah] ‘hit to knock something down’, [pa≈pa≈] ‘touch, as of sugar, and
then touch something else, leaving some’).
In the Cebu City dialect of Cebuano, the sequences /≈C/ and /hC/ are also realized
as [C≈] and [Ch] (e.g. /ka´≈un-a/ [kan≈a] ‘eat it’, /luhu´d-an/ [ludhan] ‘kneel on’). In
other dialects, the glottal remains in preconsonantal position. As predicted, the shift
observed in the Cebu City language variety is consistent with observed patterns in the
dialect. In Wolff’s (1972) lexicon of approximately 7,500 words, preconsonantal [≈]
occurs in only eight lexical items, and [h] occurs in none. The preconsonantal glottal
fricative is rather common morpheme-internally, occurring in 132 forms, but the glottal
stop is rare in this position. The preconsonantal glottal stop does frequently occur in
polymorphemic words, however (John Wolff, p.c.). Thus, the shift of the glottal from
preconsonantal to postconsonantal position is consistent with the most robust glottal/
consonant pattern in the dialect.
The direction of change observed in the well-known metathesis of Modern Hebrew
is also consistent with this approach. In the language, binyan 5 of perfective verbs
typically has the form [hit]-verb, as shown in 15a (the prefix /t/ agrees in voicing with
an adjacent obstruent); /h-/ is a perfective prefix, /-t-/ is the binyan 5 morpheme, and
/i/ is epenthetic (Bat-El 1989, 1992). When the stem-initial consonant is a strident
coronal (/c, s, z, /), however, the /t/ of the prefix occurs to its right, as in 15b(Bat-
El 1988, 1989). While the sequence [t] [strident] occurs in the language, Bat-El
(1988, p.c.) reports that it is considerably less common than the opposite order; it is
restricted to tautomorphemic forms such as [hi-tsis] ‘he fermented’ and nonverbal forms
like [t+uva] ‘reply’.
14 I am basing my claims regarding the robustness of a given sequence in each of the languages discussed
in this section on the state of the language at the present time (or at the time the description of the language
was written). An obvious critique of this methodology is that the current state of the language may not be
identical to the way it was when metathesis was first triggered in the system. Admittedly, this is not the
ideal situation but one must make do with the resources available. This should not undermine the validity
of the claims being made for any of the languages, however, given the robustness of each of the patterns
(15) Modern Hebrew
a. hi-t-nakem hitnakem ‘he took revenge’
hi-t-raxec hitraxec ‘he washed himself’
hi-t-balet hidbalet ‘he became prominent’
hi-t-darder hiddarder ‘he declined, rolled down’
hi-t-kabel hitkabel ‘it was accepted’
b. hi-t-sader histader ‘he got organized’
hi-t-zaken hizdaken ‘he grew old’
hi-t-calem hictalem ‘he took pictures of himself’
hi-t-+amer hi+tamer ‘he preserved himself’
The influence of native-language patterns on metathesis can also be heard in some
varieties of American English in the variable pronunciation of t-l in the word chipotle,
the Latin American name for a particular kind of pepper and, recently, for a chain of
Mexican restaurants. Both orders of the final two consonants can be heard, even in the
speech of the same American English speaker: chipotle (the original order) or chipolte
(the innovative order). This pattern is consistent with the claims made in this paper.
The two sounds involved are archetypical ‘metathesis sounds’ and thus contribute to
indeterminacy: /t/ with perceptually vulnerable cues and /l/ with stretched out features.
Notice also that in the original form of the word, chipotle, the stop occurs in preconsonantal
position, a perceptually weak context for a stop. Another factor that may contribute
to indeterminacy is unfamiliarity with the borrowed word. Recall that sound sequence
probabilities are most influential when listeners are presented with unfamiliar words
(Vitevitch & Luce 1999). With indeterminacy, the order of sounds is inferred based
on experience, with the bias towards the most robust order. As predicted, although both
/tl/ and /lt/ occur intervocalically in English, an examination of their text frequencies
in the online MRC Psycholinguistic database of English (Wilson 1988) reveals that
/tl/, the original form, occurs in sixty-seven words, while the innovative /lt/ sequence
occurs in 356 words.15
In the examples just presented, the sequence of sounds resulting from metathesis not
only corresponds to the more frequent order of some combination of sounds, but the
sequence also arguably has more salient acoustic/auditory cues than the original order.
In each case a stop or glottal consonant surfaces in prevocalic rather than preconsonantal
position, which, all else being equal, can be deemed the more perceptually favorable
context. The observation that the sequences that emerge by metathesis are frequently
better perceptually than their unmetathesized counterparts should not be surprising since
clusters with better cues are generally more stable, thus occurring in more lexical entries
in a language than clusters with poorer cues (Makashay 2001).
Crucial for the approach to metathesis developed in this paper, however, is the view
that speech processing is not universal; listeners with different language experiences
may parse the same sequence of sounds in different ways. Thus, we would expect to
find so-called ‘non-optimal’ sequences resulting from metathesis as well. Metathesis
in Mutsun and Kuvi serves to illustrate.
Metathesis in the Costanoan language Mutsun, as shown in 16, involves the commonly
occurring nominal thematic plural suffix, which has two alternants: [-mak] and
[-kma] (Okrand 1977).
15 The MRC Database is a machine-usable dictionary containing over 150,000 words with up to twentysix
linguistic and psycholinguistic attributes (e.g. pronunciation, part of speech, word frequency; Wilson
224 LANGUAGE, VOLUME 80, NUMBER 2 (2004)
(16) Mutsun
ru k ru k-mak ‘string’
hu s hu s-mak ‘nose’
wimmah wimmah-mak ‘wing’
kahhay kahhay-mak ‘head louse’
≈innis ≈innis-mak ‘son’
rukka rukka-kma ‘house’
t+iri t+iri-kma ‘paternal aunt’
rumme rumme-kma ‘rivulet’
sinni sinni-kma ‘child’
relo relo-kma ‘clock’
huttu huttu-kma ‘belly’
sipruna sipruna-kma ‘tule root’
The locative suffix displays a similar pattern, with the alternants [-tak] and [-tka].
(17) ≈urkan ≈urkan-tak ‘mortar’
lo t lo t-tak ‘mud’
pappel pappel-tak ‘paper’
pire pire-tka ‘world, land’
rukka rukka-tka ‘house’
si si -tka ‘water’
pa rani pa rani-tka ‘hill’
In both the nominal thematic and locative suffixes, the final consonant and vowel
metathesize; the CCV alternant occurs after vowel-final stems while the CVC alternant
occurs after consonant-final stems. By comparing the two morphemes in 16 and 17, it
can be seen that a further change is involved in the nominal thematic suffix. Not only
do the variants differ with regard to whether they end in an open or closed syllable,
but the linear ordering of the consonants changes as well: in the allomorph [-kma], [k]
precedes [m], while in [-mak], [m] precedes [k]. If we were to follow the pattern of
the locative [Ctak, Vtka], we would expect the nominal thematic suffixes to be [Cmak,
*Vmka]. Instead, we find [Cmak, Vkma].
There is no absolute prohibition against clusters consisting of nasals and heterorganic
obstruents in the language (e.g. [≈am+i] ‘so that’, [namti] ‘to hear’, [janpu] ‘to praise
oneself’), although some are more common than others. Clusters of [mk] are very rare,
occurring in only a few words including [omkon] ‘maggot’ and [hemkon] ‘to set (sun)’.
The opposite order [km], however, is attested in many words in addition to those seen
in 16 (e.g. [sukmu] ‘smokes, verb’, [ya kmun] ‘in the east’, [wakmenne] ‘grandmother’).
These findings are consistent with the view that the direction of change in
metathesis is influenced by the patterns of usage in the language.
Mutsun labial/velar metathesis is of particular interest since it provides a case opposite
to what would be expected from a strictly phonetic cue-based approach (cf. Hume
1998, 2001, Steriade 2001). Recall from §2.3 that the most favorable position for a
stop consonant is before a vowel, since auditory nerve fibers show a greater response
in this position, CV transitions provide better cues than VC transitions, and the release
burst of the stop is consistently present. Yet, in Mutsun metathesis, unlike cases such
as Sidamo, Elmolo, and others, the stop surfaces preconsonantally. Accounting for
these differences is straightforward when the sound patterns of each language are taken
into account. In Sidamo, for example, sequences of V[stop][nasal]V are nonoccurring,
while in Mutsun they are common. Conversely, sequences of V[nasal][stop]V are robust
in Sidamo, while restricted in Mutsun. This strongly suggests that the goal of metathesis
is not to improve the overall psychoacoustic (i.e. universal) cues of a sequence, but,
rather, conforming to the patterns of usage of a given language is key.
Consider next metathesis in the Dravidian language Kuvi, where the sequence
CVCCV is realized as CCVCV (Israel 1979); metathesis thus results in a change in
the prosodic structure of words. Krishnamurti (1978) points out that this metathesis
goes back at least two millennia and affected languages of the south central branch of
Dravidian including Telugu, Gondi, Konda, Kui, Kuvi, Pengo, and Manda. The change
is most widely attested in Kui (ninety-eight forms) and Kuvi (fifty-nine forms). (In
some of these forms, the initial consonant was also deleted.) Krishnamurti notes that
there are an additional nineteen forms in Kui and twenty-three in Kuvi with lexical
free variation, which he states represents subdialectal variation within that language.
According to Krishnamurti the sound change spread gradually, lexically and geographically,
over the centuries and continues to spread in Kui and Kuvi. In several words
that underwent metathesis the effects persist as morphophonemic alternations, or as
variable realizations of a word, as in the Kuvi forms shown in 18. I focus my discussion
on Kuvi primarily because metathesis continues to affect new words in the language,
though similar conclusions can be assumed to hold for Kui (and the other languages
(18) Kuvi
doıva dıova ‘basket’
paıka pıaka ‘armpit’
pa≈va p≈ava ‘firstborn’
mu≈ka m≈uka ‘urine’
o¯ri-ka ro¯ka *[ o¯rka] ‘ropes’
mı¯≈u-ka m≈ı¯ka *[mı¯≈ka] ‘fish’
pe¯≈u-ka p≈e¯ka *[pe¯≈ka] ‘lice’ lice [pl.] [pl. de la] louse
Kuvi metathesis is of particular interest since the structure resulting from metathesis
would appear to be less optimal than the nonmetathesized structure. Of relevance is
the observation that the consonant involved in metathesis is a sonorant apical (dental/
alveolar: /n r l/; retroflex: /≈ ı/).16 Steriade (1995) argues that the most salient cue to
apicality is in the V-C transition (specifically the locus of F3) and that postvocalic
position is more favorable than prevocalic position for the perception of apicals. Phonological
evidence supporting this claim comes from languages in which place neutralization
involving apicals occurs in prevocalic position but place is preserved in postvocalic
position where the phonetic cues are more salient. The observation that Kuvi metathesis
involves the shift of an apical from postvocalic to postconsonantal, prevocalic position
16 In other languages, the context has been generalized to include all apicals (see Krishnamurti 1978). For
this reason, traditional descriptions of the metathesis often refer to the process as ‘apical displacement’.
Krishnamurti (1978), for example, assumes that it is the quality of being apical that is the most important
factor underlying the metathesis discussed here. He suggests that the shift of the apical away from preconsonantal
position is to avoid assimilation to the following consonant. I argue, on the contrary, that when the
sound change originated it was the property of being a liquid that was crucial. The fact that all liquids were
apicals was perhaps incidental. Further, I do not assume that metathesis is teleological in nature, as discussed
most specifically in §5. Independent evidence against the view that apicality is the relevant property comes
from the observation that stems with a final apical nasal did not originally undergo metathesis. In fact, in
most of the Dravidian languages where this metathesis occurred, nasals were completely excluded even as
more words underwent metathesis.
226 LANGUAGE, VOLUME 80, NUMBER 2 (2004)
runs counter to this view. The pattern is consistent, however, with the claims made
here where indeterminacy and sequence attestation play a key role.
Recall that indeterminacy is a function both of the nature of the sounds involved
and of the frequency with which a structure occurs in the language. With regard to the
first point, it is significant that only liquids, all of which were apical (dental/alveolar,
retroflex), initially participated in metathesis (Krishnamurti 1978:n. 4). The consonants
undergoing metathesis in Kuvi have since been generalized to include the nasal as well.
Recall from §3.3 that liquids and vocoids both involve so-called stretched out acoustic
cues (Blevins & Garrett 1998, Ohala 1993). We may assume that this property, in
addition to the overall similarity of these two classes of sounds, contributes to indeterminacy
regarding the onset and offset of a liquid and vowel. Although the context for
metathesis now includes nasal apicals, the similar sonorant properties of these consonants
can also contribute to indeterminacy.
Evidence also suggests that the structure that underwent (or undergoes) metathesis
is less frequent than that resulting from the change. Krishnamurti (1978) notes that
sequences undergoing metathesis were all followed by a consonant, as in the examples
in 18, meaning that the metathesized consonant was in a closed syllable. While both
VCCV and VCV sequences occurred in the language, there was and continues to be
a clear predominance of word-internal open syllables. Based on Israel 1979, the text
frequency for word-initial open syllables in Kuvi is 1,535 while for closed it is 736.
Thus, closed syllables are not disallowed; they are simply dispreferred. We also know
that the sequence liquid vowel was already attested and was therefore in competition
with the unmetathesized vowel liquid sequence since, according to Krishnamurti
and as pointed out by a referee, a previous sound change positioning an intervocalic
liquid in initial position had already occurred: VLV LV, as in Proto-Dravidian *uru
*ru ‘to burn’ (Kondi-Kui-Kuvi-Pengo-Manda). Krishnamurti argues that this change
resulted from the weakening of an unstressed initial vowel (see Blevins & Garrett 1998
for discussion of similar cases). The order liquid vowel also conforms to the general
preference for consonant-initial words in the languages.
The facts surrounding Kuvi metathesis strongly support the model of metathesis
developed in this article. The sounds involved and the patterns of usage provide a
favorable context for metathesis to occur. That the resultant sequences are resolved as
the order CCVCV is consistent with generalizations about the frequency of sequences
involving apical sounds, in particular, and of consonants and vowels in open vs. closed
syllables, more generally. Of course, each time metathesis affects a word in Kuvi, the
frequency of open syllables increases, thus further strengthening the bias towards this
pattern of sounds (for related discussion see Bybee 2000).
Before concluding the discussion of Kuvi, it is worth commenting on Israel’s (1979:
14) observation that new word-initial consonant clusters are entering the language as
a result of metathesis. In other words, the dispreference for the sequence CVCCV as
opposed to CCVCV (or more accurately: VCC as opposed to VCV) results in the
creation of previously unattested consonant sequences at the beginning of the word
(CCVCV). It is important to point out that this observation is not inconsistent with the
claims made here. It does, however, underscore the fact that a listener’s sensitivity or
bias towards one structure may have consequences that affect the structure of the language
more generally. Informally speaking, in the present case, the bias towards an
open syllable (VCV) outweighs the bias towards words beginning with a single consonant
(see the literature on optimality theory, e.g. Prince & Smolensky 1993, for relevant
discussion regarding the formal representation of competition within a language sysTHE
tem). While I leave further discussion of the underlying factors influencing a listener’s
bias towards one pattern or another for a future time, it seems reasonable to assume that
the same factors that underlie indeterminacy (information quality, pattern frequency) are
also relevant in this case.
key predictors of metathesis are part universal, part language specific. The universal
component draws on the raw psychoacoustics of the sound combination at issue, while
language specificity is brought in by the influence of a speaker/hearer’s native language.
By taking into account both factors, straightforward answers to the observations raised
at the beginning of this article, repeated in 19, can be provided.
(19) Observations:
a. The acoustic/auditory cues to the identification of a sound sequence are
frequently improved by metathesis.
b. For some sound combinations, one order is favored crosslinguistically as
the result of metathesis.
c. The direction of metathesis can differ from language to language such
that either order of a given sound combination can emerge as the result
of metathesis in some language.
Consider first the observation that by metathesis, the acoustic/auditory cues to the
identification of a sound sequence are frequently improved. For example, many cases
of metathesis are attested in which a sound that depends heavily on contextual cues,
such as a stop, is realized in a position where the cues to its identification are strengthened.
This observation receives a straightforward explanation when we take into account
the nature of a metathesis input, on the one hand, and the types of sequence most apt
to influence the processing of segmental order, on the other. Considering the input,
recall that the most important predictor of metathesis is indeterminacy, which is a
function of information in the signal and language experience. Indeterminacy may thus
result from diminished perceptual salience due to similarity between the sounds, or the
masking of perceptual cues. Temporal indeterminacy can also result from lack of clarity
about the onset or offset of sounds due to stretched out cues and/or a low degree of
familiarity with the word, morpheme, or sound sequence in the signal. Each of these
scenarios can produce a context favorable for metathesis. But what kinds of sound
combinations are NOT likely to undergo metathesis? Clearly, the answer must be those
sequences that lack indeterminacy, in other words, those with sufficient information
about the identity of the sounds and their order.
By taking into account the nature of the input, it then becomes clear that one reason
why many metatheses involve an improvement in perceptual salience is because the
candidate most likely to undergo metathesis is one with weaker cues. A second reason
relates to the observation that clusters with poorer cues also tend to be less stable in
a language system and occur in fewer words than those with stronger cues (Makashay
2001). Given the view that the output of metathesis corresponds to the sequence with
the highest frequency, it then follows that if sequences with poorer cues are less frequent,
the observed cases of metathesis with non-optimal outputs will also be less commonly
The first point in 19 thus finds its explanation in the following observations. First,
sounds with robust cues are not good candidates for metathesis though ones with poorer
cues are. Second, sounds with robust cues tend to be more frequent in a system and
thus will have a greater impact on the processing of the speech signal. The infrequency
228 LANGUAGE, VOLUME 80, NUMBER 2 (2004)
of ‘non-optimized’ metatheses then stems from the fact that the phonetic cues of the
input signal would need to be better than the output. However, clusters with good cue
packages are less likely to be indeterminate and less likely to undergo metathesis. This
then suggests that the reason that improved perceptual salience is a characteristic of
so many cases of metathesis is simply an artifact of the nature of sequences that undergo
metathesis and those that influence how an indeterminate speech signal is parsed (cf.
Hume 1998, Steriade 2001). In other words, metathesis is not teleological.
This view of metathesis also provides a straightforward explanation of the second
observation: for some sound combinations, one order is favored crosslinguistically as
the result of metathesis. This is exemplified by cases involving plosives/fricatives. As
we have seen, the stop typically metathesizes to a context with better perceptual cues,
resulting in a perceptually optimized sequence. Metatheses involving these same kinds
of sounds but where the stop shifts from an arguably better context to a worse one are
less common. The reason for this should be clear: sounds occurring in a context with
strong cues tend to provide sufficient information to allow the sounds and their ordering
to be identified. Thus, the reason one order is favored in such sequences is that, all
else being equal, only one order of the sounds generally displays indeterminacy. For
other types of sound combinations, indeterminacy may arise regardless of the order in
which the sounds occur, due to the nature of the sounds and/or the context in which
they occur. Common examples include sounds sharing the same manner and/or place
features, and sounds with phonetic cues of long duration, as we have seen.
The third observation concerns the apparent randomness of metathesis: the direction
of metathesis can differ from language to language such that either order of a given
sound combination can emerge as the result of metathesis in some language. As we
have seen, however, the direction of metathesis is not arbitrary when two important
factors are taken into account: indeterminacy in the signal and the influence of native
sound patterns on speech processing. Recall that this influence is strongest when information
specifying a sound or sound sequence is indeterminate. Greater crosslinguistic
variability is then correctly predicted in those cases where indeterminacy arises regardless
of the order of the sounds. In addition, since languages differ both in terms of their
constituent parts as well as in terms of their patterns of usage, it is also correctly
predicted that the output of metathesis will differ according to the impact that the
sound patterns have on the way speakers/hearers of different languages process an
indeterminate speech signal.17
17 A referee suggests Ossetic metathesis as a potential problem for the proposals made here. The relevant
data is laid out in Hock 1985 (see also Cheung 2002), where it is pointed out that metathesis of an obstruentsonorant
occurred both word-finally (e.g. *(ha-)abra- arv ‘sky’, *t axra- tsalx ‘wheel’, *agra- al˛
‘extremity, tip’) and word-initially (e.g. *dr&u- Urdu ‘hair’, *br&¯ t&¯ Urv&d ‘brother’, *bru¯ka- Urfut
‘eyebrow’). The word-initial outputs of metathesis were then, it is assumed, repaired by vowel epenthesis.
The referee notes, following Hock (1985), that metathesis in word-initial position is highly marked, as it
would create an initial sonorant-obstruent sequence. It is also assumed that regular metatheses are structurally
motivated, for example, by a sonority-based requirement. Since the assumption that metathesis gave rise to
a word-initial sonorant-obstruent consonant cluster was problematic for the sonority-based view, an additional
mechanism was invoked to account for this case, such as generality of application.
It is important to point out that markedness, in the sense of a universal principle, is not crucial to the
proposals made in the present article. The relative commonness of a given pattern is, in my view, determined
on a language-specific basis (see Hume & Tserdanelis 2002, Hume 2002, 2003 for related discussion). What
is important for the account of metathesis developed here is whether or not the innovative structure is attested
in the language. The observation that obstruent-sonorant sequences metathesized in Ossetic, even in absolute
word-initial position, is consistent with this view since sonorant-obstruent sequences were existing articulatory
routines in the language.
6. CONCLUSION. In this article, I have focused on one aspect of the study of metathesis:
the factors that favor and disfavor its occurrence. As I have shown, a unified and
predictive account is viable when both universal and language-specific factors are taken
into account. The universal component draws on the psychoacoustics of the sound
combination at issue and the context in which it occurs, while language specificity
results from the influence of a speaker/hearer’s knowledge of sound patterns in the
native language. I have also argued that two conditions are necessary for metathesis
to occur: first, there must be indeterminacy in the signal, and second, the structure that
would result from metathesis must already be attested in the system. Indeterminacy
sets the stage for metathesis, and a speaker/hearer’s knowledge of the sound system
and its patterns of usage influence how the signal is processed and, thus, the order in
which the sounds are parsed. The greater the indeterminacy, the more the speaker/
hearer must rely on native-language knowledge to infer the temporal ordering of the
An important assumption in this paper is that metathesis has its roots in speech
processing. Support for this view comes in part from the observation that the sequence
resulting from metathesis conforms to an existing pattern in the language. Additional
evidence comes from the findings of Mielke and Hume (2001) concerning the influence
of word recognition on crosslinguistic patterns of metathesis. The findings of that study
confirm the view that ordering reversals are dispreferred at the beginning of a word or
root, and that metathesis overwhelmingly involves adjacent sounds. Both word position
and proximity have been shown to be significant factors conditioning speech processing
(Connine et al. 1993, Cutler et al. 1985, Hall 1992, Marslen-Wilson 1989, Marslen-
Wilson & Zwitserlood 1989).
While it is hoped that this study advances our knowledge of metathesis, it nonetheless
goes without saying that many issues remain to be addressed. For example, it is clear
that the role of experience plays a key role in predicting metathesis, given its influence
on speech processing. Yet, language use involves production as well; considering the
potential interplay of language experience and production on metathesis may also prove
fruitful. That this is an area worth investigating is suggested by Dell and colleagues
(2000:1365) who confirm that:
each utterance of a syllable tunes the language production system to favor the production of that and
similar syllables. The effect of this tuning endures longer than a single trial, and it accumulates with
the tuning associated with other utterances. The overall effect is to adapt the production system to recent
experience . . . The phonology is projected preferentially from those parts of the lexicon that are most
accessible, such as recently experienced sound forms.
It is thus reasonable to assume that a less practiced articulatory routine, whether it
involves coordinating the elements of a single sound or of a sequence of sounds, is
less precise and perhaps more difficult. The result is a bias towards more practiced
articulatory routines. With respect to metathesis, this would suggest that low-frequency
The additional assumption that obstruent-sonorant sequences metathesized in absolute word-initial position
in Ossetic is not required, however. According to Cheung (2002), at the time when metathesis took place,
Ossetic did not tolerate forms beginning with a consonant cluster. Vowel epenthesis was one strategy used
to repair such ill-formed structures. Importantly, epenthesis occurred with ALL initial clusters, not just those
that were subject to metathesis, for example, *x &p&¯ UxsavU/UxsUvU ‘night’, *tdz&r-ja- &˛dzUlyn ‘to
pour down, drip’. This then means that it is entirely reasonable to assume that the metathesizing sequence
was postvocalic when metathesis occurred. Thus, the change from obstruent-sonorant to sonorant-obstruent
consistently occurred postvocalically whether finally or medially within the word.
230 LANGUAGE, VOLUME 80, NUMBER 2 (2004)
or nonoccurring sound sequences would tend to lose out to more practiced sequences.
I leave the implications of this topic for our understanding of metathesis open for future
ALEXANDER, JAMES. 1985. R-metathesis in English: A diachronic account. Journal of English
Linguistics 18.33–40.
AL-MOZAINY, HAMZA QUBLAN. 1981. Vowel alternations in a Bedouin Hijazi Arabic dialect:
Abstractness and stress. Austin: University of Texas, Austin dissertation.
Stress shift and metrical structure. Linguistic Inquiry 16.1.135–44.
AMBRAZAS, VYTAUTAS. 1997. Lithuanian grammar. Lithuania: Institute of the Lithuanian
of voice onset time by human infants: New findings and implications for the effects
of early experience. Child Development 52.1135–45.
BAAYEN, R. HARALD, and ROCHELLE LIEBER. 1991. Productivity and English derivation: A
corpus-based study. Linguistics 29.801–43.
BAILEY, CHARLES-JAMES. 1970. Toward specifying constraints on phonological metathesis.
Linguistic Inquiry 1.3.347–49.
BAT-EL, OUTI. 1988. Remarks on tier conflation. Linguistic Inquiry 19.3.477–85.
BAT-EL, OUTI. 1989. Phonology and word structure in Modern Hebrew. Los Angeles: University
of California, Los Angeles dissertation.
BAT-EL, OUTI. 1992. Stem modification and cluster transfer in Modern Hebrew. Tel-Aviv:
Tel-Aviv University, MS.
BECKMAN, MARY, and JAN EDWARDS. 1990. Lengthenings and shortenings and the nature
of prosodic constituency. Papers in laboratory phonology 1: Between the grammar and
physics of speech, ed. by John Kingston and Mary Beckman, 152–78. New York:
Cambridge University Press.
BESNIER, NIKO. 1987. An autosegmental approach to metathesis in Rotuman. Lingua
BEST, CATHERINE. 1994. The emergence of native-language phonological influences in infants:
A perceptual assimilation model. The development of speech perception: The
transition from speech sounds to spoken words, ed. by Howard C. Nusbaum and Judith
Goodman, 167–224. Cambridge, MA: MIT Press.
perceptual reorganization for nonnative speech contrasts: Zulu click discrimination
by English-speaking adults and infants. Journal of Experimental Psychology: Human
Perception and Performance 14.345–60.
BLACK, PAUL. 1974. Regular metathesis in Gidole. Folia Orientalia 15.47–54.
BLADON, ANTHONY. 1986. Phonetics for hearers. Language for hearers, ed. by Graham
McGregor, 1–24. Oxford: Pergamon.
BLEVINS, JULIETTE, and ANDREW GARRETT. 1998. The origin of consonant-vowel metathesis.
Language 74.3.508–55.
BLUMSTEIN, SHEILA, and KENNETH STEVENS. 1979. Acoustic invariance in speech production:
Evidence from measurements of the spectral characteristics of stop consonants. Journal
of the Acoustical Society of America 66.1001–17.
BREGMAN, ALBERT. 1990. Auditory scene analysis. Cambridge, MA: MIT Press.
BREGMAN, ALBERT, and JEFFREY CAMPBELL. 1971. Primary auditory stream segregation and
perception of order in rapid sequences of tones. Journal of Experimental Psychology
BROADBENT, D. E., and PETER LADEFOGED. 1959. Auditory perception of temporal order.
Journal of the Acoustical Society of America 31.1539.
BUNYE,MARIAVICTORIA R., and ELSA PAULAYAP. 1971. Cebuano grammar notes. Honolulu:
University of Hawaii Press.
BUSH, NATHAN. 2001. Frequency effects and word-boundary palatalization in English. Frequency
and the emergence of linguistic structure, ed. by Joan Bybee and Paul Hopper,
255–80. Amsterdam: John Benjamins.
BUTSKHRIKIDZE, MARIKA, and JEROEN VAN DE WEIJER. 2001. On v-metathesis in Modern
Georgian. In Hume et al. 2001, 91–101.
BYBEE, JOAN. 1985. Morphology: A study of the relation between meaning and form. Philadelphia:
John Benjamins.
BYBEE, JOAN. 1995. Regular morphology and the lexicon. Language and Cognitive Processes
BYBEE, JOAN. 2000. The phonology of the lexicon: Evidence from lexical diffusion. Usagebased
models of language, ed. by Michael Barlow and Suzanne Kemmer, 65–86. Stanford,
CA: CSLI Publications.
BYBEE, JOAN. 2001. Phonology and language use. Cambridge: Cambridge University Press.
BYRD, DANI. 1994. Articulatory timing in English consonant sequences. Los Angeles: University
of California, Los Angeles dissertation.
CHEUNG, JOHNNY. 2002. Studies in the historical development of the Ossetic vocalism.
Weisbaden: Reichert.
COLEMAN, JOHN, and JANET PIERREHUMBERT. 1997. Stochastic phonological grammars and
acceptability. Computational phonology: Proceedings of the 3rd Meeting of the ACL
special interest group in computational phonology, 49–56. Somerset, NJ: Association
for Computational Linguistics.
CONKLIN, HAROLD. 1953. Hanuno´o-English vocabulary. (University of California publications
in linguistics 9.) Berkeley: University of California Press.
CONNINE, CYNTHIA M.; DAWN G. BLASKO; and DEBRA TITONE. 1993. Do the beginnings of
spoken words have special status in auditory word recognition? Journal of Memory
and Language 32.193–210.
COˆ TE´, MARIE-HE´ LE`NE. 1997. Phonetic salience and consonant cluster simplification. MIT
Working Papers in Linguistics 29.229–62.
CROUCH, MARJORIE. 1994. The phonology of Deg. Ghana: Ghana Institute of Linguistics,
Literacy and Bible Translation, MS.
CRYSTAL, DAVID. 1997. A dictionary of linguistics and phonetics. Oxford: Blackwell.
CUTLER, ANNE, and DENNIS NORRIS. 1988. The role of strong syllables in segmentation for
lexical access. Journal of Experimental Psychology: Human Perception and Performance
CUTLER, ANNE; JOHN A. HAWKINS; and GARY GILLIGAN. 1985. The suffixing preference: A
processing explanation. Linguistics 23.723–58.
DAVIDSON, JOSEPH ORVILLE, JR. 1977. A contrastive study of the grammatical structures of
Aymara and Cuzco Kechua. Berkeley: University of California, Berkeley dissertation.
DELANCEY, SCOTT. 1989. Tibetan evidence for Nungish metathesis. Linguistics of the Tibeto-
Burman Area 12.25–31.
errors, phonotactic constraints, and implicit learning: A study of the role of experience
in language production. Journal of Experimental Psychology: Learning, Memory and
Cognition 26.6.1355–67.
DERBYSHIRE, DESMOND C. 1979. Hixkaryana. (Lingua descriptive studies 1.) Amsterdam:
DERBYSHIRE, DESMOND C. 1985. Hixkaryana and linguistic typology. (SIL/UTA publications
in linguistics 76.) Dallas: Summer Institute of Linguistics.
DIMMENDAAL, GERRIT JAN. 1983. The Turkana language. Dordrecht: Foris.
DUME´ NIL, ANNIE. 1983. A rule account of metathesis in Gascon. Columbia: University of
South Carolina dissertation.
DUME´ NIL, ANNIE. 1987. A rule account of metathesis in Gascon. Linguisticae Investigationes
A destressing ‘deafness’ in French? Journal of Memory and Language 36.406–21.
EMENEAU, MURRAY BARNSON. 1967. The South Dravidian languages. Journal of the American
Oriental Society 87.365–412.
EMENEAU,MURRAY BARNSON. 1970. Dravidian comparative phonology: A sketch. (Annamalai
University publications in linguistics 22.) Tamil Nadu, India: Annamalainagar.
FAY, WILLIAM. 1966. Temporal sequence in the perception of speech. The Hague: Mouton.
FLEMMING, EDWARD. 1995. Auditory features in phonology. Los Angeles: University of
California, Los Angeles dissertation.
232 LANGUAGE, VOLUME 80, NUMBER 2 (2004)
FOLEY, LAWRENCE. 1980. Phonological variation in Western Cherokee. London: Garland.
FRANCIS, ALEXANDER, and HOWARD C. NUSBAUM. 2002. Selective attention and the acquisition
of new phonetic categories. Journal of Experimental Psychology: Human Perception
and Performance 28.349–66.
FRISCH, STEFAN. 1996. Similarity and frequency in phonology. Evanston, IL: Northwestern
University dissertation.
FRISCH, STEFAN; NATHAN LARGE; and DAVID PISONI. 2000. Perception of wordlikeness:
Effects of segment probability and length on the processing of nonwords. Journal of
Memory and Language 42.481–96.
FUJIMURA, OSAMU;M. J. MACCHI; and L. A. STREETER. 1978. Perception of stop consonants
with conflicting transitional cues: A crosslinguistic study. Language and Speech
GRAMMONT, MAURICE. 1933. Traite´ de phone´tique. Paris: Librairie Delagrave.
HALL, CHRISTOPHER. 1992. Integrating diachronic and processing principles in explaining
the suffixing preference. Morphology and mind: A unified approach to explanations
in linguistics, ed. by Christopher Hall, 321–49. New York: Routledge.
HALLE, M.; G. W. HUGHES; and J.-P. RADLEY. 1957. Acoustic properties of stop consonants.
Journal of the Acoustical Society of America 29.107–16.
of illegal consonant clusters: A case of perceptual assimilation? Journal of Experimental
Psychology: Human Perception and Performance 24.2.592–608.
HARNSBERGER, JAMES. 2001. The perception of Malayalam nasal consonants by Marathi,
Punjabi, Tamil, Oriya, Bengali, and American English listeners: A multidimensional
scaling analysis. Journal of Phonetics 29.303–27.
HARRISON, SHELDON. 1976. Mokilese reference grammar. Honolulu: The University of Hawaii
HEINE, BERND. 1976. Notes on the Rendille language. Afrika und Ubersee 59.176–223.
HEINE, BERND. 1978. The Sam languages: A history of Rendille, Boni and Somali. Afroasiatic
Linguistics 6.2.1–92.
HEWITT, BRIAN GEORGE. 1995. Georgian: A structural reference grammar. Amsterdam: John
HOCK, HANS HENRICH. 1985. Regular metathesis. Linguistics 23.529–46.
HUDSON, GROVER. 1975. Suppletion in the representation of alternations. Los Angeles:
University of California, Los Angeles dissertation.
HUDSON, GROVER. 1995. Phonology of Ethiopian languages. Handbook of phonological
theory, ed. by John Goldsmith, 782–97. Oxford: Blackwell.
HUME, ELIZABETH. 1997a. Consonant clusters and articulatory timing in Deg. Columbus:
The Ohio State University, MS.
HUME, ELIZABETH. 1997b. Metathesis in phonological theory: The case of Leti. Lingua
HUME, ELIZABETH. 1998. The role of perceptibility in consonant/consonant metathesis. Proceedings
of the West Coast Conference on Formal Linguistics 17.293–307.
HUME, ELIZABETH. 2001. Metathesis: Formal and functional considerations. In Hume et al.
2001, 1–25.
HUME, ELIZABETH. 2002. Reconsidering the concept of markedness. Paper presented at the
4th International Phonology Meeting of the GDR, Grenoble, France, June 2002.
HUME, ELIZABETH. 2003. Language specific markedness: The case of place of articulation.
Studies in Phonetics, Phonology and Morphology 19.2.295–310.
HUME, ELIZABETH, and KEITH JOHNSON. 2001a. A model of the interplay of speech perception
and phonology. In Hume & Johnson 2001b, 3–26.
HUME, ELIZABETH, and KEITH JOHNSON (eds.) 2001b. The role of speech perception in phonology.
New York: Academic Press.
HUME, ELIZABETH, and KEITH JOHNSON (eds.) 2001c. Studies on the interplay of speech
perception and phonology. (Ohio State University Working Papers in Linguistics 55.)
Columbus: The Ohio State University.
HUME, ELIZABETH, and KEITH JOHNSON. 2003. The impact of impartial phonological contrast
on speech perception. Proceedings of the International Congress of Phonetic Sciences
1999. A crosslinguistic study of stop place perception. Proceedings of the International
Congress of Phonetic Sciences 14.2069–72.
HUME, ELIZABETH, and MISUN SEO. 2004. Metathesis in Faroese and Lithuanian: From
speech perception to optimality theory. Nordic Journal of Linguistics 27.1.1–26.
HUME, ELIZABETH; NORVAL SMITH; and JEROEN VAN DE WEIJER (eds.) 2001. Surface syllable
structure and segment sequencing. Leiden: Holland Institute of Linguistics.
HUME, ELIZABETH, and GEORGIOS TSERDANELIS. 2002. Labial unmarkedness in Sri Lankan
Portuguese Creole. Phonology 19.1.441–58.
ISEBAERT, LAMBERT. 1988. Tocharian evidence for laryngeal metathesis in Indo-European.
Belgian Journal of Linguistics 3.39–46.
ISRAEL, M. 1979. A grammar of the Kuvi language. (Dravidian Linguistics Association 27.)
Trivandrum, India: Dravidian Linguistics Association.
JAKOBI, ANGELIKA. 1990. A Fur grammar. Hamburg: Helmut Buske Verlag.
JACOBSON, MADS ANDREAS, and CHRISTIAN MATRAS. 1961. Føroysk-Donsk Ordabo´k. To´rshavn:
Føroya Fro´dskaparfelag.
JANDA, RICHARD. 1984. Why morphological metathesis rules are rare: On the possibility of
historical explanation in linguistics. Berkeley Linguistics Society 10.87–103.
JOHNSON, KEITH. 1997. Acoustic and auditory phonetics. Oxford: Blackwell.
JUN, JONGHO. 1995. Place assimilation as the result of conflicting perceptual and articulatory
constraints. Proceedings of the West Coast Conference on Formal Linguistics
JUSCZYK, PETER. 1997. The discovery of spoken language. Cambridge, MA: MIT Press.
JUSCZYK, PETER, and RICHARD ASLIN. 1995. Infants’ detection of the sound patterns of words
in fluent speech. Cognitive Psychology 29.1–23.
JUSCZYK, PETER, and PAUL LUCE. 1994. Infants’ sensitivity to phonotactic patterns in the
native language. Journal of Memory and Language 33.630–45.
KAWASAKI, HARUKO. 1982. An acoustical basis for the universal constraints on sound sequences.
Berkeley: University of California, Berkeley dissertation.
KENESEI, ISTVA´ N; ROBERT VAGO; and ANNA FENYVESI. 1998. Hungarian. London: Routledge.
KENSTOWICZ, MICHAEL. 1972. Lithuanian phonology. Urbana-Champaign: University of
Illinois, Urbana-Champaign dissertation.
KEYSER, SAMUEL J. 1975. Metathesis and Old English phonology. Linguistic Inquiry
KIM, MI-RAN CHO. 1994. Acoustic characteristics of Korean stops and perception of English
stop consonants. Madison: University of Wisconsin, Madison dissertation.
KIPARSKY, PAUL. 1967. Sonorant clusters in Greek. Language 43.619–35.
KRISHNAMURTI, BHADRIRAJU. 1978. Areal and lexical diffusion of sound change: Evidence
from Dravidian. Language 54.1.1–20.
BJO¨ RN LINDBLOM. 1992. Linguistic experience alters phonetic perception in infants by
6 months of age. Science 255.606–8.
LAHIRI, ADITI, and WILLIAM MARSLEN-WILSON. 1991. The mental representation of lexical
form: A phonological approach to the recognition lexicon. Cognition 38.245–94.
LANGDON, MARGARET. 1976. Metathesis in Yuman languages. Language 52.4.866–82.
LAYCOCK, DON. 1982. Metathesis in Austronesian: Ririo and other cases. Papers from the
Third International Conference on Austronesian Linguistics 1: Currents in Oceanic
(Pacific Linguistics C-74), ed. by Amran Halim, Lois Harrington, and S. A. Wurm,
269–81. Canberra: Australian National University.
LEJEUNE, MICHEL. 1972. Phone´tique historique du myce´nien et du grec ancien. Paris: Klincksieck.
LESLAU, WOLF. 1963. Etymological dictionary of Harari. (University of California publications
in Near Eastern studies 1.) Berkeley: University of California Press.
LILJENCRANTS, JOHAN, and BJO¨ RN LINDBLOM. 1972. Numerical simulation of vowel quality
systems: The role of perceptual contrast. Language 48.4.839–62.
LINDBLOM, BJO¨ RN. 1990. Explaining phonetic variation: A sketch of the H and H theory.
Speech production and speech modeling, ed. byWilliam Hardcastle and Alain Marchal,
403–39. Dordrecht: Kluwer.
234 LANGUAGE, VOLUME 80, NUMBER 2 (2004)
LLORET-ROMANYACH, MARIA-ROSA. 1988. Gemination and vowel length in Oromo morphophonology.
Bloomington: Indiana University dissertation.
LOCKWOOD, WILLIAM BURLEY. 1955. An introduction to Modern Faroese. Copenhagen:
Ejnar Munksgaard.
LUCE, PAUL. 1986. Neighborhoods of words in the mental lexicon. (Research on speech
perception technical report 6.) Bloomington: Indiana University.
LUCE, PAUL, and DAVID PISONI. 1998. Recognizing spoken words: The neighborhood activation
model. Ear and Hearing 19.1–36.
LYCHE, CHANTAL. 1995. Schwa metathesis in Cajun French. Folia Linguistica 29.369–93.
MACUCH, RUDOLF. 1965. Handbook of classical and modern Mandaic. Berlin: Walter de
MAKASHAY, MATTHEW. 2001. Lexical effects in the perception of obstruent ordering. In
Hume & Johnson 2001c, 88–116.
MALE´COT, ANDRE´ . 1956. Acoustic cues for nasal consonants: An experimental study involving
a tape-slicing technique. Language 32.274–84.
MALONE, JOSEPH. 1971. Systematic metathesis in Mandaic. Language 47.394–415.
MALONE, JOSEPH. 1985. Classical Mandaic radical metathesis, radical assimilation and the
devil’s advocate. General Linguistics 25.92–121.
MARSLEN-WILSON, WILLIAM. 1989. Access and integration: Projecting sound onto meaning.
Lexical representation and process, ed. byWilliam Marslen-Wilson, 3–24. Cambridge,
MA: MIT Press.
MARSLEN-WILSON, WILLIAM, and PIENIE ZWITSERLOOD. 1989. Accessing spoken words: The
importance of word onsets. Journal of Experimental Psychology: Human Perception
and Performance 15.576–85.
MARTINEZ-GIL, FERNANDO. 1990. Topics in Spanish historical phonology. Los Angeles:
University of Southern California dissertation.
MASSARO, DOMINIC, and MICHAEL COHEN. 1983. Phonological constraints in speech perception.
Perception and Psychophysics 34.338–48.
MCCARTHY, JOHN. 1989. Linear order in phonological representation. Linguistic Inquiry
MCCARTHY, JOHN. 2000. The prosody of phase in Rotuman. Natural Language and Linguistic
Theory 18.147–97.
MIELKE, JEFF. 2002. Turkish /h/ deletion: Evidence for the interplay of speech perception
and phonology. North Eastern Linguistic Society 32.383–402.
MIELKE, JEFF. 2003. The interplay of speech perception and phonology: Experimental evidence
from Turkish. Phonetica 60.3.208–29.
MIELKE, JEFF, and ELIZABETH HUME. 2001. Consequences of word recognition for metathesis.
In Hume et al. 2001, 135–58.
MODER, CAROL. 1992. Productivity and categorization in morphological classes. Buffalo:
State University of New York, Buffalo dissertation.
MOHR, B., and W. S.-Y. WANG. 1968. Perceptual distance and specification of phonological
features. Phonetica 18.31–45.
MONTREUIL, JEAN-PIERRE. 1981. The Romansch ‘brat’. Papers in Romance 3.1.67–76.
MOON, CHRISTINE; ROBIN COOPER; and WILLIAM FIFER. 1993. Two-day old infants prefer
native language. Infant Behavior and Development 16.495–500.
NEWMAN, STANLEY. 1944. Yokuts language of California. (Viking Fund publication in anthropology
2.) New York: Viking Fund.
OHALA, JOHN J. 1981. The listener as a source of sound change. Chicago Linguistic Society
OHALA, JOHN J. 1992. Alternatives to the sonority hierarchy for explaining segmental sequential
constraints. Chicago Linguistic Society 26.319–38.
OHALA, JOHN J. 1993. Sound change as nature’s speech perception experiment. Speech
Communication 13.155–61.
OHALA, JOHN J. 1996. Speech perception is hearing sounds, not tongues. Journal of the
Acoustical Society of America 99.3.1718–25.
OKRAND, MARC. 1977. Mutsun grammar. Berkeley: University of California, Berkeley dissertation.
OOMEN, ANTOINETTE. 1981. Gender and plurality in Rendille. Afroasiatic Linguistics
representation of Japanese moraic nasals. Journal of the Acoustical Society of America
PADGETT, JAYE. 2001. Contrast dispersion and Russian palatalization. In Hume & Johnson
2001b, 187–279.
and JACQUES MEHLER. 1993. Attentional allocation within the syllabic structure of
spoken words. Journal of Memory and Language 32.373–89.
PARKS, DOUGLAS R. 1976. A grammar of Pawnee. New York: Garland.
PIERREHUMBERT, JANET. 1994. Knowledge of variation. Chicago Linguistic Society
PITT, MARK. 1998. Phonological processes and the perception of phonotactically illegal
consonant clusters. Perception and Psychophysics 60.941–51.
PITT, MARK, and JAMES MCQUEEN. 1998. Is compensation for coarticulation mediated by
the lexicon? Journal of Memory and Language 39.347–70.
PITT, MARK, and ARTHUR SAMUEL. 1990. Attentional allocation during speech perception:
How fine is the focus? Journal of Memory and Language 29.611–32.
PITT, MARK; KATHERINE SMITH; and JAMES KLEIN. 1998. Syllabic effects in auditory word
recognition: Evidence from the structural induction paradigm. Journal of Experimental
Psychology: Human Perception and Performance 24.1596–1611.
POLKA, LINDA, and JANET WERKER. 1994. Developmental changes in perception of nonnative
vowel contrasts. Journal of Experimental Psychology: Human Perception and
Performance 20.2.421–35.
POLKA, LINDA, and JANETWERKER. 1997. Adult and infant perception of two English phones.
Journal of the Acoustical Society of America 102.3742–53.
POLLACK, IRWIN; HERBERT RUBENSTEIN; and LOUIS DECKER. 1959. Intelligibility of known
and unknown message sets. Journal of the Acoustical Society of America 31.273–79.
POWELL, J. V. 1985. An occurrence of metathesis in Chimakuan. For Gordon H. Fairbanks,
ed. by Veneeta Z. Acson and Richard L. Leed, 105–10. Honolulu: University of Hawaii
POWLISON, PAUL. 1962. Palatalization portmanteaus in Yagua (Peba-Yaguan). Word
PRINCE, ALAN, and PAUL SMOLENSKY. 1993. Optimality theory. New Brunswick, NJ: Rutgers
University, and Boulder: University of Colorado at Boulder, MS.
RISCHEL, JøRGEN. 1972. Consonant reduction in Faroese noncompound wordforms. Studies
for Einar Haugen presented by friends and colleagues, ed. by Evelyn Scherabon Firchow,
Kaaren Grimstad, Nils Hasselmo, and Wayne O’Neil, 482–97. The Hague:
SAFFRAN, JENNY R.; RICHARD N. ASLIN; and ELISSA L. NEWPORT. 1996. Statistical learning
by 8-month-old infants. Science 274.1926–28.
SAFFRAN, JENNY R.; ELISSA L. NEWPORT; and RICHARD N. ASLIN. 1996. Word segmentation:
The role of distributional cues. Journal of Memory and Language 35.606–21.
SAVIN, HARRIS B. 1963. Word-frequency effect and errors in the perception of speech.
Journal of the Acoustical Society of America 35.200–206.
SCHMIDT, DEBORAH. 1994. Phantom consonants in Basaa. Phonology 11.149–78.
SCHULZE, WOLFGANG. 2002. Functional grammar of Udi. Munich: University of Munich,
SEMILOFF-ZELASKO, HOLLY. 1973. Glide metathesis. Ohio State University Working Papers
in Linguistics 14.66–76.
SEO, MISUN. 2003. A segment contact account of the patterning of sonorants in consonant
clusters. Columbus: The Ohio State University dissertation.
SHAVER, DWIGHT, and GWYNNE SHAVER. 1989. Un bosquejo de la metatesis en el Quechua
de Incahuasi. Lima, Peru: Instituto Linguistico de Verano y el Ministerio de Educacion.
SHETLER, JOANNE. 1976. Notes on Balangao grammar. (Language data, Asian-Pacific series
9.) Huntington Beach, CA: Summer Institute of Linguistics.
SHI, RUSHIN; JAMES MORGAN; and PAUL ALLOPENNA. 1998. Phonological and acoustic bases
for earliest grammatical category assignment: A cross-linguistic perspective. Journal
of Child Language 25.169–201.
236 LANGUAGE, VOLUME 80, NUMBER 2 (2004)
SILVA, CLARE. 1973. Metathesis of obstruent clusters. Ohio State University Working Papers
in Linguistics 14.77–84.
SILVERMAN, DANIEL. 1995. Phasing and recoverability. Los Angeles: University of California,
Los Angeles dissertation.
SIM, RONALD J. 1981. Morphophonemics of the verbin Rendille. Afroasiatic Linguistics
SIPTA´ R, PE´ TER, andMIKLO´ S TO¨ RKENCZY. 2000. The phonology of Hungarian. Oxford: Oxford
University Press.
SMITH, NORVAL. 1984. All change on CV-tier: Developments in the history on A√t im and
Anut ii. Linguistics in the Netherlands 1984, ed. by Hans Bennis and W. U. S. van
Lessen Kloeke, 169–78. Dordrecht: Foris.
SOHN, HO-MIN. 1980. Metathesis in Kwara’ae. Lingua 52.305–23.
SPENCER, ANDREW. 1996. Phonology. Oxford: Blackwell.
STERIADE, DONCA. 1995. Licensing retroflexion. Los Angeles: University of California, Los
Angeles, MS.
STERIADE, DONCA. 1997. Phonetics in phonology: The case of laryngeal neutralization. Los
Angeles: University of California, Los Angeles, MS.
STERIADE, DONCA. 2001. Directional asymmetries in assimilation: A perceptual account. In
Hume & Johnson 2001b, 219–78.
STEVENS, KENNETH N., and SHEILA E. BLUMSTEIN. 1978. Invariant cues for place of articulation
in stop consonants. Journal of the Acoustical Society of America 64.1358–68.
STONHAM, JOHN. 1990. Current issues in morphological theory. Stanford, CA: Stanford
University dissertation.
STREETER, L. 1976. Language perception of 2-month old infants shows effects of both innate
mechanisms and experience. Nature 259.39–41.
THOMPSON, LAURENCE, and TERRY THOMPSON. 1969. Metathesis as a grammatical device.
International Journal of American Linguistics 35.213–18.
TIMBERLAKE, ALAN. 1985. The metathesis of liquid diphthongs in Upper Sorbian. International
Journal of Slavic Linguistics and Poetics 31–32.417–30.
TREHUB, S. 1976. The discrimination of foreign speech contrasts by infants and adults. Child
Development 47.466–72.
TREIMAN, REBECCA, and CATALINA DANIS. 1988. Syllabification of intervocalic consonants.
Journal of Memory and Language 27.87–104.
ULTAN, RUSSELL. 1978. A typological view of metathesis. Universals of human language
2, ed. by Joseph Greenberg, 368–99. Stanford, CA: Stanford University Press.
VAGO, ROBERT. 1980. The sound pattern of Hungarian. Washington, DC: Georgetown University
VANCE, T. 1987. An introduction to Japanese phonology. Albany, NY: State University of
New York Press.
VENNEMANN, THEO. 1988. Preference laws for syllable structure. Berlin: Mouton de Gruyter.
VITEVITCH, MICHAEL S., and PAUL A. LUCE. 1999. Probabilistic phonotactics and neighborhood
activation in spoken word recognition. Journal of Memory and Language
WANG, H. S., and BRUCE DERWING. 1994. Some vowel schemas in three English morphological
classes: Experimental evidence. In honor of Professor William S.-Y. Wang: Interdisciplinary
studies on language and language change, ed. by M. Chen and O. Tzeng,
561–75. Taipei: Pyramid Press.
WANG, WILLIAM S.-Y. 1959. Transition and release as perceptual cues for final plosives.
Journal of Speech and Hearing Research 3.66–73.
WANNER, DIETER. 1989. On metathesis in diachrony. Chicago Linguistic Society 25.434–50.
WARREN, RICHARD M. 1982. Auditory perception: A new synthesis. New York: Pergamon
WEBB, CHARLOTTE. 1974. Metathesis. Austin: University of Texas, Austin dissertation.
aspects of cross-language speech perception. Child Development 52.349–55.
WERKER, JANET, and RICHARD TEES. 1984. Cross-language speech perception: Evidence for
perceptual reorganization during the first year of life. Infant Behavior and Development
WERKER, JANET, and RICHARD TEES. 1999. Influences on infant speech processing: Towards
a new synthesis. Annual Review of Psychology 50.509–35.
WILSON, M. D. 1988. The MRC psycholinguistic database: Machine readable dictionary,
Version 2. Behavioural Research Methods, Instruments and Computers 20.1.6–11.
WINFIELD, W. W. 1928. A grammar of the Kui language. Calcutta: The Asiatic Society of
WINTERS, STEPHEN. 2001. VCCV perception: Putting place in its place. In Hume et al. 2001,
WOLFF, JOHN U. 1972. A dictionary of Cebuano Visayan. Ithaca, NY: Cornell University,
Southeast Asia Program, and Linguistic Society of the Philippines.
WRIGHT, RICHARD. 1996. Consonant clusters and cue preservation in Tsou. Los Angeles:
University of California, Los Angeles dissertation.
WRIGHT, RICHARD. 2001. Perceptual cues in contrast maintenance. In Hume & Johnson
2001b, 251–77.
ZABORSKY, ANDRZEJ. 1986. The morphology of nominal plural in the Cushitic languages.
Vienna: Institute fu¨r Afrikanistik und Agyptologie der Universita¨t Wien.
Department of Linguistics [Received 23 September 2002;
222 Oxley Hall accepted 9 June 2003]
The Ohio State University
Columbus, OH 43210

Niciun comentariu: