Optimality Theory

This is an introduction to OptimalityTheory, the central idea of which is that surface forms of language reflect resolutions of conflicts between competing constraints. A surface form is ‘optimal’ if it incurs the least serious violations of a set of constraints, taking into account their hierarchical ranking. Languages differ in the ranking of constraints; and any violations must be minimal. The book does not limit its empirical scope to phonological phenomena, but also contains chapters on the learnability of OT grammars; OT’s implications for syntax; and other issues such as opacity. It also reviews in detail a selection of the considerable research output which OT has already produced. Exercises accompany chapters 1–7, and there are sections on further reading. Optimality Theory will be welcomed by any linguist with a basic knowledge of derivational Generative Phonology.

RENÉ KAGER teaches linguistics at Utrecht University, the Netherlands.

Preface page xi

1 Conflicts in grammars 1 1.1 Introduction: goals of linguistic theory 1 1.2 Basic concepts of OT 3 1.3 Examples of constraint interaction 14 1.4 The architecture of an OT grammar 18 1.5 Interactions of markedness and faithfulness 27 1.6 Lexicon Optimization 32 1.7 A factorial typology of markedness and faithfulness 34 1.8 On defining segment inventories 43 1.9 Conclusion 47

2 The typology of structural changes 52 2.1 Introduction 52 2.2 Nasal substitution and related effects 59 2.3 The typology of *Nt effects 78 2.4 Conspiracies of nasal substitution and other processes 83 2.5 Conclusion: a comparison with rule-based theory 86

3 Syllable structure and economy 91 3.1 Introduction 91 3.2 The basic syllable typology 92 3.3 Epenthesis and the conflict of well-formedness and faithfulness 98 3.4 Generalized Alignment 117 3.5 The quality of epenthetic segments 124 3.6 Coda conditions 130 3.7 Conclusion 139



4 Metrical structure and parallelism 142 4.1 Introduction 142 4.2 Word stress: general background 143 4.3 Case-study: rhythmic lengthening in Hixkaryana 148 4.4 A set of metrical constraints 161 4.5 Case-study: rhythmic syncope in

5 Correspondence in reduplication 194 5.1 Introduction 194 5.2 Reduplicative identity: the constraints 201 5.3 From classical templates to generalized templates 216 5.4 From circumscription to alignment 223 5.5 'Classical' versus OT-based prosodic morphology: conclusions 229 5.6 Overapplication and underapplication in reduplication 230 5.7 Summary of Correspondence Theory 248

6 Output-to-output correspondence 257 6.1 Introduction 257 6.2 Identity effects in truncation 259 6.3 Identity effects in stem-based affixation 273 6.4 The cycle versus base-identity 277 6.5 Output-to-output correspondence: conclusions 293

7 Learning OT grammars 296 7.1 Introduction 296 7.2 Learning constraint rankings 297 7.3 Learning the Pintupi grammar of stress 300 7.4 The learning algorithm: discussion 321 7.5 Learning alternations and input representations 324


References 425 Index of languages 445 Index of subjects 447 Index of constraints 451

This book presents an introductionto OptimalityTheory, a grammatical framework of recent origin (Prince and Smolensky 1993, McCarthy and Prince 1993a, b). The central idea of OptimalityTheory (OT) is that surface forms of language reflect resolutions of conflicts between competing demands or constraints. A surface form is ‘optimal’ in the sense that it incurs the least serious violations of a set of violable constraints, ranked in a language-specific hierarchy. Constraints are universal, and directly encode markedness statements and principles enforcing the preservation of contrasts. Languages differ in the ranking of constraints, giving prioritiesto some constraintsover others.Such rankingsare based on ‘strict’domination: if one constraint outranks another, the higher-ranked constraint has priority, regardless of violations of the lower-ranked one. However, such violation must be minimal, which predicts the economy property of grammatical processes. OT’s basic assumptions and the architecture of OT grammars will be dealt with in chapters 1 and 2.

Optimality Theory is a development of Generative Grammar, a theory sharing its focus on formal description and quest for universal principles, on the basis of empirical research of linguistic typology and (first) language acquisition. However, OT radically differs from earlier generative models in various ways. To accommodate cross-linguistic variation within a theory of Universal Grammar, OT assumes that universal constraints are violable, while earlier models assumed ‘parametric’ variation of inviolate principles. Moreover, OT is surface-based in the sense that well-formedness constraints evaluate surface forms only – no structural conditions are placed on lexical forms. Earlier models had assumed Morpheme Structure Constraints, resulting in the duplication of static and dynamic rules in phonotactics. In contrast, OT entirely abandons the notion of rewrite rule, dissociating ‘triggers’ and ‘repairs’. This serves to explain conspiracies: multiple processes triggered by a single output-oriented goal. Finally, OT also eliminates derivations, replacing these by parallelism: all constraints pertaining to some type of structure are evaluated within a single hierarchy. The comparison of OT and

Preface xii its generative ancestors will be the topic of chapter 2, although the issue will reoccur in later chapters (specifically 4, 5, and 9).

Optimality Theory is not a theory of representations, but a theory of interactions of grammatical principles. More accurately, the issue of representations is orthogonal to that of constraint interaction. Therefore the divergence from earlier generative models is less clear-cut in this respect. Most OT literature on phonology, for example, assumes the representational alphabet of non-linear (metrical and autosegmental) phonology. In this book, the emphasis will be on prosodic phenomena, partly reflecting a tendency in the field, and partly the research interests of the author. Some of OT’s most striking results have been reached in the domain of prosodically governed phenomena, such as syllable-dependent epenthesis (chapter 3), interactions of syllable weight and metrical structure (chapter 4), and prosodic targets in reduplication (chapter 5). However, our discussion of these phenomena serves to highlight results of OT that are relevant beyond prosody. To support this point, a range of segmental phenomena will be analysed throughout the book. Finally, OT has consequences for representational issues which are more closely connected with grammatical interactions, in particular for featural underspecification, as will be shown in chapters 1, 3, and 9.

Optimality Theory is a general theory of grammar, rather than one of phonology. Therefore this book is not limited in its empirical scope to phonological phenomena, but it also contains chapters on the learnability of OT grammars (chapter 7) and extensions to syntax (chapter 8). Finally, chapter 9 will address a number of important residual issues in OT, focussing on opacity, and discussing current developments in assumptions on lexical representations (versus allomorphy), optionality, absolute ungrammaticality, and various functionally oriented approaches to phonology.

During its brief period of existence, OT has sparked off a large output of articles, PhD dissertations, and volumes. Here we will review a selection of this research output, in a way that maximally highlights the theory’s contribution to insights into language. In chapters 2 and 5–8, one particular piece of research will be focussed on, while placing it against a broad theoretical background. Chapter 2 focusses on the analysis of post-nasal-obstruent-voicing effects by Pater (forthcoming), and serves to highlight factorial typology, OT’s explanation of conspiracies, and to introduce Correspondence Theory. Chapter 5 is devoted to the Correspondence Theory of reduplication by McCarthy and Prince (1995a, forthcoming), emphasizing ‘the emergence of the unmarked’ and parallelism of evaluation, and also extending the notion of ‘correspondence’to relations between outputs. Chapter 6 discusses Benua’s (1995) paper on output-to-outputcorrespondence in truncation, and its extensions to stem-based affixation, while comparing OT and derivational theory for ‘cyclic’ phenomena. Chapter 7 discusses work by

Preface xiii

Tesar and Smolensky (1993, 1998) on the learnability of OT grammars, and its dependence on basic OT notions, such as strict domination, minimal violation, and assumptions on lexical forms. Chapter 8 is devoted to the analysis of Whmovement and its relation with auxiliary inversion and do-support in English by Grimshaw (1997), pointing out the relevance of OT outside phonology.

This book is not a general introduction to phonology, and the reader should come equipped with a basic knowledge of derivational Generative Phonology, including rules and representations, and some knowledge of Minimalist Syntax for chapter 8. Exercises have been added to chapters 1–7 to increase analytic skills and reflection on theoretical issues. Moreover, each chapter contains a list of suggestions for further reading.

The idea for this book arose during a course I taught at the LOT summer school at the University of Amsterdam in 1995. Stephen Anderson, who was present at this course, suggested basing an OT textbook on its contents. For his role in originating this book, I owe him special thanks.

Parts of this book are based on research reported on earlier occasions. Chapter 4 is partly based on Kager (1997a), first presented at the workshop on Derivations and Constraints in Phonology, held at the University of Essex, September 1995. Chapter 6 contains results from Kager (forthcoming), presented at the conference on the Derivational Residue in Phonology, Tilburg University, October 1995. I wish to thank the organizers of these events: Iggy Roca, Ben Hermans, and Marc van Oostendorp. Research for this book was partly sponsored by the Dutch Royal Academy of Sciences (KNAW), whose support is gratefully acknowledged.

For their comments on earlier versions of chapters I wish to thank PeterAckema,

Stephen Anderson, Roger Billerey, Gabriel Drachman, Nine Elenbaas, Bruce Hayes, Claartje Levelt, Ad Neeleman, Joe Pater, Bruce Tesar, Wim Zonneveld, and an anonymous reviewer. These comments have led to a number of substantial improvements. Needless to say, I take the blame for any mistakes in content or my presentation of other researchers’ ideas. Thanks to Martin Everaert for supplying the child language data discussed in chapter 7.

Finally, this book would not have been finished without the encouragement and patience of my colleagues, friends, and family. Jacqueline, this book is dedicated to you.

1 Conflicts in grammars

1.1Introduction: goals of linguistic theory

1.1.1 Universality

The central goal of linguistic theory is to shed light on the core of grammatical principles that is common to all languages. Evidence for the assumption that there should be such a core of principles comes from two domains: language typology and language acquisition. Over the past decades our knowledge of linguistic typology has become more and more detailed, due to extensive fieldwork and fine-grained analysis of data from languages of different families. From this large body of research a broad picture emerges of ‘unity in variety’: core properties of grammars (with respect to the subsystems of sounds, words, phrases, and meaning) instantiate a set of universal properties. Grammars of individual languages draw their basic options from this limited set, which many researchers identify as Universal Grammar (UG). Each language thus reflects, in a specific way, the structure of ‘language’. A second source of evidence for universal grammatical principles comes from the universally recurring patterns of first language acquisition. It is well known that children acquiring their first language proceed in remarkably similar ways, going through developmental stages that are (to a large extent) independent of the language being learnt. By hypothesis, the innateness of UG is what makes grammars so much alike in their basic designs, and what causes the observed developmental similarities.

The approach to universality sketched above implies that linguistic theory should narrow down the class of universally possible grammars by imposing restrictions on the notions of ‘possible grammatical process’ and ‘possible interaction of processes’. In early Generative Grammar (Chomsky 1965, Chomsky and Halle 1968), processes took the shape of rewrite rules, while the major mode of interaction was linear ordering. Rewrite rules take as their input a linguistic representation, part of which is modified in the output. Rules apply one after another, where one rule’s output is the next rule’s input. It was soon found that this rule-based theory hardly imposes any limits on the notion of ‘possible rule’,

Conflicts in grammars nor on the notion of ‘possible rule interaction’. In the late 1970s and early 1980s, considerable efforts were put into constraining both rule typology and interactions. The broad idea was to factor out universal properties of rules in the form of conditions.1While rules themselves may differ between languages, they must always respect a fixed set of universal principles. Gradually more and more properties were factored out of rules and attributed to universal conditions on rules and representations. Developments came to their logical conclusion in Principles-and-Parameters Theory (Chomsky 1981b, Hayes 1980), which has as its central claim that grammars of individual languages are built on a central core of fixed universal properties (principles), plus a specification of a limited number of universal binary choices (parameters). Examples of parameters are the side of the ‘head’ (left or right) in syntactic phrases, or the obligatoriness (yes/no) of an onset in a syllable. At the same time, considerable interest developed in representations, as a way of constraining rule application, mainly with respect to locality (examples are trace theory in syntax, and underspecification theory in phonology). Much attention was also devoted to constraining rule interactions, resulting in sophisticated theories of the architecture of UG (the ‘T’-model) and its components (e.g. Lexical Phonology, Kiparsky 1982b).

1.1.2 Markedness

What all these efforts to constrain rules and rule interactions share, either implicitly or explicitly, is the assumption that universal principles can only be universal if they are actually inviolatein every language. This interpretation of ‘universality’ leads to a sharp increase in the abstractness of both linguistic representations and rule interactions. When some universal principle is violated in the output of the grammar, then the characteristic way of explaining this was to set up an intermediate level of representation at which it is actually satisfied. Each grammatical principle thus holds at a specific level of description, and may be switched off at other levels.

This absolute interpretation of universality is not the only one possible, however. In structuralist linguistics (Hjelmslev 1935, Trubetzkoy 1939, Jakobson 1941; cf. Anderson 1985), but also in Generative Phonology (Chomsky and Halle 1968, Kean 1975, Kiparsky 1985) and Natural Phonology (Stampe 1972, Hooper 1976), a notion of markedness plays a key role, which embodies universality in a ‘soft’ sense. The idea is that all types of linguistic structure have two values, one of which is ‘marked’, the other ‘unmarked’. Unmarked values are crosslinguistically preferred and basic in all grammars, while marked values are cross- linguistically avoided and used by grammars only to create contrast. For example,For example,Subjacency was proposed as a universal condition on syntactic movement rules and the Obligatory Contour Principle as a universal condition on phonological rules.

