Part I
Foundations
1 Whatâs in a word
1.0 Introduction: a form of words
Not only every language but every lexeme of a language is an entire world in itself.
(MelâÄuk, 1981, p. 570)
A main aim in this chapter is to introduce some basic terms and concepts in the analysis of vocabulary. The emphasis is on an exploration of what constitutes a word. There is an extensive literature on this topic stretching back over at least twenty years. The area of linguistics which covers the topic is generally known as lexical semantics and is most clearly represented in John Lyonsâs two-volume study (Lyons, 1977). In the next three chapters an introduction is given to work which is itself at an introductory stage; but an introduction to this highly developed field is bound to involve some degree of oversimplification. Word-level semantic analysis features in almost all elementary courses in linguistics and it is probable that some readers will be already acquainted with the field.
However, this is not a book in theoretical linguistics; it is not even an introduction to theoretical linguistics. Instead, a selection is made of those features of lexical semantics which seem most relevant to an understanding of some selected contexts of language use. For example, one focus in this first chapter is on some applications of aspects of lexical semantics to dictionary use, and to an evaluation of pedagogical word lists. These and other features are also selected for their further usefulness to us in subsequent chapters on vocabulary teaching, stylistics and English as a Foreign Language lexicography. This kind of selection runs a further risk that the field of lexical semantics is misrepresented or at best oversimplified. An applied linguistic perspective cannot always avoid such risks. However, its strengths lie, I hope, in the ways in which some practical problems of language use are addressed and discussed (note that bold italics are used when a technical term is introduced or discussed).
1.1 Some definitions
Everyone knows what a word is. And it may therefore appear unnecessary to devote several pages of discussion to its definition, even in a book on vocabulary. Indeed, closer examination reveals the usefulness of everyday common-sense notions of a word; it also reveals, however, some limitations which have a bearing on the ways in which words are used and understood in some specialized applied linguistic contexts.
An orthographic definition of a word is a practical common-sense defin-ition. It says, quite simply, that a word is any sequence of letters (and a limited number of other characteristics such as hyphen and apostrophe) bounded on either side by a space or punctuation mark. It can be seen that this definition is at the basis of such activities as counting the number of words needed for an essay, a competition, or telegram, to play âScrabbleâ and to write a shopping list. There are, of course, irregularities. For example, we write will not as two words but cannot as one word; instead of is two words, but in place of is three; postbox can also appear as post box or post-box. But, generally, the notion of an orthographic word has considerable practical validity.
Orthography refers, of course, to a medium of written language. And although this issue is not explicitly dealt with at this stage (see Section 4.5), we should note that spoken discourse does not generally allow of such a clear perception of a word. The issue of word stress is significant and is explored in this section, but where stress, âspacesâ or pauses occur in speech, it may be for reasons other than to differentiate one single word unit from another. It can be for purposes of emphasis, seeking the right expression, checking on an interlocutorâs understanding, or even as a result of forgetting or rephrasing what you were going to say. In such circumstances, the divisibility of a word is less clear-cut; in fact, spaces here can occur in the middle of the orthographically defined word unit. And we should, in any case, remember that not all languages mark word boundaries, the most prominent of these being Chinese.
However, even in written contexts, there are potential theoretical and practical problems with an orthographic definition. For example, if bring, brings, brought and bringing, or long, length and lengthen, or, less obviously, good, better and best are separate words, would we expect to find each word from the set listed separately in a dictionary? If so, why and if not, why not? And what about words which have the same form but different meanings; for example, line in the sense of railway line, fishing line or straight line? Are these one word or several? Others have more extended meanings and even embrace different grammatical categories; examples of such polysemic words are: fair, pick, air, flight, mouth. Knowing a word involves, presumably, knowing the different meanings carried by a single form. An orthographic definition is one which is formalistic in the sense of being bound to the form of a word in a particular medium. It is not sensitive to distinctions of meaning or grammatical function. To this extent it is not complete.
It may be more accurate to define a word as the minimum meaningful unit of language. This allows us to differentiate the separate meanings contained in the word fair in so far as they can be said to be different semantic units. However, this definition presupposes clear relations between single words and the notion of âmeaningâ. For example, there are single units of meaning which are conveyed by more than one word: bus conductor, train driver, school teacher, model railway. And if they are compound words do they count as one word or two? There are also different boundaries of meaning generated by âwordsâ which can be read in more than one way. For example, police investigation is read more normally as an investigation by the police but its appearance in a recent headline fronting a police bribery case enables us to read it as an investigation of the police. More problematically still, to what extent can âmeaningâ be said to be transmitted by the following words: if, by, but, my, could, because, indeed, them. Such items can serve to structure or otherwise organize how information is received, but on their own they are not semantic units in the sense intended above. The presence of such words in the lexicon also undermines another possible definition of a word, namely, that a word is a âminimal free formâ.
This definition, which derives from Bloomfield (1933, pp. 178 ff ), is a useful working definition and, like that of the orthographic word, has a certain intuitive validity. The idea here is to stress the basic stability of a word. This comes from the fact that a âwordâ is a word if it can stand on its own as a reply to a question or as a statement or exclamation. It is not too difficult to imagine contexts in which each of the following words could exist independently:
Shoot! Goal! Yes. There. Up. Taxi!
And it is only by stretching the imagination that the word shoot could be reduced further to, say, Sh. . . Goal!, where it would, anyway, be dependent on the other word for its sense. By this definition, then, a word has the kind of stability which does not allow of further reduction in form. It is stable and free enough to stand on its own. It cannot be subdivided. We should note here, though, that a number of words do not pass this minimal free form test. Although we can imagine grammar lessons in which words like my or because appeared independently, it is unlikely that such items could occur on their own without being contextually attached to other words. And we should also recognize that there are idioms such as to rain cats and dogs (to rain heavily) or to kick the bucket (to die) which involve three orthographic words which cannot be further reduced without loss of meaning, which can be substituted by a single word and yet which can stand on their own. For example:
Q: Is it raining hard?
A: Cats and dogs.
Where the reply serves more or less as a substitute form for the single item hard.
Another possible definition of a word is that it will not have more than one stressed syllable. Thus, cats, shoot, veterinary, immobilize, are unambiguously âwordsâ. Again, however, we should note that some of the forms designated above as not transmitting meaning (e.g. if, but, by, them) do not normally receive stress, except when a particular expressive effect is required. Also, some of our two-word orthographic units such as bus conductor would be defined as single words according to this test.
It is clear that there are problems in trying to define a word. Commonsense definitions do not get us very far; but neither do a series of more technical tests. The discussion in this section has served, however, to highlight problems and in the following sections these problems will be discussed further with the aim of at least trying to identify what are the basic, prototypical properties of a word. Let us first summarize the main problems we have already encountered:
1 Intuitively, orthographic, free-form or stress-based definitions of a word make sense. But there are many words which do not fit these categories.
2 Intuitively, words are units of meaning but the definition of a word having a clear-cut âmeaningâ creates numerous exceptions and emerges as vague and asymmetrical.
3 Words have different forms. But the different forms do not necessarily count as different words.
4 Words can have the same forms but also different and, in some cases, completely unrelated meanings.
5 The existence of idioms seems to upset attempts to define words in any neat formal way.
1.2 Lexemes and words
One theoretical notion which may help us to resolve some of the above problems is that of the lexeme. A lexeme is the abstract unit which underlies some of the variants we have observed in connect...