Happy MMX, or ˌβί, or 二千十, or if your character set will handle it, ፳፻፲. Those are the Roman, Greek, Chinese, and Ethiopic numerals for 2010. And a Happy New Year to those of you who feel like being happy. My apologies for the lengthy absence here – rest assured that I have not given up the ghost, and have a backlog of interesting posts and articles for this time of new beginning.
Looking at the numerals above, note that each only has three graphemes (characters), as opposed to the four that Western numerals require. While the Roman numerals are generally less concise for writing numbers than most other systems – one of the reasons they are stigmatized in Western thought – for round and nearly-round numbers, Western numerals generally require more symbols than other systems, because of the requirement that unused place-values have zeroes to occupy the space. Yet while each of the other four systems requires only three graphemes, they are not at all the same, but express 2010 in four distinct ways:
Roman MMX = 1000 + 1000 + 10
Greek ˌβί = (1000×2) + 10
Chinese 二千十 = (2×1000) + (1x)10
Ethiopic ፳፻፲ = (20×100) + 10
But enough of this variability – you can read all about it later this month when my book finally emerges from the depths. I want to talk about another, equally interesting form of numerical variability: How do you say the number 2010?
The great numerical question of the last decade was whether the millennium began in 2000 or 2001. (For the record, I’m very strongly in the 2000 camp, on the basis that the turning over of the calendrical odometer was by far more culturally significant.) A close second is what we ought to call the decade as a whole: the naughts, or the noughts, or the aughts, or the aught-noughts, or the naughties, or any number of more ridiculous and facetious answers. And now that the decade is done, we seem to have done quite well without an agreed-upon term – although it will probably be more important to have one when the decade is being considered retrospectively – no one knew what eighties music was until 1990 at the earliest. Similarly, whether this new decade will be the tens or the teens, as discussed at David Crystal’s blog yesterday, isn’t an issue of present concern. But what to call this very year 2010 certainly is!
If you are a speaker of English, you have several options:
a) two thousand ten
b) two thousand and ten
c) twenty hundred ten
d) twenty hundred and ten
e) twenty ten
One of the not-so-dark secrets of the English language is that virtually any number above 100 has multiple, grammatically valid readings. If I were to be extremely radical I could contend that English has two parallel lexical numeral systems – but let’s not go that far quite yet. Let’s start by reducing the five variants to three by noting that the difference between a and b, and between c and d, is only the presence/absence of the word ‘and’. Crystal asserts that the versions with and are characteristically British, while the and-less versions are American, but this runs counter to my ethnographic experience working here in Detroit, where students from an ordinary public school education are brought into a mathematics program where and is strongly stigmatized, and need to learn not to use and in contexts where it would be natural for them. I’m not denying that this national pattern may have held true at one time, or in particular contexts (e.g., radio/TV broadcasts), but it surely is not as clear-cut as Crystal makes it sound.
In particular, year-names ending in -0x tend to take and for the very sensible reason that two thousand and eight clearly delineates the end of a numeral-phrase whereas two thousand eight invites the possibility that the speaker is about to continue – e.g. two thousand eight hundred. Although I should add that the potential for real confusion is quite low, on contextual grounds, and that one does occasionally hear and-less readings of 2001-2009. But let’s admit that one can use and in these phrases, or not, as one wishes.
Let’s reduce the variability further by noting the extremely unusual nature of the numeral-phrase *twenty hundred. Many English numerals ending in 00 between 1100 and 9900 are conventionally expressed in hundreds, not thousands – eleven hundred, sixty eight hundred, etc. This is probably because they are more compact than one thousand one hundred, six thousand eight hundred, etc. That brevity is the relevant cognitive criterion is demonstrated by the exceptions, the thousands from 2000 through 9000. Two thousand is shorter (3 syllables) than *twenty hundred (4 syllables) – remember, we are talking about verbal expressions here, so syllable length is the relevant criterion. So even though we could say nineteen hundred (and) ten, *twenty hundred (and) ten sounds decidedly odd. So let’s just forget about them.
But wait! What if we omit the word hundred entirely – as it is possible to do in almost any context in English. So, for instance, I can say, “I get paid eight twenty five a week” and “I get paid eight twenty five an hour” and be understood as saying that I make $825.00 a week but $8.25 an hour, solely from contextual information. I am presumably in this case working 100 hours a week, which may be a slight exaggeration. In any event, when talking about years there is rarely even the slightest chance of lexical ambiguity, and so nineteen hundred seventy-four is almost always reduced to nineteen seventy four – in fact, the only place where the full expression is encountered is in extremely formal or prestigious contexts such as official proclamations, diplomas, etc.
Note, also, that only the and-less versions of these phrases can be so reduced: *nineteen and seventy-four or *twenty and ten are unacceptable (both in Britain and the US). The simplest explanation (though not the only one) for this phenomenon is that when one is abbreviating, it makes sense that one would abbreviate maximally, rather than adding the unnecessary and. Another potential factor is that phrases like twenty and ten are found in English sentences such as, “The first-class and regular seats cost twenty and ten dollars, respectively,” although I wouldn’t stake my professional reputation on this potential ambiguity being all that important.
So that leaves us with two thousand (and) ten or twenty ten, which of course is not a surprise. On the grounds of compactness, twenty ten clearly wins out; on the grounds of ambiguity, two thousand and ten seems preferable, and two thousand ten might represent a good compromise. But these are not the only factors to consider – also relevant is how we say (or expect to say) surrounding year-names; if we say two thousand eight and two thousand nine, then twenty ten is potentially jarring. And we decidedly do not say twenty eight and twenty nine for 2008 and 2009, for the obvious reason that they can readily be confused with twenty-eight (28) and twenty-nine (29)!
The lesson: Whatever reading you might choose will be on the basis of one or more criteria, and there is no ultimate good or ‘proper’ choice – every choice will be less than maximally preferable on at least one of those criteria. So do what you like!
And the problem gets even worse, because depending on the context in which numerals are used, they may have additional valid readings. For instance, nominal number representations – ones that serve as labels rather than as counts of things – are frequently (though not always) read digitally rather than lexically. I’m working on a paper on the reading of phone numbers, showing the ways in which digital and lexical representations of numbers interface (and interfere) with one another (see pilot data here). To illustrate the point: my paper is tentatively entitled ‘Jenny’s Revenge: Eight Hundred Sixty Seven, Five Thousand Three Hundred (and) Nine‘, which will make sense largely to those born before one thousand nine hundred (and) seventy five. But year numbers are not just labels, but rather denote a place in a series – 2010 is defined as the year after 2009 and before 2011 – and are less readily digitized. Consider the following:
two zero one zero
two oh one oh
twenty one zero
twenty one oh
None of these is even remotely acceptable as a reading of the year number 2010. More strikingly, even though nineteen oh nine is the preferred pronunciation of 1909 for most people, 2010 cannot be acceptably read as *twenty one oh, I suspect, by anyone. But if my phone number were 867-2010, at least the first two variants would be acceptable, and in fact might be preferred, because they are clear representations of each digit. When one is speaking on the phone, for instance, one tends towards maximally clear and distinct representations of each digit in the expectation that one’s listener may be writing the number down – and no one writes phone numbers out lexically. With year numbers, this expectation rarely if ever holds true. There has been no scholarship to date on the question of which numerical representations are acceptable (or not), preferred (or not) in various contexts.
To return to our beginnings, the problem of multiple representations for the same number also arises in numerical notation, although due to the structure of Western numerals, less so than in other representations. In Chinese, for instance, there is both an informal (二千十 = 2 1000 10) and formal (二千一十 = 2 1000 1 10) variant – the difference being the deletion of the morpheme/grapheme for one in the tens position. Similarly, in the classical Roman numerals one could use subtractive notation far more widely than is presently taught in schools – for instance, check out this inscription where 88 is written as XXCIIX instead of the expected (modern) LXXXVIII. And even though ‘we all know’ that the Roman numeral for 4 is IV, Roman numeral clocks even to this day read IIII (but IX for 9)!
So in summation, may you have a lexically ambiguous, but nonetheless pleasant, two-oh-one-oh through two-oh-one-nine.
Edit to add: Shortly after I posted, Mark Liberman over at Language Log offered his own mirthful take on the issue in his post, ‘2010‘, which you should all go read right now, if you haven’t already. I shall have to register a complaint, however, in that his post is (ordinally) #2012 over at the Log (see the URL) – they surely should have instructed their bloggers properly on the importance of this numerical correlation!