Numerical copyediting follies

A short time ago, I was reading an otherwise pretty good article when I encountered a sentence that was confusing at first:

Large numbers are written in a variety of formats: In English, numbers may be represented as numerals (5,000), as numbers words (5,000), or in what we might call the hybrid system (5,000).

It took me a moment to realize, with a profound sadness that only a professional number guy like myself can appreciate, that what was meant was:

Large numbers are written in a variety of formats: In English, numbers may be represented as numerals (5,000), as numbers words (five thousand), or in what we might call the hybrid system (5 thousand).

No doubt the sentence read properly at first, but then an overeager copyeditor (or perhaps an automated copyediting system?) got hold of the sentence and converted it into house style for the journal, which rendered the entire thing completely meaningless.  Probably the authors should have caught it at proofs, but they didn’t, and there you go.  (I also think numbers words should be number words, but who am I to complain?  Oh, right, almost forgot there for a moment.)

One of the perils of working in a field like numerals is that every journal and publisher has a house style, and numerals are one of those things that authors often throw about casually, thus requiring some attention from editorial staff.  The problem is that when the subject of your research is numerals, you can’t rely on a house style or intuition to figure out what to do.   So, for instance, just to take a very basic example, it’s weird to say that V is the Roman numeral equivalent of five; it is the equivalent of 5.  But almost no style guide permits free-standing numerals less than 10 to be written in numerical notation.

Now, I should say that I have had very positive experiences with copyeditors in general.  The copyeditors who worked on Numerical Notation, in particular, were absolutely superb.   I did, however, write an extensive page-long memo to my main copyeditor, complete with acceptable and unacceptable sample sentences, with all of the little exceptions and pedantries that would have taken hours to undo.   I like to think that they appreciated it but I leave open the possibility that they thought I was an entitled prick.  I also had to contend with around 30 fonts, most of my own creation, with various numerical signs.  I could write a long post about my process for creating numerical fonts, but I fear it would be even more boring than a post about copyediting goofs.  Anyway, the result is fantastic and I have found very, very few editorial issues in the book.

But the authors above should take heart – it can happen to anyone.    Around 2002, when I was finishing up my PhD, I worked as a research assistant for my supervisor, Bruce Trigger.  One day Bruce recounted to me the most remarkable thing about the book he was working on at the time.     Wherever he had written ‘one million’, he got back a version that said ‘ten lacks’, and wherever he had ‘two million’, it said ‘twenty lacks’, and so on.     At first he was extremely confused, and when he got to the bottom of it, it turns out that, like so many things, the copyediting for the book was outsourced to an Indian firm.      In Indian English, the word ‘lack’ or ‘lakh’ (borrowed from Hindi) is frequently employed to mean ‘100,000’, because, as in Hindi, Indian English has a special power term for every other power of ten above 1,000 (thousand, lack = 100K , crore = 10 million ) where American English has one for every third power (thousand, million, billion) and of course British English is even more irregular.  In this case the error was caught and fixed.  It serves, though, as an object lesson that I tell to students today, about mixed languages and Global English and weird numerical anomalies.   It’s a reminder that in the world-system, sometimes the periphery strikes back.




  1. An Irish journalist told me that her newspaper once treated the phrase “$64 million question” to a house-style addition, in parentheses, of the figure’s equivalent in euro.

