News roundup

Well there certainly has been a lot of action here since my post about the Embuggerance and Feisty fiasco. Alas, no word on any action on the part of the great Googly deity. Greetings to all newcomers arrived from Language Log, Language Hat, The Volokh Conspiracy, and parts a-Twitter. In lieu of thoughtful content, here are some things that have amused me over the past week:

Various blogs have noted (with various ranges of dismay) a new pop-sci volume entitled Manthropology by Peter McAllister, which takes the well-known fact that there is a decline in both male and female skeletal robusticity associated with industrialism and turns it into such gender-essentialist nonsense as “If you’re reading this then you — or the male you have bought it for — are the worst man in history”. As far as I can tell the author has no advanced degree in anthropology and has never published any peer-reviewed work in support of his rather extreme claims.

There’s a curious blog post over at the NYT by Olivia Judson on the relationship between facial expression and the phonetic inventory of languages. She asks whether speakers of languages in which certain vowel sounds (like [i] ) are common are more prone to smile on that basis. Perhaps not, but there’s an abundant literature on the relationship of speech and facial expression, much of which is found in the notes below the post. Hat tip to Julien at A Very Remote Period Indeed for alerting me to it.

Lastly, for any of my students who may be reading and were paying attention last week, when we discussed George Lakoff’s NATION AS FAMILY metaphor, or for any of you from the true north strong and free, I give you this amusement from the webcomic Toothpaste for Dinner. I do want to register a complaint that my part of Canada (south-southwestern Ontario) seems to have already made its escape – or perhaps is the insane relative abandoned in the basement? You decide.


Mandarin vs. Cantonese in America

There’s an interesting article in the New York Times today about the increase in the use of Mandarin among Chinese-Americans, to the detriment of the formerly more common Cantonese. When we think of language loss in the US we rightly think of situations where English replaces the languages of more recent immigrants (or of Native Americans), but here we have an interesting case where two languages, each vital in China and sharing a common script, come to be in competition here due to the nature of social ties in American Chinatowns. It’s not just that more Chinese immigrants are coming from Mandarin-speaking areas today (although that’s true); because Mandarin is an international language of commerce, there is perceived economic value for Cantonese-American families in having their children become trilingual in Cantonese, Mandarin, and English. It would be interesting to know whether some Chinatowns are less prone to Mandarin-ization than others, and why.

A feisty embuggerance

When I grade my students’ paper proposals, I make a point of doing a brief Google Scholar search for each student’s proposal, which a) helps me evaluate how thorough they have been; b) helps me help them find additional material (I then give them the sources I found, but also the keywords I used to find them). One of my students in my introductory linguistic anthropology course this term is doing a paper on linguistic aspects of laughter and humor. During my search, I encountered the following citation (direct from Google Scholar to you):

Embuggerance, E., and H. Feisty. 2008. The linguistics of laughter. English Today 1, no. 04: 47-47.

After I stopped laughing, I set to figuring out what was going on.

1) I quickly discarded the theory that an unlikely duo of scholars actually had this pair of names – although that would have been too awesome for words. In fact, no other article listed in Google Scholar has an author named ‘Embuggerance’ (although there are a couple other Feistys).

2) I also considered the possibility that this was one of the many metadata errors in Google Scholar; for instance, there are thousands of articles whose purported authors are named Citations or Introduction or Methods, due to errors where it interprets headings like “IV. Methods” as a name “Dr. I.V. Methods”. But this seemed unlikely in the extreme in this case.

3) This left the possibility that these were pseudonyms adopted by particularly amusing authors as part of a parody article.

In this case the article is in fact a book review (which I could tell because it’s all on one page), so I didn’t recommend it to the student, but I did request it for my own edification. Lo and behold, it arrived today as a PDF.

‘The linguistics of laughter’ is a book review of a The Language of Humour by Walter Nash. It’s perfectly ordinary and non-satirical, and it does not contain the words Embuggerance or Feisty. But next to it is another book review, entitled ‘Concise and human’ which contains the following passage (emphasis added):

Silverlight’s concise and human reports cover a surprising range of curious items, from Acid Rain through Bottom Line, Catch 22, Dinner/Supper, Embuggerance, Escalate, Feisty, Holistic, Krasis, Ms, Naff, Quorate, Shambles and Viable to Yomping.

The four bolded words appear on a single line, and the fact that the Google Scholar metadata thinks that the initials of the ‘authors’ are Dr. E. Embuggerance and Dr. H. Feisty seals the deal. This is the source, and so something like option 2 above is correct. But this is really weird. Not only do the pseudo-authors appear in the middle of a contextualized sentence (not in headings), but the sentence is in the wrong review – a review that itself is found (mostly correctly) in Google Scholar!

To make matters even worse, at the end of the reviews section the phrase ‘Reviews by Tom McArthur’ appears – an attribution which is found in the metadata for ‘Concise and human’ but not for ‘The linguistics of laughter’. And, as if this were not bad enough, even though both reviews are listed as being from 2008, the PDF clearly shows them as being from 1985. If I were a gambling man, I’d wager that 2008 is the year when the metadata was added and/or the file was scanned.

Now, mostly this is just a humorous anecdote; I don’t mean this as an indictment of Google Scholar, which I consider to be the most useful way for most scholars to find academic literature, and which I use virtually every day. But one has to wonder at the process (automated or otherwise) that leads to this comedy of errors. A great deal of virtual ink has been spilled over at Language Log (here and here, for instance) on the metadata problems with Google Books / Google Scholar and its implications for linguistic research, for tenure cases that rest on faulty citation records, and other potential problems. Until there is a way for these sorts of errors to be corrected by end users, we may all be well and truly embuggered.

Google Street View, maple leaf edition

Turning from ancient epigraphy to contemporary epigraphy: Today, Google Street View went live in many Canadian cities, including Montreal. As I’m currently putting together a book prospectus for Stop: Toutes Directions, this is of great interest to me. Google’s images aren’t high enough quality to evaluate damage, wear, and vandalism, much less actually photograph and read the vandalism. On the other hand, it does allow me to easily identify new (currently un-surveyed) areas where there is a lot of linguistic variability. It took me about two minutes, for instance, to find this intersection at the corner of Churchill and Cornwall in Ste-Anne-de-Bellevue, a bilingual community at the western tip of the island of Montreal, where there are two ARRETs, one STOP, and one ARRET/STOP at a four-way intersection. We only have a handful of intersections with all three sign types in our database currently. Or alternately, one of our pet theories is that airports and border crossings tend to have greater numbers of bilingual stop signs, and this could be checked out rapidly without needing a road trip. Just as Google Earth allows archaeologists to find new sites online, but requires a lot of ground-truthing, Google Street View is a handy tool but doesn’t let you skip the hard part. For any of my co-authors who may be reading, though, rest easy: I’m not about to freak out and ask you to start collecting new data online, although I did think about sending you a prank email to that effect, before I thought better of it.

Variant Roman numerals: a project

Yesterday I thought of a great new project that could be a nice little article, or, if I had a grad student with a background in classical archaeology, as a nice little thesis, or, if someone else wants to work with me, a co-authored paper. Heck, if you scam my idea, more power to you – I will cite you widely if it’s good, and mock you widely if not! You see, the Epigraphische Datenbank Clauss-Slaby is a searchable full-text database with over 350,000 Latin inscriptions (including over 20,000 images). You can enter a word (e.g, Germaniae) and it returns all the inscriptions that have that word. Nifty, huh? Just in mucking about with EDCS today I discovered two or three things that will be coming out in my book that are in need of revision, which makes me only a little bitter.

Now of course I’m not a classicist (I have three terms of Latin under my belt, but that’s hardly enough to make me an expert), but I do know a thing or three about Roman numerals. The study of Roman numerals is sorely neglected in modern epigraphy, which is a shame because there are some really interesting social questions to be asked relating to regional identity and literacy (the sort of stuff, e.g., that Greg Woolf does). We think that we know Roman numerals: just take I, V, X, L, C, D, and M, string them together in groups of no more than three, use subtractive notation for numbers like 9 and 44, and you’re done. But it isn’t so simple.

The Roman numerals are not a static and unified system; there are various expressions for the same number (e.g. XVIII vs. XIIX for 18, or XXXXX vs. L for 50). Back in the 1950s, Arthur and Joyce Gordon did some interesting statistical analysis, indicating some potential sources of this variability (chronological, regional, and textual), but he didn’t have the sort of massive resources that the EDCS provides. So, for instance, it is often said that IIIII for 5, XXXXX for 50, and CCCCC for 500 (i.e., not using the sub-base signs V, L, and D) are particularly found in African inscriptions. Well, a quick search for ‘CCCCC’ and ‘XXXXX’ suggest to me that this isn’t a full explanation. Are certain types of inscription more likely to contain these variants? Could we be dealing with a chronological difference? Could we be dealing with a variant typical of minimally literate writers, or writers of informal texts? Or could it be that the shorter forms are used when there’s less room on the medium, with longer variants used when space is not at a premium? I have no idea, but the only way to find out would be to build a list of inscriptions that use these variants, map them in time and space, and evaluate them in terms of the texts in which they occur.

Now, there are some methodological complexities: some of the interesting variation is between different forms for the same character, and there is no way to search for that. Some of the Roman numeral forms (the use of a horizontal bar or vinculum over a numeral to indicate multiplication by 1000) aren’t represented consistently, or at all, so one would just need to rely on other published material to find the relevant inscriptions. And quite a lot of the project would require taking the database results and then referring to the Corpus Inscriptionum Latinarum. But ultimately it would be taking what seems to be a rather dry subject (variability in Roman numerals) and potentially correlating it with variability in social identities (class, ethnic, professional). Well, I think it’s cool, anyway.

Ig Nobel 2009

The annual Ig Nobel awards “for achievements that first make people laugh, then make them think” were given out last night, and once again, anthropology has been well-represented. Catherine Bertenshaw Douglas and Peter Rowlinson won the award for veterinary medicine for their demonstration that cows that are humanized by giving them names produce more milk than those that remain, uh, anonymous. Although they are veterinary scientists their work appears in the interdisciplinary anthropological journal Anthrozoös. Meanwhile, the Ig Nobel for physics went to the biological anthropologists Katherine Whitcome, Liza Shapiro and Daniel Lieberman for their work (which appeared in Nature a couple of years ago) explaining why pregnant women don’t tip over. This is extremely important as it bears directly on the evolutionary costs and benefits of bipedalism, among other issues.

See the full list of winners here.

Bertenshaw, Catherine and Peter Rowlinson. 2009. Exploring Stock Managers’ Perceptions of the Human-Animal Relationship on Dairy Farms and an Association with Milk Production. Anthrozoös, vol. 22, no. 1, pp. 59-69.
Whitcome, Katherine, Liza J. Shapiro & Daniel E. Lieberman. 2007. Fetal Load and the Evolution of Lumbar Lordosis in Bipedal Hominins. Nature, vol. 450, 1075-1078.