2017-06-29 20:00:00
Our voice SIM carrier is carrying out emergency maintenance on their GGSNs between 20:00 and 00:00 on the 29th of June. This is expected to cause at least 15 minutes of downtime for voice SIMs.
Like many people, I’ve long been fascinated by the concept of “dreamtime” (which I was probably introduced to by the notoriously unreliable Bruce Chatwin); I’ve also been uneasy about depending on vague thirdhand understanding of what I was aware must be an incredibly complicated cultural complex of ideas. If you’re like me, you will welcome as I did the chance to improve your understanding at least a bit by reading Christine Judith Nicholls’s series of three posts from 2014, ‘Dreamtime’ and ‘The Dreaming’ – an introduction. It begins with a good quote by Jeannie Herbert Nungarrayi, formerly a Warlpiri teacher at the Lajamanu School in the Tanami Desert of the Northern Territory:

To get an insight into us – [the Warlpiri people of the Tanami Desert] – it is necessary to understand something about our major religious belief, the Jukurrpa. The Jukurrpa is an all-embracing concept that provides rules for living, a moral code, as well as rules for interacting with the natural environment.

The philosophy behind it is holistic – the Jukurrpa provides for a total, integrated way of life. It is important to understand that, for Warlpiri and other Aboriginal people living in remote Aboriginal settlements, The Dreaming isn’t something that has been consigned to the past but is a lived daily reality. We, the Warlpiri people, believe in the Jukurrpa to this day.

Nicholls writes:

In this succinct statement Nungarrayi touched on the subtlety, complexity and all-encompassing, non-finite nature of the Jukurrpa.

The concept is mostly known in grossly inadequate English translation as “The Dreamtime” or “The Dreaming”. The Jukurrpa can be mapped onto micro-environments in specific tracts of land that Aboriginal people call “country”.

As a religion grounded in the land itself, it incorporates creation and other land-based narratives, social processes including kinship regulations, morality and ethics. This complex concept informs people’s economic, cognitive, affective and spiritual lives.

She later adds that “words from many different languages have been squished into a couple of sleep-related English words – words that come with significantly different connotations – or baggage – in comparison with the originals”:

As noted earlier, the Warlpiri people of the Tanami Desert describe their complex of religious beliefs as the Jukurrpa.

Further south-east, the Arrerntic peoples call the word-concept the Altyerrenge or Altyerr (in earlier orthography spelled Altjira and Alcheringa and in other ways, too).

The Kija people of the East Kimberley use the term Ngarrankarni (sometimes spelled Ngarrarngkarni); while the Ngarinyin people (previously spelled Ungarinjin, inter alia) people speak of the Ungud (or Wungud).

“Dreaming” is called Manguny in Martu Wangka, a Western Desert language spoken in the Pilbara region of Western Australia; and some North-East Arnhem Landers refer to the same core concept as Wongar – to name but a handful.

Part two asks the question “who dreamed up these terms?” (it started with Francis Gillen, an Arrernte speaker and keen ethnologist in the late 19th century); part three is “‘Dreamings’ and dreaming narratives: what’s the relationship?” There’s lots of food for thought, and the illustrations are gorgeous.

David Crisp, "Gianforte: Congress’ newest misdemeanor", Last Best News 6/25/2017:

In case you were wondering whether Greg Gianforte will ever live down his body slam of a reporter for the Guardian, here’s a clue.

The Associated Press reported last week that Gianforte drew boos from the Republican side of the aisle during his brief speech following his swearing in as Montana’s representative in the U.S. House. The murmurs apparently had nothing to do with misdemeanor assault but came in response to Gianforte’s call to “drain the swamp” and for a bill denying pay to members of Congress if they fail to balance the budget.

But what’s really interesting is the C-SPAN transcript of Gianforte’s swearing in. The transcripts, according to a FAQ at the C-SPAN website, are drawn from the closed captioning that scrolls on the screen during sessions of Congress. The transcripts are included on the website to help visitors find the video they want, not to provide an accurate record of the actual speeches.

But they can nevertheless be revealing. On the tape, House Speaker Paul Ryan swears in Gianforte, then says, “Congratulations, you are now a member of the 115th Congress.” On the transcript, Ryan says, “Congratulations, you are now misdemeanor of the 115th Congress.”

Here's the audio:

And here's a screenshot of the relevant segment of the captioning, which actually says "CONGRATULATIONS, ARE YOU NOW MISDEMEANOR OF THE 115TH CONGRESS":


"The Real Threat of AI"

Jun. 26th, 2017 12:39 pm
Kai-Fu Lee has an interesting opinion piece in yesterday's NYT: –"The Real Threat of Artificial Intelligence":

What worries you about the coming world of artificial intelligence?

Too often the answer to this question resembles the plot of a sci-fi thriller. People worry that developments in A.I. will bring about the “singularity” — that point in history when A.I. surpasses human intelligence, leading to an unimaginable revolution in human affairs. Or they wonder whether instead of our controlling artificial intelligence, it will control us, turning us, in effect, into cyborgs.

These are interesting issues to contemplate, but they are not pressing. They concern situations that may not arise for hundreds of years, if ever. […]

This doesn’t mean we have nothing to worry about. On the contrary, the A.I. products that now exist are improving faster than most people realize and promise to radically transform our world, not always for the better. They are only tools, not a competing form of intelligence. But they will reshape what work means and how wealth is created, leading to unprecedented economic inequalities and even altering the global balance of power.

Read the whole thing — and then compare it to Norbert Wiener's expression of very similar concerns in 1950, discussed in "AI panics", 11/27/2016, and "Intellectual automation", 3/7/2011.

Wiener's warnings were certainly premature. Kai-Fu Lee has a more plausible case to make, though it's possible that the climb is going to be somewhat steeper and slower than he suggests it will be.

But as both Wiener and Lee explain, the eventual social and political consequences will be profound. Kai-Fu is conditionally optimistic, though his meta-Keynesian prescriptions may strike some as naive:

One way or another, we are going to have to start thinking about how to minimize the looming A.I.-fueled gap between the haves and the have-nots, both within and between nations. Or to put the matter more optimistically: A.I. is presenting us with an opportunity to rethink economic inequality on a global scale. These challenges are too far-ranging in their effects for any nation to isolate itself from the rest of the world.

It's not clear why the global 0.01% should be any more benevolent than it it now — Norbert Wiener put it this way:

Let us remember that the automatic machine, whatever we think of any feelings it may have or may not have, is the precise economic equivalent of slave labor. Any labor which competes with slave labor must accept the economic conditions of slave labor. It is perfectly clear that this will produce an unemployment situation, in comparison with which the present recession and even the depression of the thirties will seem a pleasant joke. This depression will ruin many industries-possibly even the industries which have taken advantage of the new potentialities. However, there is nothing in the industrial tradition which forbids an industrialist to make a sure and quick profit, and to get out before the crash touches him personally.

Thus the new industrial revolution is a two-edged sword. It may be used for the benefit of humanity, but only if humanity survives long enough to enter a period in which such a benefit is possible. It may also be used to destroy humanity, and if it is not used intelligently it can go very far in that direction.

And both current events and recent history suggest that large groups of motivated people with simple weapons are not easily overcome by superior technology. So whatever the ultimate outcome, we may get there after we relive years like 18111848, 1871, 1905, 1917, 1949, 1965, …

The main page of The Australian National Dictionary says:

In the tradition of the Oxford English Dictionary, the Australian National Dictionary Centre – a joint initiative of the Australian National University and Oxford University Press – published The Australian National Dictionary: A Dictionary of Australianisms on Historical Principles in 1988.

Oxford University Press has been publishing in Australia since 1908 and, in recognition of this milestone and as a symbol of gratitude to the Australian people, The Australian National Dictionary has been made available online, free.

What a splendid thing to do! I posted about essays by Bruce Moore, the main editor, here and here. And you can read a very personal review, “Up a wombat’s freckle,” by Dame Edna Everage’s alter ego Barry Humphries at the TLS:

This scholarly two-volume work contains a generous entry under the word “chunder”, a word unknown in my youth outside the Geelong and Ballarat Grammar Schools, until I relentlessly promulgated it in the comic strip of Barry McKenzie in Private Eye. There the eponymous hero regularly and compulsively regurgitated. This expressive, even onomatopoeic, term took off in trendy London circles and is now in universal, colloquial use. […]

Needless to say, there are innumerable expressions to describe thirst and drunkenness, but there are some I have noted that have eluded the lexicographer and not yet found their way into any dictionary. They illustrate Australian verbal ingenuity, and in stretching the expressive possibilities of the English language, they often possess a kind of sardonic poetry. A thirsty man might therefore say “I’m as dry as a Pommy’s bathmat”, which incorporates a reference to the well-known English aversion to bathing. I’ve also heard an inebriated man employ what must be the most offensive rhyming slang for intoxication when he declared, “Sorry mate, I’m a bit Schindlers”.

Both offensive and hilarious: Oz at its best!

"One big Donald Trump AIDS"

Jun. 25th, 2017 02:34 pm
[syndicated profile] languagelog_feed

Posted by Mark Liberman

As I've observed several times over the years, automatic speech recognition is getting better and better, to the point where some experts can plausibly advance claims of "achieving human parity". It's not hard to create material where humans still win, but in a lot of ordinary-life recordings, the machines do an excellent job.

Just like human listeners, computer ASR algorithms combine "bottom-up" information about the audio with "top-down" information about the context — both the local word-sequence context and various layers of broader context. In general, the machines are more dependent than humans are on the top-down information, in the sense that their performance on (even carefully-pronounced) jabberwocky or word salad is generally rather poor.

But recently I've been noting some cases where an ASR system unexpectedly fails to take account of what seem like some obvious local word-sequence likelihoods. To check my impression that such events are fairly common, I picked a random youtube video from YouTube's welcome page — Bill Maher's 6/23/2017 monologue — and fetched the "auto-generated" closed captions.

Here's an example that combines impressive overall performance with one weird mistake:

5:07 Mitch McConnell says he wants a vote
5:10 before the 4th of July when Trump voters
5:13 traditionally blow their hands off
5:19 oh the fourth of July hey summers here
5:24 boy it was real Beach weather in Phoenix
5:26 the other day did you see that it was
5:28 122 122 plains could not take off hey
5:34 climate deniers
5:36 if melting IceCaps and rising oceans and
5:40 pandemics aren't enough to scare you not
5:42 being able to leave Phoenix that should
5:50 work

I'll give the machine a pass on "summers" instead of "summer's", and we can ignore the issue of "oh" vs. "ah", and forgive the hallucinated "work" at the end — but "plains could not take off"? In Psalm 114:4 the mountains skipped like rams, but not even then did the plains take off.

A bit later:

6:32 but speaking of solar Donald Trump broke
6:36 some news at the rally that the wall you
6:39 know the wall between us and Mexico it's
6:41 going to have solar panels on he said it
6:43 was his idea solar battles okay so the
6:47 wall which is never going to be built
6:49 which Mexico is never going to be paying
6:52 for which now has imaginary so propels
6:56 on because if it's one big Donald Trump
6:59 AIDS it's fake news

So the system got "solar panels" right the first time, but then heard "solar battles" and "so propels". In fairness, Maher kind of garbles the last one into something like "solar pels":

But still, I don't think anyone in the audience heard "so propels".

And then at the end, "if it's one thing Donald Trump hates it's fake news" get turned into "if it's one big Donald Trump AIDS it's fake news":

In that case, I don't hear any acoustic phonetic excuses. And surely "one thing Donald Trump hates" is a priori a more probable word string than "one big Donald Trump AIDS"…

I don't know which generation of ASR Google is using to generate YouTube captions. But it's possible that this sort of thing is an example of the sometimes-peculiar behavior of RNN language models.

Renewal of the race / nation

Jun. 25th, 2017 02:53 am
[syndicated profile] languagelog_feed

Posted by Victor Mair

Jamil Anderlini in the Financial Times (6/21/17), "The dark side of China’s national renewal", writes:

To an English-speaking ear, rejuvenation has positive connotations and all nations have the right to rejuvenate themselves through peaceful efforts.

But the official translation of this crucial slogan is deeply misleading. In Chinese it is “Zhonghua minzu weida fuxing” and the important part of the phrase is “Zhonghua minzu” — the “Chinese nation” according to party propaganda. A more accurate, although not perfect, translation would be the “Chinese race”.

That is certainly how it is interpreted in China. The concept technically includes all 56 official ethnicities, including Tibetans, Muslim Uighurs and ethnic Koreans, but is almost universally understood to mean the majority Han ethnic group, who make up more than 90 per cent of the population.

The most interesting thing about Zhonghua minzu is that it very deliberately and specifically incorporates anyone with Chinese blood anywhere in the world, no matter how long ago their ancestors left the Chinese mainland.

“The Chinese race is a big family and feelings of love for the motherland, passion for the homeland, are infused in the blood of every single person with Chinese ancestry,” asserted Chinese premier Li Keqiang in a recent speech.

This is a highly perceptive, and troubling, article that merits reading in its entirety.

In this post, I will focus on some key terms.

First of all, front and center, what is this mínzú 民族?  It can mean lots of things:  nation, nationality, people, ethnic group, race, volk.  This is not the first time that mínzú 民族 has erupted on the international stage.  One of the most notable instances was four years ago, emanating right here from the University of Pennsylvania.  The incident is well recounted by R.L.G. in "Johnson" at The Economist (5/21/13), "Of nations, peoples, countries and mínzú:  Differing terms for ethnicity, citizenship and group belonging ruffle feathers":

DID Joe Biden insult China?  The American vice-president has a habit of sticking his foot into his mouth, and in this case, the recent graduation speech he gave at the University of Pennsylvania inspired a viral rant by a "disappointed" Chinese student at Penn, Zhang Tianpu. What was Mr Biden's sin? Was it Mr Biden's suggestion that creative thought is stifled in China?

You cannot think different in a nation where you cannot breathe free. You cannot think different in a nation where you aren't able to challenge orthodoxy, because change only comes from challenging orthodoxy.

No, that wasn't it.

The source of the insult is a surprising one: Mr Biden called China a "great nation", and a "nation" repeatedly after that. Victor Mair, the resident sinologist at the Language Log blog, translates Mr Zhang's complaint.

In this sentence, "You CANNOT think different in a nation where you aren't able to challenge orthodoxy", he used the word "nation". This is what really infuriated me, because in English "nation" indicates "race, ethnicity", which is different from "country, state". "Country, state" perhaps places more emphasis on the notion of the entirety of the country, even to the point of referring to the idea of government.

Mr Mair explains:

The weakness in Zhang's reasoning lies mainly in his confusion over the multiple meanings of the word mínzú 民族…. [M]ínzú 民族 can mean "ethnic group; race; nationality; people; nation".  Coming from the English side, we must keep in mind that "nation" can be translated into Chinese as guó 国 ("country"), guójiā 国家 ("country"), guódù 国度 ("country; state"), bāng 邦 ("state"), and, yes, mínzú 民族 ("ethnic group; race; nationality; people; nation").

It is clear that, when Biden said "China is a great nation", he was respectfully referring to the country as a whole.  Yet the sensitivity to questions of ethnicity in China, especially with regard to the shǎoshù mínzú 少数民族 ("ethnic / national minorities"), e.g., Uyghurs, Tibetans, and scores of others, caused Zhang to take umbrage over something that the Vice President never intended.

In a later post about smartphone zombies, Cant. dai1tau4 zuk6 / MSM dītóu zú 低頭族 (“head-down tribe”), "Tribes" (3/10/15), I wrote:

The first word I think of when I see 族 as a suffix is Mandarin mínzú, Japanese minzoku 民族 (“nation; nationality; people”), which is formed from 民 (“people; subjects; civilians”) + 族 (“family clan; ethnic group; tribe”).  The term is a neologism coined in the late 19th century by Japanese thinkers to match the Western (especially German) concept of “nation”.

… I have assembled a large amount of material concerning the absence of mínzú / minzoku 民族 as a lexical item corresponding to “nation” in China before it was introduced from Meiji [1868-1912] Japan.

When we prefix mínzú 民族 with shǎoshù 少数 ("few; small number; minority"), we have shǎoshù mínzú 少数民族 ("minority; national minority; ethnic minority").  Here it gets really tricky, because, as Anderlini points out in his article, there are officially 56 ethnic groups (mínzú 民族) in China, of which 55 are shǎoshù mínzú 少数民族 ("minorities; national minorities; ethnic minorities; ethnic groups"), with the 56th being the dominant, majority (over 90%) Hàn mínzú 汉民族 ("Han nationality; Han ethnic group").  Consequently, when Chinese politicians talk about the blood of the Chinese race, it's important to know whether they are are referring to Hàn mínzú 汉民族 ("Han nationality; Han ethnic group"), Zhōnghuá mínzú 中华民族 ("Chinese nation / people", where Zhōnghuá 中华 is understood as "Central cultural florescence"), or something else.  In each case, we need to judge carefully whether they meant to include all the ethnicities within the sovereign territory of the PRC or in the whole world, or whether they were referring specifically to individuals of Han ethnicity within the sovereign territory of the PRC or in the whole world.  Often, for politicians, as for poets, ambiguity is desirable, or at least convenient.

There are no less than half a dozen other words for "(the) people" that are in common use in Mandarin.  I won't go into all of them here, but will mention only one:  rénmín 人民, as in rénmínbì 人民币 ("RMB; people's currency") and Rénmín rìbào 人民日报 ("People's Daily").  This term, rénmín 人民, does not get involved with race, ethnicity, nation, and so on, but emphasizes the population as a whole.

As for "Zhongguo / China", that too is a huge can of worms, for which see this incisive paper by Arif Dirlik:

"Born in Translation: 'China' in the Making of 'Zhongguo'"

[h.t. John Rohsenow, Bill Bishop]

Nahuatl in LA.

Posted by languagehat

Peggy McInerny writes about a Nahuatl program for the Latin American Institute:

The language of the Aztecs, Nahuatl [pronounced na’ wat], is alive and well today in Los Angeles. Beginning and intermediate classes in modern Nahuatl are offered at UCLA, with an advanced class slated to launch next year.

A few miles due north at the Getty Museum, historians and art experts are collaborating with Italy’s Laurentian Library on a long-term project to create an online, annotated version of one of the greatest works ever written in Nahuatl: the Florentine Codex. A virtual encyclopedia of Nahua culture compiled by a dedicated Franciscan friar in the mid-16th century, the work has never been accessible to the general public — much less to descendants of the Aztecs living in Mexico.

Last fall, an entire scene of a U.S. television show was shot in Spanish and modern Nahuatl, marking the first time that the Aztec language had ever been heard on an American broadcast. This coming September, a charter school in Lynwood will offer Nahuatl classes to its middle school students, courtesy of a UCLA graduate student. And that’s not to mention a dedicated native speaker who has been teaching Nahuatl classes for 26 years in a local church in Santa Ana (see KPCC story).

Standing at the confluence of most of these linguistic streams is UCLA historian Kevin Terraciano, director of the Latin American Institute. A genial professor with a dry sense of humor, Terraciano was instrumental in making Nahuatl available at UCLA, beginning in fall 2015. It was Terraciano who translated English dialogue into Nahuatl for an American Crime episode during the show’s current season. (He later coached the actors, who had to learn their parts phonetically, at the actual shoot.)

There’s some interesting stuff there about the history of the language (“Nahuatl was still the majority-spoken language in the Valley of Mexico at the end of the colonial period […] Despite the fact that 90 percent of the population died over a 100-year period as a result of one epidemic after another, indigenous peoples were still the majority of the population of Mexico by the end of that period”). Thanks, Trevor!

Bruria Kaufman

Jun. 24th, 2017 04:53 pm
[syndicated profile] languagelog_feed

Posted by Mark Liberman

The Annual Reviews have a tradition of featuring retrospective articles by or about senior figures, and the Annual Review of Linguistics has followed this pattern with pieces featuring Morris Halle in the 2016 volume and Bill Labov in 2017. For 2018, we'll be featuring Lila Gleitman.

As background, Barbara Partee, Cynthia McLemore and I spent the last couple of days interviewing Lila about her life and work. We've got more than 7.5 hours of recordings, which is more like a book than an article — and it may very well turn into a book as well, with edited interview material interspersed with reprints of Lila's papers. But what I want to post about today is one of the many things that I learned in the course of the discussions. This was just a footnote in Lila's life story, but it has its own intrinsic interest, and I'm hoping that some readers will be able to provide more information.

I learned that the founder of the Penn Linguistics Department, Zellig Harris, was married to a mathematical physicist named Bruria Kaufman. She worked with John von Neumann, wrote some widely-cited papers on crystal statistics in the late 1940s, published with Albert Einstein (Albert Einstein and Bruria Kaufman. "A new form of the general relativistic field equations", Annals of Mathematics, 1955), and later wrote papers like "Unitary symmetry of oscillators and the Talmi transformation", Journal of Mathematical Physics 1965, and "Special functions of mathematical physics from the viewpoint of Lie algebra", Journal of Mathematical Physics 1966.

The thing that interested me most was that Bruria Kaufman also worked for a while in the 1950s with Harris at Penn, at the same time as others including Lila Gleitman, Aravind Joshi, R.B. Lees, Naomi Sager, Zeno Vendler, and Noam Chomsky. And according to this 1961 NSF report, her contributions included Transformations and Discourse Analysis Papers (TDAP) numbers 19 and 20:

19. Higher-order Substrings and Well-formedness, Bruria Kaufman.
20. Iterative Computation of String Nesting (Fortran Code), Bruria Kaufman.

I've found a couple of citations to these works, but so far not the works themselves.

The 1961 NSF report says that

Paper 15 gives an information [sic — should be informal?] presentation of a general theory and method for syntactic recognition. Papers 16-19 give the actual flow charts of each section of the syntactic analysis program.

where 15-19 are

15. Computable Syntactic Analysis, Zellig S. Harris. (Revised version published as PoFL I, above)
16. Word and Word-Complex Dictionaries, Lila Gleitman.
17. Elimination of Alternative Classifications, Naomi Sager.
18. Recognition of Local Substrings, Aravind K. Joshi.
19. Higher-order Substrings and Well-formedness, Bruria Kaufman.

and "PoFL I" is Harris's String Analysis and Sentence Structure, 1962.

Aravind Joshi and Phil Hopely, "A parser from antiquity", Natural Language Engineering 1996, explains that

A parsing program was designed and implemented at the University of Pennsylvania during the period from June 1958 to July 1959. This program was part of the Transformations and Discourse Analysis Project (TDAP) directed by Zellig S. Harris. The techniques used in this program, besides being influenced by the particular linguistic theory, arose out of the need to deal with the extremely limited computational resources available at that time. The program was essentially a cascade of finite state transducers (FSTs).

More on the history from that source:

The original program was implemented in the assembly language on Univac 1, a single user machine. The machine had acoustic (mercury) delay line memory of 1000 words. Each word was 12 characters/digits, each character/digit was 6 bits. Lila Gleitman, Aravind Joshi, Bruria Kauffman, and Naomi Sager and a little later, Carol Chomsky were involved in the development and implementation of this program. A brief description of the program appears in Joshi 1961 and a somewhat generalized description of the grammar appears in Harris 1962.  This program is the precursor of the string grammar program of Naomi Sager at NYU, leading up to the current parsers of Ralph Grishman (NYU) and Lynette Hirschman (formerly at UNISYS, now at Mitre Corporation). Carol Chomsky took the program to MIT and it was used in the question-answer program of Green, BASEBALL (1961). At Penn, it led to a program for transformational analysis (kernels and transformations) (1963) and, in many ways, influenced the formal work on string adjunction (1972) and later tree-adjunction (1975).

The paper's bibliography cites

Transformations and Discourse Analysis Project (TDAP) Reports, University of Pennsylvania, Reports #15 through #19, 1959-60. Available in the Library of the National Institute of Science and Technology (NIST) (formerly known as the National Bureau of Standards (NBS)), Bethesda, MD.

So I'll ask my friends at NIST if these works are still there.



Posted by languagehat

Timur Baytukalov has created what looks like a useful site for language learners, EasyPronunciation.com; he says:

I created this website with phonetic transcription converters – https://easypronunciation.com/en/. They can convert text into IPA phonetic transcription. I already support seven languages (English, Russian, French, Spanish, Chinese, Japanese, Italian). Russian and French converters have embedded audio recordings.

Some levels are for paid subscribers, but basic levels are free; it looks worth checking out.

Chinglish with tones

Jun. 23rd, 2017 07:57 pm
[syndicated profile] languagelog_feed

Posted by Victor Mair

4th tone – 3rd tone, it would appear:

Well, maybe not; the diacritics are probably meant to indicate vowel quality, but I don't know what system (if any) they are using.

Ben Zimmer writes:

The diacritics may be intended to evoke pinyin tone marks, but they're also reminiscent of dictionary-style phonetic respelling and stress marking. The grave accent on "ì" could be intended as an indicator of primary stress, though that's more typically marked with an acute accent. And the breve on the "ĭ" is a common enough way to represent /ɪ/ (the macron is used for long vowels and the breve for short vowels — see, e.g., Phonics on the Web). But this use of diacritics as typographical ornamentation is never very consistent — recall the styling of the play Chinglish as "Ch’ing·lish”.

The illustration appears at the top of this article:

It turns out that the image used by the People's Daily originally appeared as a promotion for the play Chinglish that Ben mentioned, specifically for its performance by the Singaporean theater company Pangdemonium in 2015. See the Pangdemonium website, as well as local coverage by PopSpoken and Today. So the People's Daily may have searched for a "Chinglish" image online and borrowed this one, without giving proper credit. (Credit should go to Olivier Henry of MILK Photographie.)

The six individuals in the picture seem to be aspiring to some idealized form of Chinglish in the sky above, overlying the cloud shrouded five star design of the Chinese flag, leading them on.  The thrust of the People's Daily article, however, is anything but adulatory of Chinglish:

Chinese authorities on June 20 issued a national standard for the use of English in the public domain, eradicating poor translations that damage the country’s image.

The standard, jointly issued by China’s Standardization Administration and General Administration of Quality Supervision, Inspection and Quarantine, aims to improve the quality of English translations in 13 public arenas, including transportation, entertainment, medicine and financial services. It will take effect on Dec. 1, 2017.

According to the standard, English translations should prioritize correct grammar and a proper register, while rare expressions and vocabulary words should be avoided. The standard requires that English not be overused in public sectors, and that translations not contain content that damages the images of China or other countries. Discriminatory and hurtful words have also been banned. The standard provided sample translations for reference, and warned against direct translation.

There are perpetual plans for eliminating Chinglish in China, but they are unlikely ever to materialize unless professional translators are sought after for their expertise and paid accordingly.

Earlier calls for the elimination of English more generally are no longer heard from responsible persons:

Now the goal is more reasonably just to get rid of Chinglish, but that will not happen on December 1, 2017 when the new standards go into effect.  Although it will take many years for their full implementation and realization, the standards are admirable goals to aim for.

See also:

[h.t. Jim Fanell, Toni Tan]

Ask Language Log: "assuage"

Jun. 23rd, 2017 11:41 am
[syndicated profile] languagelog_feed

Posted by Mark Liberman

Query from a reader:

Is it correct to use the word assuage to indicate a lessening of something? That is, it is often used in the realm of feelings, i.e. assuage hunger, assuage grief, etc. But would it be acceptable to use to indicate the lessening of something more tangible, such as assuage criminality, assuage the flow of water, assuage drug use.

I probably wouldn't use assuage to describe the lowering of flood waters or the amelioration of traffic jams. But I don't have any special standing to rule on such matters, so as usual, let's look at how others use the word.

The OED's entry for assuage, which is flagged as "not yet … fully updated (first published 1885)", has several senses marked as "arch. or Obs." that don't involve "angry or excited feelings", or beings in such a state.

There's the transitive form glossed "To abate, lessen, diminish (esp. anything swollen)", with examples like

1774   J. Bryant New Syst. II. 284   The Dove..brought the first tidings that the waters of the deep were asswaged.

There's the intransitive inchoative version of the same, glossed "To grow less, diminish, decrease, fall off, die away; to abate, subside", with examples like

1611   Bible (King James) Gen. viii. 1   And the waters asswaged .

COCA has 509 instances of "assuage", 134 of "assuaged", 46 of "assuaging", and 17 of "assuages". Looking at a random sample of 100, we find that all 100 are transitive, and that in 98 of them, what's assuaged is an negatively-evaluated emotion or feeling or concern ("the community's grief", "his guilt", "such mortal concerns", "the twitchy sensation in my cells", "white opposition to slave conversion", "my hunger", "Democratic anxieties", "India's complaints", "feelings of humiliation", the monarch's fears", "his own damaged pride", "the egos of movie stars", "my curiosity", …), or an person or group of people subject to such emotions or feelings or concerns ("his uneasy party", "the academic intellectual community", "the larger man", "international critics of the war", "his jittery passenger", "the chiefs", "the dealers", …).

The two exceptions in the sample are these:

In The Efficiency Trap, Steve Hallett claims that we will exhaust many of our resources by the 2030s, and violence and chaos will erupt as a result. Hallett proposes recycling and growing food locally as possible means of assuaging the damage.

The measure, which awaits Senate approval of a minor amendment next week, can not assuage the impending disaster that will kill virtually all the fish in the Dolores River this summer.

With respect to the specific examples in the query, Google finds

"assuage criminality": one example [link] Please reconsider your gig – don't play for a segregated audience in Israel and make of yourself a balm to assuage criminality.

"assuage the flow of water": no examples (though see biblical examples cited by the OED)

assuage drug use: one example [link] Becker's neoliberal drug policy presumes to assuage drug use and addiction by the instantiation of a highly regulated market as a system of control.

So the verdict of norma loquendi seems to be that applying assuage to things other than people and their feelings is out of fashion and currently marginal.


My own investigations on the Bronze Age and Early Iron Age peoples of Eastern Central Asia (ECA) began essentially as a genetics cum linguistics project back in the early 90s.  That was not long after the extraction of mtDNA (mitochondrial DNA) from ancient human tissues and its amplification by means of PCR (polymerase chain reaction) became possible.

By the mid-90s I had grown somewhat disenchanted with ancient DNA (aDNA) studies because the data were insufficient to determine the origins and affiliations of various early groups with satisfactory precision, neither spatially nor temporally.  Around the same time, I began to realize that other types of materials, such as textiles and metals, provided powerful diagnostic evidence.

By the late 90s, combining findings from all of these fields and others, I was willing to advance the hypothesis that some of the mummies of ECA, especially the earliest ones dating to around 1800 BC, may have spoken a pre-proto-form of Tocharian when they were alive (some people think it's funny or scary to imagine that mummies once could speak).  This hypothesis was presented at an international conference held at the University of Pennsylvania in April, 1996, which was attended by more than a hundred archeologists, linguists, geneticists, physical anthropologists, textile specialists, metallurgists, geographers, climatologists, historians, mythologists, and ethnologists — including more than half a dozen of the world's most distinguished Tocharianists.  It was most decidedly a multidisciplinary conference before it became fashionable to call academic endeavors by such terms (see " Xdisciplinary" [6/14/17]).  The papers from the conference were collected in this publication:

Victor H. Mair, The Bronze Age and Early Iron Age Peoples of Eastern Central Asia (Washington, D.C.: Institute for the Study of Man Inc. in collaboration with the University of Pennsylvania Museum Publications, 1998).  2 vols.

See also:

J. P. Mallory and Victor H. Mair, The Tarim Mummies: Ancient China and the Mystery of the Earliest Peoples from the West. (2000). Thames & Hudson. London.

"Early Indo-Europeans in Xinjiang" (11/19/08)

It is only very recently, within the last ten years or so, that Y-chromosome analysis has been brought into play for the study of ancient DNA.  See Toomas Kivisild, "The study of human Y chromosome variation through ancient DNA", Human Genetics, 2017; 136(5): 529–546; published online 2017 Mar 4. doi:  10.1007/s00439-017-1773-z.*  Since only males carry the Y-chromosome, this has made it possible to trace the patriline of individuals.  This, coupled with the massive accumulation and detailed analysis of modern DNA with increasing sophistication and the rise of the interdisciplinary (!) field referred to as genomics, has made studies on the genetics of premodern people, including their origins, migrations, and affinities, far more exacting than it was during the 90s when I did the bulk of my investigations on the early inhabitants of the Tarim Basin.

Now it is possible to draw on the results of genetics research to frame and more reliably solve questions about the development of languages from their homeland to the far-flung places where they subsequently came to be spoken.  One such inquiry is described in this article:

Tony Joseph, "How genetics is settling the Aryan migration debate", The Hindu (6/16/17).

It is significant that this substantial article appeared in The Hindu, since there is a strong bias against such conclusions among Indian nationalists (see "Indigenous Aryans").  It begins thus:

New DNA evidence is solving the most fought-over question in Indian history. And you will be surprised at how sure-footed the answer is, writes Tony Joseph

The thorniest, most fought-over question in Indian history is slowly but surely getting answered: did Indo-European language speakers, who called themselves Aryans, stream into India sometime around 2,000 BC – 1,500 BC when the Indus Valley civilisation came to an end, bringing with them Sanskrit and a distinctive set of cultural practices? Genetic research based on an avalanche of new DNA evidence is making scientists around the world converge on an unambiguous answer: yes, they did.

Joseph's paper is informed, sensitive, balanced, and nuanced.  This is responsible science journalism.

The scientific paper itself, “A Genetic Chronology for the Indian Subcontinent Points to Heavily Sex-biased Dispersals” by Marina Silva, Marisa Oliveira, Daniel Vieira, Andreia Brandão, Teresa Rito, Joana B. Pereira, Ross M. Fraser, Bob Hudson, Francesca Gandini, Ceiridwen Edwards, Maria Pala, John Koch, James F. Wilson, Luísa Pereira, Martin B. Richards, and Pedro Soares, was published in BMC Evolutionary Biology (3/23/17) ( DOI: 10.1186/s12862-017-0936-9).

I'm skeptical of many of the claims put forward by geneticists concerning origins and dispersals, not just about humans, but also about horses, dogs, cats, plants, and so forth.  This study, however, is both cautious and solid.  Moreover, it fits well with the archeological evidence (more below).

Here are two key paragraphs from the scientific paper (numbers in square brackets are to accessible references):

Although some have argued for co-dispersal of the Indo-Aryan languages with the earliest Neolithic from the Fertile Crescent [88, 89], others have argued that, if any language family dispersed with the Neolithic into South Asia, it was more likely to have been the Dravidian family now spoken across much of central and southern India [12]. Moreover, despite a largely imported suite of Near Eastern domesticates, there was also an indigenous component at Mehrgarh, including zebu cattle [85, 86, 90]. The more widely accepted “Steppe hypothesis” [91, 92] for the origins of Indo-European has recently received powerful support from aDNA evidence. Genome-wide, Y-chromosome and mtDNA analyses all suggest Late Neolithic dispersals into Europe, potentially originating amongst Indo-European-speaking Yamnaya pastoralists that arose in the Pontic-Caspian Steppe by ~5 ka, with expansions east and later south into Central Asia in the Bronze Age [53, 76, 93, 94, 95]. Given the difficulties with deriving the European Corded Ware directly from the Yamnaya [96], a plausible alternative (yet to be directly tested with genetic evidence) is an earlier Steppe origin amongst Copper Age Khavlyn, Srednij Stog and Skelya pastoralists, ~7-5.5 ka, with an infiltration of southeast European Chalcolithic Tripolye communities ~6.4 ka, giving rise to both the Corded Ware and Yamnaya when it broke up ~5.4 ka [12].

An influx of such migrants into South Asia would likely have contributed to the CHG component in the GW [VHM:  genome-wide] analysis found across the Subcontinent, as this is seen at a high rate amongst samples from the putative Yamnaya source pool and descendant Central Asian Bronze Age groups. Archaeological evidence suggests that Middle Bronze Age Andronovo descendants of the Early Bronze Age horse-based, pastoralist and chariot-using Sintashta culture, located in the grasslands and river valleys to the east of the Southern Ural Mountains and likely speaking a proto-Indo-Iranian language, probably expanded east and south into Central Asia by ~3.8 ka. Andronovo groups, and potentially Sintashta groups before them, are thought to have infiltrated and dominated the soma-using Bactrian Margiana Archaeological Complex (BMAC) in Turkmenistan/northern Afghanistan by 3.5 ka and possibly as early as 4 ka. The BMAC came into contact with the Indus Valley civilisation in Baluchistan from ~4 ka onwards, around the beginning of the Indus Valley decline, with pastoralist dominated groups dispersing further into South Asia by ~3.5 ka, as well as westwards across northern Iran into Syria (which came under the sway of the Indo-Iranian-speaking Mitanni) and Anatolia [12, 95, 97, 98].

The spread of R1a into South Asia had earlier been securely documented in Peter A. Underhill, et al., "The phylogenetic and geographic structure of Y-chromosome haplogroup R1a", European Journal of Human Genetics (2015) 23, 124–131; doi:10.1038/ejhg.2014.50; published online 26 March 2014.

The precise coalescence of R1a within South Asia was identified in Monika Karmin, et al., "A recent bottleneck of Y chromosome diversity coincides with a global change in culture", Genome Research (2015);

This kind of male migration theory is proposed with arguments based on archeological evidence in the last pages of H.-P. Francfort, “La civilisation de l'Oxus et les Indo-Iraniens et Indo-Aryens”, in: Aryas, Aryens et Iraniens en Asie Centrale (Collège de France. Publications de l'Institut de Civilisation Indienne, vol. 72), G. Fussman, J. Kellens, H.-P. Francfort, et X. Tremblay (eds.) (Paris:  Diffusion de Boccard, 2005) pp. 253-328.  The complete paper is on academia website.

Michael Witzel has favored this, the (Indo-)Aryan Migration view, on linguistic and textual grounds since at least 1995 and was constantly criticized for saying so. See his papers of 1995, 2001:

"Autochthonous Aryans? The Evidence from Old Indian and Iranian Texts."  EJVS (May 2001) pdf.

"Early Indian History: Linguistic and Textual Parameters."  In: Language, Material Culture and Ethnicity: The Indo-Aryans of Ancient South Asia. Ed. G. Erdosy (Berlin/New York: de Gruyter 1995), 85-125; —  Rgvedic history: poets, chieftains and politics, loc. cit. 307-352 combined pdf (uncorrected).

and the substrate paper of 1999:

"Early Sources for South Asian Substrate Languages." Mother Tongue (1999, extra number) pdf

Some relevant Language Log posts:

"Dating Indo-European" (12/10/03)

"The Linguistic Diversity of Aboriginal Europe" (1/6/09)

"Horse and wheel in the early history of Indo-European" (1/10/09)

"More on IE wheels and horses " (1/10/09)

"Inheritance versus lexical borrowing: a case with decisive sound-change evidence" (1/13/09)

"The place and time of Proto-Indo-European: Another round" (8/24/12)

"Irish DNA and Indo-European origins" (12/31/15)

*For those who are interested in the development of aDNA Y-chromosome studies beginning in the 2000s, I have some additional documentation and several relevant papers that I can send to you.

[Thanks to Richard Villems, Toomas Kivisild, and Peter Underhill]

[syndicated profile] languagehat_feed

Posted by languagehat

Over at the Log, Victor Mair posted about the latest silly governmental attempt to control language, in this case Erdoğan’s campaign against foreign influences in Turkish; he quotes an article in The Economist:

Mr Erdogan started by ordering the word “arena”, which reminded him of ancient Roman depravity, removed from sports venues across the country. Turkey’s biggest teams complied overnight. Vodafone Arena, home of the Besiktas football club, woke up as “Vodafone Stadyumu”. Critics wondered what the Turkish language had gained by replacing one foreign-derived word with another. […]

Because so much abstract vocabulary had come from Arabic and Persian, this in effect created a new language. From one generation to the next, the country’s cultural history was cut off. Mr Erdogan seems to want to turn the clock back, complete with imperial nostalgia and resentment towards the West. In 2014 he proposed introducing mandatory high-school classes in Ottoman Turkish, which survives today only among linguists, historians and clerics. The plan was shelved after a popular backlash.

The offensive against Western loanwords will probably meet a similar fate. In an interview, the [Turkish Language Institute]’s head, Mustafa Kacalin, clarified that it would apply only to “bizarre” foreign words incomprehensible to most Turks. The limits became clear in Mr Erdogan’s own speech on May 23rd, in which he denounced loanwords by using a loanword. They were not, he said, “sik” (“chic”). Many Turks no doubt consider the whole thing a load of bosh—from the Turkish bos, “nonsense”.

As Thomas Shaw says in the comments, the quotation from The Economist misspells the Turkish: “it should be şık and boş. Also Erdoğan, of course […].” (In the following comments, Y thought for a moment he was at the Hattery, which was amusing.) We discussed Atatürk’s original Turkish language reform back in 2012.

My summer

Jun. 22nd, 2017 11:37 am
[syndicated profile] languagelog_feed

Posted by Mark Liberman

.. or at least six weeks of it, will be spent at the 2017 Jelinek Summer Workshop on Speech and Language Technology (JSALT) at CMU in Pittsburgh. As the link explains, this

… is a continuation of the Johns Hopkins University CLSP summer workshop series from 1995-2016. It consists of a two-week summer school, followed by a six-week workshop. Notable researchers and students come together to collaborate on selected research topics. The Workshop is named after the late Fred Jelinek, its former director and head of the Center for Speech and Language Processing.

I took part in the first of these annual summer workshops, back in 1995, as a member of the team focused on "Language Modeling for Conversational Speech Recognition".

This summer, I'll be part of a group whose theme is described as "Enhancement and Analysis of Conversational Speech".

One of the group's goals is to do a better job of "diarization", i.e. keeping track of who spoke when in conversations. Existing systems do an especially bad job with overlapping speech, which can be extremely common.

Here's a graphical representation of (accurate) diarization in a (real) conversation between Red and Blue:

And the same thing continued for a while (though not to the end of the conversation):

As discussed here, turn-taking overlaps are often cooperative rather than competitive — and it would be good to be able to supplement robust diarization with a functional analysis of conversational flow.

As the workshop progresses, I'll post some updates.


[syndicated profile] languagehat_feed

Posted by languagehat

Dauvit Horsbroch, of the Scots Language Centre, has a video lecture (just under 20 minutes) on the Scots language (“leid” in Scots) that’s a fascinating experience for an English speaker; the more you listen the more you understand, and it’s a linguistically informed talk about language — what’s not to like?

Via MetaFilter, where Happy Dave (“I’m Scottish, speak Scottish English day-to-day, occasionally dot my sentences with Scots words and have academic connections to the Scots leid folks through my wife”) has the following informative comment, responding to someone else saying “I don’t know anything about this guy, but I knew people who spoke Scots and they didn’t sound much like that”:

Just a note on this – this fella is a Scots language (leid) specialist, so he’s speaking a pretty formalised form of Scots with deliberate substitution of words, including some that are pretty much archaic/extinct in everyday speech. There’s an attempt to document and make consistent some of the spellings etc and I believe this is the form of Scots the Scottish Parliament uses when producing documents in Scots.

However, a lot of people slide between broad Scots (and its sub-dialects like Doric) and Scottish English, sometimes in the space of sentence, all day every day. And there are not, so far as I know, any Scots speakers (even those who speak to to the exclusion of all else) who do not speak Scottish English. If you speak Scots, you also are capable of speaking Scottish English, although the reverse is not always true.

The people you knew may have have been speaking a different regional sub dialect of Scots, or a less formalised version with less archaisms, or Scottish English with a smattering of Scots words.

Paul Zukofsky

Jun. 21st, 2017 11:45 pm
[syndicated profile] languagelog_feed

Posted by Mark Liberman

This strikes me as an unusual obituary: Margalit Fox, "Paul Zukofsky, Prodigy Who Became, Uneasily, a Virtuoso Violinist, Dies at 73", NYT 6/20/2017. It massively violates the precept de mortuis nil nisi bonum, describing its subject at great length as an "automaton" who was "deeply ill at ease with world"; an "arch-bridge troll", full of "unbridled hubris", "disdain for those less gifted than he", and "an ample sense of self-worth"; "swift to run to judgment", "meanspirited, sarcastic, rather bitter"; someone who would "look at [his audience] with utter contempt", and on and on.

Margalit Fox certainly found plenty of sources for these judgments. But this litany of bitter score-settling is completely at odds with my own experience of Paul Zukofsky.

I first met Paul around 1976, when I was employed at Bell Labs in Murray Hill NJ, and he was the music director of the Colonial Symphony in Madison, a few miles west. He was planning to present Bach's Fourth Brandenburg Concerto, and he needed a continuo player. I owned a harpsichord, had once taken a conservatory course in figured bass realization, and occasionally performed with professional and semi-professional chamber groups in the area, so Joan Miller recommended me to him.

Paul was then teaching at Stony Brook, so I treked out there to audition. That was an amazing experience — while I played the continuo part, Paul, with an occasional glance at the score, played the parts of all three soloists and the rest of the orchestra all at once on the violin. It was amazing. I had never seen anything like it. I managed keep my jaw off the floor, and made my way through the audition well enough to get the part.

This situation was inherently intimidating, and my own musical gifts were far below Paul's. But he was charming and friendly, interested in talking about Bach's music, and about music theory and the psychology of music, and he left me with a positive feeling about the whole experience.

For a while around that time, Paul became a regular visitor at Bell Labs, where he contributed to some interesting work, including these publications:

Ronald Knoll, Saul Sternberg, and Paul Zukofsky, "Subdivision of the beat: Estimation and production of time ratio by skilled musicians", JASA 1976.
Mark Liberman, Joseph Olive, and Paul Zukofsky, "Studies of metric patterns", JASA 1977.
Saul Sternberg, Ronald Knoll, and Paul Zukofsky, "Timing by Skilled Musicians", in Diana Deutsch, Ed., Psychology of Music, 1982.

Throughout those interactions, I never met the cold, mean, unpleasant man depicted in the NYT obituary. On the contrary, Paul was always smart, engaged, friendly, and even convivial.

Maybe I have a thicker skin than the people who supplied Margalit Fox with so much bile. Or maybe Paul was different in later life than he was when I knew him.

But looking over the obituary, I see two other factors that might be relevant. One is Paul's role as executor of his father's estate — that's a side of him that I never saw, and one that would not have been relevant before Louis Zukofsky died in 1978, which was after most of my interactions with Paul.

And the other factor might be his apparent reluctance to take up the standard role of a violin virtuoso, or at least to limit himself to playing that part. Perhaps he saw me and others at Bell Labs as part of his self-liberation from that role, rather than as part of the world that he needed to escape, and perhaps he therefore interacted differently with us.

Still, I have a feeling that most people could be unlucky enough to be treated to an obituary like the one under discussion. The recipe is clear:  find people with a grudge, people on the other side of arguments, people who were offended on purpose or by accident, people who were disappointed, people with relevant prejudices, and select your quotes to play up the negatives and minimize the positives. The Paul Zukofsky I knew deserves better.

Update — a letter sent by Saul Sternberg to the New York Times:

I believe that this obituary gives a false impression of Zukofsky's personality.  The only indication that he could be a sweet, loving, caring person is the one quote (Kalish) "to those who understood him deeply…"  If you look at the comments on slippedisc.com/2017/06/death-of-an-important-american-violinist-73/ you'll find many who loved him, and some for whom he was a kind and caring mentor. Surely they didn't all "understand him deeply".

It is as if, rather than providing a balanced description, the writer emphasized those aspects of his personality that would fit with her beliefs about his early life and her claims about his "emotional development" having been "sacrificed to professional prowess".

I've known Paul Zukofsky for the past forty years, and although the names of many people have come up in our conversations and correspondence, I've seen no evidence of "his disdain for people less gifted than he".

Also, the obit fails to mention the existence of the Zukofsky Quartet, named in his honor.

Update #2 — from Joshua Gordon:

It was good to read your commentary on the NYBTimes obituary for Paul Zukofsky, and I am sympathetic to your experience with him (he was an important mentor to me at Juilliard and beyond). I posted a new Facebook page for anybody who wants to share thoughts or materials on him called "In Memory of Paul Zukofsky", I hope you'll want to contribute to it.

