Accidental Blogger

A general interest blog

Are scientists on the path to deciphering the Indus Valley script?  The story here and here.

Indus Valley 4 An ancient script that's defied generations of archaeologists has yielded some of its secrets to artificially intelligent computers.

Computational analysis of symbols used 4,000 years ago by a long-lost Indus Valley civilization suggests they represent a spoken language. Some frustrated linguists thought the symbols were merely pretty pictures.

"The underlying grammatical structure seems similar to what's found in many languages," said University of Washington computer scientist Rajesh Rao.

The Indus script, used between 2,600 and 1,900 B.C. in what is now eastern Pakistan and northwest India, belonged to a civilization as sophisticated as its Mesopotamian and Egyptian contemporaries. However, it left fewer linguistic remains. Archaeologists have uncovered about 1,500 unique inscriptions from fragments of pottery, tablets and seals. The longest inscription is just 27 signs long.

In 1877, British archaeologist Alexander Cunningham hypothesized that the Indus script was a forerunner of modern-day Brahmic scripts, used from Central to Southeast Asia. Other researchers disagreed. Fueled by scores of competing and ultimately unsuccessful attempts to decipher the script, that contentious state of affairs has persisted to the present.

Among the languages linked to the mysterious script are Chinese Lolo, Sumerian, Egyptian, Dravidian, Indo-Aryan, Old Slavic, even Easter Island — and, finally, no language at all. In 2004, linguist Steve Farmer published a paper asserting that the Indus script was nothing more than political and religious symbols. It was a controversial notion, but not an unpopular one.


Rao, a machine learning specialist who read about the Indus script in high school and decided to apply his expertise to the script while on sabbatical in Inda, may have solved the language-versus-symbol question, if not the script itself.

"One of the main questions in machine learning is how to generalize rules from a limited amount of data," said Rao. "Even though we can't read it, we can look at the patterns and get the underlying grammatical structure."

Rao's team used pattern-analyzing software running what's known as a Markov model, a computational tool used to map system dynamics.

They fed the program sequences of four spoken languages: ancient Sumerian, Sanskrit and Old Tamil, as well as modern English. Then they gave it samples of four non-spoken communication systems: human DNA, Fortran, bacterial protein sequences and an artificial language.

The program calculated the level of order present in each language. Non-spoken languages were either highly ordered, with symbols and structures following each other in unvarying ways, or utterly chaotic. Spoken languages fell in the middle.

When they seeded the program with fragments of Indus script, it returned with grammatical rules based on patterns of symbol arrangement. These proved to be moderately ordered, just like spoken languages.

As for the meaning of the script, the program remained silent.

"It's a useful paper," said University of Helsinki archaeologist Asko Parpola, an authority on Indus scripts, "but it doesn't really further our understanding of the script."

Parpola said the primary obstacle confronting decipherers of fragmentary Indus scripts — the difficulty of testing their hypotheses — remains unchanged.

But according to Rao, this early analysis provides a foundation for a more comprehensive understanding of Indus script grammar, and ultimately its meaning.

"The next step is to create a grammar from the data that we have," he said. "Then we can ask, is this grammar similar to those of the Sanskrit or Indo-European or Dravidian languages? This will give us a language to compare it to."

"It's only recently that archaeologists have started to apply computational approaches in a rigid manner," said Rao. "The time is ripe."

Posted in , ,

13 responses to “Cracking a 4000 year old linguistic mystery?”

  1. A little background first. After decades of failed attempts to decipher the Indus valley script, a significant new theory was proposed in 2004 by Farmer, Sproat, and Witzel (FSW), which said that the Indus Valley symbols do not represent a linguistic script at all. This goaded nationalistic Indians who champion the “out-of-India” theory and other academics who have a lot invested in it being a script. Rao’s claim is another attempt to refute FSW, to which FSW have provided a response, to my mind quite convincing.
    The article above is just plain silly and full of lies. I can’t take seriously any writer who opens with a sophomoric sentence like this (not to mention the stupid title):

    An ancient script that’s defied generations of archaeologists has yielded some of its secrets to artificially intelligent computers.

    The last I checked there were no artificially intelligent computers; the field of AI is moribund. He then adds, “Some frustrated linguists thought the symbols were merely pretty pictures.” What? “Frustrated” or not, this is not the contention of FSW at all. Further down, he unilaterally embellishes the claims of Rao’s research: “[Rao] may have solved the language-versus-symbol question, if not the script itself.” Solved the script? This is bizarre, since he then quotes Parpola, an Indus script researcher, as saying, “it doesn’t really further our understanding of the script.” If you ask me, Brandon Keim is one amateurish science writer, utterly out of his depth here.

    Like

  2. Dean C. Rowan

    I share Namit’s skepticism and derision for the “Golly, gee!” tone of the story. It’s typically bathetic popular science reporting. But I differ on a couple points. First, I kind of liked the phrase “artificially intelligent”; somehow, the adverb puts the emphasis where it has always belonged, namely, on the artifice, not on the intelligence. Second, I don’t think the discipline of AI is moribund. Rather, the hype has petered out. Only wackos like Ray Kurzweil persist in depicting it in sci-fi tones. Good for you, Ray. Enjoy the next Star Trek convention! I recall seeing a story recently regarding this very point, but I can’t find it. (So much for Google.) It reported something to the effect that AI was being pursued along much narrower avenues than before, without all of the grand theorizing over Turing effects and such.

    Like

  3. I too found the article quite vague. What is the claim here? That the symbols “may be” a language and not just pretty pictures? Where does that leave us vis-vis decoding the actual language riddle? As usual, another bombastic claim based on a trivial observation. It will be interesting though if one day someone does figure out what was imprinted on those Harappan seals.

    Like

  4. Dean,
    I’d say that the Turing test was quite central to the vision of AI, i.e., mimicking full human intelligence. Abandoning it is a significant departure and a defeat for AI, and it is this original vision — not just by fringe AI folks but by the MIT AI lab, Marvin Minsky, et al — that is moribund. We now speak of “expert systems” that process predetermined inputs/stimuli, do pattern matching and database lookups, and adapt their outputs algorithmically or via “training”. Examples include chess software, search engines, speech recognition, industrial and service robots, and traffic and weather forecasting systems. However, even rudimentary intelligence—as in adapting to entirely new external variables, deciding which new inputs are relevant (and how) and which not, establishing new relationships as a human child does—is nowhere evident. A good guide to the history of AI and the current state of affairs is—you may know him—Berkeley professor Hubert Dreyfus (he has a terrific book on this, and an online article).

    Like

  5. Dean C. Rowan

    Namit,
    I agree, and I think you’ve captured what I meant by “Turing effect.” I’ve never exactly figured out the attraction of so silly a heuristic as the Turing test, anyway. That’s because I haven’t explored it to any real extent, but also because it just seems patently ridiculous, yet another instance of “What if George Washington had been born a horse?” I am mildly familiar with Dreyfus (he’s here on campus, but I don’t know him) and his work regarding AI, not to mention other topics, but I suppose I should read what appears to be a fascinating article to which you’ve referred us. I recall reading a lengthy piece in New Yorker decades ago, a profile of Minsky that I at first found intriguing. Perhaps a decade on, however, I realized how empty were his ambitions. Expert system research has been around for some time, hasn’t it? I encountered it in library school, I kid you not, during the late ’80s. Imagine expert reference systems, for which the input is your typical reference question and subsequent responses to prompts for clarifications or proffered choices. Nifty. So was Kubrick’s 2001. So what?

    Like

  6. Dean C. Rowan

    Namit,
    This Dreyfus article on Heideggerian AI is a marvel, not only an interesting account of the evolution of AI, but a useful inroad to Heidegger. Thank you for the tip!

    Like

  7. Dean,
    I’m very glad to hear. Dreyfus is an outstanding guide to Heidegger (as is Bill Blattner). You might also like this conversation with Dreyfus, titled “Meaning, Relevance, and the Limits of Technology”. Additional resources are on a short blog post I wrote a while back, including an 80s conversation with Bryan Magee.

    Like

  8. Typepad swallowed my embedded links in the previous post (will it reappear later?). So here they are in full; not pretty looking but hopefully useful.
    -Bill Blattner’s book: http://www.amazon.com/Heideggers-Being-Time-Readers-Guides/dp/0826486096
    -Dreyfus conversation: http://www.youtube.com/watch?v=-CHgt2Szk-I
    -My blog post: http://blog.shunya.net/shunyas_blog/2009/02/dreyfus-on-heidegger.html

    Like

  9. banerjee

    Farmer and Witzel’s attribution of unholy motives to Indian authors in general, and engineers dabbling in the social sciences in particular, turns me off. So here’s a less polemical analysis of the technical problems with the Science paper (via Cosma Shalizi)
    Conditional entropy and the Indus Script

    Like

  10. banerjee

    Looks like the link was eaten. So here it is again

    Case Closed

    Like

  11. Dean, I think you will find interesting this new article I wrote for 3QD: “The Dearth of Artificial Intelligence“.

    Like

  12. There is many evidence that before our recorded history, other beings were here, some giants, some average, some small, but the fact is that the evidence is there, too bad governments are trying to withheld the information from the humanity.

    Like

  13. Sarkany

    VERY incomplete analysis.
    And the actual method seems dubious. They “fed” it samples of 2 spoken languages that no one alive can replicate precisely (does anyone have any Sumerian or Old Tamil recordings?), and Sanskrit and Modern English are too related. Seems to me that they should have opted for 3x that number of samples of living or at least properly recorded dead languages, all from different families if possible; preferably isolates (Basque, Korean, Khoi, Navajo, Tamil, Hmong, Gaelic, for example).
    The non-spoken communication listing was WTH-worthy, too. Fortran IS an artificial language! I’m not sure if it’s possible, but a natural sign language would be optimal to use here, as would varied musical sequences (not just Western 8-tone scale based music). I’d suggest animal communications,too, but we’re still in toddler stage per that discipline.

    Like

Leave a reply to banerjee Cancel reply