Monday, March 12, 2012

Google Glasses & The Stupidity of Computers

By the end of this year we will have Google glasses that act like a smartphone and look like Oakleys: They will also have a unique navigation system. "The navigation system currently used is a head tilting to scroll and click," [blogger Seth Weintraub] wrote this month. "We are told it is very quick to learn and once the user is adept at navigation, it becomes second nature and almost indistinguishable to outside users." ... The glasses will have a low-resolution built-in camera that will be able to monitor the world in real time and overlay information about locations, surrounding buildings and friends who might be nearby, according to the Google employees. 

Almost two decades on, it’s easy to forget the mess that resulted when you tried to use the early search engines—Lycos, AltaVista, Northern Light. In addition to spotty coverage of the nascent web, none of them had any particularly skillful way of ordering their results. .... One early search engine attempted to mask its inadequacy by achieving a semantic understanding of the queries being entered in its search boxes. Ask Jeeves (now known simply as Ask) encouraged users to type actual questions rather than keywords: "Where can I buy shoes?" rather than "shoe shops."





From n+1 Magazine:

Dumb

Computers are near-omnipotent cauldrons of processing power, but they’re also stupid. They are the undisputed chess champions of the world, but they can’t understand a simple English conversation. IBM’s Watson supercomputer defeated two top Jeopardy! players last year, but for the clue “What grasshoppers eat,” Watson answered: “Kosher.” For all the data he could access within a fraction of a second—one of the greatest corpuses ever assembled—Watson looked awfully dumb.

Hard Labor

Consider how difficult it is to get a computer to do anything. To take a simple example, let’s say we would like to ask a computer to find the most commonly occurring word on a web page, perhaps as a hint to what the page might be about. Here is an algorithm in pseudocode (more or less plain English describing the basic outline of a program), in which we have a table, WordCount, that contains the number of occurrences of each word on the page:

FOREACH CurrentWord IN WordsOnPage
    WordCount[CurrentWord] = WordCount[CurrentWord] + 1
MostFrequentlyOccurringWord = “????”
FOREACH CurrentWord IN WordCount.keys()
    IF (WordCount[CurrentWord] > WordCount
[MostFrequentlyOccurringWord])
        MostFrequentlyOccurringWord = CurrentWord
PRINT MostFrequentlyOccurringWord, WordCount[MostFrequentlyOccurringWord]

The first FOREACH loop counts the number of times each word occurs on the page. The second FOREACH loop goes through that list of unique words, looking for the one that has the largest count. After determining the most commonly occurring word, it prints the word and the number of occurrences.

The result, however, will probably be a word like the. In fact, the least commonly occurring words on a page are frequently more interesting: words like myxomatosis or hermeneutics. To be more precise, what you really want to know is what uncommon words appear on this page more commonly than they do on other pages. The uncommon words are more likely to tell you what the page is about. So it becomes necessary to track and accumulate this sort of data on vast scales.

Any instructions to a computer must be as laboriously precise as these. Some of the towering achievements in computer science have been in the creation of brilliantly clever, efficient, and useful algorithms such as Quicksort, Huffman Compression, the Fast Fourier Transform, and the Monte Carlo method, all reasonably simple (but not obvious) methods of accomplishing precisely specified tasks on potentially huge amounts of precisely specified data. Alongside such computational challenges there has been the dream of artificial intelligence: to get computers to think.

Search

Almost two decades on, it’s easy to forget the mess that resulted when you tried to use the early search engines—Lycos, AltaVista, Northern Light. In addition to spotty coverage of the nascent web, none of them had any particularly skillful way of ordering their results. Yahoo, which presented users with a search box, was in fact not a search engine at all, but a manually updated, far-from-universal directory of web pages. Either way, users frequently had to go through pages and pages of results, often with completely incoherent descriptions or summaries, in order to find something even resembling the desired information. As spam pages increased on the web, this problem grew worse. But how was a computer to know what a user was looking for, just on the basis of a list of search words?

One early search engine attempted to mask its inadequacy by achieving a semantic understanding of the queries being entered in its search boxes. Ask Jeeves (now known simply as Ask) encouraged users to type actual questions rather than keywords: “Where can I buy shoes?” rather than “shoe shops.” Its front page featured a tuxedoed English butler, the eponymous Jeeves, who was unfortunately more like Bertie Wooster.

Jeeves did not handle the queries very well. Where other search engines searched for particular words on a page, and the more of them the better, Jeeves interpreted questions in a heuristic fashion, looking for interrogative keywords such as “where,” “who,” and “how,” then transformed these into a more standard query, possibly with some special handling. “How old is President Clinton?” could become “President Clinton’s age” as a query, after isolating “How old” and using a hard-coded rule that equated the answer to that question with a plausible number, say between 0 and 100. Jeeves then searched its archive of the internet for appearances of “President Clinton’s age” or instances of “President Clinton” near a two-digit number (it certainly didn’t know that “President Clinton” was equivalent to “Bill Clinton”). If it found a few websites with the phrase “President Clinton’s age is 52” or “President Clinton is 52 years old” or even “President Clinton is 52,” it could be reasonably confident that this was correct. But it only knew to look for those patterns because humans had hard-coded them. Other cases required similar handcrafted code. It was a cumbersome process and it didn’t “scale.” The particular case of asking for a person’s age was covered, but anything more complicated—“How old is the New York subway system?” or “How old is primogeniture?” much less “How old is 12 in dog years?”—would confuse poor Jeeves.

Eliza and Shrdlu

Jeeves wasn’t exactly faking it—he really did try to “understand” the question, but his ability to do so was shockingly narrow. To computer programmers, he was just an updated version of Joseph Weizenbaum’s ELIZA, a 1966 program that parodically simulated a Rogerian therapist, mostly by rephrasing its interlocutor’s statements as questions. For example:
How are you today? What would you like to discuss?

>> I’m having a nervous breakdown.
Is it because you are having a nervous breakdown that you came to me?
>> Yes.
You seem to be quite positive.
>> My father hates me.
Tell me more about your family.
>> I’m a mess.
Did you come to me because you are a mess?
>> No.
You are being a bit negative.

ELIZA could recognize keywords like yes, no, and father to deliver particular responses. But that was about as far as it went.

A few years later came SHRDLU, developed at MIT by the artificial intelligence researcher Terry Winograd. SHRDLU was a genuine computing achievement. It allowed for natural language interaction with a computer under narrowly circumscribed conditions. Differently shaped blocks were displayed on a screen, and the computer could manipulate these in response to typed instructions from a person:

person: Pick up a big red block.
computer: OK. [does it]
person: Grasp the pyramid.
computer:I don’t understand which pyramid you mean.
person: Find a block which is taller than the one you are holding and put it into the box.
computer: By “it,” I assume you mean the block which is taller than the one I am holding.
computer: OK. [does it]
person: What does the box contain?
computer: The blue pyramid and the blue box.
person: What is the pyramid supported by?
computer: The box.
person: How many blocks are not in the box?
computer: Four of them.
person: Is at least one of them narrower than the one which I told you to pick up?
computer: Yes, the red cube.
person: Is it supported?
computer: Yes, by the table.
person: Can the table pick up blocks?
computer: No.
person: Can a pyramid be supported by a block?
computer: Yes.
[…]
person: A “steeple”is a stack which contains two green cubes and a pyramid. computer: I understand.
person: Are there are any steeples now?
computer: No.
person: Build one.
computer: OK. [does it]
person: Call the biggest block “superblock.”
computer: OK.
person: Have you picked superblock up since we began?
computer: Yes.
person: Why did you drop it?
computer: By “it,” I assume you mean the superblock. I can’t explain a non-existent event.

Every human sentence had to be transformed into a logical syntax that represented its meaning. The sentence “pick up a red block” was represented as the following:

(((PICK UP A BIG RED BLOCK)
    (CLAUSE MAJOR TOPLEVEL IMPER ACTV PRT TRANS)
    (RSS1)
    (((PICK) (VG IMPER) NIL ((PICK (VPRT VB INF TRANS MVB))))
     (UP (PRT))
    ((A BIG RED BLOCK)
     (NG OBJ OBJ1 DET NS INDEF)
      (OSS4)
     ((A (DET NS INDEF)) (BIG (ADJ)) (RED (ADJ)) (BLOCK (NOUN NS))))))

 NIL)

Such programs seemed to hold promise for more generalized communication, but the secret of SHRDLU’s success lay precisely in the very narrow domain it worked within. It could understand English, but only as long as you were talking about colored blocks and the movements that could be performed on them.

Alan Turing, the analytical genius who broke the German ENIGMA code during World War II and formulated some of the fundamental principles of computer science, famously proposed a “test” for whether a computer was intelligent: could it, in text-only conversation, convince a person that it was human? Turing predicted in 1950 that a computer would have at least 128 megabytes of memory and be able to pass his test with reasonable frequency by the year 2000. He died in 1954, and although personal computers hit that memory mark ten years ago, no computer has come close to passing his test.

The Gonz

Still, computers could score an occasional victory by cheating. In 1989, the computer scientist Mark Humphrys wrote an absurdly primitive chatbot, MGonz, which not only convinced another user (at Drake University, reached in an early online chat room) that it was human, but browbeat this confession out of him:

mgonz: when was the last time you had sex
someone at drake: yesterday.
mgonz: ok honestly when was the last time you got laid
someone at drake: i told you it was yesterday.
mgonz: you are obviously an asshole
mgonz: ok honestly when was the last time you got laid
someone at drake: ok ok it was over 24 hours ago. for you it must have been 20 years

MGonz had no understanding of what the other user was saying, but succeeded, as Humphrys explained, through “profanity, relentless aggression, prurient queries about the user, and implying he was a liar when he made responses to these.”

No comments:

Post a Comment