Idea Man

Home > Other > Idea Man > Page 35
Idea Man Page 35

by Paul Allen


  Handling pervasive uncertainty and vagueness. Much of our knowledge is uncertain, vague, and approximate, yet we have a remarkable ability to draw conclusions and act. After listening to the weather forecast, a person who knows that it might rain can make contingency plans. People can read vague statements (“John is fairly tall”), approximations (“The human genome contains about 23,000 genes”), or statements with exceptions (“All birds can fly”) and still draw useful conclusions despite the imprecision. Classic techniques of statistics can already address these issues in selected domains, and may yet work in general, but progress has been slow. Project Halo has made some advances in this area, which remains an important focus of our research.

  II. DIFFICULTY TIER 2: RESEARCH THAT IS STILL PRELIMINARY AND EXPERIMENTAL

  Unstated and implicit knowledge in language. Human language is full of ambiguity and gaps in knowledge that a reader or listener must interpret correctly. Take, for example, the statement “A teaspoon of salt is dissolved in water.” Is it the teaspoon or the salt that is dissolved? Is the teaspoon made of salt? Humans use knowledge to instantly resolve such ambiguities, while machines struggle. If we read that “acids can cause some dyes to change color,” we immediately assume that the acid and dye must be in contact, although it’s not explicitly stated. To accurately understand statements like these, our brains make use of a rich interplay between textual and background knowledge. For a computer to have full language understanding, it needs to overcome this critical problem.

  Evolving knowledge. Acquiring new knowledge is not simply a matter of memorization. New knowledge always needs to be “fitted in” with existing knowledge in a way that is coherent. For example, if you learn a simple model of how cells divide, and then come across a more complex description, you recognize that you need to align the two, which modifies your original understanding. Perhaps what you originally thought of as a single process now needs to be revised and conceptualized as two linked ones. This process of maintaining, revising, and expanding existing knowledge is critical for large-scale systems like the Digital Aristotle. Simple, specialized techniques for doing this exist, but a fully automated solution seems decades away.

  Contradictions, fragility, and handling messy knowledge. While knowledge bases for small-and medium-scale artificial intelligence systems can be fully debugged, knowledge bases above a certain size inevitably become “messy” with errors, inconsistencies, gaps, and contradictions. As the volume of available data and knowledge grows, AI systems need to both effectively debug artifacts and to continue to reason in a robust, sensible way. This challenge becomes particularly significant in Web-scale systems, where sources of knowledge and data may be geographically, culturally, and temporally diverse. A variety of new techniques exist here, from fancy new logics to systems inspired by Web search technology, but they are still experimental. Project Halo is actively working on reasoning techniques that will handle several specific kinds of conflict and contradiction, but general solutions have been elusive.

  Commonsense reasoning. A vast amount of our understanding draws on general, commonsense knowledge and rules of thumb. For example, if you are told that “carbon dioxide is a raw material for photosynthesis,” you readily infer that carbon dioxide is used in photosynthesis, that it is required and also consumed. You can draw these inferences because you understand these general notions (“raw material,” “require,” “consume”) and the relationships between them. Commonsense knowledge provides great flexibility in human question-answering and reasoning, but correctly applying it in machines is a major challenge. A range of systems now exist, from those that attempt to use the Web to systems like Cyc (www.cyc.com), which are mostly human-authored. But while computers can demonstrate examples of commonsense reasoning, their ability to reliably acquire and use this type of knowledge at the scale required for a Digital Aristotle remains unproven. Project Halo is working to find solutions.

  III. DIFFICULTY TIER 3: SOME OF THE TOUGHEST REMAINING CHALLENGES IN AI

  Applying knowledge in new contexts. Humans apply their knowledge in new contexts, constructing innovative and often novel ideas. For example, when a high school student designs an experiment to validate a chemical principle, she is capable of managing her existing knowledge about actions and objects to assemble it into a suitable sequence. We do the same thing when we imagine fictional situations, using what we know in new ways and applying it to new contexts. This ability to manipulate existing knowledge in complex and original ways remains a major challenge for computers. Very little exists in this area beyond preliminary research.

  Metaphor and analogy. When confronted with something new, people frequently draw on and adapt what they already know. For example, one biology text states, “Microtubules in the cell are like miniature springs.” The analogy prompts a reader to draw on existing knowledge to understand how microtubules expand and contract, yet avoid the conclusion that microtubules are likely made of metal. This skill requires identifying, mapping, and selectively adapting existing mental models to new tasks for which the model was never intended. Such a process remains almost impossible to automate. Very little exists in this area beyond preliminary research.

  For each of these types of knowledge representation (and several more), Project Halo is actively seeking solutions worldwide. If you have serious technical ideas in these areas, please contact us at [email protected].

  *Little did I know that the “machine” on the cover was in fact a hollow mock-up, subbed in at the last minute after the genuine Altair prototype was delayed in shipping by a Railway Express strike.

  *Between the paper tape era and the popularization of floppy disks, audiocassettes had a brief run in the midseventies as a leading storage device for microcomputers.

  *Scientific notation is a simplified way to handle very small or very large numbers using coefficients and exponents. For example: 83,700,000 = 8.37 × 107; 0.0072 = 7.2 × 10−3.

  *The opening credits embedded in our BASIC were as follows: “Paul Allen wrote the non-runtime stuff. Bill Gates wrote the runtime stuff. Monte Davidoff wrote the math package.”

  *The theft of Altair BASIC foreshadowed the wholesale piracy of copyrighted material that plagues the entertainment industry today. Once a song or movie or piece of software was reduced to binary bits, it became easy to copy, even more so with the ascendance of the Internet.

  *Earlier that summer, when we were still licensing 86-DOS under a nonexclusive contract, Brock was approached by Eddie Currie on behalf of Lifeboat Associates. As Currie tells the story, he offered Brock $250,000 for any rights to 86-DOS that Microsoft didn’t control. Brock chose instead to stay with us. He didn’t want to antagonize Bill or lose his long-term, cut-rate access to our software.

  *As minicomputers were undercut by ever more powerful microprocessors, and by the PC in particular, DEC went into a fatal tailspin and was acquired by Compaq in 1998.

  *The public contribution would come from the interests who stood to benefit most, via an extended county hotel tax and increased parking and admission taxes at the stadium, along with new lottery games and a state sales tax credit that reflected the team’s economic value to Seattle.

 

 

 


‹ Prev