by Martin Ford
In 2009, Jeff was elected to the National Academy of Engineering, and he was also named a Fellow of the Association for Computing Machinery (ACM) and a Fellow of the American Association for the Advancement of Sciences (AAAS).
His areas of interest include large-scale distributed systems, performance monitoring, compression techniques, information retrieval, application of machine learning to search and other related problems, microprocessor architecture, compiler optimizations, and development of new products that organize existing information in new and interesting ways.
Chapter 18. DAPHNE KOLLER
Stopping progress by stopping technology is the wrong approach. [...] If you don’t make progress technologically, someone else will, and their intent might be considerably less beneficial than yours.
CEO AND FOUNDER, INSITRO ADJUNCT PROFESSOR OF COMPUTER SCIENCE, STANFORD
Daphne Koller was the Rajeev Motwani Professor of Computer Science at Stanford University (where she is currently an Adjunct Professor) and is one of the founders of Coursera. She is focused on the potential benefits of AI in healthcare and worked as the Chief Computing Officer at Calico, an Alphabet subsidiary researching longevity. She is currently the Founder and CEO of insitro, a biotech startup using machine learning to research and develop new drugs.
MARTIN FORD: You’ve just started a new role as CEO and founder of insitro, a startup company focused on using machine learning for drug discovery. Could you tell me more about that?
DAPHNE KOLLER: We need a new solution in order to continue driving progress in drug research forward. The problem is that it is becoming consistently more challenging to develop new drugs: clinical trial success rates are around the mid-single-digit range; the pre-tax R&D cost to develop a new drug (once failures are incorporated) is estimated to be greater than $2.5B. The rate of return on drug development investment has been decreasing linearly year by year, and some analyses estimate that it will hit zero before 2020. One explanation for this is that drug development is now intrinsically harder: Many (perhaps most) of the “low-hanging fruit”—in other words, druggable targets that have a significant effect on a large population—have been discovered. If so, then the next phase of drug development will need to focus on drugs that are more specialized — whose effects may be context-specific, and which apply only to a subset of patients. Figuring out the appropriate patient population is often hard, making therapeutic development more challenging, and that leaves many diseases without effective treatment and lots of patients with unmet needs. Also, the reduced market size forces an amortization of high development costs over a much smaller base.
Our hope at insitro is that big data and machine learning, applied to drug discovery, can help make the process faster, cheaper, and more successful. To do that, we plan to leverage both cutting-edge machine learning techniques, as well as the latest innovations that have occurred in life sciences, which enable the creation of the large, high-quality data sets that may transform the capabilities of machine learning in this space. Seventeen years ago, when I first started to work in the area of machine learning for biology and health, a “large” dataset was a few dozen samples. Even five years ago, data sets with more than a few hundred samples were a rare exception. We now live in a different world. We have human cohort data sets (such as the UK Biobank), which contain large amounts of high-quality measurements—molecular as well as clinical—for hundreds of thousands of individuals. At the same time, a constellation of remarkable technologies allow us to construct, perturb, and observe biological model systems in the laboratory with unprecedented fidelity and throughput. Using these innovations, we plan to collect and use a range of very large data sets to train machine learning models that will help address key problems in the drug discovery and development process.
MARTIN FORD: It sounds like insitro is planning to do both wet-lab experimental work and high-end machine learning. These are not often done within a single company. Does that integration pose new challenges?
DAPHNE KOLLER: Absolutely. I think the biggest challenge is actually cultural, in getting scientists and data scientists to work together as equal partners. In many companies, one group sets the direction, and the other takes a back seat. At insitro, we really need to build a culture in which scientists, engineers, and data scientists work closely together to define problems, design experiments, analyze data, and derive insights that will lead us to new therapeutics. We believe that building this team and this culture well is as important to the success of our mission as the quality of the science or the machine learning that these different groups will create.
MARTIN FORD: How important is machine learning in the healthcare space?
DAPHNE KOLLER: When you look at the places where machine learning has made a difference, it’s really been where we have an accumulation of large amounts of data and we have people who can think simultaneously about the problem domain and how machine learning can solve that.
You can now get large amounts of data from resources like the UK Biobank or All of Us, which gather a lot of information about people and enable you to start thinking about the health trajectories of actual humans. On the other side, we have amazing technologies like CRISPR, DNA synthesis, next-generation sequencing, and all sorts of other things that are all coming together at the same time to be able to create large datasets on a molecular level.
We are now in the position where we can begin to deconvolute what is to my mind the most complex system that we’ve seen: the biology of humans and other organisms. That is an unbelievable opportunity for science, and is going to require major developments on the machine learning side to figure out and create the kinds of interventions that we need to live longer, healthier lives.
MARTIN FORD: Let’s talk about your own life; how did you get started in AI?
DAPHNE KOLLER: I was a PhD student at Stanford working in the area of probabilistic modeling. Nowadays it would look like AI, but it wasn’t really known as artificial intelligence back then; in fact, probabilistic modeling was considered anathema to artificial intelligence, which was much more focused on logical reasoning at the time. Things changed, though, and AI expanded into a lot of other disciplines. In some ways, the field of AI grew to embrace my work rather than me choosing to go into AI.
I went to Berkeley as a postdoc, and there I started to really think about how what I was doing was relevant to actual problems that people cared about, as opposed to just being mathematically elegant. That was the first time I started to get into machine learning. I then returned to Stanford as faculty in 1995 where I started to work on areas relating to statistical modeling and machine learning. I began studying applied problems where machine learning could really make a difference.
I worked in computer vision, in robotics, and from 2000 on biology and health data. I also had an ongoing interest in technology-enabled education, which led to a lot of experimentation at Stanford into ways in which we could offer an enhanced learning experience. This was not only for students on campus, but also trying to offer courses to people who didn’t have access to a Stanford education.
That whole process led to the launch of the first three Stanford MOOCs (Massive Open Online Courses) in 2011. That was a surprise to all of us because we didn’t really try and market it in any concerted way. It was really much more of a viral spread of information about the free courses Stanford was offering. It had an unbelievable response where each of those courses had an enrollment of 100,000 people or more. That really was the turning point of, “we need to do something to deliver on the promise of this opportunity,” and that’s what led to the creation of Coursera.
MARTIN FORD: Before we jump into that I want talk more about your research. You focused on Bayesian networks and integrating probability into machine learning. Is that something that can be integrated with deep learning neural networks, or is that a totally separate or competing approach?
DAPHNE KOLLER: This is a subtle answer that has several aspects. Proba
bilistic models lie on a continuum between those that try to encode the domain structure in an interpretable way—a way that makes sense to humans—and those that just try to capture the statistical properties of the data. The deep learning models intersect with probabilistic models—some can be viewed as encoding a distribution. Most of them have elected to focus on maximizing the predictive accuracy of the model, often at the expense of interpretability. Interpretability and the ability to incorporate structure in the domain has a lot of advantages in cases where you really need to understand what the model does, for instance in medical applications. It’s also a way of dealing effectively with scenarios where you don’t have a lot of training data, and you need to make up for it with prior knowledge. On the other hand, the ability to not have any prior knowledge and just let the data speak for themselves also has a lot of advantages. It’d be nice if you could merge them somehow.
MARTIN FORD: Let’s talk about Coursera. Was it a case of seeing the online classes that you and others taught at Stanford do really well, and deciding to start a company to continue that work?
DAPHNE KOLLER: We struggled trying to figure out what was the right way to take the next steps. Was it continuing this Stanford effort? Was it launching a nonprofit organization? Was it creating a company? We thought about it a fair bit and decided that creating a company was the right way to maximize the impact that we could have. So, in January of 2012, we started the company which is now called Coursera.
MARTIN FORD: Initially, there was enormous hype about MOOCs and that people all over the world were going to get a Stanford education on their phone. It seems to have evolved more along the lines of people that already have a college degree going to Coursera to get extra credentials. It hasn’t disrupted undergraduate education in the way that some people predicted. Do you see that changing, going forward?
DAPHNE KOLLER: I think it’s important to recognize that we never said this is going to put universities out of business. There were other people who said that, but we never endorsed that and we didn’t think it was a good idea. In some ways, the typical Gartner hype cycle of MOOCs was compressed. People made these extreme comments, in 2012 it was, “MOOCs are going to put universities out of business,” then 12 months later it was “universities are still here, so obviously MOOCs have failed.” Both of those comments are ridiculous extremes of the hype cycle.
I think that we actually have done a lot for people who don’t normally have access to that level of education. About 25% of Coursera learners don’t have degrees, and about 40% of Coursera learners are in developing economies. If you look at the percentage of learners who say that their lives were significantly transformed by access to this experience, it is disproportionately those people with low socioeconomic status or from developing economies who report that level of benefit.
The benefit is there, but you’re right that the large majority are the ones who have access to the internet and are aware that this possibility exists. I hope that over time, there is the ability to increase awareness and internet access so that larger numbers of people can get the benefit of these courses.
MARTIN FORD: There is a saying that we tend to overestimate what happens in the short term and underestimate in the long term. This sounds like a classic case of that.
DAPHNE KOLLER: I think that’s exactly right. People thought we were going to transform higher education in two years. Universities have been around for 500 years, and evolve slowly. I do think, however, that even in the five years that we’ve been around there has been a fair amount of movement.
For instance, a lot of universities now have very robust online offerings, often at a considerably lower cost than on-campus courses. When we started, the very notion that a top university would have an online program of any kind was unheard of. Now, digital learning is embedded into the fabric of many top universities.
MARTIN FORD: I don’t think Stanford is going to be disrupted over the next 10 years or so, but an education at the 3,000 or so less selective (and less well-known) colleges in the US is still very expensive. If an inexpensive and effective learning platform arose that gave you access to Stanford professors, then you begin to wonder why someone would enroll much less prestigious college when they could go to Stanford online.
DAPHNE KOLLER: I agree. I think that transformation is going to come first in the graduate education space, specifically professional master’s degrees. There’s still an important social component to the undergraduate experience: that’s where you go to make new friends, move away from home, and possibly meet your life partner. For graduate education, however, it’s usually employed adults with commitments: a job, a spouse, and a family. For most of them, to move and do a full-time college experience is actually a negative, and so that’s where we’ll see the transformation happen first.
Down the line, I think we might see people at those smaller colleges begin to wonder whether that’s the best use of their time and money, especially those that are part-time students because they need to work for a living while they do their undergraduate degrees. I think that’s where we’re going to see an interesting transformation in a decade or so.
MARTIN FORD: How might the technology evolve? If you have huge numbers of people taking these courses, then that generates lots and lots of data. I assume that data is something that can be leveraged by machine learning and artificial intelligence. How do you see those technologies integrated into these courses in the future? Are they going to become more dynamic, more personalized, and so forth?
DAPHNE KOLLER: I think that’s exactly right. When we started Coursera, the technology was limited in innovating on new pedagogy; it was mostly just taking what was already present in standard teaching and modularizing it. We made courses more interactive with exercises embedded in the course material, but it wasn’t a distinctly different experience. As more data is gathered and learning becomes more sophisticated, you will certainly see more personalization. I believe that you will see something that looks more like a personalized tutor who keeps you motivated and helps you over the hard bits. All of these things are not that difficult to do with the amount of data that we have available now. That wasn’t available when we started Coursera, where we didn’t have the data and we just needed to get the platform off the ground.
MARTIN FORD: There’s enormous hype focused on deep learning at the moment and people could easily get the impression that all of artificial intelligence is nothing but deep learning. However, there have recently been suggestions that progress in deep learning may soon “hit a wall” and that it will need to be replaced with another approach. How do you feel about that?
DAPHNE KOLLER: It’s not about one silver bullet, but I don’t think it needs to be thrown out. Deep learning was a very significant step forward, but is it the thing that’s going to get us to full, human-level AI? I think there’s at least one, probably more, big leaps that will need to occur before we get to human-level intelligence.
Partly, it has to do with end-to-end training, where you optimize the entire network for one particular task. It becomes really good at that task, but if you change the task, you have to train the network differently. In many cases, the entire architecture has to be different. Right now, we’re focused on really deep and narrow vertical tasks. Those are exceedingly difficult tasks and we’re making significant progress with them, but each vertical task doesn’t translate to the one next to it. The thing that makes humans really special is that they’re able to perform many of these tasks using the same “software,” if you will. I don’t think we’re quite there with AI.
The other place where we’re not quite there in terms of general intelligence is that the amount of data that’s required to train one of these models is very, very large. A couple of hundred samples are not usually enough. Humans are really good at learning from very small amounts of data. I think it’s because there’s one architecture in our brain that serves all of the tasks that we have to deal with and we’re really g
ood at transferring general skills from one path to the other. For example, it probably takes five minutes to explain how to use a dishwasher to someone who’s never used one before. For a robot, it’s not going to be anywhere close to that. That’s because humans have these generally transferable skills and ways of learning that we haven’t been able to give to our artificial agents yet.
MARTIN FORD: What other hurdles are there in the path to AGI? You’ve talked about learning in different domains and being able to cross domains, but what about things like imagination, and being able to conceive new ideas? How do we get to that?
DAPHNE KOLLER: I think those things that I mentioned earlier are really central: being able to transfer skills from one domain to the other, being able to leverage that to learn from a very limited amount of training data, and so on. There’s been some interesting progress on the path to imagination, but I think we’re fairly far away.
For instance, consider GANs (Generative Adversarial Networks). They are great at creating new images that are different from the images that they’ve seen before, but these images are amalgams, if you will, of images that they were trained on. You don’t have the computer inventing Impressionism, and that’s something that would be quite different than anything that we’ve done before.