Digital Transformation

Page 7

by Thomas M Siebel

Machine Learning

Machine learning—a very broad subset of AI—is the class of algorithms that learn from examples and experience (represented by input/output data sets) rather than relying on hard-coded and predefined rules that characterize traditional algorithms. An algorithm is the sequence of instructions a computer carries out to transform input data to output data. A simple example is an algorithm to sort a list of numbers from highest to lowest: The input is a list of numbers in any order, and the output is the properly sorted list. The instructions to sort such a list can be defined with a very precise set of rules—an example of a traditional algorithm.

Computer scientists have created algorithms since the earliest days of computing. However, using traditional approaches they had been unable to develop effective algorithms for solving a wide range of problems across health care, manufacturing, aerospace, logistics, supply chains, and financial services. In contrast to the precisely defined rules of traditional algorithms, machine learning algorithms mathematically analyze any variety of data (images, text, sounds, time series, etc.) and their interrelationships in order to make inferences.

FIGURE 3.5

An example of machine learning is an algorithm to analyze an image (input) and classify it as an “airplane” or “not airplane” (output)—potentially useful in air traffic control and aviation safety, for example. The algorithm is “trained” by giving it thousands or millions of images labeled as “airplane” or “not airplane.” When sufficiently trained, the algorithm can then analyze an unlabeled image and infer with a high degree of precision whether it’s an airplane. Another example is an algorithm in the health care field to predict the likelihood someone will have a heart attack, based on medical records and other data inputs—age, gender, occupation, geography, diet, exercise, ethnicity, family history, health history, and so on—for hundreds of thousands of patients who have suffered heart attacks and millions who have not.

The advent of machine learning combined with unlimited computational power has resulted in a whole new class of algorithms to solve previously unsolvable problems. Consider the case of assessing the risk of aircraft engine failure. By characterizing all relevant inputs (i.e., flight hours, flight conditions, maintenance records, engine temperature, oil pressure, etc.) and a sufficiently large number of engine failure cases (i.e., outputs), it is possible not only to predict whether an engine is likely to fail but also to diagnose the causes of failure. This can all be done without the need to understand material science or thermodynamics. What this does require is useful data, and lots of it.

Traditionally, machine learning has also required extensive “feature engineering.” (Advances in “deep learning,” discussed below, have reduced or in some cases eliminated this requirement.) Feature engineering relies on experienced data scientists working collaboratively with subject matter experts to identify the significant data and data representations or features (e.g., engine temperature differential, flight hours) that influence an outcome (in this case, engine failure). The complexity comes from choosing among the hundreds or thousands of potential features. The machine learning algorithm is trained by iterating over thousands (or millions) of historic cases while adjusting the relative importance (weights) of each of the features until it can infer the output (i.e., engine failure) as accurately as possible.8 The result of a trained machine learning algorithm is a set of weights that can be used to infer the proper output for any input.9 In this case, while the algorithm determines the weights, human analysts determined the features. We’ll see in deep learning approaches that the algorithm can determine both the relevant features and the associated weights directly from the data.

The training process is computationally intensive and time consuming, while the inference is lightweight and fast. Advances in the design and use of computing hardware have helped improve the performance of machine learning applications. For example, graphics processing units (GPUs) are designed to process a training set in parallel, whereas field-programmable gate arrays (FPGAs) are optimized for processing lightweight inferences. The leading cloud computing platforms now provide resources optimized for AI computing that leverage these hardware innovations.

Machine learning can be divided into “supervised” and “unsupervised” approaches. In supervised learning, the algorithm is trained using labeled training data, as in the aircraft engine failure example. This approach requires the availability of large amounts of historical data to train a machine learning algorithm.

If there are a limited number of occurrences to train a machine learning model, unsupervised learning techniques can be applied where the learning algorithm searches for outlier data. Unsupervised learning algorithms are useful in finding meaningful patterns or clusters in a large data set. For example, a retailer could use unsupervised machine learning algorithms for cluster analysis of customer data to discover new meaningful customer segments for marketing or product development purposes.

Deep Learning

Deep learning is a subset of machine learning with vast potential. As noted earlier, most traditional machine learning approaches involve extensive feature engineering that requires significant expertise. This can become a bottleneck, as data scientists are needed to classify and label the data and train the model. In deep learning, however, the important features are not predefined by data scientists but instead learned by the algorithm.

This is an important advance, because while feature engineering can be used to solve certain artificial intelligence problems, it is not a viable approach for many other AI problems. For many tasks, it is exceedingly difficult or impossible for data scientists to determine the features that should be extracted. Consider, for instance, the problem of image recognition, such as creating an algorithm to recognize cars—a critical requirement for self-driving technology. The number of variants in how a car may appear is infinitely large—given all the possibilities of shape, size, color, lighting, distance, perspective, and so on. It would be impossible for a data scientist to extract all the relevant features to train an algorithm. For such problems, deep learning employs “neural network” technology—described below—an approach originally inspired by the human brain’s network of neurons, but in reality having little in common with how our brains work.

Deep learning enables computers to build complex concepts out of a simpler hierarchy of nested concepts. You can think of this as a series of chained algorithms: Each layer of the hierarchy is an algorithm that performs a piece of the inference in succession, until the final layer provides the output. In the case of the car recognition task, for instance, the neural network is trained by feeding it a large number of images (with and without cars in them). Each layer of the neural network analyzes the various components of the image data—progressively identifying abstract concepts such as edges, corners, contours, circles, and rectangles that represent a car’s wheels, body, etc.—and eventually develops the concept of a car based on that hierarchy of nested concepts.10 Once trained, the neural network can be given an image it has not seen before and determine with a high degree of precision whether it is a car.

FIGURE 3.6

As you can imagine, deep learning has vast potential for many applications in business and government. In addition to their use in computer vision problems presented by self-driving cars and factory robots, neural networks can also be applied to tasks like voice recognition in smart devices (e.g., Amazon Echo and Google Home), automated customer service, real-time language translation, medical diagnostics, prediction and optimization of oil field production, and many others. Deep learning is especially interesting because it can potentially be applied to any task—from predicting engine failure or diabetes to identifying fraud—with far less data scientist intervention compared to other machine learning methods, due to the greatly reduced or eliminated need for feature engineering.

AI is an enormously exciting area with endless possibilities, and the field is rapidly advancing. A number of
catalysts are accelerating the use of AI, including the continuous decline in the cost of computing and storage as well as ongoing hardware improvements and innovations. As computing becomes both more powerful and less expensive, AI can be applied to ever larger and more diverse data sets to solve more problems and drive real-time decision-making.

It is no exaggeration to say that AI will profoundly change the way we work and live. While we remain in the early stages of using AI in business and government, the AI arms race has clearly begun. Forward-thinking organizations are already actively engaged in applying AI across their value chains. They are positioning themselves to preempt competitors and thrive, as digital transformation determines which organizations will lead and which will fall aside. Today’s CEOs and other senior leaders should be actively thinking about how AI will impact the landscape in which they operate—and how to take advantage of the new opportunities it will open up.

Internet of Things (IoT)

The fourth technology driving digital transformation is the internet of things. The basic idea of IoT is to connect any device equipped with adequate processing and communication capabilities to the internet, so it can send and receive data. It’s a very simple concept with the potential to create significant value—but the IoT story doesn’t end there.

The real power and potential of IoT derives from the fact that computing is rapidly becoming ubiquitous and interconnected, as microprocessors are steadily getting cheaper and more energy efficient, and networks are getting faster. Today, inexpensive AI supercomputers the size of a credit card are being deployed in more devices—such as cars, drones, industrial machinery, and buildings. As a result, cloud computing is effectively being extended to the network edge—i.e., to the devices where data are produced, consumed, and now analyzed.

NVIDIA’s TX2 is an example of such an edge AI supercomputer. The TX2 can process streaming video in real time and use AI-based image recognition to identify people or objects. For example, it can be embedded in self-driving delivery robots to power their computer vision system for navigating city streets and sidewalks. The 2-by-3-inch TX2 incorporates advanced components, including a powerful 256-core GPU and 32 gigabytes of RAM local storage capable of recording about an hour of video. Power consumption, an important consideration in applications such as drones, is under 8 watts.

With such advances, we are seeing an important evolution in the form factor of computing (which I will cover in more depth in chapter 7). Cars, planes, commercial buildings, factories, homes, and other infrastructure such as power grids, cities, bridges, ports, and tunnels all increasingly have thousands of powerful computers and smart cameras installed to monitor, interpret, and react to conditions and observations. In essence, everything is becoming a computing device and AI capabilities are increasingly being built into these devices.

The technical name for IoT—cyber-physical systems—describes the convergence and control of physical infrastructure by computers. Deployed across physical systems, computers continuously monitor and effect change locally—for example, adjusting the setting of an industrial control—while communicating and coordinating across a wider area through cloud data centers.

An example of such a system is the smart grid in the electric utility industry, which uses locally generated power when available and draws from the power grid when necessary. The potential is to make the global power infrastructure 33 to 50 percent more efficient. Enabling such a system requires AI to be deployed on edge computers to continuously make real-time AI-based forecasts (or inferences) of energy demand and match that demand to the most cost-effective energy source, whether local solar, battery, wind, or from the power grid. The idea of a “transactive” grid, in which individual nodes on microgrids make instantaneous energy buying and selling decisions, is closer to becoming reality.

Other examples of local processing combined with cloud processing include the Amazon Echo, autonomous vehicles, camera-equipped drones for surveillance or other commercial and industrial uses (such as insurance damage assessments), and robotics in manufacturing systems. Soon, even humans will have tens or hundreds of ultra-low-power computer wearables and implants continuously monitoring and regulating blood chemistry, blood pressure, pulse, temperature, and other metabolic signals. Those devices will be able to connect via the internet to cloud-based services—such as medical diagnostic services—but will also have sufficient local computing and AI capabilities to collect and analyze data and make real-time decisions.

Today, even as we remain in the early stages of IoT, numerous IoT applications already deliver enormous value to business and government. In the energy sector, for example, utilities realize substantial value from predictive maintenance applications fed by telemetry data from sensors installed throughout the utility’s distribution network. For instance, a large European utility runs an AI-based predictive maintenance application that consumes data from sensors and smart meters across its 1.2-million-kilometer distribution network to predict equipment failure and prescribe proactive maintenance before equipment fails. The potential economic benefit to the utility exceeds €600 million per year.

In the public sector, the U.S. Air Force deploys AI-based predictive maintenance applications to predict failure of aircraft systems and subsystems for a variety of aircraft models, enabling proactive maintenance and reducing the amount of unscheduled maintenance. The applications analyze data from numerous sensors on each aircraft, as well as other operational data, to predict when a system or subsystem will fail. Based on results of initial deployments of these applications, the USAF expects to increase aircraft availability by 40 percent across its entire fleet.

These are just a few examples of AI-powered IoT applications deployed today that deliver significant value as part of digital transformation initiatives at large enterprises. I will discuss additional IoT use cases and examples in chapter 7.

AI and IoT Applications Require a New Technology Stack

We have seen that each of the four technologies driving digital transformation—elastic cloud computing, big data, AI, and IoT—presents powerful new capabilities and possibilities. But they also create significant new challenges and complexities for organizations, particularly in pulling them together into a cohesive technology platform. In fact, many organizations struggle to develop and deploy AI and IoT applications at scale and consequently never progress beyond experiments and prototypes.

These organizations typically attempt to develop the application by stitching together numerous components from the Apache Hadoop open source software collection (and from commercial Hadoop distributors such as Cloudera and Hortonworks) on top of a public cloud platform (AWS, Microsoft Azure, IBM Cloud, or Google Cloud). This approach almost never succeeds—often after months, even years, of developers’ time and effort. The corporate landscape is littered with such failed projects. Why is that?

Software components like those in the Hadoop collection are independently designed and developed by over 70 contributors. They use different programming languages and interface protocols with high programming-model switching costs, and they exhibit dramatically varying levels of maturity, stability, and scalability. Moreover, the number of permutations that developers must contend with—of infrastructure service calls, enterprise systems and data integrations, enterprise data objects, sensor interfaces, programming languages, and libraries to support application development—is almost infinite. Finally, most enterprises need to design, develop, and operate hundreds of enterprise applications that all require slightly different “plumbing” of the Hadoop components. The resulting complexity overwhelms even the best development teams. This stitch-it-together approach is rarely successful.

In reality, neither the Hadoop collection nor the public clouds by themselves provide a complete platform for developing AI and IoT applications at scale. The technical requirements to enable a complete, next-generation enterprise platform that brings together cloud computing, big data, AI, and IoT are exte
nsive. They include 10 core requirements:

1. Data Aggregation: Ingest, integrate, and normalize any kind of data from numerous disparate sources, including internal and external systems as well as sensor networks

2. Multi-Cloud Computing: Enable cost-effective, elastic, scale-out compute and storage on any combination of private and public clouds

3. Edge Computing: Enable low-latency local processing and AI predictions and inferences on edge devices, enabling instantaneous decisions or actions in response to real-time data inputs (e.g., stopping a self-driving vehicle before it hits a pedestrian)

4. Platform Services: Provide comprehensive and requisite services for continuous data processing, temporal and spatial processing, security, data persistence, and so on

5. Enterprise Semantic Model: Provide a consistent object model across the business in order to simplify and speed application development

6. Enterprise Microservices: Provide a comprehensive catalog of AI-based software services enabling developers to rapidly build applications that leverage the best components

7. Enterprise Data Security: Provide robust encryption, user access authentication, and authorization controls

8. System Simulation Using AI and Dynamic Optimization Algorithms: Enable full application lifecycle support including development, testing, and deployment

9. Open Platform: Support multiple programming languages, standards-based interfaces (APIs), open source machine learning and deep learning libraries, and third-party data visualization tools

10. Common Platform for Collaborative Development: Enable software developers, data scientists, analysts and other team members to work in a common framework, with a common set of tools, to speed application development, deployment, and operation

Model-Driven Architecture

‹ Prev Next ›