But for an application to learn, it needs data; indeed without huge amounts of data, Machine Learning and especially Deep Learning could not have reached its current stage of development.
Which is why we collect what has become known as ‘Big Data’. So, what is Big Data and why does it matter?
Big Data is as described – huge amounts of data so voluminous that they cannot be handled using standard processing methods. An often-quoted statistic (that I suspect is underestimating the reality) is that 90% of the data in the world has been produced in the last two years[xlvi]. Fifteen years ago, the largest datasets were input into very structured databases (including spreadsheets). Today, however, data arrives from all aspects of our lives – where we are (through the map on our phone), where we shop and what we buy, what we watch, what we search for online, our medical conditions, who we talk to and what we say online – the list goes on.
Whilst the image of a joined up dystopian world where the authorities (or big tech companies) know everything we are doing at all times is often portrayed in the press, the reality is rather different. Each company has information related to what they are doing – so, for example, Amazon knows about each consumer for shopping habits, Netflix for viewing, Google for online searching and Facebook for social media.
But within each of these specialist areas, the data are incredibly valuable – mainly because they can be compared to other data from the whole user base. Big Data has become so important in today’s world (often described as the new gold) that the cost of holding data is now low compared to the potential value.
The ability to gather huge quantities of data opens up the opportunity to run models that can compare millions of pieces of information. By creating simulations and changing inputs, it is possible to see how results differ. It is this combination of data and processing power that increases the possibility of finding a pattern or insight, predicting behaviour, making assumptions backed up by hard evidence and recognising patterns.
But the true value of Big Data is in understanding how to mine it. Experienced data mining analysts can easily earn a six- or even seven-figure salary, which gives an indication as to the value of the analysis that can be generated from the data.
By joining data from multiple sources, the value grows exponentially. For example, a company such as Unilever will combine data from social media and focus groups with that from test markets to understand the potential success of new products.
Big Data is equally relevant to industry. Data from factories are collated to understand when machines need maintenance or are likely to break down, often being updated remotely and avoiding vast expense in relation to factory shut downs and engineer call outs.
The ability to send updates and do other things automatically is the basis of the Internet of Things (IoT) – lots of devices working autonomously. IoT devices also provide much of the Big Data mentioned above.
One of the divisions within our company provides mobile connectivity for the IoT. This allows information on devices on the move or dispersed across the world to be transmitted to a central computer (previously an engineer would have gone out to physically collect the data). The data are then analysed by programs and the information is diced and turned to provide locations, temperatures, alarms, predictive movement, passenger information, credit card information, advertising placement, refill schedules, time to harvest, state of bee hives – the list goes on.
IoT is a ridiculously overused acronym that seems to capture anything futuristic. In reality it relates to a lot of sensors connected to processors that transmit the data to a central server. Some of the data may then be joined with other data to become Big Data.
The reasons for collecting the data vary enormously. For example, it may be necessary to monitor the location of a delivery truck to adjust delivery schedules in real time and inform customers of amended drop off times.
Sensors are used in vending machines and bars so that there is no need for daily checks to restock the machine. These and the much-quoted restocking fridge (I’ve yet to meet anybody who owns one) are examples of the most basic uses of IoT.
One of the more interesting applications that we support is the monitoring of bees. By using sensors it is possible to monitor the amount of honey produced, temperature and humidity, whether the queen is still laying and other information about the hive. Add in some intelligence and other related (Big) data, and much more can be predicted such as colony strength, foraging activity, forage shortages, effect of weather conditions – all compared to other hives around the world. Albert Einstein famously predicted that “if the bees disappeared off the face of the earth, man would only have four years left to live.” Whilst many would disagree with this assertion, the health of bees is a good predictor of the health of the world.
One of our other clients monitors ski resorts for avalanches. By monitoring snowfall and weather conditions using sensors in the mountain, the company can predict when an avalanche is likely to occur, allowing the authorities to create a controlled avalanche when nobody is on the mountain.
So how is IoT, especially combined with Big Data, Artificial Intelligence, Machine Learning and Deep Learning going to change the future of work and obliterate a large number of jobs?
As always, the main driver is economics. The cost of basic devices which undertake limited processing activities and forward the data via mobile networks is becoming minimal. Likewise, connected sensors for basic monitoring are very low cost when purchased in big volumes.[xlvii] Low powered networks are bringing down the operational costs of IoT by sending small amounts of data from units running for up to 20 years on standard batteries without any human interaction. Low powered radio devices have already become so cheap that building companies are looking to embed sensors into the concrete of new buildings to monitor for future structural weaknesses.[xlviii]
So, is the IoT a threat to jobs? In short, yes.
Using technology remotely to monitor vending machines, collect meter readings, ensure successful deliveries and countless other examples has already created huge efficiencies, reducing the number of people who would have previously undertaken these tasks.
To date these job losses have been absorbed by further growth in the economy (an economist’s dream). Increased efficiency has allowed investment into new goods and services which in turn has created new jobs[xlix]. However, the present efficiencies will be minor compared to what will be achieved when IoT, Big Data and Artificial Intelligence are optimised together. IoT will allow sensors around the world to provide huge amounts of information to form Big Data; and Artificial Intelligence will then be used to process and manipulate the data and create efficiencies in every aspect of our lives.
These efficiencies will result in a level of job losses that will be difficult to replace. We have a little breathing space to change the way that we work, because much of what we think of today as Artificial Intelligence isn’t so much intelligence as processing power. However Machine Learning, a subset of Artificial Intelligence that combines processing power with a level of intelligence, will minimise the need for human intervention still further. To understand why we need to change our way of working to optimise the use of Machine and Deep Learning, we first need to understand how they function.
D)
MACHINE LEARNING AND DEEP LEARNING
“People worry that computers will get too smart and take over the world, but the real problem is that they’re too stupid and they’ve already taken over the world.”
Pedro Domingos – Professor, Machine Learning
Machine Learning uses the key advantages that computers hold over people – speed, accuracy and a lack of bias – whilst adding in some human classification skills. Human beings store thoughts, memories and ideas in different parts of the brain. This ability to categorise information allows for easy recall and the ability to predict and weigh the probability of future outcomes. It is this that Machine Learning (and Deep Learning as a subset of Ma
chine Learning) is trying to emulate.
Machine Learning is a key part of both Google’s Gmail spam filtering and Amazon’s shopping recommendations. Gmail’s spam filtering claims a 99.9% success rate in blocking unwanted emails.[l] But what is spam? One person’s spam is the next person’s new purchase (although the opportunity to collect $100 million from a deposed African dictator is generally spam to everybody), which is why individualised Machine Learning has value. It works by providing a computer with a base of data in which the (in this case) spam is already identified. More emails are then fed into the computer which identifies what it thinks is spam. The third step is to correct the computer’s predictions so that it can re-categorise the email and learn for next time. Obviously, the more data that the computer can learn from, the faster the Machine Learning process is – which puts Google with its 1.2 billion Gmail users at a considerable advantage. However, what is so powerful about Gmail is not that it blocks spam and identifies promotional messages so effectively, but rather that it does so on an individualised basis. Each time we identify an email as not being spam or something that we want to buy – it is learning our personal preferences.
At times this Machine Learning capability may seem too intrusive in our lives, inciting a desire to escape to a remote island without electricity or internet and read a book by candle light. However, it can be used to very good effect – especially in medicine. Some cancers, for example, are difficult to treat because they are diagnosed too late. The use of Artificial Intelligence within search engines is providing some interesting possibilities. Researchers from Microsoft looked at the search terms used by those who had made a search on pancreatic cancer. They estimated that, in 5–15% of cases, they could predict pancreatic cancer substantially prior to the official diagnosis and before the search term ‘pancreatic cancer’ was used. People with a propensity to suffer pancreatic cancer are also likely to have stomach pains and itchiness – these search terms combined with the timing of the searches provided the clues. With Microsoft’s predicted false positive rate of 1 in 100,000 and the possibility of knowing a crucial number of weeks earlier – which could be the difference between life and death – wouldn’t you want to know?[li]
Furthermore, no doctor has time to keep abreast of all the latest research and information in their field. This compares to IBM Watson that has learnt, amongst other things, the 23 million medical papers stored in Medline and can retrieve any one of them in milliseconds.[lii] This is partly why a computer is accurate 90% of the time in assessing lung cancer compared to 50% by a human doctor.
We are still at the early stages of knowing what the future holds for Artificial Intelligence within medicine but, given that misdiagnosis is the third leading cause of death after heart disease and cancer in the United States,[liii] computer diagnostics will (and should) become more relied upon.
This success in saving lives and increasing longevity does obviously create a side effect – overpopulation. Optimists assure us that this can be resolved by making the world a more efficient place and this is where sensors around the world (IoT) creating data (Big Data) which can be analysed (Artificial Intelligence) contribute. They will identify where waste can be cut, and productivity raised.
The next chapter covers examples of where Machine Learning and Artificial Intelligence will take over some of the roles in today’s office by incorporating pre-determined decisions. However, Machine Learning requires human initiative to understand and plan the results. It is this initiative, combined with other WEIRD attributes such as Emotional Intelligence, that will be required of employees to complement the output from Machine Learning algorithms.
Even greater opportunities arise if we look at the possibilities of Deep Learning, a branch of Machine Learning that will lead to developments that scare the innocent (and sometimes those in the know as well).
To date there are still limited applications making use of Deep Learning. However, more affordable processing power and advances in Deep Learning techniques will ensure that it becomes more mainstream.
So how does Deep Learning differ from Machine Learning? Think back to how Tesla and Google developed the software for running their self-driving cars – it was primarily written by engineers. In 2016, the chipmaker Nvidia tested a self-driving car in Monmouth County, New Jersey.[liv] The difference between this and the Tesla or Google cars is that it used an algorithm and taught itself how to drive by watching people.
Mount Sinai Hospital in New York took the hospital’s database of 700,000 patient records to train a Deep Learning program called Deep Patient.[lv] The researchers did not write a program to analyse pre-determined queries but rather wrote an algorithm and left the Deep Learning program to find patterns. It proved to be far more effective at predicting when people were likely to get a disease, including cancer of the liver, than a doctor. It also was better at predicting schizophrenia. The problem with Deep Learning is that because, like us, it teaches itself, it sometimes makes predictions and decisions that cannot be explained. The instigators of Deep Patient don’t know, for example, why it is good at predicting psychiatric conditions.
We need to understand how to cope in the world of the future and Deep Learning will have a role in this; it will destroy jobs and have profound changes on work and society as we know them. Therefore we should take it into account when we consider how and what we need to change.
So how does Deep Learning work?
The inspiration for Deep Learning comes from the human brain[lvi] which hosts approximately 100 billion neurons. These communicate with other neurons via synaptic connections.
Neural networks in computers work the same way, but instead of sending an electrochemical signal they send a weighted number to signify the level of confidence. The weighted number relates to the route by which the signals have been sent, but basically it comes down to a level of certainty. Let us imagine that you live in a nature reserve and want to create a retractable fence that will only spring out of the ground if an animal comes within five meters. To create a Deep Learning algorithm to recognise people and keep out animals, you load lots of images of people (labelled ‘human’) and animals and other images (labelled ‘not human’).
Each image would be broken down into its composite parts and analysed (colour, lines, angles etc.) through an input layer. These are then passed on to hidden layers which recognise or do not recognise each attribute. This recognition or non-recognition is multiplied by the weight (certainty) of that hidden layer recognising that attribute. Finally, the information is passed to an output layer which can indicate, with a level of confidence, whether or not it is a human being.
On the principle that a picture speaks a thousand words, I hope the graphic below will complement the words above.
Table 2.1: Understanding Deep Learning
Source: Cosmos Magazine
In people, the older we get the harder it is to change the synaptic connections in the brain because they get stronger the more they are used. A neural network algorithm has a far more efficient feedback mechanism that allows it to change the weighting of a synaptic connection quickly if the final output (eg gorilla, not human) is wrong. As computers aren’t proud and don’t mind being proved wrong, it is easier to re-program the synaptic connection of a computer than of a person.
Supervised Deep Learning needs some element of human interaction because the algorithm has to be told whether the answer is correct or not.[lvii]
The alternative is Unsupervised Deep Learning. In the example above, by providing initial examples of what is and is not a human being, the Deep Learning algorithm is supervised. With Unsupervised Deep Learning you provide the input and leave the computer on its own to see what patterns the algorithm produces. In the example above, it would probably be able to categorise people, but it might also identify mammals that can stand on two legs, or that people who look older are slightly more bent over than younger people, or that those who look pale are about to fall ill (especially if
an additional input of illnesses was added) – in fact it may come up with a range of interesting patterns that we would never have thought of.[lviii]
Thankfully, unless a trigger is linked to the output of an Unsupervised Deep Learning algorithm, we still need to do something with the analysis.
One of the worrying aspects of Deep Learning is that it is not clear how the algorithm is conducting the steps between the input provided and the final output.
Therefore, wherever Deep Learning becomes prevalent, issues over responsibility and blame will arise because the decision-making process of the algorithms is not transparent. In fact, even the authors of these programs cannot always explain how the algorithm works. But then again, neither do we know how decisions are reached in the human brain.
However, until these steps do become more transparent, we are not able to understand when there are missing data that may result in a flawed decision. When decisions made by algorithms are wrong, we won’t know why or be able to take steps to resolve them. This could lead to serious consequences – think military drones or medical decisions being made based on a faulty algorithm.
The US Defence Advanced Research Projects Agency (DARPA) is currently funding 13 projects specifically around trying to find ways to understand Deep Learning algorithms. One of these, run by Professor Carlos Guestrin at the University of Washington, has instigated a method whereby an algorithm will output examples that reflect its decision-making process.[lix] For example, when scanning emails, a few keywords could be highlighted that had a large influence on the decision-making process. The problem with this approach is that it provides simplified explanations for a complex decision-making process. In September 2018, IBM announced a service that would be able to explain how a Deep Learning algorithm reached its decision and what biases had been created within the decision.[lx]
The Weird CEO Page 5