Inspired by the depth structure of the brain, deep learning architectures have revolutionized the approach to data analysis [3–5]. Deep Learning networks have won a paramount number of hard machine learning contests, from voice recognition, image classification, Natural Language Processing (NLP), to time‐series prediction – sometimes by a large margin [3]. Traditionally AI relied on heavily handcrafted features, for instance, to have decent results in image classification, several pre‐processing techniques have to be applied, like filters, edge detection, etc. The beauty of DL is that most, if not all, features can be learned automatically from the data – provide enough (sometimes millions) training data examples are available.
31.3.1 Deep Neural Networks
Deep models have feature detector units at each layer (level) that gradually extract more sophisticated and invariant features from the original raw input signals. Lower layers aim at extracting simple features that are then clamped into higher layers, which in turn detect more complex features. In contrast, shallow models (two‐layers neural network or support vector machine) present very few layers that map the original input features into a problem‐specific feature space. See [3, 4] and for a review and [5] for a business oriented perspective.
Being essentially non‐supervised machines, deep neural architectures can be exponentially more efficient than shallow ones. Since each element of the architecture is learned using examples, the number of computational elements one can afford is only limited by the number of training samples – which can be of the order of billions. Deep models can be trained with hundreds of millions of weights and therefore tend to outperform shallow models such as Support Vector Machines. Moreover, theoretical results suggest that deep architectures are fundamental to learn the kind of complex functions that represent high‐level abstractions [4] (e. g. vision, language, semantics), characterized by many factors of variation that interact in non‐linear ways, making the learning process difficult.
There are many DL architectures, but most of the DNNs can be classified into five major categories (see Fig. 31.2). 1.Networks for unsupervised learning, designed to capture high‐order correlation of data by capturing jointly statistical distributions with the associated classes – when available. Bayes rule can later be used to create a discriminative learning machine.
2.Networks for supervised learning. These networks are designed to provide maximum discriminative power in classification problem are trained only with labeled data – all the outputs should be tagged.
3.Hybrid or semi‐supervised networks, where the objective is to classify data using the outputs of a generative (unsupervised) model. Normally, data is used to pre‐train the network weights to speed up the learning process prior to the supervision stage.
4.Reinforcement learning – the agent interact and changes the environment and receives feedback only after a set of actions are completed. This type of learning is normally used in the field of robotics and games.
5.Generative Neural Networks – Deep generative models are a powerful approach to unsupervised and semi‐supervised learning where the goal is to discover the hidden structure within data without relying on labels. Since they are generative, such models can form a rich imagery the world in which they are used: an imagination that can harnessed to explore variations in data, to reason about the structure and behavior of the world, and ultimately, for decision‐making – an example is the variational auto-encoder.
Fig. 31.2Several types of deep neural networks. a Convolutional neural network (CNN) has several levels of convolutional and subsampling layers optionally followed by fully connected layers with deep architecture. b Stacked autoencoder consisting of multiple sparse autoencoders (c). Deep Belief Network (DBN) (d). Restricted Boltzmann Machine (RBM) architecture includes one visible layer and one layer of hidden units
DNN have been successfully applied to several problems, ranging from natural language processing, time‐series prediction and image annotation. In the context of banking, and customer care, the most relevant applications are in customer segmentation, recommendation algorithm, fraud detection and credit scoring.
The main characteristics that make DNN unique can be summarized in the following points: 1.High learning capacity: they don’t saturate easily – the more data you have the more they learn.
2.No feature engineering required: learning can be performed end‐to‐end.
3.High generative capability: DNN can generate unseen – but plausible – data based on latent representations.
4.Knowledge transfer: we can teach a machine in one large set of data and transfer the learning to a similar problem where less data is known.
5.Excellent unsupervised capabilities – DNN can learn hidden statistical representations without any labels required.
6.Multimodal learning – DNN can integrate seamlessly disparate sources of high‐dimensional data, like text, images, video and audio to generate conditional probability distributions.
31.3.2 Why Deep Neural Networks is a Game Changer?
To demonstrate why DNN are so effective, let’s consider the case of weather forecasting. It’s a very complex problem that takes as inputs many measurements of previous conditions in a space‐time mesh. Current predictions models are based on huge grid based finite‐element method calculations and large sets of fluid dynamics differential equations are solved iteratively so that the results are used as initial conditions for the next step. This is computationally extremely expensive and the predictive accuracy limited as errors multiply for each predictive time step.
However, recently, using a combination of 3D Convolutional Neural Networks with neural networks with Long‐Short Term Memory (LSTMs) cells, it was possible to build an accurate model, using up to 100 million parameters, trainable end‐to‐end, to predict in less than 0.1 s on a laptop, the weather up to two days ahead achieving a better accuracy than models that need several hours of computations on a supercomputer [10].
Deep Learning was also applied to other very challenging problems, like image annotation, voice recognition and control, sometimes with super‐human accuracy – for instance in the ImageNet competition [6].
31.4 Deploying Artificial Intelligence in Banks
In 2015, technology companies spent $8.5 billion on deals and investments in artificial intelligence, four times more than in 2010. In 2016, Deutsche Bank announced a “crowdstorming” ideas initiative on how artificial intelligence can be used in the financial services industry, by inviting people to submit their concepts for the chance to develop them at the German giant’s innovation labs and win cash prizes [11]. With the likes of Google, Facebook and IBM among those pouring resources into AI, Deutsche Bank is hoping to be at the forefront of the technology’s application in financial services and arrest some of the AI talent away from the big technology firms. What is driving this change?
The goal of customer analytics is to create a deeper understanding of customers and their behavior to maximize their lifetime value to the company. Customer analytics can be applied to many applications, like customer marketing, credit scoring and approval, profitable credit card customer identification, high‐risk loan applicant identification, payment default prediction, fraud detection, money laundering detection, etc. Banks are using these techniques to reduce costs and simplify customer interactions. Some early examples can be seen through recent activities in RBS and Barclays.
After falling £2 billion in the red in 2016, RBS, a UK Bank, announced it will replace staff who offer investment tips with so‐called ‘robo‐advisers’ in order to reduce costs. Robo‐advisers have been around for a while in the US. A report from Cerulli Associates in August 2016 said that robo‐advice platforms are expected to reach $489 billion (£323 billion) in asse
ts under management by 2020, up from $18.7 billion today. Current robo‐advisers are essentially online wealth management services which use algorithms to suggest automated investment portfolios based on customers’ goals and attitude to risk. The RBS deployment will provide an automated system to offer customers advice based on their responses to a series of questions. At the same time RBS will reduce internal headcount by over 550 staff [7].
With the aim to boost its multi‐channel delivery and understand customer behavior, Barclays’ South African subsidiary, Absa, announced a trial of chatbots, using artificial intelligence to answer simple customer queries posted over popular smart messaging apps. The goal of the bank is to connect with customers through their conversational channels of choice rather than by traditional, and more limited options such as SMS or email. The use of two‐way messaging is expected to create a feedback loop to help the bank better understand the most pressing issues for improving customer service [12].
31.4.1 Customer Segmentation and Preference Analysis
By deploying fine‐grained customer segmentations in which customers share similar preference for different sub‐branches or market regions, banks can get deeper insights in their customer characteristics and preferences. This allows improved customer satisfaction and improved precision marketing by personalizing banking products and services, as well as marketing messages.
31.4.2 Recommendation Algorithms for Marketing
Recommendation algorithms are ubiquitous in almost any ecommerce site. A Recommender System (RS) is an algorithm that suggest items to a user that he may be interested. It uses as input information on past preferences of users (transactional data) over a set of finite items, either explicitly (ratings) or implicitly (monitoring users behavior, such as songs heard, applications downloaded, web sites visited) and information about the users or the items themselves.
Deep Learning can address this problem through an hierarchical Bayesian approach. Collaborative Deep Learning (CDL) [13] can jointly learns deep representations from the content of items/users while also considering the ratings matrix with significant better results in a self‐consistent way. It relies on a method using tightly coupled schemes that allow two‐way interaction between rating matrix and content: the rating information guides the learning of features and, on the other hand, the extracted features can improve the predictive power of the CF models.
31.4.3 Credit Scoring
Credit risk analysis is a very important and actual topic. Neves and Vieira [14] pioneered the use of Deep Artificial Neural Networks for the distress prediction of SME companies. They showed that these types of ANNs substantially outperform traditional methods based on Logistic Regression or Support Vector Machines. Recently, Lopes and Ribeiro [24] also applied a model based on deep belief networks (DBN) for bankruptcy prediction. Despite being a small dataset, they showed that DBN can achieve better accuracy than SVM or Restricted Boltzmann Machines (RBM).
31.4.4 Churn Management
Churn is a classical, but important, problem for banks. It cost five times more to acquire a new customer than to retain an existing one. To prevent such attrition (churn) it is critical to be able to identify the early warning signs of churn. Artificial Intelligence has been applied with success in this problem through the selection of features that work as proxies for early indicators of churn using a semi‐supervised approach. This can be done using either a more conventional transactional data perspective or analyzing the network activity – relationships between customers, their degrees of connectivity and influence.
The Churn Score, that assigns a probability to each customer indicating the predicted likelihood that the particular subscriber will churn within a predefined period of time, is constructed based on different models, the most useful are Random Forest (or an upgraded version of gradient boosting trees) or more advanced Convolutional Neural Networks. Depending on the quality of data and business activity, accuracies of up to 90% are common.
31.4.5 Customer Identification
This problem consists of identification of potential high‐revenue or loyal customers who are likely to become profitable to the bank but are not on the books. These methods rely on template matching (target new customers based on past behavior) and allow banks to get a more complete and accurate target customer list for high‐value customers, which can improve marketing efficiency and bring huge profits.
Other technique is customer network analysis. It consist of understanding customer and product affinity through analysis of social media networks and their relations through exploration of graph connectivity analysis. Customer network analysis can improve customer retention, cross‐sell, and up‐sell.
Market potential analysis: Using economic, demographic and geographic data, we can generate the spatial distribution for both existing customers and potential customers. With the market potential distribution map, banks can have a clear overview of the target customers’ locations, and identify the customer concentrating/lacking areas for investing/divesting, which will support the banks’ customer marketing and exploration [15].
Channel allocation and operation optimization: Based on the banks’ strategy and spatial distribution of customer resource, this module optimizes the configuration (i. e., location, type) and operations of service channels – i. e., retail branch or automated teller machine [16].
31.4.6 Conversational Bots (Chatbots) for Customer Service
Probably the largest, and most immediate, impact of AI anywhere, supported by DNN, will not be in self‐driving cars or robotics but in customer service. Services like sending a specific email, a mobile push, or a customer pass for a specific shop or event; predictive analytics to help support decisions and call centers. Contact centers deal with very mundane interactions that soon will be serviced through automated messaging like chat bots and personal assistants. AI can help suggest how to deliver a conversation; user interests and product. It can even use the data for secondary purposes, like risk assessment based on previous interactions.
Chabots have gone a long way since Eliza, the first conversational machine invented in the 60’s. Trained in large corpus of data, they are capable of answering almost any type of question. The technology behind chatbots is based on recurrent neural networks (RNN) for text generation that can be trained end‐to‐end [17]. In recent years, the demand for Chat‐bots has changed from answering simple questions (almost as a toy) to performing smooth in open‐domain conversations – like real humans. The challenge is to model conversation within a given domain. By introducing trainable gates to recall the global domain memory, deep learning models can incorporate background knowledge to enhance the sequence semantic modeling ability of LSTMs.
Banks channel users to customer‐service representatives—generally via a call or live chat, but chatbots are a new medium for communication that are fastly making inroads at other financial institutions. Some examples already implemented based on chatbots: DBS in Singapore recently launched Mykai, a conversational bot (created by the startup Kasisto) to help customers perform routine operations, like payments and checking. In the future the plan is to integrate it in messaging platforms, like telegram or WhatsApp [18].
Digibank, recently launched in India, allows to open an account with a bank that’s only accessible via mobile devices. It’s based on chatbots intelligent enough to answer thousands of questions submitted via chats [19].
Penny is a conversational personal financial assistant [20]
Bank of America allows customers to interact with a bot on Facebook’s Messenger platform [21].
31.4.7 Fighting Financial Crime and Money Laundering
IT companies are working with banks to create tools to increase robustness in the transaction monitoring process and the detection of unusual finan
cial activity. These systems are based on standard typologies of money laundering such as spikes in value or volume of transactions, monitoring high risk jurisdictions, identifying rapid movement of funds, screening against sanctioned individuals and politically exposed persons (PEPs) and monitoring enlisted terrorist organizations. Challenges today are in setting up the correct threshold levels and parameters, Identifying ‘false positives’ quickly and accurately, streamlining operations to minimize costs, accurate data sources and accurate and timely reporting. The future lies in seeing how the intrinsic benefits of DL can play a role. For example, if the IT system could learn from previous cycles and identify false positives before an alert was generated, it would be a ‘game‐changing’ factor in transaction monitoring, speeding up and increasing the accuracy of identifying the truly suspicious activity.
Digital Marketplaces Unleashed Page 45