Book Read Free

Digital Marketplaces Unleashed

Page 36

by Claudia Linnhoff-Popien


  Personalization is achieved by creating and maintaining a user profile. It includes user provided data, aggregated data from the user history, the current context, and integration of other data of relevance for the user (e. g. friend’s data and data of users with similar behavior). The data involved in creating and maintaining the user profile might disclose a lot of information about the user. The privacy concerns should therefore be of great importance in such systems.

  In this chapter, the process of creating and maintaining user profiles in a complex ecosystem with a wide range of participants is discussed. A major concern is user privacy and trust. This should include the ability of the user to influence what information about her that is shared and used. She should also herself decide when trust is established and what kind of trust that is established. The discussion includes the collection of input data to the process and approaches to process the data. The role of cryptography is also discussed.

  26.2 Personalization and User Profile

  Personalization is described as the ability to provide tailored content and services to individuals based on knowledge about their preferences and behavior [1]. With the huge amount of information and services available on the Internet, personalization has become a valuable tool for assisting users in searching, filtering and selecting items of interest. Examples of application areas include personalized search and recommendation, decision‐making, and social and collaboration networks.

  Personalization is well known in online stores and web based information systems, and is used in a wide range of applications and services including digital libraries, e‐commerce, e‐learning, search engines and for personalized recommendations of movies, music, books and news. Also, health‐care, in particular in combination with Internet of Things (IoT), has a great potential of improving by applying personalization [2]. Techniques used for such personalization include demographic filtering [3], content‐based filtering [4], collaborative filtering [5], social networks tagging [6], query and click‐through history [7], collective search trails [8], implicit relevance feedback [9, 10], and hybrid approaches [11]. Personalization generates significant revenue for the advertising industry, and is used by businesses of various types to offer tailored goods and services to their customers.

  26.2.1 User Profile

  Existing personalization strategies require construction of user profiles that identify interests, behavior and other characteristics of individual users. A user profile can be described through a number of dimensions, including personal data (such as gender, age, nationality and preferred language), cognitive style (the way in which the user process information), device information (that may be used to personalize presentation of information or to deliver information or services to the right device), context (that describes the physical environment when the user process information), history (the user’s past interactions), behavior (the user’s behavior pattern), interests (topics the user is interested in), intention (goals or purposes of the user), interaction experience (the user’s competence on interacting with the system) and domain knowledge (the user’s knowledge of a particular topic) [12].

  To construct a user profile, information can be collected explicitly, through direct user participation, or implicitly, through automatic monitoring of user activities [13, 14].

  Implicit gathering of user information traditionally includes systems that automatically infer user interests or behavior by keeping track of the user’s search history in terms of submitted queries, clicked results, dwell time on clicked documents, processing of stored documents, and harvesting of information from the user’s interaction with social applications [13]. Other types of user information can be obtained implicit from interaction with sensors or through use of short range communication (such as NFC, BLE or LTE) on a mobile device. For example, when a user touches an NFC tag to pay for a coffee in a restaurant or obtain a description of some tourist attraction, information about this activity (including location and time) can be intercepted and included in a user profile [15]. Wearable sensors can provide health‐related information about a user, such as heart rate, blood pressure or activity monitoring (for example walking or sleeping pattern), while sensors monitoring the physical environment (such as temperature, noise and air pollution measurements) provide information about the surrounding of the user.

  In explicit information gathering, the users themselves provide information through for example specification of interests, and positive or negative feedback to retrieved documents [13]. Implicit collection of profile information is a continuous process, where the current interests and behavior is constantly mined. The user profile can thus change over time and periodically be changed to reflect long lasting, short term and new user interests.

  A user profile can be represented by different structures. Well known representations are keyword vectors and semantic networks [13, 14]. Using a vector, the user interests are maintained as a set (vector) of weighted terms, while semantic networks represent user interests as nodes and associated nodes that capture terms and their semantically related or co‐occurring terms respectively [13]. Weights can also be assigned to nodes. The terms used in these models are either directly mined from the captured user information or some conceptual terms that are drawn from some knowledge source (based on the user information).

  Vector‐based user models may consist of one or multiple vectors. One vector can for example represent short‐term user interests while a second vector represents the long‐term interests [13]. One may also consider using different vectors for different purposes, such as different user contexts or applications.

  A user activity always happens in some context, and can thus be linked to for example location, time, task, project or event. To make the personalization context relevant, one should distinguish between different user contexts, and manage the user profile so that relevant parts of the profile can relate to the different contexts. This will support adaptive context‐aware personalization.

  26.2.2 Personalization and Privacy

  There is often a dichotomy between personalization and privacy. A user profile can contain highly sensitive information, and should, on one hand, be kept as a secret to guard the privacy of the user. On the other hand, a profile contains very useful information for personalizing applications and services to improve convenience, accuracy and efficiency for the user. To allow others to personalize a service for you, it is required that you give up some private and potentially sensitive information. This might take the form of shopping or movement patters, or even health related information.

  Most people are happy to give up personal information as long as their perceived benefit of the services outweighs the perceived cost of giving up said information. However, it is unclear if this pragmatic approach will be possible in the general case. An empirical study [16] shows that users exhibit a preference for personalization in their search results, but are unwilling to give out private information to achieve personalization when searching for topics they deem sensitive. From a societal perspective it is preferable if people make an informed decision about the amount and type of information they are sharing. Thus, encouraging people to give up as little information as possible, thereby ensuring that privacy is not breached and peoples’ integrity is maintained. From the perspective of a serious business, it is preferable to acquire as much information as possible, and still protect the customers’ privacy.

  Due to the dynamic interests of individual users and different privacy attitudes and expectations, privacy concerns when doing information retrieval or service selection are different for different users in different contexts [17]. The goal must be to achieve the appropriate level of privacy of all user data in context‐based personalized services, including user profiles, contex
t data and other user related information. Also, there is a need for new approaches to client‐side personalization; balancing the personalization requirements from the service providers with privacy protection requirements of the users.

  26.3 Data Collection

  Data collection is essential in the process of creating and maintaining adaptive user profiles. This involves many participants with different interests and roles. The quality (precision) and amount of data (number of data providers and frequency) will affect the quality of the outcome of the processing. We will discuss the different sources of relevant data and how the interests and roles of the different participants might influence the collection process. Examples of such data are user selection, user activities, user context, user interaction, current user devices, and a wide range of sensor data. Included in this we will present usage of Near Field Communication (NFC), Bluetooth Low Energy (BLE) beacons, and LTE Direct.

  26.3.1 Server‐Side Data Collection

  Usage information can be obtained both from the server‐side and the client‐side of a system [13]. Examples of systems where user interactions are maintained and processed on the server‐sideinclude Facebook, Amazon, and search engines like Google, Yahoo and Bing.

  The server‐side is a much used approach for collecting application specific information about users. This could be any kind of data, including user location (if the user has approved to share this information), performed search queries, when and how often the service is used, user transactions, and much more. Since the server‐side is controlled by the service provider, there is no limit on what the service provider can collect on the server‐side. But for the user to use the service, a form of trust has to be established between the user and the service provider. The combination of the value of the service (how important is the service for me), the cost (including the risk of sharing usage data), and the trust (how is my usage data used and protected) is used to determine if the condition to use the service is acceptable. Fig. 26.1 illustrates the user profile (UP) in the server side approach.

  Fig. 26.1Server side user profiles (UP)

  However, the actual participants involved in accessing a web based service is much more complex. Most advertisement funded services are connected to large advertisement networks that collect and combine usage data from a huge number of such services. The user experiences this, when searching for one thing at one service provider, then shortly after is presented an advertisement for this or similar products at the web page of another service provider.

  The main concern with the server‐side approach is the privacy concerns that come with the automatic collection and storing of user information outside the control of the user.

  26.3.2 Client‐Side Data Collection

  Client‐side user profiling and personalization has been proposed as a means for protecting the privacy of users. With client‐side user profiling, usage information is stored and processed on the user’s own device. This approach also allows for the collection and combination of usage information from a number of applications accessed by the user [13].

  A user normally has a number of different devices available, and client‐side user profiling will thus imply mining of user activities over multiple user‐devices. Profile information from different devices must be combined and distributed, so that each device holds a user profile reflecting relevant user interests and preferences that support client‐side personalization. Client‐side data collection should support storage of user profiles on the edge devices and provide privacy‐preserving distribution of profile information.

  The smart phone is a central device in many people’s lives, and is therefore a device of specific interest with respect to user profiling. It follows the user everywhere and is used for a whole range of activities, such as connecting with others, searching and sharing information, getting directions and recommendations, gaming, streaming videos/radio/music, buying products online, social network interaction, and much more. Recently, this has also included a large range of applications based on short range communication such as Near Field Communication (NFC), Bluetooth Low Energy (BLE) beacons, and LTE Direct. The importance of the mobile phone and the variety of mobile user activities makes the phone the central device for client‐side data collection and a natural basis for client‐side user profiling and subsequent client‐side personalization.

  In [15], we describe how a variety of NFC applications, accessed through a single personal device (i. e. a smart phone), collectively may provide useful information concerning user activity and interests. NFC is used in mobile applications to provide easy and convenient access to information and services. Examples of NFC‐based applications include payment and loyalty card applications, access keys, ticketing, and various forms of information services. We believe that NFC‐based services are well suited for user profiling as information can be implicitly gathered, while they also inhibit some of the preciseness of explicitly provided information. The touch of an NFC tag represents an explicit action including an implicit statement of interest.

  An NFC interaction can provide information about user activity, interest (through for example the information tags we choose to touch), physical location and date/time of activity. Identity of the user is known, as it is assumed to be the owner of the mobile device. The environment (context) of the user can be determined through the physical location, and possibly combined with sensor data in the vicinity to determine for example weather and/or pollution conditions. User context can also be described through nearby points of interest and who we interact with (through peer‐to‐peer NFC interaction).

  BLE provides another mobile phone type of interaction that could become of importance for personalization. A beacon is a small (battery powered) BLE device that could be used to detect the proximity of a mobile phone. A typical example of usage is to detect when the user (with her phone) is at or close to a store, a bus stop, or another point‐of‐interest. A service (application) on the phone can automatically present personalized offers or news related to this store when the user is close enough to the BLE beacon. BLE enabled data, such as physical location, time spent in a location (for example a specific store) and user interaction with offered services, can also be collected, analyzed and used in a user profile. BLE can also provide information of the user context in terms of available nearby services.

  LTE Direct is a combination of direct communication between LTE Direct enabled mobile phones (without using cell towers) and proximity detection. This technology is still at an early stage, but the combination of communication between devices and proximity detection gives room for the realization of a large range of applications and services. The common example is to detect and communicate with people with a common interest that are close by. With LTE Direct, only the mobile devices are involved and no server‐side is necessary. Data on LTE Direct proximity detection and communication could be of great interest in creating user profiles for personalization.

  26.3.3 Application‐Independent User Profiling

  A user profile can either be obtained within a single application or be application‐independent. The difference is illustrated in Fig. 26.2. With application‐independent user profiling, user profile information is collected from a variety of applications that collectively describe user interests and activities. The resulting user profile will in this case be made available as a basis for personalization within multiple applications.

  Fig. 26.2Application‐dependent user profiles (a) and application‐independent user profiles with client side data collection (b)

  In vertically partitioned datasets, different types of data of each user are distributed over a number of nodes. In horizontally partitioned datasets, one type of data about users is distributed over a number of nodes. For each user we have one complete user profi
le, typically stored on a user device (on the mobile phone in a single securitydomain). Other users have separate complete user profiles accessible on their devices. These complete user profiles located at each user’s secure domain are part of a horizontal partitioned data set. The other partitions in the data set are located at other users’ secure domains. Up till now, the predominant approach has been to trackuser interaction within a single application in order to support personalization within the same application. Each application hasa partial user‐profile stored in separate security domains. Such an approach is a form of vertical partitioning of user relateddata sets. To best capture a comprehensive user profile that reflects the variety of user activities and interests, we believethat a cross‐application user profiling approach is beneficial. The consequence is that user profile data will be horizontalpartitioned data sets.

  To achieve application‐independent user profiling, the involved application providers have to agree that the increased access to user profile data adds a significant value to their application, and the cost and risk of sharing this data with the other providers are reasonable.

  26.3.4 User Profiling and Privacy

  Despite the obvious conflict of interest, we believe that personalization can be combined with user privacy guarantees. This can be achieved through a privacy preserving framework handling generation, storing and controlled access to client‐side user profiles. Such a framework must manage a user’s profile so that the user is put in charge of her own private information. It should provide privacy guarantees through use of privacy preserving techniques, including cryptography, while at the same time allowing personalization and controlled sharing of personal information. In Fig. 26.2 we have illustrated this for the client side user profile management with the user controlled access control, where the user can decide what user profile data a given application can access in the current context.

 

‹ Prev