Telecoms operators all attach a high level of importance to big-data initiatives, but they currently feel neither ready to take on this task nor very willing to have someone else do it for them, though they sense the potential. Pyramid Research explains how analytics can drive innovation for telecoms providers, as well as addressing some of the challenges that need to be overcome in their implementation.
The most widely cited benefit of data analytics (DA) is its potential to create a unified view of a customer. Demand for personalisation is likely to be much stronger in countries where there is a strong preference for digitally shaped lifestyles. There is certainly a market for personalised services through DA applications, but the 'targeting' will not be as easy as is currently idealised. Studies often find evidence for users' preference for targeted advertising, but also their great frustration when such attempts fail and they receive irrelevant advertising.
There is a fine line between being accurate and intrusive. DA can carry existing customer experience management programmes beyond their focus on 'touchpoints' through three main data pillars: attributes, context and actions. Knowing all the attributes of an individual is useless if they cannot be translated into the actions of that individual in a particular context. Personalisation is done by using analysis on an aggregate level to derive conclusions and then applying them to the case of individuals who share the attributes or conditions of that aggregate.
Some level of simplification is also likely to be needed, which is perhaps a reason why factor analysis tends to be so critical to DA implementations. The biggest challenge for operators is learning how to learn - how to derive conclusions. DA analytics can spit out many interesting relationships, but selecting ones relevant to personalisation requires a skillful interpretation grounded in contextual and theoretical knowledge.
Managing the customer experience through personalisation will be a holistic effort, combining the three main types of data. To avoid superficial generalisations, the motivations and rationale behind any data that collectively describes an experience need to be subject to detailed examination, skilful interpretation and theoretical scrutiny.
Markets vary considerably in their appetite for and ability to digest innovations. One well-known rule of thumb for innovation management is to make the new product or service distinctly 'new' while keeping it sufficiently familiar to your customers' existing experience. From this perspective, customers can be overwhelmed if they are bombarded with new services at a frequency that feels unmanageable.
DA will put telecoms operators in an interesting dilemma: whom to serve with what? The decision will be between going after consumers and businesses directly through proprietary applications, and selling data or applications to others who will pursue the operator's users themselves.
It is possible to combine these in a B2C/B2B portfolio, but whether the 'killer apps' will benefit the operators or other stakeholders, such as retailers, is yet to be seen. One example of a proprietary B2B application comes from the UK, where Everything Everywhere, O2 and Vodafone jointly created Weve, a DA m-commerce platform for marketing and payments. On Weve, they will be running analytics and creating location-based applications that retailers will be able to plug into for promotions and easily convert to transactions.
As an example of an open B2B application, Facebook allows app developers to access data on the users of their particular apps and, by doing so, attracts developers but foregoes some of the analytical possibilities - some telecoms operators have access to the Facebook data of the users of their applications; Facebook finds the value of creating such 'stickiness' greater than that of keeping the data proprietary. Telecoms operators have some critical pieces of data that OTT players do not, so they will be very careful about making their datasets available commercially and in partnerships.
Some critical end-user data includes: customer data, location data, call records, web logs, traffic, and a variety of data from fixed and mobile end-user devices, such as handsets and set-top boxes. Telecoms operators and OTT providers have complementary assets but competitive instincts.
DA suggests a paradigm shift in data management, but the actual implementations should be a healthy mix of the old and the new.
Databases for DA can query and manipulate data across different and seemingly incompatible formats, such as text, video and audio. They can also store and navigate incoming data without subjecting it to the strict requirements associated with relational database management systems (RDBMS).
However, DA should not be idealised as a silver bullet, because analytical operations will still need the unstructured collected data to be maintained, prepared and supervised. To make any query means imposing a structure on data anyway. Different analytics require different representations of the data, making translation and organisation critical at some level. Critics of DA query methods argue that, in many instances, the processing of unstructured data does not produce better results than existing structured SQL queries using an RDBMS.
The most salient data management tools for the storage and processing of DA are Apache Hadoop and MapReduce, as well as NoSQL approaches that do not impose a priori relationships across data categories.
Together, these three elements provide a framework for distributed data storage and computing, as well as the software layer used for data processing and manipulation. Adoption is currently very limited, in that only the largest and most innovative companies, such as Google and Amazon, feel comfortable moving in this direction.
DA has been made possible by the commodification of storage and processing power. Advances in distributed storage and computing, as well as new algorithms related to task scheduling and resource allocation, paved the way. Thanks to cheaper hardware, shared-nothing architecture that breaks down data storage and reduces computing loads is becoming more feasible to deploy. For instance, Apache Hadoop and MapReduce are frameworks for storage and processing operations. Independent 'nodes' with back-ups mean operations run fast and with low fail rates.
Types of analysis
The magic ingredient of DA is the analytics, the methods of querying the data, regardless of whether the data is collected structured or unstructured. A distinction that a technology strategist should make when considering DA is analytics versus applications. Analytics is the variety of (often statistical) methods that can be applied to any variable. The same data appears very differently seen through different methodological lenses, and each analytical method may require slightly different representations of the same data. For instance, social media analysis methods can be used equally on user data or network data to observe a vast number of potential relationships between users or network components.
Applications are combinations of data and analytics on particular variables of interest that can be reused and translated from one context to another. Often, successful analytics cases have simply been repackaged into applications. For instance, 'recommendations' are packages that combine social network analysis and statistical methods towards the very particular end of recommending new consumption alternatives. The data to be collected and the analytical constraints are more or less predefined.
Most telecoms operators have the raw materials for DA but are far from capable of the required analytics. Although off-the-shelf applications developed by third-party software companies can be an invaluable source of learning, and sometimes differentiation, operators should invest in their own analytical capabilities.
Although social network analysis has the word 'social' in its name, it does not have to deal with social phenomena. Anything can be represented as a node in a social network diagram and linked to others by a relationship - for example, people connected on a social network site through friendship links, active network components linked by data paths, movies that are connected through their common audiences, or people who call or message each other.
Networks are analysed by certain indicators, such as the centrality of a node or the density of a network, that can produce insights into real-world phenomena, such as popularity and innovation diffusion. Moreover, the nodes and links can have attributes - the age of the node, or the intensity or the direction of links, for instance - that can be brought into the analysis.
Automated crawlers and text coders are available to track the appearance and frequency of certain words or phrases. This is, in some contexts, also called text mining. Text-coding software can be provided with certain words and phrases to look for; some off-the-shelf software packages have predefined vocabularies for tone and sentiment analysis.
Manual content analysis methods, such as sorting and open coding, are used in the early stages of customer experience management programmes, which can later be automated through software packages.
Once frequencies of mentions are captured, statistical tests can be run on them. There are software packages available, depending on data sizes and the level of analytical sophistication. Autonomy, IBM SPSS, Atlas.ti, NVivo, EZ-Text, Clarabridge and Aerotext are but a few examples at different levels of sophistication.
Association rule learning is a data-mining method that targets large datasets to find interesting common occurrences. The method is known for its application to 'frequent items' in consumer baskets and the design of shelf spaces. Items in purchase transactions can be highly associated if they are often bought together. Examples would be toilet paper bought with kitchen towels, or beverages bought with burgers. The method has well-developed search algorithms that can find associations with little guidance from the analyst. It can be conducted across any variables, not just frequent items, where associations are expected. To make outputs manageable, analysts ascribe a cut-off to a certain strength of association before beginning.
Multidimensional scaling (MDS) is another way of finding relationships between data without imposing expectations a priori. MDS often represents similarities and differences between things by displaying distances across categories of interest. The closer two points are on an MDS map, the more similar they are. The visual representation is indicative only. The names of the axes in an MDS map are never given - they are assigned by those who see and interpret the map.
MDS has a strong tradition in marketing departments, which use it to observe and interpret clusters of brands. Any set of variables can be mapped in terms of their 'similarities' by using correlations as a measure of closeness.
Factor analysis is a method that takes myriad variables belonging to a dataset and groups them into a number of factors by computing high and low correlations between them. Those that are highly correlated can be considered to be driven by a similar latent factor. In marketing and management research, factor analysis is often used to reduce the number of variables to track in order to simplify the data requirements of complex settings. Consumer preference correlations can also be used to identify larger 'preference styles'.
Fitness landscapes are used to describe a set of problem constraints and to test various solution models in order to see which one performs the best. They can be used to identify optimal search patterns or find the shortest path, for example. Their findings are already deployed in network engineering and traffic management.
The broad family of statistical tools known as multivariate statistics is behind some of the analytics already discussed, such as MDS and factor analysis, but it actually encompasses a much richer portfolio. Neither more traditional statistical methods such as regression, likelihood tests or principal component analysis, nor an understanding of the distribution of data will become irrelevant with DA.
One of the biggest challenges of DA implementation is related to the lack of professionals with the necessary skill sets. DA is not solely about statistics and for geeks. A new profession that has come to be known as data science demands the currently rare mix of:
- Solid expertise in novel methodological skills. Content analysis, social networks, genetic algorithms or association rule learning may fall outside the comfort zone of a traditional multivariate statistician.
- Strong theory-driven knowledge to identify meaningful relationships from spurious correlations. Because some of the findings are so unpredictable, interdisciplinary backgrounds are likely to be favoured. Fields as varied as sociology, marketing, innovation management and physics will be valuable.
- A simultaneous interest in pursuing creativity and solving optimisation problems. The simplified dichotomy of the artist versus the engineer is merging in the data scientist. You need an artist to want to face the messy, chaotic data influx but also an engineer to display methodological rigour and a willingness to impose structure on the data.
Not only do software and internet companies known for business analytics now have established educational programmes around DA, but highly reputable academic institutions are also embracing this new wave of information management and analytics. MIT, Columbia University, the University of Oxford, Syracuse University, the University of Virginia and many more academic institutions include DA in their educational curricula.
Applications designed with DA findings generate revenue. DA implementations in organisations, however, need to be viewed as investments in organisational learning and in R&D processes. Telecoms operators are usually known for outsourcing their R&D needs to players at different layers of the value chain, such as smartphone vendors or network equipment vendors. Because of their persistent lack of independence, some operators are today sandwiched by the more innovative layers of applications - OTT services and apps - and devices, such as smartphones. For struggling operators, the decision to invest in DA will be a painful one to make.
How should DA be positioned in the larger context of a telecoms operator? It can be viewed as either an occasional source of data and analytics or as an organisation's 'brain' that receives signals from different units and issues adequate responses.
The promise of DA lies in the brain approach, because DA aims to bridge the data of previously disconnected organisational locales. If DA is to be a central function, it may disrupt existing organisational arrangements. Delegating departmental data use and analytical authority may be seen as a loss of control. Will each department have its own data scientist? Will there be a centralised approach to DA with data scientists acting as departmental liaisons? Who will be responsible for new product creation? How will DA investments be justified when the returns are prone to being difficult to substantiate and distributed across multiple units? In addition, how will DA operations be accounted for in terms of costs, revenue and reporting? Although DA implementations eventually make it easier to deal with unstructured data, the level of cross-analytics across previously isolated data silos will pose severe initial integration challenges.
There are three progressively more complex areas where boundary management across organisational units will be essential to any successful implementation: units may not wish to bridge their conflicting interests - it may mean a relinquishing of control, a compromise over objectives or a new allocation of resources; datasets may be standardised, but their contextual meanings may be entirely different across units, leading to misunderstandings; and different units may use different languages, data structures, standards and formats that cannot be immediately reconciled.
Regulation and privacy
Perhaps the biggest regulatory concerns around DA will not appear until its applications are implemented. Given that most DA applications are designed so that specific conditions trigger customer interactions, they can create a sense of surprise and privacy invasion on the customer side, even if they are 100% accurate. The policies around privacy are still evolving, but the areas of concern have become clear:
- Users' method of consent to personal data disclosure: opt-in versus opt-out? Ease of opting out? Clarity of conditions while opting in?
- Users' actual knowledge of how data will be used: who is a third-party partner? What anonymity rules apply?
- Education and awareness of data use: what are the responsibilities of organisations to educate data revealers? How much should be left to self-education?
- Enforcement of privacy policies: how is data collection monitored? How are user interests protected? Are there any unfair lock-ins or service discrimination due to data revelation? How do local and global differences align in privacy policies?
Basic demographic data has become a commodity, but analytics are still not replicable. Thousands of data points are being sold in the US for a few dollars; differentiating data is more valuable. Data on key events, such as pregnancies and birthdays, are worth more than basic demographics. Telecoms operators are likely to find synergies with other valuable-data owners, such as OTT service providers and mobile apps. Synergistic collaborations are difficult to evaluate up front. Sharing data and analytic capabilities, and developing new services through collaboratively generated insights can be susceptible to contractual hazards.
Applications that collect vast amounts of data about people and their behaviour, such as Foursquare and Facebook, are increasingly interested in opening parts of their data to operators that are willing to use it for their own analytics and applications. At the moment, users are loudly frustrated that service providers 'own' a lot of data about them, but the willingness and knowledge to actually change the situation is not yet there.
DA is not only about collecting and making sense of massive amounts of data; it is also about creating services in which customer-facing actions are triggered by predefined events or conditions. Recognising trigger conditions and servicing them differently is very likely to violate net neutrality, because doing so should involve packet inspection and traffic management. Net neutrality is a controversial topic still unresolved as a universal principle. Issues related to security, accountability, taxation and competitiveness force governments to take different stances towards net neutrality. The European Commission is a strong advocate of net neutrality but, in practice, appears to tolerate non-neutral behaviour among operators.
In emerging markets, OTT is seen as a threat to the telecoms operators, which are under less growth pressure than those in developed markets. Governments want to protect their growth industries, so they put different types of pressure on OTT communications, such as fees in China and application blocking in the Middle East.
DA applications will be harder to develop in emerging markets due to a lack of data and skills, but regulatory environments in these markets will be more relaxed for telecoms operators as they violate net neutrality.
Net neutrality is not expected to be an obstacle to value-creating applications generated and triggered by DA applications, but operators may need to find delivery models that evolve with shifting regulatory conditions on net neutrality.