The five key elements of big data these days are volume, velocity, variety, variability and complexity of data. Whether it is a large enterprise or a small organisation, voluminous data is generated on a day-to-day basis. This complex data can be sorted and grouped to reveal patterns, trends, and associations, especially relating to human behaviour and interactions. This, in technical language, is known as big data and Hadoop is the key to it. This data can be structured, unstructured or semi-structured.
Big Data has become extremely important for the mobile cellular networks. Understanding the three key concepts of the Big Data can benefit the mobile networks.
- Big Volume: These days there are countless individual information units, such as information on the user equipment side or management information for base stations. As the number of base stations in a network amounts to tens of thousands and each of the millions of users create a call, it is extremely difficult to trace the amount of data generated.
- High Velocity: The above-mentioned data changes rapidly over time. This calls for elaborate methods for it to be collected, stored and processed.
- High Variety: The data is rarely of the same format and usually needs to be pre-processed. On the many different levels of a cellular network, data might be counters, sensor values or simple true or false statements describing the status of a certain indicator.
The vast amount of data generated for the management of mobile cellular networks exceeds the capabilities of experts tasked with manual network operations.
The potential for companies that apply data science effectively is substantial. For instance, a service provider used analytics models to predict the periods of heaviest network usage arising from video streaming. It subsequently took targeted steps to relieve congestion during those times, reducing its planned capital expenditures by 15 percent. Another provider had a machine-learning model that combined socio-demographic data, information from customer touchpoints (such as call centers and social media), and data on network usage. It was able to identify, in real time, the customers most likely to defect or have trouble paying their bills. To achieve similar results, other telecom companies could start by mapping out the wealth of data at their disposal and their opportunities to exploit it.
A recent approach is towards self-organizing networks which facilitate the shift from reactive management, where a network manager observes key performance indicators to proactive interaction and where automated reports and notifications are created through the use of machine learning techniques. This means the automatic detection of network states through its performance indicators. Self-organization, as applied to cellular networks, is referred to SON (Self Organising Networks) and it is a key driver for improving OAM (Operations, Administration and Maintenance) activities. SON aims at reducing the cost of installation and management by simplifying operational tasks through the capability to configure, optimize and heal itself. The main objective of SON is to reduce the costs associated with network operations, i.e., Capital Expenditure (CAPEX) and Operational Expenditure (OPEX).
We believe that this need of automation will be further enhanced with the expected complexity that future 5G network management will have to handle. The current cellular networks already generate a huge amount of data that if properly stored and managed could bring new insights into how the networks work. The correlation of data can also improve network management by taking into account the experience that can be gained from this data. The question that now arises is what information lies within this unfathomable amount of data? What insights can be derived and used for assurance of quality of service and quality of experience in a timely manner?