With the advent of Long-Term Evolution (LTE) networks and the spread of a highly varied range of
services, mobile operators are increasingly aware of the need to strengthen their maintenance and
operational tasks in order to ensure a quality and positive user experience. Furthermore, the co-
existence of multiple Radio Access Technologies (RAT), the increase in the traffic demand and the need
to provide a great variety of services are steering the cellular network toward a new scenario where
management tasks are becoming increasingly complex. As a result, mobile operators are focusing their
efforts to deal with the maintenance of their networks without increasing either operational
expenditures (OPEX) or capital expenditures (CAPEX). In this context, it is becoming necessary to
effectively automate the management tasks through the concept of the Self-Organizing Networks (SON).
In particular, SON functions cover three different areas: Self-Configuration, Self-Optimization and Self-
Healing. Self-Configuration automates the deployment of new network elements and their parameter
configuration. Self-Optimization is in charge of modifying the configuration of the parameters in order to
enhance user experience. Finally, Self-Healing aims reduce the impact that failures and services
degradation have on the end-user. To that end, Self-Healing (SH) systems monitor the network elements
through several alarms, measurements and indicators in order to detect outage and degraded cells,
then, diagnose the cause of their problem and, finally, execute the compensation or recovery actions.
Even though mobile networks are become more prone to failures due to their huge increase in
complexity, the automation of the troubleshooting tasks through the SH functionality has not been fully
realized. Traditionally, both the research and the development of SON networks have been related to
Self-Configuration and Self-Optimization. This has been mainly due to the challenges that need to be
faced when SH systems are studied and implemented. This is especially relevant in the case of fault
diagnosis. However, mobile operators are paying increasingly more attention to self-healing systems,
which entails creating options to face those challenges that allow the development of SH functions.
On the one hand, currently, the diagnosis continues to be manually done since it requires considerable
hard-earned experience in order to be able to effectively identify the fault cause. In particular,
troubleshooting experts thoroughly analyze the performance of the degraded network elements by
means of measurements and indicators in order to identify the cause of the detected anomalies and
symptoms. Therefore, automating the diagnosis tasks means knowing what specific performance
indicators have to be analyzed and how to map the identified symptoms with the associate fault cause.
This knowledge is acquired over time and it is characterized by being operator-specific based on their
policies and network features. Furthermore, troubleshooting experts typically solve the failures in a
network without either documenting the troubleshooting process or recording the analyzed indicators
along with the label of the identified fault cause. In addition, because there is no specific regulation on
documentation, the few documented faults are neither properly defined nor described in a standard
way (e.g. the same fault cause may be appointed with different labels), making it even more difficult to
automate the extraction of the expert knowledge. As a result, this a lack of documentation and lack of
historical reported faults makes automation of diagnosis process more challenging.
On the other hand, when the exact root cause cannot be remotely identified through the statistical
information gathered at cell level, drive test are scheduled for further information. These drive tests aim
to monitor mobile network performance by using vehicles to personally measure the radio interface
quality along a predefined route. In particular, the troubleshooting experts use specialized test
equipment in order to manually collect user-level measurements. Consequently, drive test entail a hefty
expense for mobile operators, since it involves considerable investment in time and costly resources
(such as personal, vehicles and complex test equipment). In this context, the Third Generation
Partnership Project (3GPP) has standardized the automatic collection of field measurements (e.g.
signaling messages, radio measurements and location information) through the mobile traces features
and its extended functionality, the Minimization of Drive Tests (MDT). In particular, those features allow
to automatically monitor the network performance in detail, reaching areas that cannot be covered by
drive testing (e.g. indoor or private zones). Thus, mobile traces are regarded as an important enabler for
SON since they avoid operators to rely on those expensive drive tests while, at the same time, provide
greater details than the traditional cell-level indicators. As a result, enhancing the SH functionalities
through the mobile traces increases the potential cost savings and the granularity of the analysis. Hence,
in this thesis, several solutions are proposed to overcome the limitations that prevent the development
of SH with special emphasis on the diagnosis phase. To that end, the lack of historical labeled databases
has been addressed in two main ways. First, unsupervised techniques have been used to automatically
design diagnosis system from real data without requiring either documentation or historical reports
about fault cases. Second, a group of significant faults have been modeled and implemented in a
dynamic system level simulator in order to generate an artificial labeled database, which is extremely
important in evaluating and comparing the proposed solutions with the state-of- the-art algorithm. Then,
the diagnosis of those faults that cannot be identified through the statistical performance indicators
gathered at cell level is automated by the analysis of the mobile traces avoiding the costly drive test. In
particular, in this thesis, the mobile traces have been used to automatically identify the cause of each
unexpected user disconnection, to geo-localize RF problems that affect the cell performance and to
identify the impact of a fault depending on the availability of legacy systems (e.g. Third Generation, 3G).
Finally, the proposed techniques have been validated using real and simulated LTE data by analyzing its
performance and comparing it with reference mechanisms.