Self-Organizing Networks (SON) add automation to the Operation and Maintenance of mobile networks. Self-healing is the SON function that performs automated troubleshooting. Among other functions, self-healing performs automatic diagnosis (or root cause analysis), that is the task of identifying the most probable fault causes in problematic cells. For training the automatic diagnosis functionality based on support-decision systems, supervised learning algorithms usually extract the knowledge from a training set made up from solved troubleshooting cases. However, the lack of these sets of real solved cases is the bottleneck in the design of realistic diagnosis systems. In this paper, the properties of such troubleshooting cases and training sets are studied. Subsequently, a method based on model fitting is proposed to extract a statistical model that can be used to generate vectors that emulate the network behavior in the presence of faults. These emulated vectors can then be used to evaluate novel diagnosis systems. In order to evaluate the feasibility of the proposed approach, an LTE fault dataset has been modeled, based on both the analysis of real cases collected over two months and a network simulator. In addition, the obtained baseline model can be very useful for the research community in the area of automatic diagnosis.