A model for predicting court decisions on child custody

Awarding joint or sole custody is of crucial importance for the lives of both the child and the parents. This paper first models the factors explaining a court’s decision to grant child custody and later tests the predictive capacity of the proposed model. We conducted an empirical study using data from 1,884 court rulings, identifying and labeling factual elements, legal principles, and other relevant information. We developed a neural network model that includes eight factual findings, such as the relationship between the parents and their economic resources, the child’s opinion, and the psychological report on the type of custody. We performed a temporal validation using cases later in time than those in the training sample for prediction. Our system predicted the court’s decisions with an accuracy exceeding 85%. We obtained easy-to-apply decision rules with the decision tree technique. The paper contributes by identifying the factors that best predict joint custody, which is useful for parents, lawyers, and prosecutors. Parents would do well to know these findings before venturing into a courtroom.


Introduction
The most recent data in the US accounted for 2,015,603 marriages and 746,971 divorces annually [1], while in Europe, there were 1,950,935 marriages and 834,068 divorces [2]. After separation or divorce, joint physical custody is increasingly common in many Western societies. This is a parental care arrangement in which a child lives with each parent 25-50% of the time [3]. Joint custody is likely to be beneficial to children on average, which justifies recommending it [4], although the economic repercussions are not negligible and some parents fight for custody to avoid paying child support [5]. However, before engaging in expensive litigation, it would be good for parents to have an idea of how likely it is that they will win the lawsuit. Our paper aims to identify the factual elements that determine a court's decision to choose joint or sole custody, relate them to the legal principles applied in the judgments, and develop a predictive model capable of forecasting judicial decisions from a set of factual findings.
Our first research question aims to find the factors that explain court rulings on child custody. The best interest of the child principle stands out among the legal principles that influence a court's decision [6]. Family systems theory argues that a proper custody decision requires an evaluation of the entire family and its relationships [7]. Therefore, legal principles concerning parents, such as the principle of equality between parents and proportionality in responsibility, should be relevant [8]. It is therefore important to know the parents' relationship and attitude, the parents' readiness, including the economic resources of both parents, and their previous dedication to childcare. One fact that seems crucial is the parents' agreement on the type of custody, if any. Among other theories, therapeutic justice justifies support for joint or sole custody depending on what is in the best interest of the child. This theory focuses on the impact of the law on the psychological well-being of individuals, but without privileging therapeutic outcomes over due process or other constitutional and related values [9]. Thus, it is frequent that the child is asked for his or her opinion, while the child's circumstances and the child's background are also considered as factual findings that influence the decision on the type of custody [9]. Furthermore, the judge can rely on a psychological report [10]. This research question goes beyond identifying the factual findings and tries to understand the judge's reasoning and its relationship to the facts. To this end, we developed explanatory models using linear and logistic multivariate regressions.
Already in 1897, Oliver Wendell Holmes, the father of legal realism, claimed that law must be predictive, but one may still wonder to what extent justice is predictable in practice [11]. Our second research question aims to develop a forecasting model. There is extensive literature on legal judgment prediction, for example, on forecasting criminal sentencing decisions [12] and on the prediction of decisions made by the Supreme Court of the United States [13][14][15][16]. Explanatory factors for child custody decisions were also studied [10,17,18], but as far as we know, no predictive models have yet been developed to forecast child custody decisions. We applied logistic regression models as benchmarks and other data mining tools with high predictive capacity, such as neural networks. In this research question, we also aim to obtain decisional rules through decision trees. This technique can be very useful not only in predicting court sentences but also in explaining them [15].
We conducted an empirical study with 1,884 Spanish court rulings on child custody. Our models predict the court's decision with an accuracy exceeding 85%. In the case under analysis (second instance appeals), the justice system agrees with the petitioner only 17.8% of the time. Interestingly, the decision tree detected situations with both extremely high and low probabilities of winning at trial. The latter is a sure loss of money for the litigants and a time-consuming process for the overburdened court system. Widespread use of legal decision support systems would help minimize the asymmetry of information, which is so negative for the justice system [19]. Decision systems such as the one presented in the paper could help alleviate pressure on the justice system, as many parents would avoid going to court and opt for out-of-court settlements.
This study contributes to the literature in many ways. The factors that explain child custody decisions were studied by other authors [17,18,20], but we developed a predictive model, which was tested using temporal validation. This means that the test sample comes from a period after the training sample: this is the appropriate validation method to test predictive results over time [21]. It recalls a real-world situation in which a lawyer estimates the model from the most recent information available and tests the model with new cases. We provide performance measures such as accuracy, sensitivity, and specificity. We provide heuristics in the form of easy-to-apply decision rules, which is a significant practical tool for a lawyer to use to prepare their case for trial. Although court decisions have been analyzed using decision trees [15], no previous research applied decision trees to the study of child custody to the best of our knowledge. Finally, our paper is not limited to an empirical exercise but identifies the underlying legal principles, deepening the causal reasoning behind a judicial decision.

Legal judgment prediction
Decision-making support systems in the field of justice may take many forms, including computer-assisted legal research [22], expert systems that explain court decisions based on argumentation mining [23], systems for predicting crime [24] and recidivism of juvenile offenders [25], big data tools to help regulators pass appropriate laws by predicting their outcomes [26], and systems that predict judicial decisions [27]. Legal judgment prediction tries to identify factual elements that influenced past court decisions to correctly predict the decisions of new cases for a specified legal problem. As a research field, it has been active since the late 1950s [28], and today, it has great potential due to advances in the natural language processing of judgments [27] and data mining techniques applied to court decision forecasting [15].
The predictor variables depend on the chosen theoretical framework. Up to nine judicial decision-making theories were described [34], the most extreme of which are legalism and attitudinalism. Legalism is often regarded as the 'official theory' of judicial behavior: judges make decisions exclusively applying the so-called 'rule of law' with intellectual rigor. Hence, advocates of legalism use factual findings as explanatory variables, because the judge bases his decision on them [28]. On the other side, advocates of the attitudinal theory argue that judges' decisions are best explained by personal factors such as emotions, opinions, and political preferences. Some empirical studies incorporated the political preferences of the court and the pressures of interest groups as predictor variables [30]. Other studies derive from psychological theories and even used emotional arousal (measured by a vocal pitch) to predict a jury's vote [13]. Katz et al. [14] used as many as 240 variables, including chronological variables, case history variables, justice-specific variables, and outcome variables. These last authors used data from the United States Supreme Court Database to forecast 240,000 justice votes over nearly two centuries (1816-2015). Accuracy rates ranged from 71.9% [14] to 75.0% [15,16].
As for the techniques used to predict judicial decisions, the first studies used logistic regression and other multivariate statistical techniques [29]. More recently, studies have used neural network models [35] and techniques based on decision trees [16], including AdaBoosted decision trees [15] and random forest classifiers [14]. Ruger et al. [16] forecasted Supreme Court decisions by comparing two methods: the first was a decision tree model that relied on general case characteristics, while the second was based on a set of predictions made by legal specialists. The statistical model outperformed the experts and was particularly good at forecasting economic activity cases, while the experts did comparatively better in the judicial power cases. It is common to compare the results of the most advanced techniques with those obtained with logistic regression [15] or to compare several techniques with each other [35].

Joint custody of a child
The Declaration of the Rights of the Child [36] marked the point at which children began to be seen as holders of rights independent of those of their parents, giving substance to the principle of the best interests of the child. This principle governs decisions on custody in cases of separation or divorce, providing the solution that is least traumatic for the child and ensuring that the best conditions for its development should always be sought [6]. However, the principle of the interest of the child is highly indeterminate, and its application depends on multiple factors. Until the 1960s, the 'tender years' doctrine predominated in jurisprudence, which considered that there was a biological superiority of the mother over the father, making her preferable for the custody of children [37]. Gradually, positions in favor of shared custody gained acceptance, arguing that it was the modality that, for the child, most resembled the situation before the separation. Joint custody is now a global trend whose antecedent is gender equality, which can be seen in women's labor participation and fathers' involvement in childcare [3]. Joint custody currently represents between 10% and 30% of arrangements in Western countries, reaching 35% in Sweden [38] and 38% in Spain [39].
The effects of joint custody on children and parents have been widely studied [40,41]. A recent meta-analysis on 60 studies found that joint custody had better outcomes according to all measures in 34 studies, equal outcomes on some measures, and better outcomes on other measures in 14 studies, with very few cases having worse outcomes [41]. However, these studies have a methodological flaw: they do not come from a randomized control trial, because there will never be an instance of judges assigning the custody of children at random [4]. The expected advantages of shared custody are explained by bonding and monitoring theories [42]. The former argues that parents allow themselves to grow more attached to their children when they do not fear a complete break with them in case of divorce. Monitoring theories ensure that the parents minimize the problem of agency costs, because they know how their financial contributions are spent and assume their commitments responsibly. The type of custody not only impacts the child and the parents but also affects society as a whole. The latter is because the effects of adopting joint custody may include changes in marriage rates, overall fertility, and divorce rates [5]. Table 1 describes the variables used in the present study. The dependent variable is the decision made by the court (DEC_JOINT). This is a dummy variable that takes the value 1 if joint custody was decided and 0 in the case of sole custody. What the plaintiff asked for can be considered another dummy dependent variable (RQ_JOINT). The third dependent variable is also a dummy variable that measures whether or not the plaintiff won the lawsuit (WINNER). The independent variables are the factual elements, which deal with the circumstances of the child, the parents, and their relationships [6,8,10,43]. Among the child's circumstances, the opinions and wishes expressed by the child are usually taken into account (CHILD_OPIN), as well as the psychophysical circumstances of the child (CHILD_PSY) and the child's adjustment or background (CHILD_ROOT). Concerning the parents, we considered their relationship and attitude towards their obligations (PAR_RELAT), their availability, including schedules and financial resources (PAR_RDNS), their dedication during cohabitation (PAR_DED), and the agreements and conventions between parents (PAR_AGREEM). Finally, we considered the content of the psychosocial report, if any (PSY_REP).

Sample and data
We considered four legal principles used to understand the judge's reasoning and its relationship to the aforementioned factual elements [6,8,10,43]. The best interest of the child (BES-T_INT) stands out, which, depending on the circumstances of the case, can operate in favor of both sole and joint custody [6]. Other principles include the principle of equality between parents (PAR_EQL) and the principle of proportionality in the assumption of burdens (PRO-P_RESP), which concerns how each parent should bear the expenses of the children in proportion to their ability to do so [8]. Finally, the principle of res judicata (RES_JUD) was included, which the judge applies when the intention is to modify previously established conditions without a substantial change in circumstances.
The sex of the plaintiff was included as an independent variable (PLAINTIFF_SEX), because this variable was found to be relevant in child custody decisions [44]. Currently, in Spain, mothers obtain the majority of sole custody arrangements, obtaining child custody 58% of the time, while men obtain sole custody 4% of the time and shared custody occurs in the

Dependent variable
DEC_JOINT Dummy variable indicating the type of physical custody decided by the court (1 = joint custody; 0 = sole custody).
RQ_JOINT Dummy variable indicating the type of physical custody requested by the plaintiff (1 = joint custody; 0 = sole custody). other 38% of cases [39]. The sex of the judge was also included (JUDGE_SEX), because statistically significant differences were found in other contexts [45]; however, we do not expect them to occur in our study. We also studied the role of territorial legislation. Spain is divided into seventeen autonomous communities, four of which have adopted their own civil legislation on custody. We created a dummy variable (FAVOR_JOINT) that assigns a 1 to rulings issued by courts in territories that established joint custody as preferential (Aragon, Basque Country and Catalonia) and a 0 to the rest of Spain. In Aragon, Law 2/2010 established joint custody as a preferential modality, but Law 6/2019 eliminated this preference and equated both modalities. In the Basque Country, Law 7/2015 established the preference for joint custody, which is still in force. The Civil Code of Catalonia (Law 25/2010) also established the preference for shared custody, although it was more ambiguous in its pronouncements.

WINNER
The court sentences were taken from the Spanish Judicial Authority Documentation Center (CENDOJ), a body that compiles and disseminates the jurisprudence of the Supreme Court and other Spanish courts. This database is freely accessible to the public. We downloaded 1,884 child custody rulings from June 2016 to June 2020, of which 1,134 (60.2%) were sole custody and 750 were joint custody (39.8%). They are second instance appeal judgments. That is, after a divorce without an agreement, or when one of the parents requested a change in the pre-existing custody modality, the first verdict issued by a judge was not accepted by one of the parties and was appealed. Fig 1 shows a flow diagram of the divorce procedure and the content of the second instance sentence, which is common in both cases.

Method and procedure
The research team read and labeled the contents of each court sentence, identifying the factual elements and legal principles. Understanding legal language requires expertise in legal matters and two researchers (law graduates) were chosen to label each of the court sentences. This task can be subjective, and the two researchers independently labeled each of the court sentences to minimize bias. Although the criteria were previously agreed upon and the degree of coincidence was high, numerous disagreements arose in the identification of the factual elements and legal principles. Therefore, a third person, the leading researcher, solved the dubious cases. This ensured the quality of the process.
The labeling process was time-consuming and difficult. On average, each court sentence had 2,093 words. On average, we identified 11.46 factual findings and 1.98 legal principles in each sentence. It took about 24 minutes to label each court sentence. Fig 1 also shows the contents of a court sentence on child custody. A court sentence is made up of four parts: (1) the header contains the details of the court, the parties involved, and the professionals who represent and defend them; (2) the factual background explains the factual basis of the decision; (3) the legal grounds contains the legal argument; and (4) the verdict contains the court's decision. It is important to note that the court may mention some facts alleged by the parties in the factual background, but they may not be taken into account by the court, and hence they must not be labeled. That is, a fact is relevant to the analysis of the argument when it is mentioned in the legal grounds and used as part of the legal argument.
A court sentence may refer to the same factual element several times, sometimes with arguments in favor and sometimes with arguments against. Table 2 shows two examples of phrases for each of the factual elements and legal principles, one favoring joint custody and the other favoring sole custody. Therefore, when labeling each phrase, the meaning was taken into account: if it spoke in favor of joint custody, it was recorded with a positive value, while if it spoke in favor of sole custody, it was recorded with a negative value. Thus, the final value of each variable was obtained by adding up the number of occurrences in favor of joint custody and subtracting the number of occurrences in favor of sole custody. Therefore, the factual findings and legal principles in Table 1 are not dummy variables but quantitative variables. Table 3 provides an overview of the summary statistics for each independent variable for the two groups of court sentences (joint and sole custody) according to the judge's decision  Table 2. Examples of phrases identified in the court sentences for each of the factual elements and legal principles.

PSY_REP
In favor of joint custody The report prepared by the psychosocial technical team concludes by advising the regime of shared custody. (El informe elaborado por el equipo técnico psicosocial concluye aconsejando el régimen de custodia compartida) In favor of sole custody The psychosocial team's report considers it necessary and timely for the mother to have custody of her daughters. (El informe del equipo psicosocial valora necesario y oportuno que la madre tenga la guarda y custodia de sus hijas)

In favor of joint custody
The son has expressed his desire to be able to be equally with both parents. (El hijo ha manifestado su deseo de poder estar por igual con ambos progenitores) In favor of sole custody The child does not wish to have contact with her father. (La menor no desea tener contacto con su padre)

CHILD_PSY
In favor of joint custody The upcoming age of majority of the minor advises a last effort in order to relaunch and improve the parent-child relationship through the shared custody regime. (La cercana mayoría de edad de la menor aconseja un último esfuerzo en orden a relanzar y mejorar la relación paterno-filial a través del régimen de custodia compartida)

In favor of sole custody
In short, the child is happy and perfectly integrated socially, in their family, and academically, all of which advises at this time maintaining current custody in favor of the mother. (En definitiva, el niño se encuentra feliz y perfectamente integrado social, familiar y escolarmente, todo lo cual aconseja en este momento mantener la guarda y custodia actual en favor de la madre)

In favor of joint custody
Both minors have an affective bond with both parents and value contact, communication, and staying with both parents positively. (Ambos menores tienen vinculación afectiva con ambos progenitores y valoran positivamente el contacto, comunicación y estancia con ambos)

In favor of sole custody
The ties established with the mother are closer than those established with the father. (Son más estrechos los vínculos establecidos con la madre que los que tiene con su padre)

In favor of joint custody
It is ruled out that there is a level of conflict between the parents that could be an insurmountable obstacle or inconvenience, so that said shared custody regime can function properly. (Se descarta que haya un nivel de conflictividad entre los padres que pueda ser un óbice o inconveniente insalvable para que dicho régimen de guarda y custodia compartida pueda funcionar adecuadamente)

In favor of sole custody
This poor relationship that exists between parents hinders the proper development of joint custody. (Esta mala relación que existe entre los progenitores dificulta el desarrollo adecuado de la guarda y custodia compartida)

In favor of joint custody
The proximity of the domiciles of the litigants also favors the judicial regulation of the alleged shared custody. (Igualmente propicia la regulación judicial de la pretendida custodia compartida la circunstancia de la proximidad de los domicilios de los litigantes)

In favor of sole custody
The professional activity of the father is incompatible with shared custody. (La actividad profesional del padre resulta incompatible con un régimen compartido de guarda y custodia)

In favor of joint custody
Both parents, depending on each parent's work schedule in each time period, devoted time to the care and attention of their daughter. (Ambos progenitores, en función de los horarios de trabajo de cada uno en cada periodo temporal, dedicaron tiempo al cuidado y atención de su hija) In favor of sole custody The mother has been the main caregiver and currently has a job that allows her to take care of the girl. She seems to be the most suitable parent to assign custody of the child. (Habiendo sido la madre la cuidadora principal y contando en la actualidad con un empleo que le permite hacerse cargo de la niña parece el progenitor más adecuado para asignarle la guarda de la menor) (DEC_JOINT). The table reports the mean, standard deviation, and minimum and maximum values for both groups. An independent samples t-test was conducted to determine if the composition of the factual elements and legal principles differed for the two possible decisions of the court (joint and sole). As expected, a court sentence that decides on joint custody includes on average more phrases that refer to factual elements that favor joint custody than phrases that favor sole custody. The differences are statistically significant for all factual elements and legal principles. Table 4 is similar to the previous table but uses the request for joint custody and sole custody as a grouping variable (RQ_JOINT). It should be remembered that these are second instance appeal proceedings; therefore, they were already subject to a judgment, and the plaintiff is asking for a new trial. The results show a negative relationship between the petition and In favor of sole custody The amount of maintenance must be proportional to the means at the disposal of the maintainer, taking into account the financial capacity of the noncustodial parent. (La cuantía de la pensión de alimentos debe ajustarse a criterios de proporcionalidad entre los medios con los que cuenta el alimentante y las necesidades del alimentista, para lo que se hace preciso tener en cuenta la capacidad económica del progenitor no custodio)

In favor of joint custody
In short, the existence of a change in circumstances that justifies the alteration of the shared custody regime established in the previous judgment cannot be admitted. (En definitiva, no puede admitirse la existencia de un cambio de circunstancias que justifique la alteración del régimen de guarda y custodia compartida fijado en la sentencia anterior)

In favor of sole custody
Submission to consideration of the opportunity for a shared custody regime cannot be reconsidered unless there is a substantial change in circumstances.
(No cabe replantear el sometimiento a consideración de la oportunidad de establecimiento de un régimen de custodia compartida, si no se da una alteración sustancial de las circunstancias) In italics, the original in Spanish.
https://doi.org/10.1371/journal.pone.0258993.t002 the factual elements. That means that petitions for joint custody include on average more labels referring to factual elements that favor sole custody than labels favoring joint custody. Again, the differences are statistically significant for all variables. Table 5 presents contingency tables that show the relationships between requests, court decisions, the sex of the plaintiff and the sex of the judge. Contingency tables were analyzed statistically using Pearson's Chi-square test statistics. The association was estimated with odds ratios (OR) and their respective 95% confidence intervals. Panel A in Table 5 presents the  relationship between what was demanded and success in winning or losing the trial. Plaintiffs only win an average of 17.8% of child custody trials. Those who request joint custody are 1.74 times more likely to win than those who request sole custody. The results are statistically significant. Panel B in Table 5 presents the relationship between the sex of the appellant and the claims. Most men request joint custody, while most women request sole custody. Female plaintiffs are 29.35 times more likely to request sole custody than joint custody. The differences are very large and statistically significant. In the first instance in Spanish courts, the most frequent rulings are those that grant joint custody to the mother, which partly explains why it is men who appeal in the second instance, and why they are asking for joint physical custody. Panel C in Table 5 presents the relationship between the sex of the appellant and winning or losing the trial. Male plaintiffs are 1.5 times more likely to win than female plaintiffs. The differences are statistically significant. It is, therefore, appropriate to differentiate between the requests made by men and women, so we split the sample into two subsamples, according to the sex of the applicant. Panel D in Table 5 presents the relationship between petitions and winning or losing the trial when the plaintiff is a man. The most common situation is for a man to ask for shared custody, which accounted for 929 cases out of the full sample of 1,884 court sentences (49.3%). It is unusual for a man to ask for individual custody; they only accounted for 10.7% of the full sample. Males requesting joint custody are 2.19 times more likely to win than those requesting sole custody. The differences are statistically significant. Panel E in Table 5 presents the relationship between petitions and winning or losing the trial when the plaintiff is a woman. There is no significant relationship between petitions and winning or losing the trial when the plaintiff is a woman. Panel F in Table 5 presents the relationship between the sex of the judge and court decisions. There were no statistically significant differences between court decisions made by judges of different sexes.

Relationships between requests, court decisions, sex of the plaintiff and sex of the judge
Panel G relates the courts' decision (DEC_JOINT) to the existence or not of territorial legislation in favor of joint custody (FAVOR_JOINT). There are no statistically significant differences. Panel H relates the plaintiff's winning the case (WINNER) to the existence or not of territorial legislation in favor of joint custody (FAVOR_JOINT). Plaintiffs from territories whose legislation favors joint custody are 1.54 times more likely to win the trial than those from territories whose legislation does not favor joint custody. The differences are statistically significant. The rationale is that establishing by law a preferential modality increases legal certainty, hence the chances for success are low in second appeals.

Relationships between legal principles and factual findings
In the following, we seek to relate the factual findings to legal principles. Legal reasoning follows several steps that can be simplified as follows: identify the issue and the applicable law; analyze and synthesize the legal rules and principles governing the issue; investigate the relevant facts and apply the rule to the facts to obtain the outcome [46]. Legal argumentation encompasses the justification of legal decisions-that is, how conclusions can be reached through logical reasoning. The doctrine of stare decisis states that cases that have similar facts should receive similar decisions, thus legal practitioners need to identify such facts and principles in precedent cases, which is a time-consuming task [47]. Table 6 relates factual elements to legal principles through multivariate linear regressions.
The assumptions of multicollinearity [48], linearity, no auto-correlation [49], and homoscedasticity [50] were all checked and found to be within acceptable thresholds. While there was some deviation from normality, the sample size is large enough to consider the deviation not to have a serious effect on the results [51][52][53].
The first model uses PSY_REP as a dependent variable and the four legal principles as independent variables. A significant regression equation was found (F(4,1879) = 14.390, p < 0.001), but it had a low adjusted R 2 of 0.028. The other models also found significant regression equations, with similar values of goodness-of-fit, ranging from 0.010 to 0.158. The only principle with significant coefficients in all eight models is the best interest of the child (BEST_INT). Let us focus on model 6. When the parents' readiness (PAR_RDNS) factual finding arises, the most frequent argumentation refers to the proportionality in the responsibilities principle (PROP_RESP) and the best interest principle (BEST_INT). Given a fact related to parents' readiness (i.e. financial resources of both parents, their previous dedication to childcare, the proximity of the domiciles of the litigants, or their professional activity) the judgment refers particularly to the proportionality in the responsibilities principle in addition to the omnipresent best interest of the child principle. Both legal principles were significant determinants of parents' readiness, but the beta coefficient for the proportionality in the responsibilities principle (1.376) was notably larger than that for the best interests of the child principle (0.413). For example, if the litigants live nearby, it seems quite reasonable for the party seeking shared custody to argue that this factual finding (on parents' readiness) favors joint custody, alluding to the principle of shared responsibility-always in the best interests of the child. Similar associations can be identified in the remaining facts and legal principles. The parents' equality principle (PAR_EQL) exhibits a significant relationship with the child's psychological circumstances (CHILD_PSY), and the res judicata principle (RES_JUD) has a significant relationship with the parents' agreements (PAR_AGREEM).

Predicting court decisions
Our second research question was about predicting court decisions. Table 7 shows the results of a logistic regression whose dependent variable is the court decision on custody type (DEC_-JOINT). When developing a predictive model, it is necessary to validate it-that is, to evaluate its performance by testing how well the model predicts the dependent variable in the presence of new cases [54,55]. External validation in which the forecasting capacity is tested using a different sample from another time period (temporal validation) was applied. The training sample includes 942 court rulings from June 2016 to May 2019, and the test sample includes 942 court rulings from May 2019 to June 2020. We designed the research to be a real-world study in which a law firm wants to develop a predictive model, taking May 2019 as a starting point. The team of lawyers would take the information available on that date, develop the model, and make predictions as new sentences appear, while being able to calculate various performance measures for the model. We used accuracy, true positive rate (sensitivity), true negative rate (specificity), and the area under the receiver operating characteristic curve (AUC) as performance measures. The first eight models are univariate logistic regressions that use each of the factual elements as an explanatory variable. Model 9 uses the sex of the plaintiff as an explanatory variable. The accuracy of univariate models ranges from 58.8% to 75.9%. The independent variable that shows the greatest predictive ability is the relationship between parents (PAR_RELAT). Although the predictive power of this variable is remarkable, it presents slightly unbalanced values in terms of the percentages of type I and type II errors. Model 10 is a full model with all nine independent variables and has an accuracy rate of 83.3%. The percentages of type I and type II errors are fairly balanced. Table 7 also shows the results of training two neural network models. Model 11 is a multilayer perceptron with a hidden layer and sigmoid activation functions. Model 12 is a radial base function network. The accuracy increased to 86.4% and 84.6%, respectively. The percentages of type I and type II errors are fully balanced. The normalized importance of each variable is shown in both neural network models, and the results coincide with those obtained in the logistic regression. Table 5 showed that the probability of losing the trial was much higher than that of winning it. This encourages research into decisional rules that may be valuable, which was also the subject of the second research question. We used the Chi-squared Automatic Interaction Detection (CHAID) decision tree to get rules [56]. Table 8 summarizes the results. Again, we divided the sample into training data for the development of the model (n = 942) and testing data for the validation of the model (n = 942). Node 1 of Table 8 shows that 60% of the cases are men who go to trial, and the test sample indicates that they have a probability of winning of 19.6%. Specifically, there are 108 winning cases out of 942, which is 11.5% of the test sample. The probability of winning rises to 21.5% if the man requests shared custody (Node 2). Node 3 provides a winning strategy, as it represents men who request joint custody and have an excellent relationship with their partner (PAR_RELAT); sentences refer to this positive relationship more than 4 times on average. Then, the probability of winning rises to 89.1%. 4.9% of cases meet these characteristics. The probability of winning even increases to 91.3% if the psychophysical circumstances of the child (CHILD_PSY) favor shared custody (Node 4). On the contrary, a man who applies for joint custody and has a bad relationship accompanied by child circumstances that are unfavorable for joint custody has a 99.4% chance of losing at trial (Node 5). This is a frequent scenario, with 349 cases (18.5% of the sample), and only on 2 occasions did the judge grant joint custody. Women who file for joint custody have a small (16.9%) chance of getting it (Node 10), which increases to 80% if the parental relationship is good and the child's circumstances recommend it, although this only happens 0.9% of the time. If a woman requests individual custody, she will usually lose her case, with a probability of 86.7% (Node 13). This situation accounts for 34.6% of all trials.

Discussion and conclusion
This paper shows that it is possible to explain and predict court decisions on child custody from a set of factual findings, with a success rate of over 85%. The first research question studied the factors that explain the decision to grant sole or joint custody. Our empirical study was conducted with second instance appeal judgments. We found that all factual elements hypothesized in our model are considered by the judge, whether they are related to the child, the parents, or the psychological report. The relationship and attitudes of the parents and the psychophysical circumstances of the child are the two factual elements that the judge takes most into account when deciding the type of custody. This is very much in line with the family systems theory [7]. The best interest of the child is the only principle with significant coefficients in all regression models. This principle stands out among the legal principles that justify the decision [44], which is supported by the theory of therapeutic justice and most of the explanatory theories of court decisions. In the cases analyzed, the sex of the applicant is very important, which is in line with previous studies [44]. Women win 14.3% of the trials, while men win 20.1%, and the differences are statistically significant. This can be explained because 82.1% of men request joint custody, while 86.5% of women request sole custody. Then, it must be taken into account that the justice grants the appellant individual custody 13.5% of the time and joint custody 21.4% of the time. The rulings of the Spanish Supreme Court have a significant influence on the decisions of the second instance courts and their tendency is in favor of joint physical custody. This may explain why second instance court rulings are increasingly favorable to joint physical custody. As expected, we found no statistically significant differences in the sex of the judge when granting sole or shared custody, which speaks for the neutrality of Spanish judges. The second research question addresses whether court decisions can be predicted with a reasonable success rate. We used temporal validation; that is, the test samples come from data collected after the training data, as in a real-world application. We developed the models using logistic and neural network regression, achieving satisfactory performance measures. Our study highlights that justice is predictable in the case of child custody, which is a contribution because papers predicting judicial decisions addressed other legal concerns [13][14][15][16]30]. Holmes [11] claimed that for the rational study of the law, the man of the future must be the man of statistics, and law must be predictive. We can conclude from our analysis that it is possible to forecast court decisions in the context of child custody from a modest set of factual elements.

Practical implications
The paper provides practical implications affecting parents, lawyers, and even the judicial system. We found statistically significant differences by considering the factual elements to be independent variables and by, rather than considering the decisions to be dependent variables, considering the requests (joint custody vs sole custody). However, interestingly, the facts influence in the opposite direction: that is, those who complain the most have the least reason to complain. This is because the courts rule in favor the plaintiff in the case of second appeal judgments on child custody only 17.8% of the time. Perhaps many parents decide to go to trial without being aware of their chances of success, leaping in the dark. We obtained useful rules for decision making using the CHAID decision tree technique. Some of the rules allow the recognition of judicial patterns with an accuracy above 90%. For example, if a couple has a very bad relationship, it is almost impossible for the judge to award joint custody (less than a 0.5% chance of winning). However, this is a very common case (about a quarter of the cases), and lawyers would probably do well to advise their client to avoid a trial under these circumstances. The practical implications of these rules are clear as they allow the preparation and filing of a lawsuit with information on the probability of success or failure. Many parents who go to court would be willing to negotiate if they knew that their chance of success is low according to the rules obtained in our empirical study. This is very much in line with the therapeutic justice process, which aims at families resolving their own disputes and encourages the use of mediators [57]. The use of legal decisional systems would make it possible to know the probabilities of success, contributing to reducing the asymmetry of information in the legal domain, which has pernicious effects [9]. If the use of these predictive models becomes widespread, there may even be effects on the judicial system. It is possible that the number of trials would decrease, and the saturation and slowness in the judicial system would be alleviated. It can also affect the work of lawyers and the way they approach a trial. The lawyer's experience is a determining factor when it comes to winning lawsuits [58]. Studies such as the one presented can supplement experience by providing insight into the factors that judges take into account in their decisions. It could also be the case that a lawyer who wants to improve his record by winning a high percentage of cases could accept only those that a priori are more likely to be won.

Limitations and future research
Reading and labeling 1,884 court sentences is laborious and has a certain amount of subjectivity, which is a limitation of the work. Much progress has been made in argumentation mining in the legal domain through the application of natural language processing [59]. We made several attempts to automate labeling, with qualitative text analysis software and training a neural network model for argument mining. The results were promising, and there was a high degree of agreement with the human experts, but manual labeling was chosen, because given the objective of the study, no level of discrepancy was acceptable. It is proposed as a future line of research because it would facilitate extending this type of study to other decisions in the legal domain. We identified a set of factual elements that explain court decisions, but other factual findings or even external variables, such the experience of lawyers, can be relevant as factors of success in trials [58]. It would be positive to extend the study using first instance court decisions, as we only analyzed second instance court decisions, which would increase the validity of the study. Another limitation is that the results are valid for cases in Spain; the results could be generalized to other countries, but only those with similar development indices. To overcome these limitations, it would be necessary to extend the study to other contexts, which we propose as a future line of research.