Network Identification of Hormonal Regulation

Relations among hormone serum concentrations are complex and depend on various factors, including gender, age, body mass index, diurnal rhythms and secretion stochastics. Therefore, endocrine deviations from healthy homeostasis are not easily detected or understood. A generic method is presented for detecting regulatory relations between hormones. This is demonstrated with a cohort of obese women, who underwent blood sampling at 10 minute intervals for 24-hours. The cohort was treated with bromocriptine in an attempt to clarify how hormone relations change by treatment. The detected regulatory relations are summarized in a network graph and treatment-induced changes in the relations are determined. The proposed method identifies many relations, including well-known ones. Ultimately, the method provides ways to improve the description and understanding of normal hormonal relations and deviations caused by disease or treatment.


S1.2 Pulse estimation
The serum concentration profiles of most hormones display episodic secretion that can be estimated. The model used to estimate the secretion events is described by the following equation [4].
x i,j,k,o = x i,j,(k−1),o e −κi,j,o + φ i,j,k,o + u i,j,k,o (2) In this equation, x i,j,k,o is the hormone concentration as described, e −κ is the (estimated) decay rate, φ i,j,k,o is the (estimated) secretion, and u i,j,k,o is the residual (error) term. The estimated secretion is the basis on which further analysis is performed and the networks were inferred.

S1.3 Metrics
The metrics use are the ordinary Pearson product moment correlation and the lagged Pearson product moment correlation. The former is commonly referred to as correlation while the latter is better known as the cross correlation. As correlation statistics are skewed Gaussian the Fisher-Z transform, the area hyperbolic tangent (atanh), is used. After this transform the average can be calculated and the result is back transformed to ordinary correlation space by taking the inverse of the function, the hyperbolic tangent, which then becomes the aggregated cohort statistic. The average Fisher-Z transformed correlation and the standard error of the mean is used to calculate t-values. See, Equations 1-9.

S1.4 Static network metrics
The static metrics are based on the Pearson product moment correlation and are calculated on the estimated secretion profiles. The correlation estimates are Fisher-Z transposed to improve the asymptotic consistency of the estimates.

S1.4.1 Partialization
For the partial correlation the variation of all other variables is regressed out of the φ i,j,k,o and φ i,j ,k,o pair and the obtained residuals are correlated per individual [5]. No other changes to the algorithm are made.

S1.5 Optima selection
The optima are defined to be extremes in the averaged cross-correlation. To get some understanding of the variability in extremes a resampling scheme was formulated that performed a leave two (individuals) out resampling. The resampling is performed by randomly selecting n − 2 out of the n subjects, after which the optima calculation is repeated on that subset. This chain of actions is repeated a 1000 times. The modes of mic and mac are the robust optima, the selected optima are referred to by e. The lags at which the minima and the maxima cross correlation was found are tabulated, and the 'optimal' τ was defined to be the mode of the entire set of τ . Thec j,j ,o,τ optima were defined in Equation 4 and 5.
The optima are defined as the values of the averaged cross-correlation.

S1.6 Dynamic network metrics
The cross-correlation analysis is applied to individual subjects. The application of the Fisher Z transformed cross-correlation function [6] is shown in Equation 6, in which i is a particular subject,j and j are hormones (j = j ), τ the lag, N the number of time points (N = 145), o a particular measurement occasion, and w the total number of lags to consider. In the analysis the upper value of w was set to 24, which is equivalent to 4 hours (6 samples/hour).
The aggregation of the individual cross-correlation values is performed as shown in Equation 7, in which: n is the number of subjects (n = 18). Equations 8 and 9 show how the standard deviation and t-value are calculated. Since the optima are signed, that is, we search for the highest or lowest values, all tests using this data are single sided.
S1.6.1 False discovery rate control A false discovery rate (FDR) correction is used to control the number of false positives. Controlling the FDR at 5% means that on average 5% of the significant features are truly null (that is, not significant). The Benjamini and Hochsberg procedure [7] was used to control this rate. The default significance level was set at α = 0.05. Since the dynamic network is based on optimized values, the optima, a convervative level of α = 0.01 was used for this inference. For the treatment effect the tendency (0.05 < α < 0.10) of an effect is also reported.

S1.7 Directionality
A nonzero optimal lag estimated from the aggregated cross-correlations implies directionality, which, as effects are believed to emerge after the cause, could imply regulation. The mathematics of these conceptual chains are rooted in theory developed by Fisher [8], Granger [9], Pearl [10,11] and Dawid [12,13]. Because the hormone concentrations cannot be assumed to be autoregressive (AR), the standard tests for Granger causality cannot be applied, and a more generic test is used instead. The null hypothesis is that there is no direction, that is, the correlation values at τ = 0 are the same as the correlation values at optimal τ . Directionality is said to exist when the optimal lag is with higher correlation values than τ = 0.
The directionality statistics are provided in Table 1. Note that a positive lag indicates that the second variable follows the first with some lag, for negative lags the ordering is reversed.

S1.8 Treatment effect
The optimum lag was estimated in the before treatment situation. The treatment effect is defined as a difference between the correlation values between before and after treatment. In Equation 11, before/after is indicated by 1, 2, e is the optimum, the null hypothesis (Equation 11) is evaluated using a paired t-test.

S1.9 Software
The calculations were performed in Mathworks MATLAB 2011b, except for the calculation of the FDR q-values which was performed in R 2.13.1 using the stats package.  Figure S1: The after treatment relations between ACTH, TSH, prolactin, and cortisol are shown here in more detail. ACTH-cortisol is clearly connected. The TSH-prolactin relation is more modest but still significant. The relation ACTH-TSH is significant, but shows a wide optimum, interestingly Cortisol-TSH shows a similarly wide optimum which is shifted to the right. ACTH-Prolactin shows a narrow optimum while cortisol-prolactin is again wider and shifted to the right.  Figure S2: The lean control relations between ACTH, TSH, prolactin, and cortisol are shown here in more detail. ACTH-cortisol is clearly connected. The TSH-prolactin relation is more modest but still significant. The relation ACTH-TSH is significant, but shows a wide optimum, interestingly Cortisol-TSH shows a similarly wide optimum which is shifted to the right. ACTH-Prolactin shows a narrow optimum while cortisol-prolactin is again wider and shifted to the right.