A time-series analysis of blood-based biomarkers within a 25-year longitudinal dolphin cohort

Causal interactions and correlations between clinically-relevant biomarkers are important to understand, both for informing potential medical interventions as well as predicting the likely health trajectory of any individual as they age. These interactions and correlations can be hard to establish in humans, due to the difficulties of routine sampling and controlling for individual differences (e.g., diet, socio-economic status, medication). Because bottlenose dolphins are long-lived mammals that exhibit several age-related phenomena similar to humans, we analyzed data from a well controlled 25-year longitudinal cohort of 144 dolphins. The data from this study has been reported on earlier, and consists of 44 clinically relevant biomarkers. This time-series data exhibits three starkly different influences: (A) directed interactions between biomarkers, (B) sources of biological variation that can either correlate or decorrelate different biomarkers, and (C) random observation-noise which combines measurement error and very rapid fluctuations in the dolphin’s biomarkers. Importantly, the sources of biological variation (type-B) are large in magnitude, often comparable to the observation errors (type-C) and larger than the effect of the directed interactions (type-A). Attempting to recover the type-A interactions without accounting for the type-B and type-C variation can result in an abundance of false-positives and false-negatives. Using a generalized regression which fits the longitudinal data with a linear model accounting for all three influences, we demonstrate that the dolphins exhibit many significant directed interactions (type-A), as well as strong correlated variation (type-B), between several pairs of biomarkers. Moreover, many of these interactions are associated with advanced age, suggesting that these interactions can be monitored and/or targeted to predict and potentially affect aging.

Introduction A better understanding of the biology of aging can help us discover strategies to stay healthier longer [1,2]. Studying aging directly in humans has many advantages, and can help pinpoint some of the mechanisms responsible for predicting and controlling aging rates [3][4][5][6][7]. Unfortunately, human studies have several disadvantages, including difficulties acquiring regular samples, as well as controlling for the differences between individuals (e.g., diet, socioeconomic status and medication) [6,8,9]. Many of these limitations can be overcome by focusing on short-lived animals such as worms, flies and mice [10,11], but it is important to complement this work with studies of animals that have lifespans similar to humans (i.e., with evolutionary adaptations that allow them to live 50 or more years long).
In this manuscript we analyze data taken from a longitudinal cohort of 144 US Navy bottlenose-dolphins. These dolphins were cared for over three generations and were routinely sampled to asess the trajectory of 44 clinically relevant biomarkers over their lifespan, as described in [20,31,32]. The diet and environment of these dolphins was carefully controlled, resulting in the dolphins living an average of 32.5 years (c.f. wild dolphins live an average of 20 years) [33,34]. The uniformity of environment, diet and health-care within this cohort implies that age-related differences between dolphins might be due to inherent or genetic factors influencing senescence [35].
This data-set was used previously to demonstrate differences in aging-rates between individual dolphins [35]. Motivated by these results, we further analyze the trajectories of each of the biomarkers for each of the dolphins within this data-set, searching for hints of any causal interactions between the measured biomarkers. In this analysis it is crucial to account for the shared fluctuations (termed 'shared variation') between biomarkers, as these shared fluctuations typically dwarf the more subtle causal interactions we are trying to find. To account for this shared variation, we model the longitudinal data as a linear stochastic-differential-equation (SDE), extracting model parameters as indicators of which biomarkers might affect one another. After doing so, we are able to identify interactions that are associated with aging. These interactions may play a role in the biology of aging, and are prime candidates for further investigation.

Motivation for the model
Obviously, the body is a very complicated dynamical system [36]. Any set of biomarkers can only ever tell part of the story, and there will always be an abundance of unmeasured factors that play an important role in the trajectory of all the measured biomarkers. Given any particular subset of (observed) biomarkers, we typically expect the changes in those biomarkers to come from three different categories.

type-A: Directed interactions. Any particular biomarker may play a direct role in influencing
the dynamics of other biomarkers. For example, one biomarker might 'excite' or 'inhibit' another, with an increase in the first promoting an increase (or, respectively, decrease) in the second.
type-B: Shared biological variation. As the system evolves, we expect a large number of unmeasured biomarkers to affect each of the measured biomarkers. Generally, this multitude of unmeasured effects will accumulate, giving rise to a combined effect that is essentially unpredictable given the measurements at hand. Collectively, the influence of these unmeasured effects can seem random.
type-C: Observation-noise. To complicate things further, our measurements of each observed biomarker may not be perfectly accurate. Typically there are additional sources of 'observation-noise' which introduce random variation to our measurements. This type-C variation can include both measurement errors as well as true biological fluctuations that are much more rapid than the typical time-interval of the experimental measurements themselves (e.g,. hourly variations based on the individual's activity).
To understand how large a role each category plays, we can look at the changes in each of the biomarkers within the dolphin data-set. One example for the biomarker 'Alkaline Phosphatase' (AlkPhos) is shown in Fig 1. This figure illustrates the distribution of measured variable-increments (between one time-point and the next) for this biomarker. The large variation in these increments indicates that the type-B and type-C effects play a large role. Furthermore, there is a striking linear relationship between (i) the magnitude of the measured time-increment and (ii) the variance in the measured variable-increments. This linear relationship indicates that a significant component of the type-B variation is similar to the Brownian processes commonly seen in high-dimensional dynamical systems [37]. Such a Brownian process-also called a 'stochastic drive'-can provide a source of shared variation which correlates different biomarkers. Finally, we remark that the variation in variable-increments is still significant even when the measured time-increment is 0. In other words, multiple repeated measurements of the same dolphin may not always return the same value. This indicates that there is a significant source of type-C variation in our measurements.
The qualitative aspects of Fig 1 are typical for this dataset; Figs 2 and 3 illustrate the distribution of variable-increments for two other measured biomarkers. Note that, once again, we see that the variation in variable-increments is large, and that there is evidence for both type-B and type-C variation. For some biomarkers we expect the type-B variation to be larger than the type-C variation, whereas for other biomarkers we expect the type-C variation to dominate (e.g., compare Figs 2 and 3).
In the next section we'll introduce our model for the dynamics. Within this model we'll model the type-A directed interactions as simple linear interactions (see Eq (1) below). Motivated by the observations above, our model will also include terms to account for type-B and type-C variations. we'll model the type-B variations as an (anisotropic) Brownian-process (see Eq (1) below), and we'll model the observation-noise as an (anisotropic) Gaussian with timeindependent variance [38] (see Eq (2) below).

Model structure
Given the motivation above, we can model the evolution of any d specific variables over the time-interval [t, t 0 ] using a simple linear stochastic-differential-equation (SDE) of the following Here we illustrate the distribution of increments associated with Alkaline Phosphatase (AlkPhos). On the left we show a scatterplot of these increments. Each small black point in the background represents a pair of time-adjacent measurements in a single dolphin, with the square-root of the timeincrement (in years) along the horizontal, and the change in AlkPhos along the vertical. The time-increments are then divided into 12 bins, and the mean (white dot) plus and minus one standard-deviation (yellow dots) are displayed in the foreground. On the right we show the same data, with the 12 time-bins shown as vertical strips. Each vertical strip is further divided into 18 boxes, which are colored by the number of increments within that time-bin which fall into that box (i.e., a histogram for each time-bin, scaled to have maximum 1, colored as shown in the colorbar to the far right). Note that the distribution of increments for each time-bin becomes wider as the time-increment increases. Importantly, the standard-deviation increases roughly linearly with the square-root of the time-increment (i.e., the variance increases roughly linearly with the time-increment), as expected from a Brownian-process. Note also that the variance is nonzero even when the time-increment is 0, implying that there must be an extra source of variation in addition to this Brownian-process.
https://doi.org/10.1371/journal.pcbi.1010890.g001 form: where XðtÞ 2 R d represents the d-dimensional vector-valued solution-trajectory at the initial time t, the time-increment dt = t 0 − t represents the difference between the initial time t and the final time t 0 , and dXðtÞ ¼ Xðt 0 Þ À XðtÞ 2 R d represents the vector of variable-increments between times t and t 0 . We further assume that we do not measure X(t) directly, but rather some Y(t) which depends on X(t) and which also incorporates the type-C observation-noise: In Eqs (1) and (2) there are several terms on the right hand side which contribute to the observed dynamics.

Baseline velocity:
The vector a 2 R d corresponds to a constant 'velocity' for each of the variables. This represents the rates at which each variable would increase, were it not for the other influences within the system.

PLOS COMPUTATIONAL BIOLOGY
Directed interactions: The matrix A 2 R d�d represents the type-A interactions between variables. These directed interactions are modeled as linear effects; the value of a 'source' variable v 0 will influence the rate at which the 'target' variable v increases at time t via the factor A vv 0 X v 0 (t). Each of these type-A interactions can be thought of as part of a pathway or larger network describing how the variables influence one another. These directed interactions are often called 'deterministic interactions', as they do not explicitly involve any randomness.

Stochastic drive:
The matrix B represents the biological variation (due to an accumulation of independent unmeasured effects) expected to affect the model variables as they evolve. The source of the type-B variation is modeled by the noise dW(t) (i.e., Brownian-increments) which are each drawn independently from the Gaussian distribution N ð0; dt � I d�d Þ. The matrix B 2 R d�d controls the correlations between the noise received by the different variables [37]. Put another way, dtBB ⊺ is the covariance matrix of biological variation for this model, analogous to the variance-covariance matrix in quantitative genetics [39]. This biological variation is often called a 'stochastic drive' to disambiguate it from the type-C observation-noise mentioned below.

Observation-noise:
The matrix C in Eq (2) represents both the noise inherent to the observations, as well as the effects of very rapid unpredictable fluctuations in the biomarkers. Each vector ϵðtÞ 2 R d at each observation-time is drawn (independently) from the Gaussian distribution N ð0; I d�d Þ. The matrix C 2 R d�d controls the level of correlations within the type-C errors; the covariance of this observation-error is given by the symmetric-matrix CC ⊺ .
Note that the observation-times are not necessarily unique: t 0 could very well equal t, and dt may equal 0. In this situation dX(t) � 0, and so X(t 0 ) will equal X(t). However, because of the type-C errors, Y(t 0 ) will in general be different from Y(t).
While we could, in principle, fit an SDE to all the variables simultaneously (with a large d), we opt to consider only pairwise combinations of variables (with d = 2 in each case). Mathematically, each such pairwise fit (for variables v and v 0 ) corresponds to approximating the system's trajectory after projecting the full system onto the subspace spanned by those two variables. We make this decision for three reasons.
Increase Power: First and foremost, the number of model parameters, as well as the number of observations required to accurately estimate these parameters, both increase as d increases. We will have the most statistical power when d is low. This topic is discussed in more detail within S1 Text.
Avoid Redundancy: Second, redundancies in any subset of variables (e.g., GFR and Creatinine, or MCH and MCV) can easily create the illusion of large interactions between the redundant set and other variables. These spurious interactions will be statistically, but not biologically significant. In mathematical terms, these redundancies will result in a poorly conditioned inverse problem, which we want to avoid [40].
Easily Understand: By considering only pairwise interactions, the results can be interpreted more easily. The interactions between any pair of variables can then be considered (and used to make predictions) even in situations where the other variables aren't accessible, or haven't been measured.
We acknowledge that there are many strategies for modeling the interactions in this longitudinal data-set (see, e.g., [41][42][43][44][45]). For example, there are many similarities between our model and vector-autoregression [46]. Indeed, if the time-intervals dt in this data-set were all of equal size, then our method would be equivalent to one-step vector-autoregression; the only structural difference being that in such a scenario the type-B and type-C variation would be indistinguishable from one another. We opt for the SDE proposed above because (i) it has relatively few parameters, (ii) it is easy to understand, and (iii) it can easily accommodate the irregular time-intervals and occasional missing elements within this data-set.

Model interpretation
We use a generalized regression (see S1 Text) to fit the longitudinal data for each pair of biomarkers v and v 0 with the simple dynamical system (SDE) above. After fitting the model parameters we interpret A and BB ⊺ as having potential biological significance.
The entries A vv 0 indicate the directed interactions (i.e., the deterministic effects) from biomarker v 0 to biomarker v. If A vv 0 is positive, then we expect an increase in biomarker v 0 to typically precede a subsequent increase in biomarker v; we conclude that v 0 'excites' v. Conversely, if A vv 0 is negative, then we expect that v 0 will 'inhibit' v; an increase in v 0 will typically precede a subsequent decrease in v.
The symmetric entries [BB ⊺ ] vv 0 = [BB ⊺ ] v 0 v indicate the shared biological variation (i.e., the stochastic drive) expected to influence both biomarkers v and v 0 . If [BB ⊺ ] vv 0 is positive, then v and v 0 will experience correlated stochastic input which will give rise to correlated (but not causal) fluctuations between these two biomarkers. Conversely, if [BB ⊺ ] vv 0 is negative, then v and v 0 will experience anti-correlated stochastic input which will produce anti-correlated (but again, not causally-linked) fluctuations between the two biomarkers. Because we fit the SDE to each pair of biomarkers individually, the correlations (or anti-correlations) within this stochastic drive to the pair v, v 0 can result from the accumulated effects of the other measured biomarkers (other than v, v 0 ), as well as from biomarkers which weren't included in the study at all.
To illustrate these kinds of relationships, we fit the SDE in Eqs (1) and (2) to the two biomarkers 'Iron' and 'Mean corpuscular hemoglobin' (MCH) across all observation-times across all dolphins. Referring to Iron and MCH as v 0 and v, respectively, we find that A vv 0 is significantly positive and A v 0 v is significantly negative (with uncorrected p-values p 0 < 1e − 11 and p 0 < 1e − 4 respectively, see Methods). Moroever, we find that [BB ⊺ ] vv 0 is significantly positive (p 0 < 1e − 50). As we'll discuss below, the type-B correlations are quite a bit larger (in magnitude) than the type-A interactions for these two biomarkers (note the difference in p-values), and an accurate estimate of A vv 0 can only be obtained after estimating [BB ⊺ ] vv 0 .
The directed interactions A can be visualized using the phase-diagram of the linear differential-equation associated with A (i.e., by ignoring the effect of B in Eq (1)). This phase-diagram is shown on the left of Fig 4, with the directed interactions giving rise to the flow vectors in the background (grey). For this differential-equation a surplus of Iron will drive an increase in

PLOS COMPUTATIONAL BIOLOGY
MCH, while a surplus of MCH will drive a decrease in Iron. This combination of excitation and inhibition will tend to give rise to 'counterclockwise' motion within the Iron-MCH phaseplane (indicated by the directions of the grey arrowheads).
The subplot on the left of Fig 4 also shows a finely sampled trajectory X(t) of the SDE (which includes the effect of the nonzero B). This trajectory is colored by age, from 0yr (cyan) to 55yr (magenta) (see adjacent colorbar). Note that the sample-trajectory often deviates from the counterclockwise course suggested by the directed interactions, often meandering around due to the stochastic drive provided by B. The sample trajectory shown here is just one possible trajectory for the SDE; other sample trajectories are shown in Fig J in S1 Text. Subplots illustrating the (kalman filtered) trajectories of several individual dolphins are shown on the right of this figure, using the same colorscale. Similar to the sample trajectory from the SDE, these experimental trajectories also meander around, while exhibiting a global tendency to prefer counterclockwise loops over clockwise ones.
The type-B and type-C variations in these two biomarkers are shown in Fig 5. To illustrate these sources of variation, we take all the time-increments dt used to fit the SDE (across all pairs of adjacent observation-times and all dolphins), and divide them into 10 percentilebands. For each percentile-band of dt we construct a scatterplot of the observed variable-increments dYðtÞ 2 R 2 for these two biomarkers. We fit the distribution of variable-increments in each percentile-band with a Gaussian, and indicate the 1-and 2-standard-deviation contours of this Gaussian with bright-and pale-yellow ellipses, respectively. The covariance-matrix CC ⊺ is approximately equal to the covariance of the scatterplot shown in the upper-left, with PLOS COMPUTATIONAL BIOLOGY dt * 0. The covariance-matrix for each of the other subplots is roughly given by dtBB ⊺ + CC ⊺ ; as dt increases the variable-increments become more and more correlated, indicating that [BB ⊺ ] vv 0 is positive (i.e., these two biomarkers are influenced by a shared and correlated stochastic drive).
For these two biomarkers the type-B and type-C variations contribute heavily, dominating the evolution of the trajectory of the SDE. Specifically, the term B in Eq (1) is several times larger than the matrix A. The term C in Eq (2) is also large, contributing observation-noise which is a few times larger (on average) than the type-A interactions. This is quite common, and most biomarker-pairs exhibit a similar phenomenon.
If we were to ignore the type-B and/or type-C interactions when fitting this data (i.e., setting B � 0 in Eq (1) or C � 0 in Eq (2)), then the correlations in these variations would impact our recovery of A, giving rise to inaccurate estimates of A vv 0 . These spurious estimates for A vv 0 would, in turn, give rise to an abundance of false-positives and false-negatives when estimating the significance of the directed interactions across the biomarker-pairs within the data-set. By accounting for these type-B and type-C terms within the SDE, we believe that we can accurately estimate the size and direction of the type-A interactions.
To summarize, we expect Iron and MCH to exhibit strong correlated (but not necessarily causal) fluctuations due to shared input coming from the accumulated effect of other (potentially unmeasured) biomarkers. By taking into account these strong type-B correlations, we can attempt to tease out more subtle directed type-A interactions. For these two biomarkers we do indeed find evidence of a statistically significant directed link: an increase in Iron will tend to produce a subsequent increase in MCH, while an increase in MCH will tend to produce a subsequent reduction in Iron.

Longitudinal analysis
By fitting the longitudinal data with the SDE Eqs (1) and (2) above, we observe multiple significant (type-A) directed interactions, as well as (type-B) shared biological variation. An illustration of the type-A interactions observed across all dolphins and ages is shown in Fig 6, while the corresponding type-B correlations are shown in Fig 7. We first remark that the type-B correlations typically dominate the type-A interactions. We saw an example of this earlier when considering the case of Iron versus MCH, and the same general trend holds for most of the biomarker pairs. Many strong type-B correlations can be observed between the biomarkers relating to red blood cells (RBCs) and hemoglobin (HGB), such as Iron, RBC, HGB, HCT, MCV, MCH and MCHC. These correlations are to be expected, because many of these biomarkers are influenced by the same underlying biological factors [47,48]. Similarly, we see many strong type-B correlations between the biomarkers relating to liver function, such as LDH, AST, ALT and GGT. Once again, these are all influenced by the same factors, and strong correlations are to be expected [49,50].
Regarding the type-A interactions, we note that most of the more significant type-A interactions involve an A vv 0 and A v 0 v with opposite signs. That is, one biomarker excites the other, while being inhibited in return (see, e.g., the interaction between Iron and MCH mentioned earlier). These 'push-pull' pairs manifest when the observed data for biomarkers v and v 0 both look like noisy versions of the same rough trajectory (e.g., meandering back-and-forth), but with one of the biomarkers 'leading' the other in time. The linear SDE that best fits this scenario is usually one which gives rise to stochastic oscillations with a distribution of temporal correlations fit to what was observed in the real data.
Some of the more statistically significant push-pull interactions include: (i) Iron Ð MCH, (ii) HGB Ð SED60, (iii) HCT+MCH Ð Bilirubin, and (iv) HGB Ð MCH. These interactions suggest an intricate interplay between Iron, various RBC attributes, and inflammation (for which SED60 is a proxy). These interactions are not without precedent in humans. For example, inflammation is well known to have wide-ranging effects, and can alter red blood cell clearance as well as red blood cell counts [51,52]. The production and regulation of Heme and Hemoglobin also involves many mechanisms, and can be affected by inflammation, in turn affecting endothelial cell activation [53][54][55][56][57]. The breakdown and/or destruction of RBCs can

PLOS COMPUTATIONAL BIOLOGY
also release damage-associated molecular patterns which can activate inflammatory pathways [58], and result in increased bilirubin levels, which in turn can trigger erythrocyte death [59].
Grouping the interactions. When observing the array of type-A interactions in Fig 6, it is clear that there are an abundance of 'push-pull' biomarker-pairs (such as Iron and MCH). Further inspection reveals that these push-pull pairs are not haphazardly scattered throughout the array (i.e., they are not 'randomly distributed'). Instead, there are subsets of biomarkers which exhibit collective push-pull interactions.
For example, let's define the subsets V and V 0 as follows. Let V contain the biomarkers RBC, HGB, and HCT, and let V 0 contain the biomarkers AST, MCH, Bilirubin, ALT, Sed60 and

PLOS COMPUTATIONAL BIOLOGY
Iron. With this definition, we see that these two subsets V and V 0 form a 'block' of push-pull interactions. That is, the directed interaction between any source biomarker from set V and any target biomarker from set V 0 is excitatory, while the reciprocal interaction is inhibitory. These directed interactions may be related to the observed relationships between liver function and RBC elimination [60][61][62].
The particular block of push-pull interactions described above is quite large (comprising a set of 3 and a set of 6 biomarkers), and is substantially larger than the typical push-pull blocks one would expect if the A-type interactions were randomly reorganized (p-value < 1e − 21). There are other (less significant) push-pull blocks within the array of A-type interactions as well. To systematically search for these push-pull blocks, we use a modified version of the biclustering techniques described in [63] (see S1 Text).
We can use this strategy to rearrange the biomarkers to reveal the significant push-pull blocks. As described in S1 Text, many of the significant push-pull blocks involve overlapping subsets of biomarkers, and there is no single best strategy for presenting them all. One possible rearrangement, which isolates some of the larger disjoint push-pull blocks, is shown in Fig 8. In addition to the push-pull block described above (shown in the upper left of Fig 8), there are two other push-pull blocks that can be seen towards the center of Fig 8. The first is defined by V containing AlkPhos and InorgPhos, and V 0 containing CPK, Platelets and BUN (pvalue < 0.007). The second is defined by V containing Creatinine and Lymphs, and V 0 containing AlkPhos, InorgPhos, CPK and Platelets (p-value < 0.026).
When considering the type-B interactions the situation is simpler. The array of B-type interactions is symmetric, and is already quite structured. A simple spectral clustering (i.e., sorting the leading term in an eigendecomposition of BB ⊺ ) can be used to reveal many groups of correlated biomarkers. In Other groups can also be seen, such as the group near the center of Fig 9, including AlkPhos, InorgPhos, Lymphs, and ACLymphocytes, relating the phosphate and phosphatase concentrations to lymphocyte count.
Age-associated interactions. To search for age-associated interactions, we checked if there were significant differences between the older dolphins (older than 30) and the middleaged dolphins (between 10 and 30 years old). The results for the type-A interactions are shown in Fig 10 (with the type-B correlations shown in Fig I of S1 Text). While we saw very few significant differences in the directed interactions between the male-and female-dolphins overall in this data-set (the only exceptions being the influence of magnesium on platelets and inorganic phosphate), several previous studies have found significant differences between the sexes [26,33,34,64]. Consequently, we repeated our analysis after segregating by sex. These results are shown in Figs 11 and 12.
While our sex-segregated analysis has lower power than when the sexes were pooled, we can already see that many of the significant differences observed in the general population are in fact driven by one sex or the other. For example, the male dolphin population drives the age-related change in the push-pull directed interaction between RDW and MCV. On the other hand, the female dolphin population drives many of the other observed age-related changes, such as the interaction between MCHC and RDW, and the interaction between Magnesium and Glucose.
The age-related changes we observe between RDW and MCV could reflect a change in the regulation of bone-marrow function, or the onset of certain kinds of iron-deficiency anemias [65][66][67][68]. More generally, RDW plays a role in various cardiovascular diseases, and it is possible that RDW serves as a proxy for the body's ability (or lack thereof) to tightly regulate certain processes [69][70][71]. Moreover, the management of RDW seems to be important for aging, with at least one study reporting a larger effect-size in men [72,73].

PLOS COMPUTATIONAL BIOLOGY
Regarding the observed interaction between Magnesium and glucose, magnesium is known to affect blood glucose levels and glucose metabolism in humans, and was found to influence glucose absorption in rats [74][75][76]. Moreover, magnesium can play the role of a secondary messenger for insulin action, and magnesium accumulation is in turn influenced by insulin [77,78]. Aging is also associated with changes in magnesium level, and magnesium deficits may result from many age-related conditions [79]. As just one example, the modulation of magnesium levels in elderly patients can potentially improve both insulin response and action, affecting blood glucose levels in turn [80].

Discussion
We have used a linear SDE to model the time-series data from a longitudinal cohort of dolphins. Importantly, this model accounts for stochastic correlations between biomarkers, without which we would not have been able to accurately estimate the directed interactions.
After performing our analysis, we observed many statistically significant interactions across the population as a whole. Some of the most significant interactions involve 'push-pull' biomarker-pairs, where one biomarker excites the other while being inhibited in return. A large block of push-pull interactions involves (i) RBC, HGB and HCT interacting with (ii) AST, We also reported several significant age-related changes to the interaction patterns. Some of the most significant age-related effects involve RDW, MCV and MCHC (in males and females), and the connection between magnesium and glucose (in females).
While certainly suggestive, our a-posteriori analysis cannot be used to decide whether these directed interactions are truly causal or not. Indeed, any significant type-A relationship between any pair of biomarkers (such as Iron and MCH shown in Fig 4) might be due to an unmeasured source affecting both biomarkers, but with different latencies. In this scenario

PLOS COMPUTATIONAL BIOLOGY
both biomarkers would receive the same unmeasured signal, and would not be causally linked. However, the receiver with the shorter latency would appear to excite the receiver with the longer latency. Moreover, if the unmeasured signal was roughly oscillatory (e.g., meandering back and forth on a time-scale comparable to the difference in latencies), then the receiver with the longer latency would appear to inhibit the receiver with the shorter latency.

PLOS COMPUTATIONAL BIOLOGY
In summary, while we believe that most causal interactions would result in a signal captured by our SDE model, the reverse is certainly not true. Causality can only truly be determined after following up our analysis with a controlled study wherein one of the biomarkers is perturbed directly while measuring the response of the other biomarkers.
Finally, we remark that, while dolphins and humans share many age-related qualities [18], they are also different in important ways. For example, female dolphins do not experience menopause, and thus would not be expected to exhibit the suite of hormonal changes that human women do as they age [81]. Further work is needed to determine which of the interactions we have discovered are truly causal, which affect (or are affected by) aging, and which generalize to humans.

Study population and animal welfare
This study was limited to archived, retrospective health data and biospecimen analyses that were collected on Navy bottlenose dolphins (Tursiops truncatus) as part of their routine health care. The US Navy Marine Mammal Program is an Association for Assessment and Accreditation of Laboratory Animal Care International-certified program that adheres to animal care and welfare requirements outlined by the Department of Defense and US Navy Bureau of Medicine.

Relationships between biomarkers
Preprocessing. For two of the biomarkers we excluded measurements with atypical labcodes that lay outside the standard range of measurements seen from the two most common lab-codes. This involved removing Albumin measurements below 2, and RBC distribution width (RDW) measurements above 40. We also ignored the glomerular filtration rate (GFR) biomarker, as for this data-set the GFR measurements v are functionally related to the Creatinine measurements v 0 (up to rounding/discretization error) as: Consequently, after applying the log-transformation described below, these two biomarkers become (almost) exactly correlated with one another. Log-transform. For each biomarker, we applied a log-transformation (adding the smallest discrete increment as a psuedocount) when such a transformation would reduce the skewness of the resulting distribution (across all dolphins). For this data-set the list of biomarkers undergoing a log-transformation is: WBC, MCV, RDW, NRBC, ACNeutrophils, Lymphs, ACLymphocytes, Monocytes, EOS, ACEosinophils, Glucose, BUN, Creatinine, UricAcid, Potassium, Protein, Calcium, AlkPhos, LDH, AST, ALT, GGT, Bilirubin, Cholesterol, Triglyceride, Iron, CPK, SED60, GFR.
Normalization. For each dolphin we estimated the age-related drift for each biomarker by using linear-regression on measurements taken from age 5 onwards, excluding outliers above the 99th percentile or below the 1st percentile. We record the slope of this age-related linear drift for each biomarker for each dolphin, referring to this value later on to categorize individual dolphins as slow-or accelerated-agers (see Fig H in S1 Text). After removing this linear drift term, we normalize each biomarker (for each dolphin) to have mean 0 and variance 1.
Analysis. For any particular subset of dolphins (e.g., male dolphins between the ages of 10 and 30), we fit a linear stochastic-differential-equation (SDE) to each pair of biomarkers, pooling observations across the dolphins in the chosen subset. Each of these SDEs has 12 parameters in total, accounting for type-A directed interactions, type-B shared stochastic drive, and type-C observation-noise (see Eqs (1) and (2)). Briefly, we performed the fit by using a variation of expectation maximization to approximate the model parameters that were most likely to have produced the observed data. The details are described in S1 Text. Our a-posteriori grouping of the type-A directed interactions (shown in Fig 8) is described in S1 Text. The code for these methods can be found within the github repository 'https://github.com/adirangan/ dir_PAD'.
Null Hypothesis. To estimate significance for any subset of dolphins across any age-interval for any pair of biomarkers, we compare the estimated parameters of our SDE-model to the parameters obtained after randomly permuting the time-labels of the measurements. Each label-shuffled trial k = 1, . . ., K is associated with a permutation of the time-labels π k which randomly interchanges time-labels within each dolphin (but not across dolphins) within the given age-interval. These label-shuffled trials correspond to sampling from the null hypothesis that the true interaction coefficient A vv 0 for any pair of biomarkers is 0, while still maintaining many of the correlations between biomarkers. Note that the same set of permutations {π 1 , . . ., π K } is used for the label-shuffled trials 1, . . ., K across all biomarker-pairs v, v 0 . Consequently, we can use the correlations between different parameters estimated from the same label-shuffled trial to adjust the p-values of any given parameter (see p h below).
Estimating p-values. For each parameter we estimate the p-value 'p 0 ' numerically using K = 256 label-shuffled trials. When the parameter value for the original data is more extreme than the analogous parameter from any of the K label-shuffled trials, we estimate p 0 by first (i) estimating the z-score z 0 for that parameter using the Gaussian-distribution fit to the collection of K label-shuffled parameter-values, and then (ii) estimating the p-value p 0 by applying a 2-sided test to the z-score z 0 ; i.e., logðp 0 Þ ¼ erfclnðjz 0 j= ffi ffi ffi 2 p Þ. To correct for multiple hypotheses we calculate two adjusted p-values 'p b ' and 'p h '. The adjusted p-value p b is obtained using a standard bonferroni-correction. Thus, p b = p 0 /J, where J is the number of parameters under consideration. For the directed interactions A vv 0 we set J = N(N − 1), while for the covariances [BB ⊺ ] vv 0 we set J = N(N − 1)/2 (recall N = 43 is the total number of biomarkers analyzed). This bonferroni-corrected p-value is an overestimate (i.e., p b is too conservative). A more accurate p-value can be obtained by using an empirical version of the holm-bonferroni adjustment, as described in S1 Text. These strategies can easily be extended to estimate the significance between different groups of dolphins (see Fig G in S1 Text).
Estimating Aging-rate. To demonstrate consistency with the analysis of [35], we can estimate the aging rate of the dolphins. These results are shown in Fig H in S1 Text.  Table. Signed logarithm of the holm-bonferroni-corrected p-values for the B-type interactions associated with Fig 10 (comma-separated values). (CSV) Fig 11 (comma-separated values). (CSV) S10 Table. Signed logarithm of the holm-bonferroni-corrected p-values for the A-type interactions shown in Fig 11 (comma-separated values). (CSV) Fig 12 (comma-separated values). (CSV) S12 Table. Signed logarithm of the holm-bonferroni-corrected p-values for the A-type interactions shown in Fig 12 (comma-separated values). (CSV)