A new approach for incorporating 15N isotopic data into linear inverse ecosystem models with Markov Chain Monte Carlo sampling

Oceanographic field programs often use δ15N biogeochemical measurements and in situ rate measurements to investigate nitrogen cycling and planktonic ecosystem structure. However, integrative modeling approaches capable of synthesizing these distinct measurement types are lacking. We develop a novel approach for incorporating δ15N isotopic data into existing Markov Chain Monte Carlo (MCMC) random walk methods for solving linear inverse ecosystem models. We test the ability of this approach to recover food web indices (nitrate uptake, nitrogen fixation, zooplankton trophic level, and secondary production) derived from forward models simulating the planktonic ecosystems of the California Current and Amazon River Plume. We show that the MCMC with δ15N approach typically does a better job of recovering ecosystem structure than the standard MCMC or L2 minimum norm (L2MN) approaches, and also outperforms an L2MN with δ15N approach. Furthermore, we find that the MCMC with δ15N approach is robust to the removal of input equations and hence is well suited to typical pelagic ecosystem studies for which the system is usually vastly under-constrained. Our approach is easily extendable for use with δ13C isotopic measurements or variable carbon:nitrogen stoichiometry.

T is the temperature (°C). W 1 is the average mass of a protozoan (7.5) and W 2 is the average 928 mass of a mesozooplankton (3.8×10 6 ). where Z is an orthonormal matrix that is calculated by the singular value decomposition of the 936 matrix E and x 0 is an initial solution that solves the equality and inequality constraints. We used 937 the L2MN solution (found using the R function lsei) as our initial solution. The MCMC 938 approach then follows an iterative approach to find new solutions through a constrained random 939 walk through the solution space: 940 1) Given solution x n = x 0 + Zq n a new solution is found by a random jump such that q n+1 = q n 941 + jmp × R n , where jmp is a pre-selected jump length and R n is a vector with the same 942 length as q n that is drawn from a random normal distribution.

943
2) The mirror algorithm is used to reflect the solution q n+1 off hyper planes defined by the 944 inequalities (Eq. 3), to ensure that the solution derived from q n+1 will also satisfy the 945 inequality constraints.

946
3) x n* is then defined such that x n* = x 0 + Zq n+1 and the probability of x n* and x n with respect 947 to Eq. 2 are calculated: (p(x)=e -1/2ζ-2(Ax-b)'(Ax-b) ). 948 4) If p(x n* ) > p(x n ), then x n* is accepted as a solution and appended to the matrix of solutions 949 as x n+1 and the process is repeated from step 1. 950 5) Otherwise, a random number (r) is drawn from the uniform distribution from 0 to 1. If r 951 ≤ p(x n* )/p(x n ), then x n* is accepted as a solution and appended to the matrix of solutions 952 as x n+1 , and the process is repeated from step 1. Oevelen et al. [2]. We implemented xsample using a jump length (jmp) that was chosen 958 separately for each model run to allow acceptance rates that were typically in the range of 10%.

959
Each MCMC solution was preceded by a burn-in period equal to 20% of the total run length.

960
This burn-in period was used to ensure that the overall solution was not influenced by the 961 arbitrary choice of x 0 . To allow for rapid dispersal from the initial location (x 0 = L2MN 962 solution), for the beginning of the burn-in period we used a standard deviation (ζ) for the 963 approximate equalities (Eq. 2) equal to 10 times the ζ that would be used throughout the rest of 964 the random walk. Random walk length was at least 100 million, but was adjusted to ensure that 965 for all unknowns (flows) the difference between the mean of the first half of the random walk 966 and the mean of the second half of the random walk was less than 5%. To minimize 967 performance issues associated with memory constraints, we only stored every 10,000 th solution 968 vector. 1) Given solution x n = x 0 + Zq n a new solution is found by a random jump such that q n+1 = q n 981 + jmp × R n , where jmp is a pre-selected jump length and R n is a vector with the same 982 length as q n that is drawn from a random normal distribution.

983
2) The mirror algorithm is used to reflect the solution q n+1 off hyper planes defined by the 984 inequalities (Eq. 3), to ensure that the solution derived from q n+1 will also satisfy the 985 inequality constraints. selected by a random jump such that ∂ U,n* = q n + jmp 15 × R n , where jmp 15 is a pre-988 selected jump length and R n is a vector with the same length as ∂ n that is drawn from a 989 random normal distribution. 990 4) Since matrix A from Eq. 2 is a function of ∂ U and ∂ K (see Appendix 1), we then call a 991 function written in R (ResetRN15) to re-calculate A(∂ U,n*, ∂ K ). 992 5) x n* is then defined such that x n* = x 0 + Zq n+1 and the probability of x n* and x n with respect 993 to Eq. 2 are calculated: (p(x)=e -1/2ζ-2(Ax-b)'(Ax-b) ), where A is a function of ∂ U,n and ∂ K or 994 ∂ U,n* and ∂ K .