Modelling daily weight variation in honey bee hives

A quantitative understanding of the dynamics of bee colonies is important to support global efforts to improve bee health and enhance pollination services. Traditional approaches focus either on theoretical models or data-centred statistical analyses. Here we argue that the combination of these two approaches is essential to obtain interpretable information on the state of bee colonies and show how this can be achieved in the case of time series of intra-day weight variation. We model how the foraging and food processing activities of bees affect global hive weight through a set of ordinary differential equations and show how to estimate the parameters of this model from measurements on a single day. Our analysis of 10 hives at different times shows that the estimation of crucial indicators of the health of honey bee colonies are statistically reliable and fall in ranges compatible with previously reported results. The crucial indicators, which include the amount of food collected (foraging success) and the number of active foragers, may be used to develop early warning indicators of colony failure.


Introduction
There is growing concern that crop pollination by honey bees might no longer be a sustainable operation in the near future [1]. Single and combined stress from pesticides, parasites, and diseases have led to an increase in bee colony mortality worldwide [2][3][4][5][6][7]. Despite beekeepers' efforts to control pests and diseases, annual colony losses of about 30% are frequently reported in countries across Europe [8] and North America [9][10][11]. Automated methods to track the status of hives can play a key role in helping beekeepers prevent the death of colonies [12][13][14][15][16].
The social organization of a bee colony is complex and relies on positive and negative feedback loops to maintain homeostasis. Feedback loops can help buffer the effect of stressors or cause snowballing effects which in some cases precipitate colony death. A well-known feedback loop that has been hypothesized to accelerate the failure of the colony is the one that regulates the ontogeny of foraging [17]. In a bee colony, workers typically spend the first few days of their adult life performing various maintenance, defence, and care tasks before transitioning into foragers [18]. The presence of a sufficiently large number of foragers inhibits the transition of younger bees into foragers [17,19]. Under ideal conditions, this ensures enough bees are performing in-hive tasks, and that the colony is able to process all the food brought back by the foragers and to raise the next generation of workers. If an environmental stress reduces the number of foragers, then this same mechanism ensures the rapid replacement of the foraging force. With fewer foragers, the transition of younger bees into foragers is increased and the foragers' population is rapidly replenished. Under sustained stress, it has been hypothesized that this buffering mechanism can lead to generations of workers starting to forage at a younger age, which reduces their life expectancy and the foragers' population increasingly fast until colony death [20]. Thus, information about the number of forager bees and their activities can provide important guidance about the health of the hive.
Directly measuring the number of forager bees is difficult, but the use of new technologies provides opportunities to estimate them from more accessible data [12,21]. Recently developed affordable and precise balances can easily be fitted to hives in the field to study the longterm weight changes [22] and shorter within-day weight variations of hives [12,23]. Withinday hive weight variations are clearly visible and exhibit similar patterns that are consistent across multiple day as shown in Fig 1. In the morning, rapid weight loss occurs because foragers leave the hive [24]. For this reason, previous methods have focused on measuring the slope and low point of the morning departure segment of daily hive weight's curves [24], and have used this measure as likely representing forager activity [25]. However, these phenomenological methods provide limited information about the daily internal dynamics of the hive. Other methods perform piece-wise linear regression analysis where up to five breaking points are suggested after an abrupt change of slope [24,26]. In particular, these methods do not provide quantitative estimations of the relevant bee activities such as the number of forager bees or the amount of food brought to the hive because the different bee activities are not explicitly modeled. In this paper, we develop a mechanistic mathematical model of the daily dynamics of hives which, in combination with hive weight time series, provides quantitative estimations of key indicators of the state of the hive such as the total number of active forager bees and the average mass of food they bring to the hive in each trip. We start in Sec. 2.1 with our model presentation, its qualitative description, and its main properties. In Sec. 2.2 we present how the parameters can be inferred from data. In Sec. 2.3 we apply our methodology to time series of ten different hives measured across a year and show how in most cases the results lead to biologically meaningful estimations. Finally, in Sec. 3 we summarize our conclusions. A detailed description of our data and computational methods is presented in the Materials and Method section. The data and codes used in our analysis are available in a repository [27]. bees actively maintain the temperature of the brood at about 35˚C, as well as humidity and CO 2 concentration to optimize development. Bees actively warm the colony by shivering, or cool down and ventilate the hive by evapotranspiration and fanning. These activities require a great deal of movement and, thus, cause honey bees to consume significant amounts of honey at night. Sugars in the honey are broken down to produce energy and exhaled as CO 2 . A colony also evaporates a large amount of water, either to facilitate The main time-dependent elements we model are the number of forager bees N(t) and the food F(t) (in green) inside the hive. The foragers are represented as active bees that depart the hive (black arrow) and return to it (red arrow) with food (in orange). The ratio of forager bees inside the hive at a particular time is represented by x(t), which is constant before and after the dynamics of forager bees because the N max foragers are all inside the hive. The static weight W 0 is composed of the hive structure and bees that do not leave the hive.
https://doi.org/10.1371/journal.pcbi.1010880.g002 PLOS COMPUTATIONAL BIOLOGY biological processes, concentrate sugars in food stores, or cool down the hive. Moreover, it is likely that a small amount of the weight loss at night results from cleaning activities, with bees likely getting rid of pests, dead bees and chewed brood cell caps throughout the night. The cycle resumes the next day again.
The net outcome of the activities described above depends on the hive, the environment, and the weather conditions. In favourable environments and at its peak population in summer, a single bee colony can harvest a few kilograms of food in a single day. Most of this weight comes in the form of nectar and honeydew, which generally contains more than 50% water and diluted sugars [28]. Hives in environments with little or no resources may lose more weight than they gain in a day as a consequence of honey consumption inducing respiration and evaporation (i.e., foragers expend more resources than they are able to bring back to the colony).
Mathematical model. We consider the time dependent weight W 2 R þ (measured in g) to depend on the number of forager bees N 2 Z; 0 � N � N max (measured in bees) and on the food F 2 R þ (measured in g) as where W 0 is the constant weight of the hive and w is the (average) weight of a bee. Next we discuss the variation in N(t) and F(t).
For mathematical convenience, we model the variation in the number of foragers bees N(t) by focusing on the fraction x 2 R (nondimensional) of all N max foragers bees that are inside the hive: We write the rate of change of x as where a(t), measured in 1/s, is the arrival rate of the fraction of forager bees outside the hive and d(t), measured in 1/s, is the departure rate of the fraction of forager bees inside the hive.
The inverse of these rates can be thought of as the typical period a bee spends outside and inside the hive, respectively. These terms multiply the fraction of bees outside (1 − x) and inside (x), respectively. The rate of bees arriving is AðtÞ ¼ aðtÞðN max À NðtÞÞ ¼ aðtÞN max ð1 À xðtÞÞ and the rate of bees departing is where A and D are measured in bees/s. The food inside the hive F(t) increases due to the mass m (measured in g) of food brought in by arriving bees and decreases due to evaporation, respiration, and food consumption, which we model as a constant loss ℓ (measured in g/s) so that: Notice that F(t) is linear in x and can be computed directly from x(t) and a(t) by integrating the above equation. Substituting F(t) and x(t) in Eq (1) we obtain the time-dependent weight W(t) of the hive, our measurable quantity. This implies that the equations can be solved sequentially, starting from x(t). The final step of our model is to specify the departure d(t) and arrival a(t) rates defined in Eq (3) and then solve for x(t). We will consider a very simple choice for a(t) and d(t)-piecewise constant in t-which allows for an explicit solutions for x (t) and, thus, W(t). Note that more sophisticated models of food inside the hive F(t) could use a time-dependent loss ℓ(t) dependent on temperature, weather variations, or a term proportional to N (or x) to indicate a consumption rate proportional to the number of (forager) bees. Furthermore, more elaborate functions a(t) and d(t), such as delay equations, could be incorporated.
Day and night cycles. Without loss of generality, in our model, we consider: t = 0 to be the time in the morning (around sunrise) at which the bees start exiting the hives. Therefore, in all cases below we assume d(t) = 0 for t < 0 and t > 1. The matching of the model time t to the clock time requires the estimation from data of two clock times: t 0 (corresponding to t = 0) and t 1 (corresponding to t = 1).
Night dynamics. Let us start with the (easiest) night case t > 1, when bees do not depart d(t) = 0 and arrive at a constant rate a(t) = a. Assuming that at the beginning of this regime at t = 1 there are x 1 bees inside, the particular solution of Eq (3) is which corresponds to an exponential decay towards all the bees being inside the hive (x = 1) with a rate a. Therefore, we can consider 1/a proportional to the duration for which bees remain outside the hive. Day dynamics. Let us consider the simple case where d(t) = d and a(t) = a, both constant in time. The initial condition in this case is x = 1 at t = 0 because we assume all forager bees spent the night in the hive (i.e., 1/a � night period and all previous-day foragers return). In this case, Eq (3) leads to Explicit solutions. We are now able to combine the results above to write explicit solutions for the fraction of foragers inside the hive x(t), the weight of food inside the hive F(t), and the total weight of the hive W(t). We consider the arrival rates during the day a 1 and at dusk a 2 to be different: where we expect to retrieve a 2 > a 1 (bees spend less time out when it is getting dark).
Introducing these arrival rates and a constant departure rate d in Eqs (5) and (6) we find an explicit solution divided in three time intervals for x(t): where Þ is the value of x(t = 1) and ensures the continuity of x(t). We can now turn to the dynamics of food inside the hive F(t). From Eq (4) we obtain wherem ¼ mN max a and C is an integration constant to be fixed using the initial condition F(0) = F 0 and F(1) = F 1 . The solution for F(t) is then solved by integrating Eq (8), which can be done analytically in each of the three branches leading us to where F 0 � F(t = 0)-for this model an arbitrary constant that is incorporated into the static weight of the hive W 0 in Eq (1), consequently, without loss of generality, we take F 0 = 0-and F 1 � F(t = 1) is computed setting t = 1 in the second line of Eq (9) so that Finally, the weight of the hive W(t) is obtained from Eq (1) as the sum of Eq (8) (times wN max ), Eq (9), and the constant weight of the hive W 0 , leading to an explicit piece-wise continuous function W(t) that will be compared to the measured data. A summary of the parameters used in the model is given in Table 1.
Effective description of the model solution W(t). The mathematical model described above is divided in three regimes-t < 0, 0 � t � 1, t > 1-with similar characteristics as shown in Fig 3. The three regimes contain segments of linear dependence of W(t)-a decay in the first and last regime, and typically an increase in the intermediate regime-with a transition period (shaded region) between the regimes: t < 0 The first regime corresponds to all bees inside the hive (x = 1) and a linear decay of W on time t with rate ℓ-evaporation and respiration rate defined in Eq (4)-and with an intercept A � W(t = 0).
The second regime starts with an exponential relaxation of x(t) from 0 to the fixed point value x � � a dþa , as described in Eq (6), which leads to a sharp decay of W(t) until an inflection time t c . Shortly after t c , W(t) shows a linear dependency with rate α and constant B.
The third regime shows an exponential relaxation of x(t) from x � to x = 1 as described in Eq (5), which leads to a sharp growth in W(t) as all the bees return and before the linear decay of the first regime starts again completing the cycle.
The critical regime describing bee activities is the second regime that takes place during the day 0 � t � 1. It can be effectively described by the 4 parameters A, α, B, and t c (see Fig 3a), which can be rewritten in terms of the parameters of the model (see Table 1) as: A: The total weight of the hive at W(t = 0) defined as: A = N max w + W 0 α: The rate of weight gain due to foraging, from Eq (9): a ¼mð1 À a 1 dþa 1 Þ À ' B: The intersection point of weight gain at t = 0: The time of the minimum weight: shows an important mathematical property of our model: different choices of parameters lead to nearly identical curves W(t) making our model effectively underdetermined by W (t). Mathematically, this underdetermination happens because the four effective parameters introduced above (A, B, α, t c ), which provide a simplified quantitative description of the solution W(t), depend on five model parameters (N max , W 0 , m, a 1 , d, note thatm � mN max a and the mass of a bee w is taken as fixed). This means that there are different choices of the parameters θ of our model that lead to the same four effective parameters describing the second regime W(t) (even if the curves W(t) are mathematically distinct, their difference is indistinguishable in Fig 3 and much smaller than the experimental precision of the data). Biologically, this underdetermination appears due to our definition of which bees in the hive count as forager bees. Our model parameters are fixed and can be thought as an average over a spectrum of bees that ranges from very active to those that do only very few forager excursions in a day. Depending on whether we include the least active bees as foragers or not, we obtain different model parameters but the same effective parameters defined above. For instance, the same Table 1. The parameters θ of our model. For simplicity, the average weight of bees w is fixed based on previously reported results. The remaining parameters are inferred from data within a prior range of admissible parameters (right column). The boundaries of the range of flat priors were chosen based on the cited references or on values smaller/ larger than reasonably possible values (e.g., the maximum weight max(W) is chosen to be the maximum total weight of the hive at all times). In all cases reported below, the inferred parameters remained away from the boundaries, indicating that our choice of boundaries has no influence on our results.

Parameter Description
Prior Range   initial weight A � W(t = 0) = N max w + W 0 can be obtained by including the less active bees as foragers-thus increasing N max -or as part of the non-foragers bees that spend most of the time in the hive-thus contributing to the constant W 0 . Similarly, counting the least active bees as foragers reduces the average mass m of food brought to the hive, increases the average time they spend in the hive (i.e., decreasing d), decreases the time spent outside the hive (i.e., increase a 1 , a 2 ), and leads to a larger fraction of forager bees inside the hive (x � increases). As shown in Fig 3, solutions of W(t) with very different N max , W 0 in Fig 3b and 3d, respectivelywhich imply also different d, a 1 , a 2 , and m-lead to essentially the same curves for the food F

Connecting model and data
Here we show how to combine our model with data to extract information about the state of hives. We describe the theoretical and computational methodology used to infer the model parameters θ from time series of hive weights W 0 (t).
Theory. We assume that the measured weight W 0 deviates from the actual weight W as where ε is an independent Gaussian distributed random variable with zero mean and standard deviation σ, i.e., ε � N ð0; s 2 Þ, that accounts for measurement errors and random effects not accounted in our model. If θ are the model parameters generating W � W θ , at an arbitrary time t, as described in the model of the previous section, the probability likelihood of an observation W 0 is a Gaussian distribution centred at W θ given by Assuming that observations are performed at times q 1 , q 2 , . . .q n with q i = q 0 + nδq and are (conditionally) independent from each other, the log-likelihood of the observations is given by: We can include further information on the parameters θ by considering a prior distribution P(θ). Here we consider flat priors for all parameters, i.e., P(θ) = cst. for θ 2 [θ 1 , θ 2 ] as specified in Table 1 above. Effectively, this restricts the range of admissible parameters without changing the probability of parameters within the range. The estimation of the parameters θ can be done maximizing the posterior distribution P(θ|D) which in our case is equivalent to the maximization of the log-likelihood (10) or to the minimisation of the squares of W 0 − W as in the range of admissible parameters set by the priors.
Computation. The problem of obtaining the ten parameters introduced in our modelsee Table 1-has been thus reduced to the optimization problem of finding the minimun of L in a ten-dimensional space. The computational method we use for this optimization is centred around the Levenberg-Marquardt algorithm as implemented in the library SciPy (See Materials and methods Sec-4.1). However, a reliable estimation of the parameters cannot be obtained directly through the naïve application of this method both because of the high-dimensionality of the problem and the effective underdetermination of our model reported at the end of last section (which leads to essentially flat regions of L in the parameter space with multiple local minima). Here we use properties of bee hives and our model to simplify this problem and obtain a reliable estimation of parameters.
We start by considering four parameters: w For simplicity, we fix the average weight of bees to w = 0.113 g/bee [29,30].
t 0 , t 1 We infer the times t 0 and t 1 in a grid of plausible values (separated by five minutes). During this minimization we do not distinguish between model parameters that lead to the same effective parameters A, B, α, and t c because t 0 and t 1 are not involved in them. The estimation of t 0 and t 1 allow us to re-scale the data to the model time scale, facilitating the numerical estimation of the remaining parameters. ℓ The continuous loss can be calculated directly from the negative linear decrease when t < 0.
These parameters, and the effective parameters (A, B, α, t c ), are not strongly affected by the underdetermination problem discussed above. The remaining six parameters-N max , W 0 , m, a 1 , a 2 , d-are directly connected to the underdetermination problem and need to be estimated with care. We start by obtaining an estimation of A, B, α, and t c by direct minimization of L. The mapping of the effective parameters to the model parameters is done in two different ways: (i) We seek a range of plausible model parameters determined as the values of W 0 and N max that keep the four effective parameters constant and that maintain the parameters ℓ, m, a 1 , a 2 and d within their prior range (set in Table 1). In practice, this is done by defining different linear combinations between N max and W 0 that are introduced in the system of six equations that connect the parameters. For each possible N max , we obtain a particular a 1 , d, and m by solving the systems of equations given by A, B and t c . The second rate of arrival, a 2 , is obtained by the conservation of departed and returned bees. This method leads to the estimation of an interval of possible parameters.
(ii) Alternatively, by pre-defining one parameter, we calculate the N max value directly because the system of equations presents a unique solution. We focus on the departure rate d because of the consistency among different reported studies on the time interval between trips that a forager bee spends inside the hive [37] and because d can be directly connected to the simplifying modeling decision of dividing bees into foragers and those that do not leave the hive (i.e., we consider as foragers the most active bees so that collectively their average departure rate is d, the other bees are considered to effectively stay inside the hive, and their weight is counted as part of W 0 ). In practice, we choose d so that the average time spent by forager bees inside the hive between trips is τ d = 0.816 h (the half-life τ is derived from the arrival or departure rate, and defined in Eq (12) below) [36].
Even if the first approach does not lead to a point estimation of the parameters, in practice most solutions can be restricted to relatively narrow intervals by considering physically-based constraints. Some of these constraints are the average mass of food carried by a single bee m, the interval of time between trips τ, and the number of trips needed to satisfy food requirements. Details on the computational methods appear in Materials and Methods Sec. 4.2 and the codes are available in the repository [27].

Results from data analysis
In this section we report and discuss the results obtained by applying the model defined in Sec. 2.1 to data using the inference methods explained in Sec. 2.2.
Data. We used 24h time series of weights W 0 (t) sampled every δt = 1 minute for ten different hives on different days. The data was collected between 27 th of November 2017 and 2 nd of March 2019, from hives located at Macquarie Park, NSW, Australia (33˚46 0 06.6 00 S 151˚06 0 43.8 00 E, see [34]). We show data from the 7 th of April in Figs 4, 5a and 5b; additional dates appear in Fig 5c. In all cases, the daily maximum temperature was above 16˚C (which is  sufficient for honeybees to forage [38]), and no rain was recorded (ensuring that the recorded weights are not affected by precipitation). This dataset also contains an independent estimation of the total number of bees in each of the hives, obtained from measures of the total weight of the hive with and without bees (performed two days after the day we use to infer the parameters of our model) [34]. As a pre-processing step to the analysis of weight time series, we removed outliers that correspond to exogenous influences as described in Materials and Methods Sec. 4.1.
Representative case. Fig 4 shows the comparison between our model with inferred parameters and data for one specific hive. We see how the combination of the three piece-wise continuous curves in our model (for t < 0, 0 � t � 1, and t > 1, respectively) provides a reasonable account of the characteristic pattern of within-day hive weight variations. The main advantage of our methodology is the mechanistic generative model associated to this curve which ensures the parameters describing the curve have a relevant meaning. For instance, we estimate that on this day there were N max = 5, 830 active forager bees, that they brought on average 0.03 g of food on each trip, and that trips were on average t a 1 ¼ 0:70 h long. The difference between our model and the data is particularly pronounced at t ≳ 1, when a sharper than expected transition is observed. This suggests that modifications of our model are needed to better capture the return of the bees during the initial night period.
Activity time. As a further test of our inference method, and the meaningfulness of its outcome, we look in detail at the inferred times in which foragers start (t 0 ) and stop (t 1 ) departing the hive on different days and different hives. The results were summarized in panels of . These results confirm the success of our approach which is to treat these times as additional parameters to be inferred from the data simultaneously with the other parameters. Systematic analysis. Motivated by the success of our methodology in the cases above, we applied it systematically to 10 different hives on the same day (2018-4-7). The results obtained from our two inference methods are reported in Table 2. The comparison of the inferred parameters of these hives with expectations about the behaviour of bees and hives provides further evidence that our mathematical model is capturing meaningful information from the data. In particular: • The time forager bees start (t 0 ) and stop (t 1 ) departing the hives are aligned with sunlight.
• The time of minimum weight was inferred to be around t c = 9: 45 a.m. shortly after t 0 (for all hives except 'Hive 8', which presents a later minimum). This result is compatible with the minimum daily weights reported in experimental data [39], which reveal minimum weights around 9: 00 a.m. Recently, results from segmented linear regression showed minimum losses can take place even 4 hours after sunrise [40].
• The estimation of the number of forager bees N max is around a few thousand bees, in line with alternative methods of estimation. In particular, Fig 6a shows  • In 90% of the cases, a 1 < a 2 , which is in line with the expectation that bees return faster to the hive at dusk (after they stop leaving).
• The typical time spent by forager bees inside and outside the hive is computed through an analogy with the half-life, τ, which represents the time that half of the foragers' colony will need to go outside the hive or vice-versa. The half-life, τ depends on the departure or arrival   rate r 2 {d, a 1 , a 2 } in hours as From the calculated values of d, a 1,2 , and t 1 − t 0 , we determine that the time in the hive between trips, τ d , ranges from 0.29h to 0.85h and the typical time of a foraging trip, τ a , is in between 0.14h to 3.40h. These estimations align with the results in Refs. [36,37,42]. The only hive that shows values up to 6.25 h for trip duration is Hive 8, this hive shows behaviour outside the ones predicted by our model, as evidenced by ; in the interval estimation. However, some reports points that bees are able to be outside the hive 6.25 h at the age of 36 days [35]. Arrivals and departures. Finally, we test the predictions of our model against independent experimental measures that tracked the number of bees that arrive and depart from a hive at a time interval [37]. Fig 6b compares these measurements to the assumption in our model and reveals the limitations of our simplifying choice of constant and piece-wise linear arrivals and departures in Eq (7). While our model captures the main characteristics of the data, unsurprisingly, bee activities do not start or end abruptly and the measured data shows smoother variations than the prediction of our model. A similar behaviour can be observed for W(t)-such as in Fig 3-where the measured data do not show the abrupt changes seen in the  18.2% of all bees being foragers, constant across all hives. The total number of bees was estimated through an independent measurement performed on 2018-04-9, two days after the data used in our inference [34]. The points without bars correspond to those that had no successful interval estimation, as discussed in Table 2. In all our analysis of correlation between variables, the reported p-value p is the probability of obtaining a positive coefficient of determination R 2 equal or larger than the reported R 2 under the null hypothesis that the variables are independent from each other. (b) The plot represents the difference between the number of bees arriving and bees departing a honey-bee hive during a 5 minutes interval. The model prediction (blue line) was computed integrating A(t) − D(t) over each time interval. The real data was measured using bee-tracking methods.
https://doi.org/10.1371/journal.pcbi.1010880.g006 model curve at t = 0 and t = 1 (discontinuous derivative). This limitation can be partially overcome within our modeling framework-Eqs (1) to (4)-by changing the assumptions of constant a(t) and d(t)-e.g., in Eq (7). For instance, the next simplest alternative is to consider that the arrival remains constant a(t) = a but that the departures varies continuously from 0 at time t = 0 to a maximum value d max at time t = 0.5 and back to 0 at t = 1. The simplest polynomial with these characteristics is In this case, Eq (3) is a linear first-order equation and the solution is: where rðtÞ ¼ at þ 2d max t 2 À 4 3 d max t 3 and C is the arbitrary constant to be fixed by the initial condition x(0) = 1. This alternative model has the same number of parameters (d max instead of d), but requires additional computational resources to numerically find the solution for x(t) through the integration of Eq (14). This example illustrate the flexibility of our model to consider more realistic settings and also the trade-off between model complexity and the need of computational approaches.

Discussion
We introduced a simple mathematical model of the foraging activities of bees that reproduces the observations of within-day weight variations of hives. Simplistic assumptions of the departure and arrival of bees allow us to obtain closed form solutions of the model that result in a continuous and piece-wise differentiable curve W(t) with 10 biologically-relevant parameters. We inferred the parameters of W(t) for 10 different hives and for different days across the year. Overall, the results reveal that the parameters we infer agree with existing knowledge of bee behavior and that the inferred number of forager bees is compatible with the expectation based on the independent measure of the total number of bees in the same hives. Further work is needed to quantitatively test the accuracy of our inferred parameters and the extent of validity of our model. For instance, future studies could simultaneously collect weight data and forager activities with tracking devices for individual bees, allowing for a direct comparison between our approach and estimations based on independent measurements.
The main advantage of our approach when compared to previous works is that it combines mechanistic mathematical models with data analysis to obtain interpretable quantitative information about the condition of the hive. Additional advantages of our model approach include: (i) indications from the inference of when results cannot be trusted (either because they are not capturing meaningful information or because optimal results lie outside of the allowed error range); (ii) that our framework allows for the proposal of more sophisticated models, e.g., by considering additional terms contributing to the food dynamics in Eq (4) or more detailed modes of arrival a(t) and departure d(t) such as in Eq (13); and (iii) that our model allows for a systematic exploration of the effect of foraging activities and food processing parameters on how hive weight changes over time. An important methodological expansion of our work would be to consider and compare different type of functional forms in our model. This will require considering cases in which solutions are not obtained in closed forms and the implementation of a Bayesian-model comparison that can compare models of different complexities, in line with modern simulation-based and Bayesian inference that are increasingly applied to dynamical systems of biological interest [43][44][45].
The success of our model in extracting meaningful information about bee hives opens the possibility of using it for a systematic analysis of the health of hives across different hives and across time. Our model can be used to obtain interval estimations of the following quantities, which are of particular interest: • The number of active foragers in a bee colony N is a crucial parameter of the capacity of a colony at sustaining itself. Environmental stressors such as pesticides can cause a rapid depletion of the population of active foragers [46]. Estimations of N provide a rapid assessment of the daily foragers population and allow quick intervention and stress reduction to potentially prevent colony collapse [20].
• The time spent by foragers outside the hive τ a can be connected to the efficiency of the colony because long times outside the hive lead to a reduction of the life expectancy of foragers [37].
• The amount of food collected by bees in each trip m allows for a better assessment of foraging difficulty and thus of environments that can best support beekeeping operations [22].
• The rate of weight growth of a hive α gives a reliable indicator of whether foragers are supplying their colony properly (leading to optimal colony development) or if instead the colony is endangered (suggesting the existence of stressors or environmental limitations).
Tracking the daily evolution of these parameters opens the perspective of developing weight-based early-warning indicators of bee colony failure that could lead to new strategies for risk control.

Computational methods
Here we describe the computational methods used to estimate ranges of the parameters θ of our model presented in Table 1 from the within day time series of weights W 0 (t). In Sec. 2.2 of this manuscript this issue has been formulated as the problem of finding the parameters θ that minimize the sum of squares L in Eq (11). The ten parameters proposed in this model were reduced to nine parameters after considering the weight of a bee w as 0.113 g/bee.
The minimisation was performed in a nine-dimensional space through a least-square method composed of three different approximation levels. First, the clock times when bees started departing t 0 , and when they stopped leaving t 1 . Next, a reliable estimation of the characteristic features from the time series of the weight fluctuations was performed. Finally, the remaining seven parameters were obtained from the connection with the effective parameters.
For the estimation of clock times for t 0 and t 1 we created a squared-grid for the times t 0 2 [6:30am,9:00am] and t 1 2 [2:30pm,6:30pm] spaced every 5 minutes. These times combinations provides 1440 different scaling options. Then, we applied to each ordered pair, (t 0 , t 1 ), the Levenberg-Marquardt algorithm for square minimization. We used the python library SciPy (through "scipy.optimize.curve_fit" Pythons' function) with the inputs described in Table 3 to apply the algorithm.
We chose the combination of t 0 and t 1 that minimizes the error as the inferred scale time parameters. Then, after re-scaling our data set on the time axis, we proceeded with the estimation of effective parameters; A, B, t c , and α from the daily-hive data. We infered the remaining seven model parameters from a system of six equations, which consist of: (i) four equations presented as the "effective description of the model"; (ii) one equation from the conservation of the number of departed and returned bees; and (iii) the last equation is the rate of evaporation ℓ, which can be calculated directly from the time series after the re-scaling process when t < 0. Then, we performed a simultaneous calculation with six equations and seven variables; consequently, the system is undetermined with no unique solution due to an extra degree of freedom. We proposed that our dependent system of equations must be solved for any possible N max value. Note that the total weight of the hive at W(t = 0) can be decomposed in two terms only: the weight given by the total number of forager bees N max , and the initial constant weight W 0 . Consequently, our solution will depend on picking a value for N max and finding the corresponding W 0 value. Thus, an infinite number of solutions will be obtained and then reduced to a confidence interval as can be observed in S1 Fig. Alternatively, by pre-defining one value, such as departure rate d, we can reduce the variables of our equation's system and obtain a unique solution for the parametersŷ. The solution has extended to 10 independent and different hives during the day 2019-4-7 and is presented in Sec 2.3.

Pre-processing of data
Before any computation, we removed outliers that do not reflect actual change in hive weight and that would otherwise affect our analysis. These outliers appeared due to equipment failure, animal or human interactions with the hives, or rainfall. In practice, we removed all measurement points that corresponded to a difference of 20g or more within two consecutive minutes, a conservative threshold which ensures that strong oscillations were excluded. From the ten different hives analysed, the hive that presented the highest number of outliers was hive '10', with 89 outliers, so at most only 6.1% of the data were excluded from this hive (S2 Fig).

Bootstrapping
Bootstrapping is used for assessing the effects of data uncertainty. We applied a bootstrapping method to estimate the uncertainty around the estimated parametersŷ. For each daily data of 10 different hives, 200 re-samplings were generated, and the minimisation procedure described above was applied to them (for the fixed value of t 0 , t 1 obtained in the full dataset). Suppose the inference fails or the number of iterations exceeds 5, 000 evaluation, and the optimal parameters are still not found. In that case, the re-sample is discarded (for the 10 analysed hives listed in Table 2 only 5% of the 200 re-samplings were discarded). The inferred parameters of the accepted re-samples are used to estimate the range of plausible parameters. The uncertainty reported in the main manuscript corresponds to the standard deviation over the parameters estimated in the re-sampled data. For 10 hives reported in Table 2   Measured weight W 0 per minute over a full day.
x data Time per minute scaled for the corresponding measured weight W 0 .