Using within-day hive weight changes to measure environmental effects on honey bee colonies

Patterns in within-day hive weight data from two independent datasets in Arizona and California were modeled using piecewise regression, and analyzed with respect to honey bee colony behavior and landscape effects. The regression analysis yielded information on the start and finish of a colony’s daily activity cycle, hive weight change at night, hive weight loss due to departing foragers and weight gain due to returning foragers. Assumptions about the meaning of the timing and size of the morning weight changes were tested in a third study by delaying the forager departure times from one to three hours using screen entrance gates. A regression of planned vs. observed departure delays showed that the initial hive weight loss around dawn was largely due to foragers. In a similar experiment in Australia, hive weight loss due to departing foragers in the morning was correlated with net bee traffic (difference between the number of departing bees and the number of arriving bees) and from those data the payload of the arriving bees was estimated to be 0.02 g. The piecewise regression approach was then used to analyze a fifth study involving hives with and without access to natural forage. The analysis showed that, during a commercial pollination event, hives with previous access to forage had a significantly higher rate of weight gain as the foragers returned in the afternoon, and, in the weeks after the pollination event, a significantly higher rate of weight loss in the morning, as foragers departed. This combination of continuous weight data and piecewise regression proved effective in detecting treatment differences in foraging activity that other methods failed to detect.


Introduction
Hive weight data provide information on the interaction of a colony and its environment with little or no disturbance to the colony. Hive weight changes are a function of several factors, including colony food collection and consumption, bee development and disappearance, moisture gain and loss due to nectar inflow, ambient relative humidity, respiration and too infrequent to adequately capture colony responses to nutritional changes. More frequent checks (more than once a week) of colony functions and performance by conventional periodic hive evaluations are impractical both in terms of logistics and colony disturbance. Here, we describe how continuous weight data monitoring can be used to assess changes in colony activities associated with nutritional stress.

CAL 2014 dataset
Eight honey bee hives provided by a commercial beekeeping operation (Hiatt Honey CA LP) were installed in a commercial almond orchard, in four groups of two at regular intervals, on the periphery of a 510 ha irrigated almond orchard near Chowchilla, CA (37˚7'22.25"N, 1201 6'21.12"W) in January, 2014, as part of an unrelated study on bee foraging (see [6]). At the start of the experiment, hives consisted of a single deep Langstroth deep box (43.65 l capacity) with 8 frames and a division board feeder. Hive evaluations data (see [6]). In summary, during a hive evaluation each frame was picked up from the box, gently shaken to dislodge adult bees, then weighed and photographed on both sides using a 16.3 megapixel digital camera (Pentax K-01, Ricoh Imaging Co., Ltd.) and replaced in the hive. The area of sealed brood per frame was estimated from the photographs using ImageJ version 1.47 software (W. Rasband, National Institutes of Health, USA). The total weight of the adult bee mass was calculated by subtracting the combined weights of hive components (i.e. lid, inner cover, box, bottom board, frames, entrance reducer, internal feeder) obtained at the start of the experiment from the total hive weight recorded the midnight prior to the inspection. A second box containing 9 frames was added immediately after the hive evaluation on 28 January. Hives were placed on outdoor electronic scales (TEKFA1 model B-2418, Galten, Denmark) (max. capacity: 100 kg, precision: ±20g; operating temperature: -30˚C to 70˚C) and linked to 12-bit dataloggers (Hobo1 U-12 External Channel datalogger, Onset Computer Corporation, Bourne, MA, USA) with weight recorded every 15 minutes. Bottom board pollen traps (Sundance bottom mount pollen trap, Brushy Mountain Bee Farm, NC) were placed under each hive at the start of the experiment in January and pollen was collected nearly continuously until 25 March to monitor pollen inflow, after which pollen traps were removed, standard bottom boards installed, and hives evaluated. Hives were then moved 50 km to Madera, CA (36˚56'14.09"N, 119˚49'52.90"W) as a single group adjacent to fields planted in almond, grape and citrus, and replaced on the hive scales. Hives were evaluated again on 27 May and 20 August. Drought conditions were severe in the area, which impacted available forage.

SRER 2014 dataset
Hives monitored in the Santa Rita Experimental Range, AZ (31˚46'38.08"N, 110˚51'47.39"W) [6]. In April 2013, 4 honey bee colonies were established from package bees (approx. 1.5 kg) with Cordovan Italian queens (C.F. Koehnen & Sons, Glenn, CA). The packages were installed in painted, 10-frame, wooden Langstroth deep boxes fitted with migratory wooden lids (Mann Lake Ltd, Hackensack, MN). The apiary was provided with a permanent water source and hives were spaced 1-3 m apart. The hives were placed on scales linked to dataloggers (same models as described above) with weight recorded every 15 minutes. Weight data were collected 11 Feb. to 3 April 2014 and hives were evaluated on 11 March and 3 April 2014.

CHBRC 2015 dataset
The objectives of this experiment, preventing the departure of foragers for a known period of time at the start of the active period of the day to observe 1) how the rate of weight change during the night was affected by forager departure times; and 2) the timing of any rate change relative to dawn and to control hives. On 29 July 2015 seven honey bee colonies in 10-frame Langstroth deep boxes at the Carl Hayden Bee Research Laboratory, USDA-ARS, Tucson, AZ (32˚16'30.17"N, 110˚56'28.52"W) were evaluated and placed on hive scales linked to dataloggers as described above. The hives were placed under a shelter with a sun screen that protected the hives from both direct sun (for most of the day) and precipitation. The hives were established from packages in April, 2013, and had been treated with thymol (Apiguard) prior to start of the experiment to control Varroa mite densities. Dataloggers on scales were set to record every minute. Robbing screens (Mann Lake Ltd, cat. no. WW-176) were placed on the entrances of the hives. These screens permit closing the hive entrance without interfering with air exchange. Colonies were divided into two groups, block 1 with 3 colonies and block 2 with four colonies. From 12-28 August, the entrances of all the hives in one block were closed at 5:30AM, i.e. before daybreak and the start of flight activity. Closed entrances were opened after a pre-determined period of time: 1 h, 1.5 h, 2 h, 2.5 h or 3 h (S1 Table). Entrances for a given block were closed only on alternate days, and closures were conducted for 4-5 days per week. Resulting within-day weight data were analyzed using piecewise regression, using 100 iterations of regressions with 4 break points, to compare planned and observed delays with respect to data parameters.

MU 2017 dataset
The objective of this experiment was to determine the extent to which the slope of the piecewise regression segment associated with morning forager departure is affected by bee movement (arrivals-departures). In November 2016 16 honey bee colonies established in 10-frame Langstroth deep boxes at Macquarie University, Sydney, Australia (33˚46'27.91"S, 1516 '46.23"E) were placed on hive scales linked to dataloggers as described in the sections above. The dataloggers were set to record every minute. On 29 March and 20, 21 and 24 April hive entrances were recorded on video for one minute per hive on each of four hives. A different set of four hives was chosen each day, so each hive was monitored once. The videos were then analyzed frame by frame with the software Cowlog 3.0.2 to record all bee departures and arrivals during that minute, which occurred between 8:49 to 8:59 AM depending on the day. Piecewise regression models with 4 and 5 break points were fit to the detrended within-day weight data, and the net change in bee number was regressed on slopes of the segments, from 100 iterations of regressions with 5 break points, for that time period during which the observations were made.

CAL 2015-16 dataset
In November 2015 32 honey bee colonies that occupied one to two 10-frame Langstroth deep boxes and had marked queens were identified in apiaries in southern Arizona (see [34]. The hives were divided into four groups of 8. Six hives in each group, for a total of 24 hives, were placed on electronic scales (same scales and dataloggers as described in the previous section). All hives were given 250 g protein patty supplement at the end of November, evaluated to determine adult bee mass in the first week of December, and fed both protein patty supplement and 3 L of 1:1 sugar syrup in mid December. Hive evaluations were conducted on all hives as described in the previous section. On 30

Data analysis
Hive weight data were detrended for each day by subtracting the weight estimate of the hive between midnight (or closest time thereafter) from each subsequent weight value during the day until the last value just before midnight. This daily dataset had three main parts: the period from midnight until about dawn, when colony's interactions with the environment in terms of weight were limited to temperature effects and gas exchange (e.g., water vapor), the active period during the day from the initial departure of bees about dawn until their return about dusk, and the period from dusk to midnight when the colony was again quiescent. In the CAL 2015-16 field experiment, hive weight data were also analyzed using the method previously described (see [6]): data were detrended by subtracting raw data from the hourly 25 hour running average, sine curves were fit to the resulting data, and the amplitudes of those curves were used as a response variable (reflecting bee flight activity). Within-day weight changes were modeled using the "segmented" function in R [35] which fits a segmented line derived from linear or generalized linear model to a dependent variable. The number of breakpoints can be estimated automatically or through the use of initial values provided by the user with final determination by the function. Using the segmented function to automatically determine the number of break points tends to overestimate the number of break points so estimation via methods such as visual inspection is preferred for the final model [35]. Function outputs include the final break point values, the slopes and intercepts of the segments, and the overall model fit. An effort was made to use a minimum number of break points to ensure the model did not have an excessive number of arbitrary parameters but did reflect changes in weight with sufficient precision. Initially models with three break points were fit: break point 1-estimated sunrise, which was calculated with respect to day of year, latitude and longitude; break point 2-minimum weight value of dataset; and break point 3-estimated sunset, using the same function as for sunrise. Models with one and two additional break points were also considered, with those initial break point values estimated at regular intervals during the active part of the day. Model output was visually compared to raw data. Our intention was to determine a suitable general model, rather than the optimal model for each day. Parameters for the model with the highest adjusted r 2 value out of 100 iterations were retained.

Detrending hive weight data
Detrending data by subtracting hive weight at midnight yielded consistent patterns among hives over time (Figs 1 and 2). CAL 2014 data showed colony weight loss in mid-winter followed by a nectar flow in March with an ensuing dearth. SRER 2014 data showed colony weight gain during a late winter to early spring nectar flow. While the bee colonies themselves differed in size (Table 1), after detrending the data by subtracting the hive weight at midnight, the within-day patterns of weight changes were largely consistent over periods of time that corresponded to nectar flows and other events.

Identifying break points
Most daily activity patterns can be considered as having 4 to 6 segments (Fig 3). Two examples of daily patterns with 3 break points were shown with the following interpretations: Point A: First weight measure at midnight or shortly thereafter;

Optimizing the number of model break points
In most locations a hive is active, when ambient temperatures permit, from about dawn to about dusk. Few if any foragers depart a hive before dawn or return to the hive after dusk. An initial break point before dawn would therefore probably correspond to a weight change caused by something other than initial hive activity, such as rainfall. Similarly, a break point after dusk likely corresponded to an event other than forager return. Considering break points falling between 8PM and 4AM as errors a model with 3 break points had the fewest, followed models with 4 and 5 break points. CAL 2014 data were averaged over 8 hives and the SRER 2014 data were averaged over 3 hives. Those average values were then modeled using regression lines with 3, 4 and 5 break points. The Hiatt 2014 dataset was divided into a period of hive weight loss with comparatively little nectar (29 January to 24 March, which included almond pollination) and a period of citrus nectar flow (29 March to 26 May). Five parameters were compared among the different models: the average adjusted r 2 for each fit, the number of days in which any fits were successful, the number of successful fits out of 100 iterations for those days where fits were successful, and number of "errors" for the 1 st and last break points, with a 1 st break error defined as a break before 4AM (i.e. before bees start foraging) and a last break error defined as a break after 8PM (i.e. after bees would have returned to the hive) ( Table 2). Regressions with 3 break points had significantly lower adjusted r 2 values on average but successfully fit more iterations, for those days where any 3 break point regressions could be fit, than regressions with 5 break points, while regressions with 4 break points had r 2 values equal to or greater than models with 3 break points and to fit as many or more datasets than models with 5 break points. That hives tended to change weight at rates that were largely constant for sufficient periods of time, often up to several hours, and that the average r 2 values of successful fits equaled or exceeded 0.96, suggested that the piecewise regression approach is appropriate.
Data for April 12, 2014, in the CAL 2014 dataset, as shown in Fig 3, were fit with 3, 4 and 5 break point regression models (Fig 4).
That particular dataset illustrated another kind of error that would be difficult to correct: the function fails to detect a break point associated with early departure that can be detected through visual inspection. In this case models with 4 and 5 break points detected the departure while the model with 3 break points did not. Models with 4 break points fit to data samples showed an adequate fit (Fig 5).

Interpreting model parameters
Models with 4 break points were determined to have the best fit and used in subsequent analyses. Models fit to data from the CAL 2014 and SRER 2014 datasets, and the break point and slope values changed from day to day (Figs 6 and 7).
Forager departure and return estimates tended to diverge through the winter and spring, as would be expected as daylight hours and ambient temperatures increased. Hive weight changes during the inactive period at night were influenced by the presence of fresh nectar. A nectar flow was present throughout the SRER 2014 data set but a discrete, short-term citrus nectar flow was observed in the CAL 2014 dataset. Hives tended to lose weight at night during a nectar flow, probably due to moisture loss from drying nectar, but outside of a nectar flow hives usually maintained or even gained weight, probably because of increasing moisture content of hive wooden parts as the ambient relative humidity rose at night. Meikle et al. (2006) observed that hourly weights of empty hives varied by 50 g or more over 24 hours.

Manipulating initial bee departure
Visual examination of the data showed that while the early morning weight curves were generally maintained when the bees were prevented from leaving, some small, additional weight changes were observed, often around the time of the 1 st break point for the control colonies (S1 Fig). The exact cause of those changes was unclear. A comparison of planned vs. observed changes in the first break point showed a significant linear relationship (F 1,13 = 73.06, p<0.001, adj. r 2 = 0.85; Shapiro-Wilk normality test and constant variance test both passed) (S2 Fig). The slope of that line was 0.649; that the slope is not equal to one indicates that there were factors other than bee departure, such as moisture gain or loss due to changing ambient conditions, influencing the break points.

Correlating bee traffic with regression slopes
Piecewise regression lines with 5 break points fit data had a higher r 2 (0.987±0.002) and fit the data more often (89.6±4%) (S1 File) than those with 4 break points (0.976±0.005 and 82.5±5%, respectively). The slope of the regression segment fit to weight data collected during the time of the observation was regressed on the net movement of bees (arrivals-departures) at that time ( S3 Fig). Net movement was significantly correlated with hive weight change, as measured by the segment slope (F 1,13 = 26.66, p = 0.0002, adj. r 2 = 0.65; Shapiro-Wilk normality test and constant variance test both passed), indicating that bee traffic was a significant Within-day weight changes of honey bee hives  Within-day weight changes of honey bee hives Within-day weight changes of honey bee hives  SRER 2014 dataset). A) 1st and 4th breakpoints for piecewise regression curves. In cases where the 1st breakpoint occurred before 4AM (i.e. before dawn) the 2nd breakpoint was used. Likewise, if the 4th breakpoint occurred after 8:00PM the 3rd breakpoint was used. B) 1st and 5th slopes for piecewise regression curves fitted to average within-day hive weight changes for those same hives. C) Rainfall and minimum and maximum temperature data.
https://doi.org/10.1371/journal.pone.0197589.g007 that point in time was due to bee traffic, that change can be expressed as: where ΔW is the hive weight change (slope of the regression segment, here 0.032g/min), N D and N A are the numbers of departing and arriving bees, respectively, and M D and M A are the masses of departing (unladen) and arriving (laden) bees, respectively. Assuming M D = 0.13 g [5], using data on N D and N A , and rearranging the terms, the average mass of an arriving bee, M A would have been about 0.15 g.

Applying within-day analysis to migratory colony nutrition study
The two treatment groups were not significantly different in terms of adult bee mass or brood surface area (see Table 1) during any hive evaluations. Continuous hive weight data were divided into three parts: before, during and after almond pollination. Of the 17 colonies in the low-forage treatment just prior to shipping to almond pollination on 28 January, only 8 (47%) survived to the last evaluation on 5 April, but of the 19 colonies in the high-forage treatment, 14 (74%) were still alive for the final evaluation. Each hive weight dataset was subjected to a sine curve analysis of 3-day subsets of detrended data [6] and the resulting curve amplitudes were compared between treatments using repeated measures MANOVA. Amplitude data from every 3 rd day were used, to avoid overlap between data subsets. No treatment differences were detected with respect to sine amplitudes for any of three datasets (pre-almond pollination: P = 0.946; during almond pollination: P = 0.274; postalmond pollination: P = 0.998).
Piecewise regression analysis yielded estimates for 10 parameters: 4 break point values, 5 slope values and the adjusted r 2 . Because the data were detrended by subtracting the raw data value at midnight, daily datasets were mathematically independent. A repeated measures MANOVA was conducted on these daily parameter values of interest: 4. slope of the first segment after initial forager departure, usually the 2 nd segment (rate of weight loss largely due to forager departure); and 5. slope of the last segment before dusk, usually the 4 th segment (rate of weight gain due forager return).
For statistical analysis, if the 1 st break point occurred before 4AM, the 2 nd break point was used as the time of initial forager departure (with no restrictions placed on that second estimate) and the slope of the 3 rd , rather than 2 nd , segment was used as the rate of weight loss due to forager departure. Likewise, if the 4 th break point occurred after 8PM then the 3 rd break point was taken as the time of final forager return (with no restrictions placed on that second estimate).
No differences were observed between treatment groups with respect to any parameter before almond pollination. However, during almond pollination the slope of the last segment before dusk was significantly higher among colonies that had access to forage compared to colonies that did not (Table 3; Fig 8). The slope of that segment would correspond to the change in hive mass due to returning foragers with nectar and pollen. No other parameters were significantly different between groups during pollination. After almond pollination, only the first segment after dawn, corresponding to weight loss due to forager departure, was significant.

Discussion
Monitoring the weight honey bee hives continuously provides information, without colony disturbance, on how the hive is interacting with the environment. One of the challenges in the analysis of such data is modeling the data in a way that exploits the information in the dataset while reducing as much as possible the impact of spurious error. Running average weight data and detrended within-day data have been related to colony parameters such as total adult bee mass and foraging activity [5,6], and the methods they used to detrend and model the data (subtract raw hourly data from the 25 hour running average and fit sine curves to the result) provided robust estimates of bee flight activity. However, information was also lost due to the use of hourly averaging and of 3-day data subsets. In this study raw data were detrended by subtracting the value at midnight, rather than the running average, from each of the values of that day until the next midnight. A piecewise regression model was fit to single-day datasets in the original time scale using an R function [35] that is based on a bootstrapping method [36]. The user provided estimated break points, and the function used a random number generator to provide and test model parameters. The parameters were automatically adjusted until the fit reached a given threshold.
In order to reduce noise in the exploratory phase of model evaluation, daily detrended data were averaged overall 3-8 colonies in two independent datasets from different sites in the southwestern U.S. The resulting daily patterns were, for the most part, consistent among colonies within site. Because of the high precision and the random number algorithm in the segmented function to search for break points, parameters for a given fit were rarely identical, so 100 iterations of the function were conducted for each daily dataset and parameters for the best fit line, based on r 2 value, were retained. Models with 3, 4 and 5 break points were evaluated and those with 4 break points offered a significantly higher r 2 than those with 3 break points while often fitting more datasets than those with 5 break points. Models with 4 break points offered an alternative break point in the event the 1 st or 4 th break points were due to chance events unrelated to the study, such as a weather event. The function established break points where the change in the linear slope exceeded a predetermined threshold so providing an excessive number of break points resulted in a failure of model fit. Piecewise regression models in general fit the data well, with r 2 values for regressions with 4 break points on average exceeding 0.97.
Having established that the piecewise regression method fit the data well, two experiments were conducted to test the interpretation of two parameters: the break point and following segment associated with initial forager departure in the morning (usually the first break point and Within-day weight changes of honey bee hives second segment). The first experiment involved blocking the departure of foragers, without blocking gas or temperature exchange, just prior to dawn and keeping them blocked for fixed times. Models fit to the within-day weight data for each hive were used to determine the times of the initial forager departures. The results were consistent and the regression coefficient showed that for every hour the hive was blocked, the forager departure was delayed by about 40 minutes. The lack of a strict correspondence between planned and observed delay was likely due to factors such as the inherent variability in forager departure (particularly outside of a nectar flow) and to hive weight change resulting from moisture loss due to decreasing ambient r.h. after dawn, rather than bee movement. The study did confirm that the dawn break point resulted largely from an increased loss of hive mass due to forager departure. The second experiment involved monitoring bee traffic for one minute in the morning, during the period of time associated with the initial forager departure. Net bee loss (i.e. more departures than arrivals) during that minute was significantly related to the slope of the piecewise regression model associated with that time of the morning. The value of that slope was a function of the numbers and masses of arriving (laden) and departing (unladen) bees, and was used to estimate an average payload of arriving bees of about 0.02 g per bee. The better fit of piecewise regression lines with 5 break points to data from Australia compared to 4 break points, which fit Arizona data better, suggested that the optimal number of break points for such analyses may depend on the environment. The piecewise regression approach was then used to analyze data from an experiment on bee colony nutrition conducted in Arizona and California. Within-day data from hives from two groups, one that had been exposed to abundant forage and one that had been exposed to little forage, were compared with respect to nine regression parameters (4 break points and 5 segment slopes). Neither hive evaluation data nor sine-based models of withinday weight data revealed group differences during any of the time periods (i.e. during treatment, during almond pollination, and post pollination), and analyses of bee gut microbiota did not reveal significant differences [34]. However, the regression approach did detect a higher rate of weight gain during forager return (usually the 4 th segment) during almond pollination, indicating that the well-nourished colonies had greater foraging success even though no significant differences were observed in the mass of foragers. Similarly, during the post almond period, only the forager departure slope was significantly different between groups, indicating that the well-nourished hives likely either produced more foraging bees, or caused increased foraging effort due to factors such as increased demand, even though forage opportunities at that time and location were poor for all hives (and there was no difference between groups in foraging success, as shown by the lack of a difference in the slopes of segments associated with forager return). In the analysis presented here, no significant differences were detected with respect to break point timing, suggesting that the nutritional state did not affect timing of colony activity.
Within-day weight changes over one to 15 minutes from hives at different locations and years showed consistent patterns that had biological interpretations, that were sensitive to environmental factors such as the presence of a nectar flow, and that could be exploited using piecewise regression. Analyses of within-day data from a field experiment involving colony nutrition detected colony-level treatment differences where other methods, such as visual inspections, did not. In addition, the data were collected without colony disturbance, which can be an important, particularly during cold weather. These results demonstrate the potential of continuous monitoring of honey bee hives as a means of providing easily-interpreted colony-level response variables for longitudinal field experiments.
• The piecewise regression analysis of detrended within-day weight data were applied to data from fed and starved bee colonies during and after almond pollination in California, and the regression indicated that the fed colonies had greater foraging success during almonds, although the initial rates of mass loss due to forager departures were not different; the opposite was true during a pollen dearth after almond pollination.
Supporting information S1 Table. Entrance closing schedule to manipulate the start of flight activity.