## Figures

## Abstract

Adaptive management involves learning-oriented decision making in the presence of uncertainty about the responses of a resource system to management. It is implemented through an iterative sequence of decision making, monitoring and assessment of system responses, and incorporating what is learned into future decision making. Decision making at each point is informed by a value or objective function, for example total harvest anticipated over some time frame. The value function expresses the value associated with decisions, and it is influenced by system status as updated through monitoring. Often, decision making follows shortly after a monitoring event. However, it is certainly possible for the cadence of decision making to differ from that of monitoring. In this paper we consider different combinations of annual and biennial decision making, along with annual and biennial monitoring. With biennial decision making decisions are changed only every other year; with biennial monitoring field data are collected only every other year. Different cadences of decision making combine with annual and biennial monitoring to define 4 scenarios. Under each scenario we describe optimal valuations for active and passive adaptive decision making. We highlight patterns in valuation among scenarios, depending on the occurrence of monitoring and decision making events. Differences between years are tied to the fact that every other year a new decision can be made no matter what the scenario, and state information is available to inform that decision. In the subsequent year, however, in 3 of the 4 scenarios either a decision is repeated or monitoring does not occur (or both). There are substantive differences in optimal values among the scenarios, as well as the optimal policies producing those values. Especially noteworthy is the influence of monitoring cadence on valuation in some years. We highlight patterns in policy and valuation among the scenarios, and discuss management implications and extensions.

**Citation: **Williams BK, Johnson FA (2017) Frequencies of decision making and monitoring in adaptive resource management. PLoS ONE 12(8):
e0182934.
https://doi.org/10.1371/journal.pone.0182934

**Editor: **Igor Linkov, US Army Engineer Research and Development Center, UNITED STATES

**Received: **February 14, 2017; **Accepted: **July 26, 2017; **Published: ** August 11, 2017

This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

**Data Availability: **No original data were collected or used in this paper.

**Funding: **This work was supported by the U.S. Geological Survey, The Wildlife Society. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests: ** The authors have declared that no competing interests exist.

## Introduction

A well-known approach to learning-oriented decision making in natural resources is adaptive management, in which learning occurs through recursive management and what is learned at each time is used to guide future management actions (Williams and Brown [1–2]). Adaptive decision making is based on the recognition that resource systems are only partially understood, and there is value in tracking resource conditions and applying what is learned as the resources are being managed (Williams [3]). In the ongoing process of learning and adaptation, adjustments to decision making occur as understanding improves, with the ultimate goal of improved management (Walters [4]).

Adaptive decision making is by its nature flexible, and therefore is applicable to a wide variety of resource problems (Williams and Brown [5]). In some instances its focus is on the improvement of understanding about the role of management in influencing resource dynamics (Linkov et al. [6], Runge et al. [7]). In others it is on the social and institutional framework supporting iterative decision making (Susskind et al. [8], Convertino et al. [9]). In yet others, it is on the “architecture” of structured decision making, with the elicitation of values, objectives, decision alternatives etc. (Johnson et al. [10], Linkov et al. [11]). Even if one is primarily concerned about uncertainty and the improvement of technical understanding, the range of applicability is extremely broad. An important challenge for an adaptive framework is to cover a large number of decision problems, yet be flexible enough that it can be tailored to the details of any particular problem.

Here we take a formal, decision-theoretic approach to adaptive management (Johnson and Williams [12]), rather than more ad hoc approaches that sometimes are described in the literature (e.g., Schhreiber et al. [13]). In particular, we focus on an iterative sequencing of (*i*) decision making and taking actions, (*ii*) followed by monitoring of system responses, (*iii*) followed by assessment of data, (*iv*) followed by incorporating what is learned into future decision making (Fig 1). System state is typically monitored at fixed intervals, often annually, in order to inform decisions that occur with the same frequency (Hauser et al. [14]).

Technical learning involves an iterative sequence of decision making, monitoring, assessment, and feedback of what is learned into decision making. Institutional learning involves periodic reconsideration of the components in decision making (Williams and Brown [5]).

Alternatives to the coincidence of decision making and monitoring involve different cadences for the 2 activities. For example, the setting of migratory bird hunting regulations often involves annual monitoring and decision-making, but a different sequence was adopted for pink-footed geese (*Anser brachyrhynchus*) in Europe (Johnson and Madsen [15]). In the latter case, administrative burden is reduced by fixing harvest quotas for three years, while population monitoring occurs annually. Thus, while learning accrues annually, decisions are based on system state only every fourth year. In the United States, the regulations setting process for duck harvest has recently been modified so that current system state is not known (monitored) at the time a decision must be made (Johnson et al. [16]). Decisions must therefore be conditioned on the previous system state and regulatory action.

The issue of timing in decisions and monitoring has arisen in other decision processes as well. An example is the adaptive management program adopted by the Atlantic States Marine Fishery Commission for the establishment of horseshoe crab harvest in Delaware Bay quotas (Smith et al. [17]). State variables relevant to harvest decisions include not only the abundance of the harvested species, but also the abundance of migratory shorebirds (red knots [*Calidris canutus*]) that depend on horseshoe crab eggs as a food source at key migration stopover sites in Delaware Bay. Harvest quotas for the fishing season of June-December, year *t*+1 are established in the fall (e.g., November) of year *t*. The decisions are informed by estimates of system state variables obtained in May of year *t* (red knots) and October-November of year *t*-1 (horseshoe crabs [*Limulus polyphemus*]). Debate has ensued about the feasibility of pushing the harvest decision forward (e.g., January, year *t*+1) in order to make use of the previous fall’s crab survey data.

More generally, disconnecting the sequencing of decision making and monitoring is potentially advantageous, in that cost-savings often can be obtained by reducing the frequency of monitoring, or alternatively reducing the frequency of analysis and decision making. Of course, an important question concerns the effect of such asynchrony, in particular as it relates to the value produced by decisions made when there is restricted decision making or an absence of monitoring information. The issue can be framed in terms of the value produced with management policy that is informed by monitoring. One approach is to identify a value function, for example the expected accumulated harvest of a biological population, which expresses the value associated with decision making given the status of the resource being managed and some measure of understanding of it. If the resource is managed optimally based on current information about it, the question at issue is whether and to what degree the value produced through decision making is compromised by an asynchrony between decision making and monitoring.

Our objective here is to provide a framework by which to consider the question of asynchrony in monitoring and decision making, by describing and assessing valuation forms for 4 simple and plausible scenarios. These involve annual and biennial decision making in combination with annual and biennial monitoring. With biennial decision making, decisions are changed only every other year; with biennial monitoring, field data are collected only every other year. We acknowledge that other cadences are possible and could be considered. But we believe that the 4 scenarios developed here serve to highlight relevant patterns.

We first summarize the technical framework for adaptive decision making, and then describe value functions for each of 4 scenarios. For each scenario we describe optimal valuations and policies under both active and passive adaptive management.

## Decision making under structural uncertainty

A formal expression for adaptive management in the presence of structural uncertainty can be given in terms of a resource system that changes through time in response to iterative decision making, with models describing periodic change in resource status. The parameters and elements needed to characterize iterative decision making under uncertainty include:

*t*—time index for a range of times constituting the time frame. The index is assumed here to take positive integer values, starting at some time*t*_{0}and ending at time*T*which may be infinite. In what follows we also use*τ*as a time index, to represent forward aggregations of values conditional on some starting time*t*, as in .*x*_{t}—system state (size, density, spatial coverage, etc). Because the system is assumed to change through time its state is time-specific. It is assumed for now that system state is fully observable. We discuss the implications of partial observability below. In what follows we will need to consider the summation of values*f*(*x*_{t}) across all system states for a given time*t*, which we abbreviate with the notation .*k*—model index for*k*= 1,…,*K*models representing different hypotheses about system dynamics.*q*_{t}—vector (*q*_{t}(1),*q*_{t}(2),…,*q*_{t}(*K*)) of probabilities, with*q*_{t}(*K*) the probability that model*k*best represents the system at time*t*. The vector*q*_{t}is referred to as the model state, and it evolves through time as information accumulates via monitoring.*a*_{t}—action taken as a result of decision making. Because they are taken through time, actions are time-indexed.*A*_{t}—policy that specifies a particular action for each system state and model state at each time starting at time*t*in the time frame.*A*_{0}specifies actions over the full time frame {*t*_{0},…,*T*}, and*A*_{t}identifies the actions over a subset {*t*,…,*T*} of the time frame, starting at*t*≥*t*_{0}.

### System dynamics

Here we assume that transitions among system states at any point in time are influenced by the current state but not previous states, and by the action taken at that time. That is, state transitions can be described as Markovian (Puterman [18], Williams et al. [19]). If *x*_{t} and *a*_{t} are the state and action at a particular time *t* and *x*_{t+1} is the state at the next time, then the probability of transition from *x*_{t} to *x*_{t+1} is *P*(*x*_{t+1} | *x*_{t},*a*_{t}).

Structural uncertainty reflects an incomplete understanding of system dynamics, i.e., the transition probabilities in *P*(*x*_{t+1} | *x*_{t},*a*_{t}) are uncertain (Williams [20], Williams and Brown [2]). Different Markovian models *P*_{k}(*x*_{t+1} | *x*_{t},*a*_{t}) along with an evolving model state can be used to account for structural uncertainty. Model-specific transition probabilities can be averaged based on *q*_{t}, to produce

### Decision making

In the presence of structural uncertainty, policy is a function of the state of the system at time *t* and our understanding of system dynamics (and associated uncertainty) at time *t*, such that *A*(*x*_{t},*q*_{t}) = *a*_{t}. Policy *A*_{t} over a time frame {*t*,…,*T*} can be described sequentially by actions for each system and model state at time *t*, followed thereafter by the remainder *A*_{t+1} of the policy over {*t* + 1,…,*T*}:

In what follows it will be useful to consider decision making over 2 time steps, in which actions for 2 time steps are jointly determined. This situation is denoted by *A*_{t} = {*a*_{t},*a*_{t+1},*A*_{t+2}}.

### Propagating uncertainty

Just as the system state evolves through time in response to management actions, so too does the model state (Williams and Johnson [21]). The dynamics of the model state are driven by the information produced through time with ongoing management, in the spirit of adaptive management (Nichols and Williams [22]). With iterative management, decision making influences an evolving system state *x*_{t}, with transitions that are recognized through ongoing monitoring in turn influencing the level of uncertainty. Bayes’ theorem (Lee [23]) is used for updating uncertainty, based on system state transitions from *x*_{t} to *x*_{t+1}:
(1)

Bayes’ theorem can also be used to determine the propagation of uncertainty across 2 time steps, as which can be rewritten as (2)

## Optimal decision making

Smart decision making requires an objective or value function to guide decisions and evaluate progress toward their achievement. Typically, valuation for adaptive management is based on the accrual of returns *R*(*x*_{t},*a*_{t}) through time, with each return incorporating costs and benefits corresponding to action *a*_{t} when the system is in state *x*_{t} (Williams et al. [19]). A value function *V*(*A*_{t} | *x*_{t},*q*_{t}) expresses the aggregation of returns associated with policy *A*_{t}, given system state *x*_{t} and model state *q*_{t}:
(3)
where the expectation accounts for stochastic transitions among states through time as well as the structural uncertainty represented by multiple Markovian models *P*_{k}(*x*_{t+1} | *x*_{t},*a*_{t}) and their evolving probabilities *q*_{t}(*k*). *V*(*A*_{t} | *x*_{t},*q*_{t}) serves as an objective or value function by which to compare and contrast the effectiveness of different management strategies.

Two important variations of adaptive decision making are active and passive adaptive management. Active adaptive management incorporates the potential for learning directly into the process of decision making (Williams [24]). Thus, optimal active adaptive management accounts for system state and structural uncertainty at each decision point, and it also accounts explicitly for learning in the choice of strategy:
(4)
where λ is a discount factor and the updated model state *q*_{t+1} in *V*[*x*_{t+1},*q*_{t+1}] indicates the use of learning in identification of strategy. That is, the consequences of learning are anticipated in the decision making process itself. Active adaptive management via Eq (4) produces the optimal value of the function in Eq (3), i.e., maximum valuation in the face of structural uncertainty.

Active adaptive management can also be expressed in terms of 2 successive time periods by
(5)
where the term in brackets in Eq (5) is simply another expression for *V*[*x*_{t+1},*q*_{t+1}]. The 2-step form for optimization in Eq (5) will prove to be especially useful in what follows for describing valuations of scenarios involving biennial patterns in decision making and monitoring.

With passive adaptive management, decision making is again based on system state and uncertainty at each decision point, but without explicitly accounting for learning in the choice of strategy (Williams [24]). The effect on valuation is seen by
(6)
where the prior model state *q*_{t} in *V*[*x*_{t+1},*q*_{t}] indicates the absence of learning in the identification of decisions. The corresponding form for 2-step passive adaptive optimization is
(7)

The only difference between active vs passive adaptive management as described above is the direct incorporation of learning into decision making, as indicated by an updated model state *q*_{t+1} in the valuation *V*[*x*_{t+1},*q*_{t+1}] in Eq (4). In contrast, learning in passive adaptive management factors into future decision making only after the current decision is made. The absence of anticipated learning in guiding decisions is indicated by the use of current model state *q*_{t} in the value term *V*[*x*_{t+1},*q*_{t}] in Eq (6). We note that our description of passive adaptive management extends beyond many descriptions in the literature, where passive adaptive management is held to involve actions based on the best available model, followed by post-decision monitoring to revise or replace the model (Walters and Hilborn [25], Schreiber et al. [13], Williams [24]).

While the value *V*[*x*_{t},*q*_{t}] produced by passive adaptive management is necessarily less than that of active adaptive management, the passive form has the advantage of being computationally tractable for relatively large problems, specifically because only the current model state must be considered. In practice, policies and values may vary little between the active and passive forms (Johnson et al. [26], Hauser and Possingham [27]).

## Valuation under different cadences of decision making and monitoring

The learning-based approach described above involves iterative decision making through time, utilizing monitoring information that is collected at each decision point. However, the selection of decisions need not coincide with the monitoring of system transitions. In what follows we consider annual and biennial decision making along with annual and biennial monitoring, where biennial decision making involves changes in decisions only every other year and biennial monitoring means the collection of field data every other year. The options for decision making combine with those for monitoring to define 4 scenarios. Here we discuss optimal valuations for each scenario, and compare/contrast the valuations among scenarios. We acknowledge that variations in timing beyond the biennial cadences considered here are possible, and we highlight other examples in the discussion below.

Given the scenarios defined by annual and biennial cadences, every 2 years a new decision can be made and state information is available to inform that decision. In the subsequent year after a new decision, however, in 3 of the 4 scenarios either the decision is repeated or monitoring does not occur (or both). The difference among scenarios becomes clear by focusing on the arguments of the value function *V*(*A*_{t} | *x*_{t},*q*_{t}).

- Scenario 1: Annual decision making and annual monitoring
- Every year (
*x*_{t},*q*_{t}) is known because of annual monitoring - A new action can be taken every year

- Every year (
- Scenario 2: Annual decision making and biennial monitoring
- Every other year (
*x*_{t},*q*_{t}) is not known because of the lack of monitoring - A new action can be taken every year

- Every other year (
- Scenario 3: Biennial decision making and annual monitoring:
- Every year (
*x*_{t},*q*_{t}) is known because of annual monitoring - The same action is taken in successive years

- Every year (
- Scenario 4: Biennial decision making and biennial monitoring
- Every other year (
*x*_{t},*q*_{t}) is not known because of the lack of monitoring - The same action is taken in successive years

- Every other year (

The differences among scenarios are accentuated in non-monitoring years. In this situation scenarios 1 and 2 produce different valuations, because the state is seen via monitoring under scenario 1 but not under scenario 2. Scenarios 3 and 4 also produce different valuations, for the same reason: the state is seen via monitoring under scenario 3 but not under scenario 4. Finally, the valuations for the scenarios 1 and 2 differ from those for scenarios 3 and 4, because actions in successive years are repeated in scenarios 3 and 4.

In the next sections we assume active adaptive decision making in the development of valuation forms. We then describe valuation under passive adaptive management. In both cases we use *V*(*A*_{t} | *x*_{t},*q*_{t}) as in Eq (3) to represent the aggregate value associated with policy *A*_{t} given the combination (*x*_{t},*q*_{t}) of system and model states, and use *V*[*x*_{t},*q*_{t}] as in Eqs (5) and (7) to represent the optimal valuation obtained by maximizing *V*(*A*_{t} | *x*_{t},*q*_{t}) over all available policies.

### Scenario 1: Valuation under annual decision making and monitoring

Here we describe the standard scenario for dynamic optimization (Williams and Johnson [28], Bertsekas [29]), in which decisions can be changed every year and observations about resource status are available to identify optimal actions and values (Fig 2). Thus, in any year *t* a new action *a*_{t} can be selected based the system state *x*_{t} and model state *q*_{t}. Immediately before the next decision point in year *t*+1 the system state *x*_{t+1} is identified through monitoring, and model-specific probabilities *P*_{k}(*x*_{t+1} | *x*_{t},*a*_{t}) of transition from system state *x*_{t} to *x*_{t+1} are identified. These transition probabilities are combined with the model state *q*_{t} to produce an updated model state *q*_{t+1} by Bayes’ theorem (Eq 1). The updated system and model states are then available to inform the selection of an action *a*_{t+1} at year *t*+1.

Action *a*_{t} is selected based on system state *x*_{t} and model state *q*_{t}. Realized system state *x*_{t+1} is identified through monitoring in year *t*+1. Model state *q*_{t} is updated to *q*_{t+1} with by Bayes’ theorem. This sequence, with actions based on current system and model state, is repeated over the remainder of the time frame.

The determination of optimal values and actions is facilitated with recursion approaches (Puterman [18]). In a given year *t* the value function can be expressed recursively as
(8)
and maximization
(9)
over *A*_{t} = {*a*_{t},*A*_{t+1}} produces and *V*[*x*_{t},*q*_{t}] for each (*x*_{t},*q*_{t}).

Because new actions can be taken every year and system status is always observed, valuation in successive years *t* and *t*+1 have the same form, with the value function for *t*+1 replicating Eq (8) simply by incrementing the time index by 1:
(10)

An algorithm for determining optimal values and policies with scenario 1 is discussed in the Appendix.

### Scenario 2: Valuation under annual decision making and biennial monitoring

In this scenario decisions can be changed each year as in the standard scenario 1, but the monitoring by which system and model states are recognized occurs only every other year. If system state is observed in a given year *t*, not observed in the subsequent year *t*+1, and observed again in year *t*+2, a 2-step transition process is required (Fig 3). With state information *x*_{t} and *x*_{t+2} in years *t* and *t*+2, model-specific probabilities of transition from state *x*_{t} to *x*_{t+2} can be determined. These transition probabilities can be combined with model state *q*_{t} to determine model state *q*_{t+2} by Bayes’ theorem (Eq (2).

Actions *a*_{t} and *a*_{t+1} are jointly selected based on system state *x*_{t} and model state *q*_{t}. Realized system state *x*_{t+2} is identified through monitoring in year *t*+2. Model state *q*_{t} is updated to *q*_{t+2} by Bayes’ theorem. This sequence, with actions *a*_{t} and *a*_{t+1} jointly chosen for successive years, is repeated over the remainder of the time frame.

A 2-step value function
(11)
with
(12)
allows optimal actions for year *t* and *t*+1 to be jointly identified. Maximizing *V*(*A*_{t} | *x*_{t},*q*_{t}) over *A*_{t} = {*a*_{t},*a*_{t+1},*A*_{t+2}} (Eq (5)) produces ,, and *V*[*x*_{t},*q*_{t}] for each combination (*x*_{t},*q*_{t}).

Determining value in the subsequent year *t*+1 requires a somewhat different treatment. Because there is no monitoring in year *t*+1, the states *x*_{t+1} and *q*_{t+1} in Eq (12) are unknown. However, they are related stochastically to *x*_{t} and *q*_{t}, which are known through monitoring. Averaging over the transition probabilities produces a valuation for year *t*+1 of
(13)
and using , and from the 2-step optimization in Eq (11) produces the optimal valuation
(14)

For year *t*+1. Note that the function in Eq (13) describing valuation for year *t*+1 has arguments that are indexed for the previous year *t*. The triple (*x*_{t},*q*_{t},*a*_{t}) in , inherited from , is needed to anchor the transition from the previous year *t*, when system state *x*_{t} is known, to year *t*+1 when system state *x*_{t+1} is not known.

Computations of values and identification of policies for scenario 2 are discussed in the Appendix.

### Scenario 3: Valuation under biennial decision making and annual monitoring

In this scenario monitoring occurs every year, as in the standard situation involving annual monitoring and decision making, but decisions can be changed only every other year. The sequencing of actions is as described above for scenario 1, except that every other year the action for the previous year is repeated (Fig 4). For a year *t* in which a new action can be taken, valuation that includes the repetition of actions in successive years is
(15)

Action *a*_{t} is selected based on system state *x*_{t} and model state *q*_{t}. Realized system state *x*_{t+1} is identified through monitoring in year *t*+1. Model state *q*_{t} is updated to *q*_{t+1} with Bayes’ theorem. Action *a*_{t} is repeated in year *t*+1. This sequence, with the same action taken in successive years, is repeated over the remainder of the time frame.

The conditioning argument *a*_{t} in *V*(*A*_{t+1}′ | *x*_{t+1},*q*_{t+1},*a*_{t}) is used here to emphasize that *a*_{t+1}, the lead action in *A*_{t+1}′ = {*a*_{t+1},*A*_{t+2}′}, is predetermined to be *a*_{t+1} = *a*_{t} because of the biennial decision making. Maximizing over *A*_{t} = {*a*_{t},*a*_{t},*A*_{t+2}} produces , , and *V*′[*x*_{t},*q*_{t}]. The “′” symbol in *A*_{t}′ and *V*′[*x*_{t},*q*_{t}] distinguishes strategies and valuations in scenario 3 from *V*(*A*_{t} | *x*_{t},*q*_{t}) and *V*′[*x*_{t},*q*_{t}] in scenario 1, where decisions can be changed annually. On inspection the only difference between the valuation *V*(*A*_{t}′ | *x*_{t},*q*_{t}) here and *V*(*A*_{t} | *x*_{t},*q*_{t}) in Eq (8) for the standard scenario 1 is the replacement of *a*_{t+1} in Eq (15) with *a*_{t} in the computation of future returns. Of course, that seemingly marginal policy difference can have substantive consequences for valuation, depending on the Markovian structure of the problem.

Assuming that a new action can be taken in year *t* and is repeated in the subsequent year, the value function for year *t*+1 is
(16)

Policy maximization for year *t*+1 then produces
(17)
where in is identified by optimizing in value function in Eq (15). The action corresponding to the value is of course .

The determination of optimal values and policies with scenario 3 is discussed in the Appendix.

### Scenario 4: Valuation under biennial decision making and monitoring

Finally, decision making and monitoring can both be biennial, with decisions repeated and monitoring conducted only every other year (Fig 5). Valuation in a year *t* where a new action can be taken and monitoring occurs is given by
(18)
With *a*_{t} in *V*(*A*_{t+1}′ | *x*_{t+1},*q*_{t+1},*a*_{t}) again used as a conditioning argument to emphasize that *a*_{t+1}, the lead action in *A*_{t+1}′ = {*a*_{t+1},*A*_{t+2}′}, is predetermined to be *a*_{t+1} = *a*_{t} because of biennial decision making. Maximizing *V*(*A*_{t}′ | *x*_{t},*q*_{t}) in Eq (18) over *A*_{t}′ = {*a*_{t},*a*_{t},*A*_{t+2}′} produces , , and *V*′[*x*_{t},*q*_{t}]. This is the same form for scenario 3 with biennial decision making and annual monitoring.

Actions *a*_{t} and *a*_{t+1} = *a*_{t} are selected based on system state *x*_{t} and model state *q*_{t}. Realized system state *x*_{t+2} is identified through monitoring in year *t*+2. Model state *q*_{t} is updated to *q*_{t+2} with Bayes’ Theorem. This sequence, with the same action chosen in successive years, is repeated over the remainder of the time frame.

However, scenario 4 differs from the other scenarios in the subsequent year *t*+1, because the conditioning states *x*_{t+1} and *q*_{t+1} are not known in the absence of monitoring. But their linkage to *x*_{t} and *q*_{t} can be used with *a*_{t} and *a*_{t+1} = *a*_{t} to produce the average value
(19)
with an optimal valuation of
(20)

In year *t*+1. Eq (19) differs from the corresponding Eq (13) for scenario 2 with annual decision making and biennial monitoring, but only in that scenario 2 allows for different actions *a*_{t} and *a*_{t+1} in *A*_{t} = {*a*_{t},*a*_{t+1},*A*_{t+2}}, whereas scenario 4 uses the same actions *a*_{t} and *a*_{t+1} = *a*_{t} in *A*_{t}′ = {*a*_{t},*a*_{t},*A*_{t+2}′}.

Computing forms for optimal values and policies with scenario 4 are discussed in the Appendix.

## Patterns in the optimal valuations

Several informative comparisons can be recognized among the scenarios, in terms of the actions to be optimized and the state information that is available when actions are to be selected.

### Comparisons of annual and biennial monitoring

For the scenarios considered here, in a year *t* when monitoring is conducted and new decisions are made the valuation of optimal policy is the same irrespective of monitoring frequency. That is, the same expression for optimal valuation obtains under both monitoring regimes. For annual decision making the optimal valuation for annual decision making is *V*[*x*_{t},*q*_{t}] irrespective of the cadence of monitoring. For biennial decision making the optimal valuation is *V*′[*x*_{t},*q*_{t}]. Basically, if one knows the system and model states when decisions are made, there is no additional value in collecting more information between decision points (but see below on the potential influence of partial observability). This pattern holds for annual as well as biennial decision making.

The situation is somewhat different for years when monitoring may not be conducted. Assume that monitoring occurs in year *t*, and may or may not in year *t*+1 depending on the scenario. With annual decision making and annual monitoring (scenario 1), in year *t*+1 one optimizes
as in Eq (10), whereas with biennial monitoring (scenario 2) one optimizes the average value
in Eq (13).

A similar pattern holds for biennial decision making. Under annual monitoring (scenario 3) one optimizes in Eq (16), whereas under biennial monitoring (scenario 4) one optimizes the average value in Eq (19).

On reflection, these results make sense. The averaging with biennial monitoring is essentially a way to compensate for the lack of knowledge about system and model states at *t*+1. The effect of averaging clearly distinguishes the scenarios with biennial monitoring from those with annual monitoring, in their valuations as well as their policies.

### Comparisons of annual and biennial decision making

Even for years in which system state is observed, valuation varies with the cadence of decision making. As seen above, value is optimized for biennial decision making over *A*_{t}′ = {*a*_{t},*a*_{t},*A*_{t+2}′} rather than *A*_{t} = {*a*_{t},*a*_{t+1},*A*_{t+2}} for annual decision making. The use of identical actions in successive years is definitive of biennial decision making. The same pattern holds for both annual as well as biennial monitoring.

It also holds for years *t*+1 in which system state is not necessarily observed. Of course, with biennial monitoring valuation involves the averaging of value functions in the absence of monitoring information.

## Passive adaptive management under different cadences

The value functions above are based on an active form of adaptive decision making, in which at any particular point in time the effect of learning is factored into future decision making (Eq (4)). An alternative to active adaptive management is passive adaptive management, in which learning influences future decision making only indirectly, after the current decision is made (Eq (6)).

Consider, for example, annual decision making and biennial monitoring (scenario 2) under passive adaptive management. The associated value function differs from that for active adaptive management only in the use of a stationary model state. In a year *t* when the system is observed the value function for passive adaptive management is
(21)
and in year *t*+1 when it is not the value function is
(22)

These are the same forms as for active adaptive management (Eqs (11) and (13)), except for the use of a stationary model state in the transition probabilities and future values.

An analogous pattern can be seen for passive adaptive management under biennial decision making and annual monitoring (scenario 3). The value function for scenario 3 again differs from that for active adaptive management only in the model states used in the transition probabilities and future values. Thus, for a year when a new action can be selected the value function is
(23)
where the current model state *q*_{t} is used in the transition probabilities and the future value *V*(*A*_{t+2}′ | *x*_{t+2},*q*_{t}). For a year in which the previous action is repeated the value function is
(24)

These again are the same forms as for active adaptive management (Eqs (15) and (16)), except for the use of a stationary model state in the transition probabilities and future values.

In like manner, the value functions for scenarios 1 and 4 can be described on assumption that decision making is passive rather than active. In each of the 4 scenarios the passive adaptive management forms can be derived by simply restricting the model states in active adaptive management to be stationary. This reduces considerably the computational burden in identifying optimal policies and values.

## Discussion

As adaptive management continues to grow in its popularity and use in natural resources, there is a trend toward being more flexible in its implementation. But greater flexibility in turn creates new challenges in capturing an appropriate decision making “architecture” for individual problems. An important example concerns the frequencies of decision making and monitoring. In particular, an accounting is needed of the effects of different cadences on both strategy and valuation. Here we have described a technical framework that allows for assessment of differing combinations of annual and biennial decision making and monitoring.

In this paper we have highlighted substantive differences in value functions for varying cadences of decision making and monitoring, recognizing that the differences are less pronounced for years when a system is observed. Indeed, for a year *t* with observed status the cadence of monitoring does not affect valuation (or policy) for either annual or biennial decision making. On the other hand, the cadence of monitoring does affect valuation and policy for year *t*+1 in which the system is not observed.

These patterns provide insight into the value of the information that is added with more frequent monitoring (Yokota and Thompson [30], Williams and Johnson [28]). For example, in a monitoring year the valuations under less frequent and more frequent monitoring are identical, so no value is added by increasing the frequency of monitoring. But there is an added value in the subsequent year, as a result of the need to average the valuations across system states with less frequent monitoring. The gain in value with more frequent monitoring is the difference between a valuation informed by knowledge of system state (annual monitoring), versus an average valuation when system state is only known stochastically (biennial monitoring). The results of such an assessment can be useful to managers as a metric in determining whether to reduce annual to biennial monitoring, or to expand biennial to annual monitoring.

The assessment for annual and biennial cadences can be extended to include options for 3 or more years. Thus, one could consider the effect of decision making every 3 years rather than every year, or the effect of monitoring every 3 years rather than every year (Johnson and Madsen [15]). One way to assess such an extension would be to express returns in the value functions in terms of 3 time steps rather than 2, and compute the corresponding expectations. The practical effect would be to complicate the mathematical expressions for valuation, and likely would make more difficult the comparative interpretation of patterns.

Other variations in the cadence of monitoring and decision making are possible. In the above, biennial monitoring and decision making occur in the same years, which allows decisions to be informed by system and model states at those times. Another variation is for monitoring and decision making to occur in alternative years, for example with decision making in one year and monitoring to occur in the subsequent year. The overall effect of this cadence is to require averaging based on prior year status each time a new decision is made. Yet another variation involves decision making prior to monitoring each year, so that the monitoring results are not available to inform the selection of actions for that year (Johnson et al. [16]). Under these conditions, possible actions must again be conditioned on the previous system and model state and the action previously taken.

The lack of additional value in collecting information between decision points that is highlighted here depends on the assumption that the resource system is fully observable. An allowance for partial observability defines a partially observable Markov decision process (POMDP), in which system status is approximated by a time-specific probability distribution or “belief state” that is updated with monitoring data through time (Kaelbling et al. [31], Littman [32]). A natural accounting of both partial observability and structural uncertainty considers the updating of belief, whenever it occurs, as a factor in the updating of model state, so that a change is belief affects the propagation of model state and in so doing may influence policy and valuation. Under these circumstances monitoring between decision times can have an effect on decision making.

There is a very large technical literature on theory and applications of adaptive management in natural resources for fully observable systems, and a much smaller but growing literature of POMDP methods and applications in natural resources (e.g., Lane, D. [33], Chadés et al. [34], Haight and Polasky [35], Tomberlin [36], Chadés et al. [37], Fackler and Haight [38], Regan et al. [39], Nicol and Chadés [40]). However, there are very few expositions concerning natural resources that include both (Williams [20], Fackler and Pacifici [41], Chadés et al. [42]), even though structural uncertainty and partial observability are common in natural resources. The limited documentation is no doubt a result, at least partially, of the formidable difficulties of incorporating both factors in analytic and computational frameworks that are accessible to natural resources specialists (e.g., Jaulmes et al. [43], Williams [20], Bertsekas [29]). One rather ad hoc approach is to assume full observability, identify optimal policies and valuations as approximations to the broader problem that includes partial observability, and explore the sensitivity of the approximations to errors in state estimation.

Finally, a key determinant in the usefulness of comparative valuation with different cadences is the ability to actually compute values with the forms of the value functions discussed above. Software is currently available for active and passive adaptive management under annual decision making and monitoring (Lubow [44], Fackler [45]). It is straightforward to use this software for the case of biennial decision making and monitoring for passive adaptive management, by utilizing the 2-step transition probabilities. Further software development is necessary for the other scenarios, which involves a greater or lesser degree of difficulty and programming effort depending on the scenario.

## Appendix

In this appendix we outline computing algorithms and forms for the 4 scenarios. In each scenario the determination of optimal values and actions can be determined recursively.

### Scenario 1: Annual decision making and annual monitoring

In any given year *t* the value function can be expressed recursively as
and maximization
over *A*_{t} = {*a*_{t},*A*_{t+1}} produces and *V*[*x*_{t},*q*_{t}] for each (*x*_{t},*q*_{t}). This algorithm typically is applied sequentially throughout the time frame, starting at the terminal time *T* and stepping backward in single time steps (Williams et al. 2002, Bertsekas 2017).

### Scenario 2: Annual decision making and biennial monitoring

A recursive algorithm for identifying optimal valuation and strategy for scenario 2 involves a 2-step backward iteration to determine *V*[*x*_{t},*q*_{t}] for a year *t* with monitoring, and with the results used to determine optimal valuation for year *t*+1. In the first step, *V*[*x*_{t},*q*_{t}] is computed as above (Eq (9)), along with and for each combination (*x*_{t},*q*_{t}). In the second step the optimal strategy is used to compute the average valuation for year *t*+1 as shown in Eq (14).

### Scenario 3: Biennial decision making and annual monitoring

A recursive algorithm for identifying optimal valuation and strategy for scenario 3 again involves a 2-step iteration for any year *t* in which a new decision can be made, to determine *V*′[*x*_{t},*q*_{t}] and , and then use the results to determine in year *t*+1. In the first step, *V*′[*x*_{t},*q*_{t}] is computed for each combination (*x*_{t},*q*_{t}) via the maximization of Eq (15), and the optimal action corresponding to (*x*_{t},*q*_{t}) is identified. In the second step is used to compute for year *t*+1 as in Eq (17), for each combination (*x*_{t+1},*q*_{t+1}) in the triple .

### Scenario 4: Biennial decision making and biennial monitoring

A recursive algorithm for identifying optimal valuation and strategy for scenario 4 involves a 2-step backward iteration to determine *V*′[*x*_{t},*q*_{t}] for each year *t* when monitoring occurs, and then using the results to determine optimal valuation for year *t*+1. In the first step, *V*′[*x*_{t},*q*_{t}] is computed along with and for each combination (*x*_{t},*q*_{t}), via the maximization of Eq (18). In the second step the optimal strategy is used to compute the average valuation for year *t*+1 with Eq (20).

## References

- 1. Williams BK, Brown ED. Adaptive management: From more talk to real action. Environmental Management 2014;53:465–479. pmid:24271618
- 2. Williams BK, Brown ED. Technical challenges in the application of adaptive management. Biological Conservation 2016;195:255–263.
- 3. Williams BK. Adaptive management of natural resources: Framework and issues. Journal of Environmental Management 2011;2011:1346–1353.
- 4.
Walters CJ. Adaptive management of renewable resources. Caldwell, NJ: Blackburn Press; 1986.
- 5.
Williams BK, Brown ED. Adaptive Management: U.S. Department of the Interior Applications Guide. Washington, DC: U.S. Department of the Interior; 2012.
- 6. Linkov I, Satterstrom FK, Kiker G, Batchelor C, Bridges T, Ferguson E. From comparative risk assessment to multi-criteria decision analysis and adaptive management: Recent developments and applications. Environment International 2006; 32(8):1072–93. pmid:16905190
- 7. Runge MC, Converse SJ, Lyons JE. Which uncertainty? Using expert elicitation and expected value of information to design an adaptive program. Biological Conservation 2011;144:1214–23.
- 8. Susskind L, Camacho AE, Schenk T. Collaborative planning and adaptive management in Glen Canyon: a cautionary tale. Columbia Journal of Environmental Law 2010;35(1):1–54.
- 9. Convertino M, Foran CM, Keisler JM, Scarlett L, LoSchiavo A, Kiker GA, et al. Enhanced Adaptive Management: Integrating Decision Analysis, Scenario Analysis and Environmental Modeling for the Everglades. Scientific Reports 2013;3:2922. pmid:24113217
- 10. Johnson FA, Breininger DR, Duncan BW, Nichols JD, Runge MC, Williams BK. A Markov decision process for managing habitat for Florida scrub-jays. Journal of Fish and Wildlife Management 2011;2:234–46.
- 11. Linkov I, Satterstrom FK, Kiker G, Batchelor C, Bridges T, Benjamin SL, et al. From Optimization to adaptation: Shifting paradigms in environmental management and their application to remedial decisions. Integrated Environmental Assessment and Management 2006;2:92–98. pmid:16640324
- 12.
Johnson FA, Williams BK. A decision-analytic approach to adaptive resource management. In Allen CR, Garmestani AS (eds.). Adaptive Management of Social-Ecological Systems. Houten, Netherlands: Springer; 2015.
- 13. Schreiber ESG, Bearlin AR, Nicol SJ, Todd CR. Adaptive management: a synthesis of current understanding and effective application. Ecological Management and Restoration 2004; 5:177–182.
- 14. Hauser CE, Pople AR, Possingham HP. Should managed populations be monitored every year? Ecological Applications 2006;16:807–819. pmid:16711064
- 15.
Johnson F, Madsen J. Adaptive harvest management for the Svalbard population of pink-footed geese: 2015 progress summary. Aarhus University, Danish Center for Environment and Energy, Technical Report No. 64. ISSN 2245-019X. http://dce2.au.dk/pub/TR64.pdf; 2015.
- 16. Johnson FA, Fackler PL, Boomer GS, Zimmerman GS, Williams BK, Nichols JD, et al. State-Dependent Resource Harvesting with Lagged Information about System States. 2016;PLoS ONE 11:e0157373. pmid:27314852
- 17. Smith DR, McGowan CP, Daily JP, Nichols JD, Sweka JA, Lyons JE. Evaluation of a multi-species adaptive management framework: must uncertainty impede effective decision-making? Journal of Applied Ecology 2013;50:1431–1440.
- 18.
Puterman ML. Markov decision processes: Discrete stochastic dynamic programming. New York, USA: John Wiley and Sons; 1994.
- 19.
Williams BK, Nichols JD, Conroy MJ. Analysis and Management of Animal Populations. San Diego, CA: Academic Press; 2002.
- 20. Williams BK. Markov decision processes in natural resources management: Observability and uncertainty. Ecological Modelling 2009;220:830–840.
- 21. Williams BK, Johnson FA. Confronting dynamics and uncertainty in optimal decision making for conservation. Environmental Research Letters 2013;8:025004.
- 22.
Nichols JD, Williams BK. Adaptive management. In: El-Shahaarwi AH, Piegorsch W (eds.). Encyclopedia of Environmetrics. New York: John Wiley and Sons; 2012.
- 23.
Lee PM. Bayesian Statistics: An Introduction. London, UK: Edward Arnold Publishers; 1989.
- 24. Williams BK. Passive and active adaptive management: Approaches and an example. Journal of Environmental Management 2011;92:1371–1378. pmid:21074930
- 25. Walters CJ, Hilborn R. Ecological optimization and adaptive management. Annual Review of Ecology and Systematics 1978;9:157–188.
- 26. Johnson FA, Kendall WE, Dubovsky JA. Conditions and limitations on learning in the adaptive management of mallard harvests. Wildlife Society Bulletin 2002;30:176–185.
- 27. Hauser CE, Possingham HP. Experimental or precautionary? Adaptive management over a range of time horizons. Journal of Applied Ecology 2008;45:72–81.
- 28. Williams BK, Johnson FA. Value of information and natural resources decision making. Wildlife Society Bulletin 2015:
- 29.
Bertsekas DP. Dynamic Programming and Optimal Control, vol. 1. Belmont, MA: Athena Scientific; 2017.
- 30. Yokota F, Thompson KM. Value of information literature analysis: a review of applications in health risk management. Medical Decision Making 2004;24:287–298. pmid:15155018
- 31. Kaelbling LP, Littman ML, Cassandra AR. Planning and acting in partially observable stochastic domains. Artificial Intelligence 1998;101:99–134.
- 32. Littman ML. A tutorial on partially observable Markov decision processes. Journal of Mathematical Psychology 2009;53:119–125.
- 33. Lane D. A partially observable model of decision making by fishermen. Operations Research 1989;37:240–254.
- 34. Chadés I, McDonald-Madden E, McCarthy MA, Linkie M, Possingham HP. When to stop managing or surveying cryptic threatened species. Proceedings of the National Academy of Sciences 2008;105:13936–13940.
- 35. Haight RG, Polasky S. Optimal control of an invasive species with imperfect information about the level of infestation. Resource and Energy Economics 2010; 32:519–533.
- 36. Tomberlin D. Endangered seabird habitat management as a partially observable Markov decision process. Marine Resources Economics 2010;25:93–104.
- 37. Chadès I, Martin TG, Nichol S, Burgman MA, Possingham HP, Buckley YM. General rules for managing and surveying networks of pests, diseases, and endangered species. Proceedings of the National Academy of Sciences 2011;108:8323–8328.
- 38. Fackler PL, Haight R. Monitoring as a partially observable decision problem. Resource and Energy Economics 2011;37:226–241.
- 39. Regan TJ, Chades I, Possingham HP. Optimally managing under imperfect detection: a method for plant invasions. Journal of Applied Ecology 2011;48:76–85.
- 40. Nicol S, Chadés I. Which states matter? An application of an intelligent discretization method to solve a continuous POMDP in conservation biology. PLoS One 2012;7, e28993. pmid:22363398
- 41. Fackler P, Pacifici K. Addressing structural and observational uncertainty in resource management. Journal of Environmental management 2014;133:27–36. pmid:24355689
- 42. Chadès I, Nicol S. Rout TM, Péron M, Dujardin Y, Pichancourt J-B, et al. Optimization methods to solve adaptive management problems. Theoretical Ecology 2016;10:1–20.
- 43.
Jaulmes R, Pineau J, Precup D. Active learning in partially observable Markov decision processes. In: Gama J, Camacho R, Brazdil PB, Jorge AM, Torgo L (eds). Machine Learning: ECML 2005. ECML 2005 Lecture Notes in Computer Science, vol 3720. Berlin: Springer; 2005.
- 44. Lubow BC. SDP: Generalized software for solving stochastic dynamic optimization problems. Wildlife Society Bulletin. 1995;23:738–742.
- 45.
Fackler PL. MDPSolve Users Guide. 2011. https://sites.google.com/site/mdpsolve/.