How Much Is Enough? Minimal Responses of Water Quality and Stream Biota to Partial Retrofit Stormwater Management in a Suburban Neighborhood

Decentralized stormwater management approaches (e.g., biofiltration swales, pervious pavement, green roofs, rain gardens) that capture, detain, infiltrate, and filter runoff are now commonly used to minimize the impacts of stormwater runoff from impervious surfaces on aquatic ecosystems. However, there is little research on the effectiveness of retrofit, parcel-scale stormwater management practices for improving downstream aquatic ecosystem health. A reverse auction was used to encourage homeowners to mitigate stormwater on their property within the suburban, 1.8 km2 Shepherd Creek catchment in Cincinnati, Ohio (USA). In 2007–2008, 165 rain barrels and 81 rain gardens were installed on 30% of the properties in four experimental (treatment) subcatchments, and two additional subcatchments were maintained as controls. At the base of the subcatchments, we sampled monthly baseflow water quality, and seasonal (5×/year) physical habitat, periphyton assemblages, and macroinvertebrate assemblages in the streams for the three years before and after treatment implementation. Given the minor reductions in directly connected impervious area from the rain barrel installations (11.6% to 10.4% in the most impaired subcatchment) and high total impervious levels (13.1% to 19.9% in experimental subcatchments), we expected minor or no responses of water quality and biota to stormwater management. There were trends of increased conductivity, iron, and sulfate for control sites, but no such contemporaneous trends for experimental sites. The minor effects of treatment on streamflow volume and water quality did not translate into changes in biotic health, and the few periphyton and macroinvertebrate responses could be explained by factors not associated with the treatment (e.g., vegetation clearing, drought conditions). Improvement of overall stream health is unlikely without additional treatment of major impervious surfaces (including roads, apartment buildings, and parking lots). Further research is needed to define the minimum effect threshold and restoration trajectories for retrofitting catchments to improve the health of stream ecosystems.


Introduction
Rapid urbanization and the ongoing conversion of landscapes from natural habitats to industrial, commercial, and residential land uses to support a growing human population remain the most salient threats to natural ecosystems [1][2][3]. Aquatic ecosystems that drain urban areas are particularly vulnerable due to their low position in the landscape [4]. In most urban and suburban areas, untreated stormwater runoff from impervious surfaces is typically routed directly into rivers, lakes, and oceans. This conventional design of urban drainage systems reflects concerns about human health and safety, but largely ignores threats to aquatic ecosystem health that stem from stormwater runoff [5,6].
The urban stream syndrome describes changes in stream ecosystems associated with urbanization, a subject that has been increasingly studied in the last few decades (see reviews by [7][8][9]). These changes primarily arise from stormwater runoff from impervious cover-particularly impervious cover that is directly connected to streams by stormwater pipes [6]-which alters stream hydrology, water chemistry, and biotic communities. High magnitude, flashy flows in urban streams can scour stream beds and erode stream banks, thus reducing habitat quality. Furthermore, the extreme high flows can wash out aquatic biota and low lying riparian vegetation, whereas reduced base flows can reduce in-stream habitat and alter stream ecosystem function [10]. Runoff that enters urban and suburban streams often contains increased toxicants, ions, and nutrients, along with higher temperatures, reduced oxygen saturation, and organic material that can alter biotic structure and ecosystem function (production, nutrient uptake, leaf breakdown, etc.) relative to streams in natural landscapes [7]. While there are other catchment stressors (e.g., point sources, septic systems, riparian degradation) and in-stream stressors (e.g., impoundments, water withdrawals, stream burial) associated with urbanization [9], stormwater runoff is a dominant source of impairment of ecological structure and function in most urban catchments. As such, comprehensively managing stormwater runoff with the goals of mimicking pre-disturbance flow regimes, improving water quality, and ultimately improving ecosystem health is a leading approach to urban stream restoration [5,6,11,12].
There is an increasing movement throughout the world to address runoff through decentralized stormwater management [6,[13][14][15]. This management approach can include small-scale tools that capture and detain (e.g., detention and retention basins), infiltrate (e.g., pervious pavements, rain gardens), and filter (e.g., biofiltration swales, wetlands) runoff on individual parcels throughout a landscape [16][17][18]. These tools are implemented through new development (often referred to as low impact development, LID) [15] or retrofitting of already-developed areas.
To date, there have been no reports of the effectiveness of decentralized stormwater management for improving stream water quality and biota in suburban catchments. Most assessments are limited to measurements of hydrology and water quality within individual treatment practices [18,19] (see also www.bmpdatabase. org) and modelled effects of installations throughout catchments [20][21][22]. There have been some catchment-scale studies comparing LID and conventional practices in new developments [23,24], but no catchment-scale retrofits of existing developments. This is due in part to the distribution of impervious surfaces within catchments, and the legal, economic, and logistical difficulty of implementing stormwater management practices at a scale appropriate for improvement [25,26]. The research presented here and another study currently underway in Australia [27] are, to our knowledge, the first attempts at assessing stream responses to retrofit stormwater management at the catchment scale.
Like many large, older US cities, the metropolitan area of Cincinnati, OH has an aging stormwater infrastructure that uses common combined sewer overflows, which lead to both an environmental and legal (e.g., consent decree) need to address stormwater management [28]. Thurston et al. [29] used Cincinnati to demonstrate that decentralized stormwater abatement was likely to be less expensive than a centralized (e.g., deep tunnel) solution to the water quality and quantity problem. Thus, a multidisciplinary study was designed that 1) assessed the legal, economic, and scientific challenges associated with decentralized stormwater management [25], and 2) developed and tested a stormwater management strategy within a small, suburban catchment. Rain barrels (up to four per property) and rain gardens (one per property) were offered to eligible residents (i.e., owner-occupied and within the experimental area) through a voluntary, reverse auction where bids consisted of the stormwater management practice(s) and a financial subsidy (if desired). The lowest cost bids at locations with the highest potential environmental benefits were prioritized. Winning homeowners received the bid amount, free stormwater management practices, installation, and maintenance for three years [30]. The voluntary nature of the auction avoided private property rights issues while providing financial incentives to property owners for installation and increasing stakeholder ownership [25]. The project placed 81 rain gardens and 165 rain barrels on ca. 30% of the eligible properties within the headwaters of the Shepherd Creek catchment in 2007 and 2008 [30]. The rain barrels resulted in an overall decrease in directly connected impervious area (DCIA) from 7.4% to 7.0% in the catchment, and 11.6% to 10.4% (Sub1), 9.0% to 8.1% (Sub2), and 7.3% to 7.1% (Sub3) in the Experimental subcatchments (Table 1). Rain gardens did not change the DCIA, but offered additional capacity to capture overland runoff.
The objective of our study was to determine if the retrofit stormwater management imposed as a result of the economic auction would result in measurable shifts in the ecological condition of streams in the Shepherd Creek catchment. We

Study design
We used a modified before-after-control-intervention (BACI) study design, where the intervention was the installation of treatments (rain barrels and rain gardens) on select parcels within the catchment. The modified paired-catchment BACI design relied on comparison of the difference between responses of Control and Experimental subcatchments for three periods to determine if significant treatment effects were present [33][34][35]. Typically only a pre-treatment and experimental period are used in a BACI design, but because implementation of the treatments spanned 16 months, we also included a transition period. Six study sites were sampled within the Shepherd Creek catchment, including four Experimental sites (Sub1, Sub2, Sub3, Catch) and two Control sites (Sub4, Sub5; Figure 1). Access to field sites Sub1, Sub2, Sub3, and Catch was granted by private landowners. Sites Sub5 and Sub5a were in the Mt. Airy Forest, and permission was granted by the Cincinnati Park Board. Sub4 was in the road rightof way. Multiple Experimental and Control sites were used to minimize potential confounding of location-specific differences with treatment effects [36,37]. Total impervious area (TIA) in the Shepherd Creek catchment was 13.1%, with just over half (7.4%) of the TIA directly connected to stormwater or combined sewer pipes (Table 1) [38]. The five subcatchments (24.9-68.9 ha) ranged from 11.2-19.9% TIA, with a corresponding range of 43.8-68.0% forest cover (Table 1).

Physical and chemical characteristics
Basic morphometric, geomorphic, and water quality parameters were measured five times/year within the 61-m sample reach during biotic sampling. We calculated approximate values for average width, average depth, wetted area, and surface velocity based on field measurements. Additional physical attributes included estimates of: % riffle, % pool, % run, large wood density, % small wood, % large wood, % detritus, bed texture (% bedrock, cobble, gravel, sand, silt), and % canopy cover following the EPA Rapid Bioassessment Protocols (RBP) Physical Characterization data sheet [39]. Water quality measurements were taken with a YSI 6600 data sonde (YSI, Inc., Yellow Springs, OH, USA), and included: water temperature, conductivity, dissolved oxygen, pH, oxidation-reduction potential, and turbidity. YSI data sondes were calibrated within 24 hours of use and tested immediately following return from the field. Measurements from water quality probes that did not pass post-deployment calibration checks were not used. Finally, we calculated two visual assessment habitat evaluation scores: EPA's Rapid Bioassessment Protocols Quantitative Habitat Assessment (QHEI) [39], and the Primary Headwater Habitat Evaluation Form (HHEI) [40] that is specifically designed for streams with water depths ,40 cm.
Water quality sampling was also conducted monthly during baseflow conditions. In addition to measuring the water quality parameters described above, grab water samples were filtered with a 0.45-mm filter for total dissolved phosphorus (TDP), dissolved organic carbon (DOC), and dissolved metals (Al, Fe, Mn, Cu, Zn).
TDP and DOC were preserved with sulfuric acid, and dissolved metals were preserved in nitric acid. Unfiltered grab samples were collected for analysis of nutrients (nitrate/nitrite nitrogen, ammonium nitrogen, dissolved inorganic nitrogen, total Kjeldahl nitrogen, total phosphorus, ortho-phosphate, TDP), total organic carbon (TOC), total recoverable metals (Al, Fe, Mn, Cu, and Zn), and base cations (Na, Mg, K, Ca). Nutrients and TOC were preserved with sulfuric acid, and metals and base cations were preserved with nitric acid. Suspended sediment concentration (SSC), anions (Cl, Br, F, SO 4 , NO 3 , and ortho-PO 4 ), and alkalinity samples were unpreserved. For SSC, the 250-mL sample was filtered through a pre-washed and pre-weighed 1.5-mm glass-fiber filter, dried to a constant weight, and re-weighed following ASTM Method D3977-97 [41]. Analyses for nutrients, metals, anions, and cations were performed by EPA Region 5 Central Regional Laboratory (Chicago, IL) using standard EPA protocols as follows:

Periphyton
Periphyton samples were collected from submerged rocks throughout the 61-m study reach. Cobbles were removed from the stream and a 13.2-cm 2 area on each rock, designated with a PVC ring, was brushed with a toothbrush for 2 min. Rocks and brushes were then rinsed with stream water into a 500-mL bottle. Algae from all rocks within a reach were composited into a single bottle and placed in the dark on ice.
In the laboratory, 20-30 mL of the periphyton slurry was filtered onto each of two glass fiber filters and frozen for subsequent analysis of chlorophyll a using a multi-wavelength spectrophotometer following EPA's Method 446.0 [42]. An additional 50 mL of sample was preserved in 1% gluteraldehyde for biomass analysis. The sample was later filtered onto a preashed glass fiber filter (47 mm, PALL Type A/E, 1-mm pore size). Filters were dried at 105uC to a constant weight, weighed for dry weight, ashed in a muffle furnace for 1.5 hours at 500uC, wetted, re-dried at 105uC, and re-weighed to obtain ash-free dry mass (AFDM). The remaining algal sample was preserved in 1% gluteraldehyde for identification. All algae (diatoms and soft algae) were identified and enumerated to the genus level by PhycoTech, Inc., consistent with Standard Methods 10200 and 10300 [43]. Three permanent slide mounts were made with 2-hydroxypropyl methacrylate (HPMA), and all slides were examined using a stratified counting procedure (2006 and 4006 for soft algae, 10006 for diatoms and picoalgae) to a minimum of 400 natural units [44]. Algal indices were calculated based on densities of cells per cm 2 and included total density, density of major orders (Bacillariophyta, Chlorophyta, and Cyanophyta), relative proportions of the major orders, taxa richness, Shannon diversity, and percent of sample in the dominant taxon. A periphyton index of biotic integrity (PIBI) developed for the mid-Atlantic region of the United States was calculated that incorporated nine metrics (phosphatase activity metric excluded) [45].

Macroinvertebrates
Macroinvertebrates were collected using two methods: 1) a triangular dip net (500-mm mesh) used to collect a composited, multi-habitat sample in the entire 61-m reach, and 2) a bucket sampler (0.053-m 2 ) within three replicate depositional/riffle habitats. The net samples were considered ideal for capturing macroinvertebrate diversity and relative abundance [39], whereas the bucket samples were used to determine macroinvertebrate densities in riffle habitats that are most sensitive to disturbance [46]. Net samples were collected five times/year in conjunction with periphyton and habitat sampling, and bucket samples were collected during three sampling events per year (spring, summer, autumn). Bucket samples were taken by pushing an open bucket into the bed sediment, surrounding the bucket-sediment interface with a custom-made canvas skirt to enclose the area, and scrubbing each large rock into the water contained in the bucket. The bed sediment was then disturbed for 10 sec. using a trowel, Lambda is value for the exponential transformation. 2 ***P,0.001, *P,0.05. 3 Habitat variables (including some water quality variables) were sampled five times per year during biotic sampling events. 4 HHEI score from Ohio Environmental Protection Agency Primary Headwater Habitat Evaluation Index [40]. 5 QHEI score from Rapid Bioassessment Protocols Quantitative Habitat Assessment for high gradient streams, and filamentous algae score (range 0-4) is from RBP benthic macroinvertebrate field sheet [39]. 6 Water quality variables were sampled monthly during baseflow conditions. doi:10.1371/journal.pone.0085011.t003 followed by 10 sec. of sweeping with a small dip net (500-mm mesh), and repeated for a total of three times. All samples were emptied into wash basin, elutriated, poured through a 500-mm sieve, and preserved in 70% ethanol.
In the lab, macroinvertebrates were subsampled to a minimum of 300 individuals and identified to lowest possible taxonomic unit (typically genus or species), enumerated, and measured (body length or shell width). All midges (Diptera: Chironomidae) and oligochaetes were slide-mounted for identification. To address differences in taxonomic resolution among three contractors, we lumped taxa to a common taxonomic level (e.g., genus or family) or assigned lower classification levels as appropriate (e.g., where there was only one genus in a family found at our sites). Macroinvertebrate biomass was calculated using published lengthmass relationships (e.g., [47]) to generate AFDM for each taxon. Several macroinvertebrate indices were calculated for analysis. Abundance, relative abundance, richness, and biomass were calculated for Chironomidae, EPT (Ephemeroptera, Plecoptera, and Trichoptera), insects only, and all taxa. We calculated the abundance, relative abundance, and biomass of the isopod Asellidae (typically the most abundant taxon in each sample), the proportional abundance and biomass of the dominant taxon in each sample, and Shannon diversity.

Statistical analysis
Data analyses were performed using SAS version 9.2 (SAS Institute, Cary, NC, USA). Prior to any analysis, each analyte (e.g., water quality parameters, habitat measures, periphyton and macroinvertebrate indices) was screened for outliers using scatter plots and histograms; Box-Cox transformed for normality using SAS proc Transreg; normalized using SAS proc Standard; and analyzed using SAS proc Mixed and SAS proc HPmixed. Less than 10 percent of data were excluded as outliers at this stage. Each analyte was then evaluated separately using a simplified ''screening'' model including study period (Period as Before, During, or After treatment implementation), sample site (Site), and sample group (Group as Control or Experimental) with the ''influence'' option selected to identify suspicious data points. For the most part, we could not differentiate outliers due to human error from system noise. We believe that the outliers do more to obscure real signals than reflect actual conditions, and therefore have we have omitted them from further analysis.
After data cleaning was completed, separate analyses were performed for each analyte to assess the responses of individual variables to treatment using SAS proc Mixed. Model parameters included Period, Site, Group, Group*Period interaction, and sampling round (Round as sampling dates grouped in 7-day windows) ( Table 2). Period and Group were coded in the model as a numerical, fixed main effects, Site was coded as a fixed effect nested within Group, and Round was coded as a random effect. The Group*Period interaction was used to assess for significance of treatment effects.
We used non-metric multidimensional scaling (NMS) to ordinate taxonomic data and express differences in assemblage structure across samples. Species-specific periphyton abundances and macroinvertebrate abundances and biomass (for bucket samples only) were log (x+1) transformed and extreme biomass outliers (3 samples) were removed prior to ordination. The NMS Figure 2. Water chemistry before, during, and after treatment for control and experimental sites. Mean (6 SE) back-transformed values are reported. Conductivity (A) was sampled during seasonal biotic monitoring, and calcium (B), iron (C), and sulfate (D) were sampled during monthly baseflow water quality monitoring. P-values reflect results of ANOVA for Group*Period interaction for the strongest models (Table 3). doi:10.1371/journal.pone.0085011.g002 ordination was configured with the Sorensen distance measure and step down in dimensionality, and run in PC-ORD TM (Version 6, MjM Software Design, Gleneden Beach, OR, USA). The ordination axes were tested for fixed effects of Period, Group, Site (nested in Group), and Group*Period interactions, and random effect of Sample (month, year) using SAS proc Mixed. The combined axes for each analyte type (periphyton, macroinvertebrate abundance, and macroinvertebrate biomass) ANOVA included fixed effects for Period, Group, Group*Period, Axis, Axis*Period, and Axis*Group. Due to the ordination step, the statistical model was slightly altered, but equivalent to that used for non-ordinated data. In all cases, a significant Group*Period interaction indicated a significant effect of stormwater treatment installation ( Table 2).
For all results, we used a P,0.05 cut-off for designating significance. Following the recommendations of Moran [48] for diverse ecological data and in the interest of maintaining detailed analyses, we did not correct for multiple comparisons (e.g., using sequential Bonferroni). Therefore, we caution the reader to tend toward a more conservative interpretation of tests that may be affected by Type I error, even for P-values,0.05. All raw data and SAS codes are available in EPA's STORET database (http:// www.epa.gov/storet/).

Landscape conditions, habitat, and water quality
Most of the headwater streams had a mix of gravel, cobble, and boulder substrate with high amounts of fine sediment deposition in the pools. The QHEI habitat scores reflected the mix of substrates and high proportion of riffle habitats, although most sites had suboptimal conditions due to high sediment deposition, poor vegetative protection, and moderately unstable banks (Table 3). Baseflow water quality varied considerably across sites and seasons, but on average streams had elevated nutrients (nitrate = 1.40 mg/L; total dissolved phosphorus = 0.620 mg/L), and high conductivity. Both natural (e.g., limey parent material as calcium = 99.4 mg/L) and anthropogenic (e.g., road salting and domestic wastewater as chloride = 147 mg/L) sources of ions likely contributed to high conductivity (Table 3).
There was a significant effect of treatment (rain garden and rain barrel installations) on several habitat and water quality parameters (Table 3). Conductivity ( Fig. 2A), iron (Fig. 2C), and sulfate (Fig. 2D) increased in the Control sites through time, with no apparent change or a decrease (iron) over the corresponding periods in Experimental sites. Calcium concentrations were similar in Control sites Before and After installations, but decreased in Experimental sites through time (Fig. 2B). Canopy cover was lower in the During and After periods (vs. Before), corresponding to the vegetation removal that occurred at the Control sites in 2007 (Fig. 3A). The qualitative filamentous algae score was much higher in the During period than Before and After treatment at the Control sites, whereas the Experimental sites did not experience a similar fluctuation through time (Fig. 3B).
Analysis of the five individual periphyton metrics that met the statistical criteria revealed no significant treatment effects (Table 4). NMS ordination of cell densities of periphyton taxa revealed a visible shift between samples Before, During, and After treatment installations in the 3-dimensional solution (Fig. 4). Further analysis of the individual axes showed significant Period effects for all three axes separately, and a significant effect of Group (Control vs. Experimental) for axis 2 (Table 5). However, there were no significant effects of Group*Period (treatment) for any of the NMS axes separately or combined (Table 5).  We collected 189 unique macroinvertebrate taxa across all samples and sites. Assemblages were dominated by the Asselid isopod Lirceus that composed 60% of the abundance of all net samples and 39% of bucket samples. Oligocheata worms (8.7% and 28.8%, respectively) and the chironomid Tanytarsus (5.1% and 12.6%, respectively were the second and third most abundant taxa, respectively). Other common taxa (composing .2% of the abundance) included the chironomids Diamesa, Paratendipes, and Orthocladius, and Ostracoda crustaceans. On average (6 SD), the bucket samples were composed of 29.6624.8% Chironomidae, 26.8624.5% Asellidae, and 5.266.7% EPT taxa (Table 4) There were few significant effects of treatment (Group*Period) on the individual macroinvertebrate abundance and biomass variables (Table 4). Insect richness (Fig. 5A) and total richness (Fig. 5B) from bucket samples tended to increase through time in the Control sites, whereas the Experimental sites had lower richness in the During period compared to Before and After. Shannon diversity (based on biomass from bucket samples) was PIBI is the Periphyton Index of Biotic Integrity [45]. 4 Macroinvertebrate variables were calculated separately for multi-habitat net samples (based on abundance data) and bucket samples in riffle habitats (represented as abundance and biomass). 5 EPT represents taxa in the orders Ephemeroptera, Plecoptera, and Trichoptera (considered sensitive to disturbance). doi:10.1371/journal.pone.0085011.t004 highest in the During period for the Control sites and Before treatment for the Experimental sites (Fig. 5C). Percent dominant taxon based on biomass from bucket samples (a variable that should increase with disturbance) was lowest in the During period for the Control sites and highest During treatment for the Experimental sites (Fig. 5D). Macroinvertebrate assemblages collected from riffle habitats were distinct based on Group (Control vs. Experimental) and Site (Table 5) based on the ordination of taxa abundance (Fig. 6) and biomass (Fig. 7). There was also a significant Period effect for abundance (combined axes) and biomass (axis 1 and combined axes; Table 5). Only macroinvertebrate abundance axis 3 revealed a significant effect of stormwater treatment (Group*Period, P = 0.002, Table 5).

Stream responses to rain garden and barrel installations
As expected, the installation of rain barrels and rain gardens on 30% of the properties in the Experimental catchments resulted in very few responses in stream water quality, periphyton, and macroinvertebrate metrics relative to Control sites. The few significant results that were detected should be interpreted with caution, given the high number of comparisons and potential risk of Type I error. Despite the high number of samples, with only four experimental and two Control sites there was low statistical power, a challenge of BACI designs [35]. Nonetheless, the few detected responses are notable given the study design and relatively small amount of stormwater runoff mitigated in the catchment.
There was a statistically significant effect of stormwater treatment on a few baseflow water quality variables, generally reflecting reduced water quality through time at Control sites. The small reduction in runoff volume from Before to After treatment [31] may have stabilized the water quality in the Experimental sites over a period of time when conditions in the Control sites were deteriorating. Rain gardens and barrels captured runoff, thereby reducing the likelihood of pollutant mobilization and transport, and potentially decreasing the total mass of pollutants delivered to streams during higher-flow storm conditions [49,50].  Table 2 for variable descriptions.   Table 5). The 3D solution explained 71.5% of the variation, and the final stress was 19.3. doi:10.1371/journal.pone.0085011.g004 In addition to capturing and detaining stormwater, rain gardens can play a role in filtration of pollutants [51,52], and may contribute to overall reduced pollutant loading to streams, with some potential for improved baseflow water quality [49], although this mechanism was untested in our study.
There was no significant effect of treatment on the algal community, as examined through individual periphyton metrics and ordination of cell densities by taxon. This is not surprising given the lack of a response of nutrients (nitrogen and phosphorus) to treatment, and that these systems are not significantly nutrient limited [53]. The qualitative filamentous algal score was much higher in the Control sites in the During period (June 2007 through September 2008), which corresponded to the removal of trees and shrubs within the riparian zones at both Control sites. Loss of riparian cover increased the duration and intensity of light reaching the stream, potentially also increasing stream temperature locally, and could have triggered the increase in the relative proportion of filamentous green algae in the Control sites [53,54]. The lower scores for the After period may be a reflection of algal sloughing during the After period, which had higher precipitation and flows compared to the During period.
The few significant treatment effects on the macroinvertebrate assemblage were not intuitive, and may be explained by multiple factors independent of the stormwater management. In the Control sites, there was an increase in richness and diversity through time, and lower percentage of dominant taxa in the During period. These patterns may reflect the periphyton responses, especially if increases in filamentous algae provided habitat, food, or increased nutrient uptake to support new taxa and higher diversity in the Control sites. In contrast, the Experimental sites demonstrated lower richness and diversity, and higher percentage of dominant taxa in the During period, compared to Before and After. It is possible that the reductions can be attributed to differences in low flow hydrology over the course of the study. Whereas high storm flows can directly alter macroinvertebrate communities through physical washout [55], small streams such as those in our study are more likely structured by seasonal variation in stream flow. The two Control streams (Sub4 and Sub5) had two of the three smallest catchment areas, and dried to pools nearly every summer, whereas the other sites remained perennial, which may explain why the Control sites had the lowest overall richness and diversity. However, in 2007 and 2008 (when the stormwater management devices were being installed), all six streams dried to pools in the summer. The lack of permanent flow and associated fluctuations in temperature likely resulted in loss of taxa in the Experimental sites that require flowing water, and these taxa may have already been missing from the Control sites [56,57].
Why were there so few responses to stormwater management?
Although the installation of rain gardens and rain barrels represented a widespread retrofit management effort, it is likely the number and capacity of installations were simply insufficient to elicit any response from the water quality or biotic measures. The management effort targeted runoff from rooftops and driveways on private properties, which comprised a majority (53.2%) of the total impervious area in the Shepherd Creek catchment. However, even if the 30% of properties that received treatments captured all of the runoff from rooftops and driveways on those properties (which we know was not the case), it would not reduce the effective impervious area (EIA) or DCIA in the subcatchments to below the threshold (1-14% EIA [6], 2% TIA [32]) of expected biotic impairment (See Fig. 2.2.2 in [58]). The range in impervious  (Table 4). doi:10.1371/journal.pone.0085011.g005 (11.2-19.9%) and forest (43.8-68.0%) cover across sites may have also masked detection of responses to treatments. Furthermore, our management approach did not address runoff from streets, which comprised 22.7% of the total impervious area, but had a proportionally higher amount of impervious cover directly connected to storm sewers [38]. Streets were therefore likely to have a disproportional impact on streams, and their lack of treatment may have masked the benefits provided by the rain gardens and rain barrels. Walsh et al. [59] demonstrated that it is possible for streams with ,10% imperviousness to have good ecological condition if stormwater is infiltrated throughout the catchment. Thus, it is conceivable to achieve in-stream improvements in the Shepherd Creek catchment if we (1) increase the number of properties with management practices, (2) ensure all impervious surfaces on the property are routed to rain gardens, (3) increase the capacity of management devices, and (4) mitigate runoff from streets with high proportions of connected impervious cover.
In addition to the lack of hydrologic capacity of installed stormwater treatment devices, there are other possible reasons for the lack of water quality and biotic responses. First, there could have been an overwhelming influence of other stressors, despite the reduced stormwater volume. Although in-stream hydrology is tightly linked to water quality and biotic health in suburban and urban streams, water quality and other stressors (e.g., dispersal barriers, riparian forest loss, channelization) can shape biotic communities independently of stormwater runoff [9]. In the Shepherd Creek catchment, some of the properties have private septic tanks, and poorly maintained or malfunctioning systems can increase nutrients and bacteria, especially during low-flow conditions [60]. Road salt inputs were extremely high in the catchment, and although this can be partially mitigated by capturing runoff, salt concentrations and conductivity remained high despite restoration efforts. It is possible that these and other aquatic and terrestrial stressors were not mitigated by restoration efforts, thus preventing improvement in periphyton and macroinvertebrate assemblage integrity.    Over the seven-year study, there were many changes in the catchment unrelated to the project that may have masked responses associated with stormwater management. The Shepherd Creek project involved county and city organizations (e.g., Hamilton County Soil and Water Conservation District, Hamilton County Engineers Office, Cincinnati Metropolitan Sewer District, and Cincinnati Parks) in effort to maximize potential effectiveness of the project. Despite this, a few road maintenance, sewer maintenance, and tree removal projects occurred during the study period. These changes can increase variability in response variables and reduce the potential to detect improvements in the catchment. Because tree removal (to improve road visibility in Sub4 and for management of invasive species in Sub5) occurred at the same time as installation of stormwater management devices, it was difficult to separate the causes of any biotic responses. In addition, landowners likely made changes in their landscaping, watering, and other practices independently of the project. Given that it is unrealistic to prevent these non-target changes in suburban catchments, future studies will likely have to do additional improvements, matching the scope of management to the type and extent of disturbance, in order to detect a response.
The lack of detectable responses may also be explained by the high spatial and temporal variability of the biotic variables, which is typical in small, hydrologically-complex urban catchments [61]. In the spatial dimension, there are large differences in macroinvertebrate [46] and periphyton [62] taxa found across habitats. By targeting sample collection to riffle habitats (macroinvertebrate bucket samples) and hard substrates (periphyton samples), our study minimized this within-reach variation, although habitat was still likely an important factor influencing differences across sites. We observed high intra-and inter-annual variability in both the periphyton and macroinvertebrate assemblages. Although these were accounted for in the statistical model (Round), the additional variable in the model can minimize the power to detect a response, a disadvantage of this hybrid designed-observational study with numerous unavoidable nuisance effects. Moreover, research suggests that weather variability (on a small scale) and climate variability (on a larger scale) may drive assemblages [56,57,63,64] and override responses to localized hydrologic management in some years. As mentioned before, we experienced drought conditions during the installation phase (late 2007 & early 2008), which may explain biotic responses in the During period. It is likely the low flows, combined with other stressors (water quality, temperature, sedimentation) in the catchment, interacted in complex ways to control biotic assemblages and ultimately prevent any detectable response to stormwater management.
Whereas improvements in hydrology were expected almost immediately following restoration, subsequent improvements in water quality and biotic integrity may take much longer. This study included three years of post-restoration monitoring after the initial installations (Phase 1), and only two years of monitoring after all of the installations were complete (i.e., following Phase 2). Existing, sediment-bound pollutants may take several years to process before streams experience improved water quality from reduced loading [65] and contaminated groundwater reservoirs can maintain high levels of contaminates in streams decades after the pollutant source has been eliminated [66]. Although periphyton have relatively short life cycles and are more sensitive to short-term shifts in water quality than macroinvertebrates [39,62], their assemblages are structured by habitat and substrate, neither of which changed during our study. Similarly, macroinvertebrates may display a delayed response to increased detention of stormwater runoff because it takes time for critical resources (e.g., food, habitat) to improve [67]. Furthermore, the multi-year life cycles of many macroinvertebrates and their modes of dispersal suggests that recovery may take several years [39]. Even in studies where in-stream habitat enhancement has restored habitat diversity, some researchers have shown limited recovery of macroinvertebrates that can be explained by other, persistent stressors [67][68][69]. Even if instream conditions are suitable, aquatic invertebrates that have terrestrial adults need good riparian habitat and dispersal corridors across the landscape to persist [70].

Conclusions
Although this study represents a sizeable effort to control stormwater runoff on private properties throughout a catchment, the stream responses to the retrofit management were limited to localized responses of a few variables. These results are not surprising given the number of rain gardens and rain barrels and the capacity of these stormwater devices relative to the impervious surfaces in the catchment. There is an obvious need for additional controlled studies where stormwater management practices are installed at higher densities to capture a greater volume of stormwater to determine the extent of stormwater management necessary to improve ecosystem health. A large-scale stormwater restoration is currently underway in the Little Stringybark Creek catchment in Melbourne, Australia [27], and more studies of this scope are needed despite the logistical and financial challenges of implementing and monitoring catchment-scale restoration [26].
The focus of this paper was on the effects of stormwater management on stream water quality and biota. However, the success of stream restoration should not only be measured in terms of improved ecosystem health [71], but in other benefits that can be derived. This study took a multidisciplinary approach to designing and implementing catchment-scale management, providing economic and social benefits that extend beyond the ecosystem responses [28]. The auction revealed substantial landowner interest and the potential for mitigating stormwater at much lower costs than centralized options. There were additional contributions to ecosystem services such as flood protection and water supply that extend beyond stream ecosystem benefits. For example, the 165 rain barrels installed resulted in water savings in cases where residents used the outside water source for watering that would otherwise come from potable sources. The 81 rain gardens included only native plant taxa, so the installations contributed to increases in native flora and wildlife habitat in the neighborhood. Additional benefits included generating public awareness of stormwater issues and the connection between human activities and environmental quality. Overall, it is clear that management efforts designed to mimic natural ecosystems will provide a variety ecosystem and other benefits, yet the extent of retrofit stormwater management necessary to restore healthy streams remains to be determined.