Defra WQ0220: Draft Milestone 1 Model Evaluation

Catchment Modelling Strategies for Faecal Indicator Organisms:

Options Review and Recommendations

Project Code

WW0220

Workpakage 1; Milestone 2

Interim Report on Literature Review Listing Sources Accessed

January 2011

Project Team

David Oliver* (Principal Author of this Report),

John Crowther**, Phil Haygarth***, Louise Heathwaite***,

David Kay**, Trevor Page***

* StirlingUniversity; ** AberystwythUniversity; *** LancasterUniversity

Correspondence to:-

David Kay

01570 423565 (Tel and Fax)

GLOSSARY

Black-box model: A model which does not explicitly represent the actual processes inconverting the model input into a model output.

Calibration:The fitting of model predictions with measured data through the changing of model input parameters relating to some accepted criteria.

Continuous model: The application of a model to continuous data (e.g. long term) as opposed to discrete data (e.g. storm event)

Deterministic:A deterministic model is one of knowable outcome:having an outcome that can be predicted because all of its causes are either known or the same as those of a previous event.

Empirical model: A model developed on empirical observations of the system under study

Export coefficient model: A ‘black-box’ modelling approach whereby, for a given climatic regime, a particular land use class is determined to export characteristic quantities of a contaminant over a defined time period.

Fully-distributed:The attributes of the catchment being modelled are distributed throughout the landscape (e.g. via a grid).

Fuzzy model: Deals with information that is approximate rather than accurate

Grey-box model:Provides some physical process-representation but some of the processes are approximated

Lumped model:Simplification of a distributed physical system whereby processes are grouped into spatial units of similar functioning such as ‘hydrological response units’

Model evaluation: Assessment of the model with respect to its intended objectives and may include some reporting on model structural and parameter uncertainties, and parameter sensitivity. Model evaluation is often undertaken as part of validating whether the model is ‘fit for purpose’.

Model structure: The conceptualisation of the system under study into a model representation and numerical design.

Monte-carlo simulation: The use of repeated random samplingfrom apriori specified parameter distributions to generate results.

Parameterisation: The process of assigning values to parameters that represent particular processes or functions within a model structure. This can be undertaken using expert opinion, literature searches, via new experimentation and field studies or via calibration (see above). A lack of spatial or temporal data can inhibit parameterisation.

Physically-based: A model whereby the structure attempts to represent processes, such as those governing contaminant inputs, mobilisation and delivery, in a physically-meaningful and spatially distributed manner. The extent of process representation is dictated in part by underlying hydrological model and process equations. Physically-based models can be deterministic or stochastic.

Probabilistic model: A statistical approach to estimate the probability of a given event based on historical data.

Process based: see physically-based

Semi-distributed:In contrast to a fully distributed model different land use classes within a catchment, or sub catchment boundaries, are modelled simultaneously rather than as explicitly individually defined units typical of a distributed model. This likens the approach to a lumped model in some ways but similar land units may not be contiguous

Stochastic: Non-deterministic behaviour involving a random element. Stochastic models aim to represent the likelihood of different outcomes given similar inputs

Uncertainty analysis: Model uncertainty can relate to parameter uncertainty and model structural uncertainty as well as the uncertainty associated with uncertain inputs and evaluation data. Ultimately, a reduction and characterisation of uncertainty in model predictions should form part of the modelling process which may help in the reduction of uncertainty.

Validation:For the purposes of this review this term is used to mean that a model is tested as being fit for purpose rather than as being truly valid.

White-box model: A model representation of a system where all necessary information is known and available

INTRODUCTION

DEFRA have utilised export coefficient models to characterise faecal indicator organism (FIO) flux at the catchment scale and determine FIO source apportionment through the FIOSA project(Kay, Anthony et al. 2010). FIOSA comprises an empirically-based, but ‘black-box’, modelling approach. While useful, export coefficient approaches are only able to provide a limited disaggregation of process understanding at the catchment scale. The growing requirement for the design of ‘programmes of measures’ by Article 11 of the Water Framework Directive (WFD), to prevent impairment of Annex IV ‘protected areas’ (i.e. including bathing and shellfish harvesting waters), is generating an imperative for the development of more ‘white-box’, or process-based, modelling capacity. This is needed in order to differentiate specific (spatial) effects of land management practices when combined with catchment responses to hydrological drivers at relevant timescales. In turn this will underpin requirement and design of remediation strategies (particularly in livestock farming areas) to facilitate integrated management of diffuse and point-source FIO fluxes.

The adoption or development of a modelling approach for diffuse pollution should always consider a number of critical factors. Most notably these should include: a clear statement of data needs (both for process representation (and constraint) but also for model evaluation purposes, an awareness of the importance of spatial and temporal scales for model predictions, and finally an evaluation of the uncertainty in model predictive capability. The other major consideration should be the type of modelling approach to use (Oliver, Heathwaite et al. 2009). By that we mean where on the scale (ranging from simplistic, black box modelling through to highly complex physically based modelling) the chosen approach will need to function.

The prediction of catchment contributions to watercourse pollution has seen a number of process-based modelling platforms developed over the last few decades, particularly for prediction of nitrate and phosphate loading of surface waters (see review by Merritt for assessment of modelling platforms for sediment and sediment-associated nutrients). In contrast, model development for FIOs is less mature – a direct consequence of the limited extent of the scientific evidence base for,and ‘historical’ interest in, microbial contaminants relative to phosphorus (P) and nitrogen (N). That said, the need has never been greater to consider existing modelling platforms (developed for other agricultural pollutants) in order to evaluate their suitability for accommodating a microbial sub-routine given the looming requirements of the WFD. Indeed, excellent progress has already been made in terms of initial developments of such microbial submodels and some prior reporting of comparative model evaluations already exists in the literature (e.g. (Borah and Bera 2003; Borah and Bera 2004; Coffey, Cummins et al. 2007). Here, we build on this existing review material but crucially we also extend the evaluation of modelling platforms to a key cluster of tools developed specifically for UK application to agricultural pollution prediction. This does not imply a preference for UK specific models but simply recognises: (i) a lack of their consideration in published literature appraising microbiological modelling capability previously, and brings this up-to-date; and (ii) that there may be more efficient uptake by UK regulators through bolting on of different modelling components to existing UK models. Importantly, the research team fully appreciate that we should not limit ourselves to the fact that model structures are right for UK needs with regard to prediction of microbial watercourse pollution; they may not be, and if not it raises difficult issues surrounding the ease at which existing model structures (and associated code) can be adapted. In evaluating existing model frameworks it will be paramount to keep in mind the purpose of this review. The aim is to consider the range of modelling frameworks currently available and to propose, after a thorough balanced interrogation, a selection of the most suitable contenders for potential development in order to accommodate a microbial submodel. The ultimate selection must be undertaken in conjunction with the Project Advisory Group.

THE SUITABILITY OF KEY INDICATORSFOR A MODEL PLATFORM

Given that the focus of this review is to consider the extension of modelling platforms to accommodate a microbial submodel it is important to highlight the key requirements of catchment scale process-based models of agricultural pollution for FIO suitability. Each of the models evaluated in this document will be compared against the following defined ‘model needs’ to provide a yardstick for model comparison:

-Hydrological representation:the driving routine that underpins the majority of models of diffuse agricultural pollution is one grounded in hydrology. Thus, a hydrological model accommodates a suite of flow-governing equations to deliver on distinct stages of a model simulation. For example, one of the most frequently used modules is a subroutine for calculating surface runoff. This needs to account for variation in land use type, topography, soil type, vegetation cover, precipitation and land management practice (e.g. manure applications, livestock grazing etc). Process-based models attempt to represent some of the (albeit in a simplified way) physical rules and processes observed under real-world scenarios including surface runoff, subsurface flow, and channel flow via these submodel components. Flow routing is undisputedly a critical element in models designed to predict diffuse pollution impacts on receiving waters. One of the first models claimed to have successfully integrated all submodules necessary for catchment chemical hydrology was the Stanford Watershed Model (SWM). A derivative of SWM is the Hydrological Simulation program – FORTRAN (HSPF). European equivalents of a comprehensive catchment model include the SystemeHydrologiqueEuropeen (SHE), which has been succeeded by MIKE SHE (a catchment-scale, physically based, spatially distributed model for water flow and sediment transport). In some cases, many more processes are represented and this in turn can lead to the creation of incredibly complex model structures that have no quantitative equivalent using current field measurements. Within this review we are particularly interested in assessing the potential of models to simulate the capture of faecally-derived microbial pollutants via hydrological processes and their subsequent routing through the catchment drainage network. Thus, where evident we will make clear the potential for hydrological submodels to entrain sporadic livestock excretions (sources of FIOs) and likewise their feasibility for representing in-stream processing of FIOs.

-Time-step: The temporal resolution of the model routines are extremely important if we need to think about capturing the dynamic response of event-driven pollutants such as FIOs. Logically, monthly time-steps would appear to be inappropriate for accurate capture of any water quality response to short term rainfall events, and instead a daily if not sub daily routine would appear to be needed. The timestep for FIOs is governed by the likely exceedence periods and in our view this can be very short for bathing waters and shellfish harvesting waters, thus hourly resolution would be the ideal. This will be explored within the body of this review

-Spatial-scale: The over-arching remit of evaluating modelling capability tounderpin prediction and design of remediation strategies (particularly in livestock farming areas) to facilitate integrated management of diffuse and point-source FIO fluxes dictates acatchment-scale approach. However, it will be important to consider the importance of arbitrary 1km2 gridded distributed models versus models that delineate hydrological response units (HRU’s) or the equivalent based on common landscape functionality. How important is such delineation? Such issues are discussed by Lane et al. (2009) and incorporated in the body of this review.

-Diffuse and point source contributions: In addition to diffuse source FIO inputs to stream loadingsthere will need to be some consideration of how point source FIO inputs are accounted for within the catchment context. Point sources will be numerous both in terms of quantity but also type (e.g. wastewater treatment works, farmyards, leaking septic tanks). Of particular interest will be the scope for modelling platforms to account for farmyard FIO contributions yet at the same time we need to be critical in assessing what data are actually available for farmyard contributions to constrain the model. It will be important to understand how variable such data may be. It is likely that such point sources will prove complex to accommodate within any model structure because of the variability of farmyards associated with different agricultural enterprises.

-Ability to represent lifecycle processes (such as regrowth and die-off) within model parameterisation:Many diffuse pollutants are of a non-conservative nature because of uptake by plants or their degradation potential. For FIOs the model will need to be able to account for cell die-off and regrowth potential within different catchment matrices. Likewise storage and release within the catchment though this is probably much more important for P than for FIOs.

-Recognition of in-stream processes: Given that the overall aim of the review is to evaluate the predictive capability of modelling platforms for FIO it is important to account for all of the key sources and sinks of microbial pollutants, one of which is stream-bed sediment. The ability of including in-stream processing within modelling routines will be considered. There is evidence available in the literature to highlight that streambed storage is important (Muirhead, Davies-Colley et al. 2004; Cho, Pachepsky et al. 2010).

-Ability to account for mitigation impacts: Most models should be able to account for changes in the catchment that relate to management interventions through the alteration of parameter values. Any reference to trialled modelling of mitigation measures within existing modelling platforms will be highlighted within this review.

-Licensing (cost): The likelihood of licensing requirements presenting a hindrance to model development will be considered. Collaborating with those who hold the license is clearly one option to circumvent those problems but this does not pave the way forward for an open-access web-based platform that is favoured by Defra. An open-source web-based approach would encourage the development of any modelling platform and allow a more rapid evolution of the model by the research community. Clearly this may restrict the applicability of some of the existing models.

LITERATURE SEARCH STRATEGY

There are a considerable number of model platforms from around the world and the following text provides a brief summary of those identified as having greatest potential for further development with regard to microbial prediction. Web of Knowledge was the principal engine for the literature search using combinations of key words as shown in Table 1. This text summary has been condensed into two accompanying Excel summary tables for each model and an initial reference bank has been created within Endnote for further exploitation (containing 230 FIO [or diffuse pollution related] catchment-scale modelling references). This document therefore serves as a precursor to a more detailed evaluation of the required input parameters necessary for their functioning (see linked excel spread-sheet).

Table 1: Search criteria used in combination within Web of Knowledge

SUMMARY OF MODEL PLATFORMS FOR CONSIDERATION

PSYCHIC: PSYCHIC is a process based catchment scale model of P and sediment transfers developed in the UK. Specifically, it models P and suspended sediment mobilisation in land runoff and their subsequent delivery to watercourses (Davison, Withers et al. 2008). It is packaged as a decision support system and operates through the coupling of hydrological and land management information. The PSYCHIC platform offers end-users a dual scale application allowing for catchment scale predictionusing nationally available datasets, but also harnessing more detailed user-supplied information for field-scale utility. A variety of transfer pathways are accounted for and include: release of desorbable soil P, detachment of suspended sediment (SS) and associated particulate P, incidental P losses from manure and fertiliser applications, losses from hard standings, artificial drainage routings, point sources and surface runoff.

A number of caveats are apparent when PSYCHIC is operated at the catchment scale. For example, the model is not programmed to account for bank erosion as a contributor of P loading (nor in-stream processing of P for that matter). PSYCHIC uses a monthly time-step and the spatial scale of operation allows for the accumulation of 1km2 (or smaller where possible) spatial data that the model combines with management information derived from Ag Census returns and relevant survey responses.

As with any process-based model PSYCHIC has a number of data requirements in order to function. These are outlined briefly in the following list(thougha considerable number are considered practical constraints on uptake of PSYCHIC as a modelling platform of choice, largely because of licensing issues):

Manure Management Database (held by ADAS), combining agricultural census data with the survey of fertiliser practice;
MAGPIE (held by ADAS): used in PSYCHIC as a database of land use type, the number of livestock per ha of managed grass and the proportion of each crop grown per ha;
NATMAP (held by NSRI): series based, 1km2 spatial data set of % coverage of each soil series;
National Soil Inventory (held by NSRI): physical properties associated with each of the soil series under different land use conditions;
HOST (Held by CEH/NSRI): Classification of the soils of the UK;
DEM;
Census data (no of people per km2);
Climate Surface (held by Climate Research Unit, UEA) – climate attributes including rainfall, rain days, wind speed, sun hours, maximum temperature, minimum temperature;
Drainage density (river network; CEH);
Index of proximity to surface water (connectivity).

Clearly, a number of datasets are the property of research institutions and or consultancies and the model code itself is held by ADAS and is not open access. The EA do have a PSYCHIC 1 version (Neil Preedy is the contact at the EA who would know such details)[1].