Fall Run Calibration 2021 • fallRunDSM

Objective

We are using genetic algorithms, a type of stochastic optimization, to calibrate the following model coefficients:

..surv_adult_enroute_int used in submodel surv_adult_enroute
..surv_adult_prespawn_int used in submodel surv_adult_prespawn
..surv_egg_to_fry_int used in submodel surv_egg_to_fry
..surv_juv_rear_int used in submodel surv_juv_rear
..surv_juv_rear_contact_points used in submodel surv_juv_rear
..surv_juv_rear_prop_diversions used in submodel surv_juv_rear
..surv_juv_rear_total_diversions used in submodel surv_juv_rear
..surv_juv_bypass_int used in submodel surv_juv_bypass
..surv_juv_delta_int used in submodel surv_juv_delta
..surv_juv_delta_contact_points used in submodel surv_juv_delta
..surv_juv_delta_total_diverted used in submodel surv_juv_delta
..surv_juv_outmigration_sj_int used in submodel surv_juv_outmigration_san_joaquin
..ocean_entry_success_int used in submodel ocean_entry_success

A detailed description for each of these parameters can be found in the Fall Run Documentation Website

We are restricting the calibration to watersheds with Grandtab data, these are:

watershed
Upper Sacramento River
Butte Creek
Clear Creek
Deer Creek
Mill Creek
Feather River
Yuba River
American River
Cosumnes River
Mokelumne River
Merced River
Stanislaus River
Tuolumne River

Packages

library(GA)
library(fallRunDSM)
library(DSMCalibrationData)
library(tidyverse)

We are using the GA R package to implement our genetic algorithms search to calibrate the above intercepts for the fallRunDSM::fall_run_model() function of the fallRunDSM R package.

Additionally, we are using DSMCalibrationData to access calibration related data.

Calibrating Data

GrandTab

We will be minimizing the error between GrandTab estimated escapement data for the years 1998-2017 to natural spawners predicted by the DSM. We are further scaling this error by a weight representing the amount of observed data available for a given watershed. Given that GrandTab includes both hatchery and natural spawners, we will scale GrandTab escapement values using an estimated proportion natural spawners for each watershed. This will allow us to compare predicted natural spawners to psuedo GrandTab natural spawners.

For calibration, we have prepared the GrandTab data in two ways:

DSMCalibrationData::grandtab_observed - This data will be used in the fitness function to measure the difference between model predictions and observed escapement. Missing values are NA and we made all records less than 100 NA to account for a lack of confidence for counts less than 100.
DSMCalibrationData::grandtab_imputed - This data will be used to calculate the number of juveniles during the 20 year simulation. The GrandTab data is incomplete for many watersheds during the 20 year period of calibration. For watersheds with no GrandTab data, we used 40 as the default escapement value. For watersheds with incomplete data, we used the mean escapement value.

Observed Values

Imputed Values

Selected Proxy Years

The data inputs to the DSM are for years 1980-1999. We selected proxy years for 1998-2017 from the 1980-1999 model inputs by comparing the DWR water year indices in this analysis.

These inputs are generated by calling DSMCalibrationData::set_synth_years(params), with a params object from the life cycle model ex: fallRunDSM::params. See the documentation for more details on the DSMCalibrationData R package.

Spawning Inputs Proxy Years

Calibration Year	Selected Proxy Year
1997	1997
1998	1998
1999	1999
2000	2000
2001	1987
2002	1985
2003	2000
2004	1985
2005	1980
2006	1995
2007	1989
2008	1994
2009	1981
2010	1985
2011	1997
2012	1985
2013	1987
2014	1992
2015	1992
2016	1985
2017	1998

Rearing Inputs Proxy Years

Calibration Year	Selected Proxy Year
1998	1998
1999	1999
2000	2000
2001	1987
2002	1985
2003	2000
2004	1985
2005	1980
2006	1995
2007	1989
2008	1994
2009	1981
2010	1985
2011	1997
2012	1985
2013	1987
2014	1992
2015	1992
2016	1985
2017	1998

Fitness Function

We are minimizing the sum of squared errors of simulated natural adults and grandtab adults scaled using simulated proportion natural, normalized by mean escapement. The purpose of normalizing is to prevent watersheds with large escapement values from dominating the calibration. In addition to normalizing we are adding weights to each error term corresponding to the amount data available at that watershed. Doing so will penalize watersheds with more data whose error term is large, but be less aggressive with watersheds with less data.

All intercepts and watersheds that need calibration are described in the Objective section of this document.

Notes:

..surv_adult_enroute_int and ..ocean_entry_success_int are vectors and expanded within the fitness function arguments.
R integer data type is a 32 bit signed integer which can only count to ~2 billion. During calibration, some estimates of juveniles and adults will exceed this limit and cause errors. We run the prediction within the fitness function inside a tryCatch block. If an integer is produced that is too large we return a large loss value, this will ensure that this iteration will not be selected for future generations.

# Fitness Function ------------------
fall_run_fitness <- function(
  known_adults,
  seeds,
  params,
  surv_adult_enroute,
  surv_adult_prespawn,
  surv_egg_to_fry,
  bypass_surv_juv,
  upsac_surv_juv,
  butte_surv_juv,
  clear_surv_juv,
  deer_surv_juv,
  mill_surv_juv,
  sac_surv_juv,
  feather_and_bear_surv_juv,
  yuba_surv_juv,
  american_surv_juv,
  deltatribs_surv_juv,
  moke_surv_juv,
  merced_surv_juv,
  stan_surv_juv,
  tuol_surv_juv,
  sj_surv_juv,
  surv_juv_rear_contact_points,
  surv_juv_rear_prop_diversions,
  surv_juv_rear_total_diversions,
  surv_juv_bypass_int,
  surv_juv_delta_int,
  surv_juv_delta_contact_points,
  surv_juv_delta_total_diverted,
  surv_juv_outmigration_sj_int,
  default_ocean_entry_surv,
  upsac_ocean_entry_surv,
  butte_ocean_entry_surv,
  deer_ocean_entry_surv,
  mill_ocean_entry_surv,
  midsactribs_ocean_entry_surv,
  yuba_ocean_entry_surv,
  american_ocean_entry_surv,
  deltatribs_ocean_entry_surv,
  moke_ocean_entry_surv,
  merced_ocean_entry_surv,
  stan_ocean_entry_surv,
  tuol_ocean_entry_surv) {

  params_init <- params

  params_init$..surv_adult_enroute_int = surv_adult_enroute
  params_init$..surv_adult_prespawn_int = surv_adult_prespawn
  params_init$..surv_egg_to_fry_int = surv_egg_to_fry
  params_init$..surv_juv_rear_int = c(`Upper Sacramento River` = upsac_surv_juv,
                                      `Antelope Creek` = deer_surv_juv,
                                      `Battle Creek` = deer_surv_juv,
                                      `Bear Creek` = deer_surv_juv,
                                      `Big Chico Creek` = deer_surv_juv,
                                      `Butte Creek` = butte_surv_juv,
                                      `Clear Creek` = clear_surv_juv,
                                      `Cottonwood Creek` = deer_surv_juv,
                                      `Cow Creek` = deer_surv_juv,
                                      `Deer Creek` = deer_surv_juv,
                                      `Elder Creek` = deer_surv_juv,
                                      `Mill Creek` = mill_surv_juv,
                                      `Paynes Creek` = deer_surv_juv,
                                      `Stony Creek` = deer_surv_juv,
                                      `Thomes Creek` = deer_surv_juv,
                                      `Upper-mid Sacramento River` = sac_surv_juv,
                                      `Sutter Bypass` = bypass_surv_juv,
                                      `Bear River` = feather_and_bear_surv_juv,
                                      `Feather River` = feather_and_bear_surv_juv,
                                      `Yuba River` = yuba_surv_juv,
                                      `Lower-mid Sacramento River` = sac_surv_juv,
                                      `Yolo Bypass` = bypass_surv_juv,
                                      `American River` = american_surv_juv,
                                      `Lower Sacramento River` = sac_surv_juv,
                                      `Calaveras River` = deltatribs_surv_juv,
                                      `Cosumnes River` = deltatribs_surv_juv,
                                      `Mokelumne River` = moke_surv_juv,
                                      `Merced River` = merced_surv_juv,
                                      `Stanislaus River` = stan_surv_juv,
                                      `Tuolumne River` = tuol_surv_juv,
                                      `San Joaquin River` = sj_surv_juv)
  params_init$..surv_juv_rear_contact_points = surv_juv_rear_contact_points
  params_init$..surv_juv_rear_prop_diversions = surv_juv_rear_prop_diversions
  params_init$..surv_juv_rear_total_diversions = surv_juv_rear_total_diversions
  params_init$..surv_juv_bypass_int = surv_juv_bypass_int
  params_init$..surv_juv_delta_int = surv_juv_delta_int
  params_init$..surv_juv_delta_contact_points = surv_juv_delta_contact_points
  params_init$..surv_juv_delta_total_diverted = surv_juv_delta_total_diverted
  params_init$..surv_juv_outmigration_sj_int = surv_juv_outmigration_sj_int
  params_init$..ocean_entry_success_int = c(
    `Upper Sacramento River` = upsac_ocean_entry_surv,
    `Antelope Creek` = default_ocean_entry_surv,
    `Battle Creek` = default_ocean_entry_surv,
    `Bear Creek` = default_ocean_entry_surv,
    `Big Chico Creek` = default_ocean_entry_surv,
    `Butte Creek` = butte_ocean_entry_surv,
    `Clear Creek` = default_ocean_entry_surv,
    `Cottonwood Creek` = default_ocean_entry_surv,
    `Cow Creek` = default_ocean_entry_surv,
    `Deer Creek` = deer_ocean_entry_surv,
    `Elder Creek` = default_ocean_entry_surv,
    `Mill Creek` = mill_ocean_entry_surv,
    `Paynes Creek` = default_ocean_entry_surv,
    `Stony Creek` = default_ocean_entry_surv,
    `Thomes Creek` = default_ocean_entry_surv,
    `Upper-mid Sacramento River` = default_ocean_entry_surv,
    `Sutter Bypass` = default_ocean_entry_surv,
    `Bear River` = midsactribs_ocean_entry_surv,
    `Feather River` = midsactribs_ocean_entry_surv,
    `Yuba River` = yuba_ocean_entry_surv,
    `Lower-mid Sacramento River` = default_ocean_entry_surv,
    `Yolo Bypass` = default_ocean_entry_surv,
    `American River` = american_ocean_entry_surv,
    `Lower Sacramento River` = default_ocean_entry_surv,
    `Calaveras River` = deltatribs_ocean_entry_surv,
    `Cosumnes River` = deltatribs_ocean_entry_surv,
    `Mokelumne River` = moke_ocean_entry_surv,
    `Merced River` = merced_ocean_entry_surv,
    `Stanislaus River` = stan_ocean_entry_surv,
    `Tuolumne River` = tuol_ocean_entry_surv,
    `San Joaquin River` = default_ocean_entry_surv)


  keep <- c(1,6,7,10,12,19,20,23,26:30)
  num_obs <- rowSums(!is.na(known_adults[keep, 6:20]))
  total_obs <- sum(!is.na(known_adults[keep, 6:20]))
  weights <- num_obs / total_obs


  tryCatch({
    preds <- fall_run_model(mode = "calibrate",
                            seeds = seeds,
                            stochastic = FALSE,
                            ..params = params_init)

    known_nats <- known_adults[keep, 6:20] * (1 - params_init$proportion_hatchery[keep])
    mean_escapent <-rowMeans(known_nats, na.rm = TRUE)

    sse <- sum(((preds[keep,] - known_nats)^2 * weights)/mean_escapent, na.rm = TRUE)

    return(sse)
  },
  error = function(e) return(1e12),
  warning = function(w) return(1e12)
  )
}

Run GA Algorithm

params <- DSMCalibrationData::set_synth_years(fallRunDSM::params)

res <- GA::ga(type = "real-valued",
          fitness =
            function(x) -fall_run_fitness(
              known_adults = DSMCalibrationData::grandtab_observed$fall,
              seeds = DSMCalibrationData::grandtab_imputed$fall,
              params = params,
              x[1], x[2], x[3], x[4], x[5], x[6], x[7], x[8], x[9], x[10],
              x[11], x[12], x[13], x[14], x[15], x[16], x[17], x[18], x[19],
              x[20], x[21], x[22], x[23], x[24], x[25], x[26], x[27], x[28],
              x[29], x[30], x[31], x[32], x[33], x[34], x[35], x[36], x[37],
              x[38], x[39], x[40]
            ),
          lower = c(2.5, rep(-3.5, 39)),
          upper = rep(3.5, 40),
          popSize = 150,
          maxiter = 10000,
          run = 50,
          parallel = TRUE,
          pmutation = .4)

Calibration Settings

The ga function allows for several paramaters to be set before calibrating a model. Here we list the reasoning for the values passed into the ga functon above.

lower - sets the lower bound possible for each of the parameters we are trying to calibrate. All parameters used in the model exist within a inverse logit function therefore limiting the genetic algorithms space of search greatly improves its speed. We also explicitly limit ..surv_adult_enroute as a positive value to keep the GA from quickly converging to a state where all fish are dead.
upper - sets the upper bound possible for each of the parameters we are trying to calibrate. The same reasoning as lower, but we add no special limitiation to ..surv_adult_enroute
popSize - sets population size for each iteration of the GA. We found 150 to be a good balance between breadth and speed of the search.
maxiter - sets the number of iterations the GA will run for. We set this value large enough so that the run argument would be determining factor as to when the GA stopped.
run - we set this value to 50 as it allowed the algorithm to avoid local minimum but also complete the calibration within 3-4 hours.
pmutation - set the probability of mutation, we used this value to explore the solution space more aggresively once we had a starting position from a previous run.

The algorithm is always run in parallel.

Calibration Results

The results for calibration are below:

Spawning Adults

..surv_adult_enroute_int = 3.3660081
..surv_adult_prespawn_int = -1.061446

Egg to Fry

..surv_egg_to_fry_int = 3.4048167

Rearing Survival Intercepts

Note: Deer Creek survival intercept was used for all Upper Sacramento Tributaries not listed below

Intercepts listed below are supplied into the model as a vector ..surv_juv_int

Bypass (Yolo and Sutter) = -0.7230985
Upper Sacramento River = 2.9687302
Butte Creek = -0.6094086
Clear Creek = 2.8237986
Deer Creek = -0.2438628
Mill Creek = -0.2663291
Sacramento River = 1.9282397
Feather and Bear River = -0.701983
Yuba River = 0.5618062
American River = -1.4780065
Calaveras and Cosumnes River = 1.8642124
Mokelumne River = -0.4910794
Merced River = 1.885828
Stanislaus River = -0.797111
Tuolumne River = -1.8340336
San Joaquin River = -1.2950271
..surv_juv_rear_contact_points = 0.3317422
..surv_juv_rear_prop_diversions = -2.5818296
..surv_juv_rear_total_diversions = -3.1670594

Migration Intercepts

..surv_juv_bypass_int = -2.2108365
..surv_juv_delta_int = -2.0264052
..surv_juv_delta_contact_points = -1.4272058
..surv_juv_delta_total_diverted = 1.0292423
..surv_juv_outmigration_sj_int = -0.4343738

Ocean Entry Success

Intercepts listed below are supplied into the model as a vector ..ocean_entry_success_int

Default Ocean Entry = 2.6768012
Upper Sacramento River = 2.9739999
Butte Creek = -0.9551862
Deer Creek = -1.056046
Mill Creek = -0.5287509
Bear and Feather River = 0.1730387
Yuba River = -1.6229621
American River = -1.8852218
Calaveras and Cosumnes River = 1.7441138
Mokelumne River = 0.2561206
Merced River = -1.6259521
Stanislaus River = -0.2705412
Tuolumne River = -0.2540344

Default Ocean Entry is used for all watersheds not listed above