Member-By-Member postprocessor module

This module contains Data postprocessors classes applying the postprocessing algorithm detailed in

Bert Van Schaeybroeck and Stéphane Vannitsem. Ensemble post-processing using member-by-member approaches: theoretical aspects. Quarterly Journal of the Royal Meteorological Society, 141 (688):807–818, 2015. URL: https://doi.org/10.1002/qj.2397.

Basically, it corrects a provided Data object \(\mathcal{D}_{p,n,m,v} (t)\) provided by applying the formula:

\[\mathcal{D}^C_{n,m,v} (t) = \alpha_{v} (t) + \sum_{p=1}^P \beta_{p,v} (t) \mu^{\rm ens}_{p,n,v} (t) + \tau_{v} (t) \bar{\mathcal{D}}^{\rm ens}_{0,n,m,v} (t)\]

where \(P\) is the total number of predictors (given by the attribute number_of_predictors of \(\mathcal{D}\)). In the formula above:

The Data object \(\mathcal{D}_{p,n,m,v} (t)\) is assumed to contain all the predictors used to correct the variable needed, the first predictor being by convention the variable itself: \(\mathcal{D}_{0,n,m,v} (t)\).
\(\mu^{\rm ens}_{p,n,v} (t)\) is the Data.ensemble_mean of the Data object \(\mathcal{D}_{p,n,m,v} (t)\).
\(\tau_{v} (t)\) is a multiplicative correction applied to each member of the first predictor of the Data.centered_ensemble \(\bar{\mathcal{D}}^{\rm ens}_{0,n,m,v} (t)\).
\(\mathcal{D}^C_{n,m,v} (t)\) is the corrected Data object, with an ensemble member-by-member correction.

Content

Several different postprocessors are available, with different strategy to determine the coefficients \(\alpha_{v} (t)\), \(\beta_{p,v} (t)\) and \(\tau_{v}\):

BiasCorrection: A simple correction of the bias.
EnsembleMeanCorrection: A correction of the ensemble mean.
EnsembleSpreadScalingCorrection: A correction of the ensemble mean and a correction of the ensemble spread through a multiplicative scaling.
EnsembleSpreadScalingAbsCRPSCorrection: A correction of the ensemble mean and a correction of the ensemble spread, with a tuning of the parameters to minimize the Absolute norm CRPS score Data.Abs_CRPS().
EnsembleSpreadScalingNgrCRPSCorrection: A correction of the ensemble mean and a correction of the ensemble spread, with a tuning of the parameters to minimize the Non-homogeneous Gaussian Regression (NGR) CRPS score Data.Ngr_CRPS().
EnsembleAbsCRPSCorrection: A correction of the ensemble mean and a correction of the ensemble spread with spread scaling and nudging. In addition, the parameters are tuned to minimize the Absolute norm CRPS score Data.Abs_CRPS().
EnsembleNgrCRPSCorrection: A correction of the ensemble mean and a correction of the ensemble spread with spread scaling and nudging. In addition, the parameters are tuned to minimize the Non-homogeneous Gaussian Regression (NGR) CRPS score Data.Ngr_CRPS().

Postprocessors

We now detail these postprocessors:

class postprocessors.MBM.BiasCorrection[source]

Bases: postprocessors.MBM.EnsembleMeanCorrection

A postprocessor to correct the bias. Correct according to the formula:

\[\mathcal{D}^C_{n,m,v} (t) = \alpha_{v} (t) + \sum_{p=1}^P \beta_{p,v} (t) \mu^{\rm ens}_{p,n,v} (t) + \tau_{v} (t) \bar{\mathcal{D}}^{\rm ens}_{0,n,m,v} (t)\]

where \(\alpha_{v} (t) = \left\langle \mathcal{O}_{n,v} (t) - \mu^{\rm ens}_{0,n,v} (t) \right\rangle_n\) with \(\mathcal{O}_{n,v} (t)\) is the observation training set and \(\mu^{\rm ens}_{0,n,v} (t)\) is the ensemble mean of the first predictor of the training set, i.e. the ensemble_mean of the predictor to correct. \(\bar{\mathcal{D}}^{\rm ens}_{0,n,m,v} (t)\) is the centered_ensemble of the first predictor. We have also

\(\beta_{0,v} (t) = 1\)
\(\beta_{p>0,v} (t) = 0\)
\(\tau_{v} (t) = 1\)

for all \(v\) and \(t\).

parameters_list

The list of training parameters \(\alpha_{v} (t)\), \(\beta_{n,v} (t)\), \(\tau_{v} (t)\), as Data objects. This list is empty until the first train() method call.

Type: list(Data)

Examples

>>> from pp_test.test_data import test_sin
>>> import postprocessors.MBM as MBM
>>> import matplotlib.pyplot as plt
>>>
>>> # Loading random sinusoidal observations and reforecasts
>>> past_observations, reforecasts = test_sin(60., 3., number_of_observation=200)
>>> past_observations.index_shape
(1, 200, 1, 1, 21)
>>> reforecasts.index_shape
(1, 200, 10, 1, 21)
>>>
>>> # Training the postprocessor
>>> postprocessor = MBM.BiasCorrection()
>>> postprocessor.train(past_observations, reforecasts)
>>>
>>> # Loading new random sinusoidal observations and forecasts
>>> observations, forecasts = test_sin(60., 3.)
>>> corrected_forecasts = postprocessor(forecasts)  # Correcting the new forecasts
>>>
>>> # Plotting the forecasts
>>> ax = observations.plot(global_label='observations', color='tab:green')
>>> ax = forecasts.plot(ax=ax, global_label='raw forecasts', mfc=None, mec='tab:blue',
... ls='', marker='o', ms=3.0)
>>> ax = corrected_forecasts.plot(ax=ax, global_label='corrected forecasts', mfc=None,
... mec='tab:orange', ls='', marker='o', ms=3.0)
>>> t = ax.set_ylabel('temperature [C°]')
>>> t = ax.set_xlabel('date')
>>> t = ax.legend()
>>> t = ax.set_title('Example of corrected forecasts')
>>>
>>> # Plotting the bias
>>> ax = postprocessor.plot_parameters()
>>> t = ax.set_title('Postprocessing parameters \n'+r'($\gamma_{1,0}$ is $\tau_{0}$)')
>>>
>>> # Computing the CRPS score
>>> raw_crps = forecasts.CRPS(observations)
>>> corrected_crps = corrected_forecasts.CRPS(observations)
>>>
>>> # Plotting the CRPS score
>>> ax = raw_crps.plot(timestamps=postprocessor.parameters_list[0].timestamps[0, 0],
... global_label='Raw forecast CRPS', color='tab:blue')
>>> ax = corrected_crps.plot(timestamps=postprocessor.parameters_list[0].timestamps[0, 0], ax=ax,
... global_label='Corrected forecast CRPS', color='tab:orange')
>>> t = ax.set_ylabel('CRPS [C°]')
>>> t = ax.set_xlabel('time [hour]')
>>> t = ax.set_title('CRPS score')
>>> t = ax.legend()
>>> plt.show()

(Source code)

../../_images/MBM-1_00.png — (png, hires.png, pdf)

../../_images/MBM-1_01.png — (png, hires.png, pdf)

../../_images/MBM-1_02.png — (png, hires.png, pdf)

train(observations, predictors, forecast_init_time=None, **kwargs)[source]

Method to train the postprocessor with an observation and predictors training set. Once trained, the postprocessor parameters are stored in the parameter_list.

Parameters

observations (Data) – A Data object containing the observation training set.
predictors (Data) – A Data object containing the predictors training set. Must be broadcastable with the observations.
forecast_init_time (int or datetime, optional) – The time at which the predictors generation begin (the (re)forecasts initial time). If None, assume that this time is zero (00Z).

class postprocessors.MBM.EnsembleAbsCRPSCorrection[source]

Bases: postprocessors.MBM.EnsembleSpreadScalingCorrection

A postprocessor to correct the ensemble mean and scale the spread of the ensemble. Correct according to the formula:

\[\mathcal{D}^C_{n,m,v} (t) = \alpha_{v} (t) + \sum_{p=1}^P \beta_{p,v} (t) \, \mu^{\rm ens}_{p,n,v} (t) + \left(\gamma_{1,v} (t) + \gamma_{2,v} (t) / \delta_{0, n, v} (t) \right) \, \bar{\mathcal{D}}^{\rm ens}_{0,n,m,v} (t)\]

where \(\delta_{0, n, v} (t)\) is the average over the ensemble members of the ensemble members distance (delta). The parameters \(\alpha_{v} (t)\), \(\beta_{v} (t)\), \(\gamma_{1,v} (t)\) and \(\gamma_{2, v} (t)\) are optimized at each lead time to minimize the absolute norm CRPS (Abs_CRPS()) of \(\mathcal{D}^C_{n,m,v} (t)\) with the observations \(\mathcal{O}\). This postprocessor must thus be trained with a training set composed of past ensembles \(\mathcal{D}_{p,n,m,v}\) and observations \(\mathcal{O}\). If needed, the parameters initial conditions for the minimization are constructed using the values given by the EnsembleSpreadScalingCorrection, dependding on the minimizer and options being chosen.

The parameters \(\gamma_{1,v} (t)\) and \(\gamma_{2,v} (t)\) control respectively the scaling of the ensemble spread and its nudging toward the climatological variance for the long lead times.

parameters_list

The list of training parameters \(\alpha_{v} (t)\), \(\beta_{n,v} (t)\), \(\tau_{v} (t)\), as Data objects. This list is empty until the first train() method call.

Type: list(Data)

Examples

>>> from pp_test.test_data import test_sin
>>> import postprocessors.MBM as MBM
>>> import matplotlib.pyplot as plt
>>>
>>> # Loading random sinusoidal observations and reforecasts
>>> past_observations, reforecasts = test_sin(60., 3., number_of_observation=200)
>>> past_observations.index_shape
(1, 200, 1, 1, 21)
>>> reforecasts.index_shape
(1, 200, 10, 1, 21)
>>>
>>> # Training the postprocessor
>>> postprocessor = MBM.EnsembleAbsCRPSCorrection()
>>> postprocessor.train(past_observations, reforecasts, ntrial=10)
>>>
>>> # Loading new random sinusoidal observations and forecasts
>>> observations, forecasts = test_sin(60., 3.)
>>> corrected_forecasts = postprocessor(forecasts)  # Correcting the new forecasts
>>>
>>> # Plotting the forecasts
>>> ax = observations.plot(global_label='observations', color='tab:green')
>>> ax = forecasts.plot(ax=ax, global_label='raw forecasts', mfc=None, mec='tab:blue',
... ls='', marker='o', ms=3.0)
>>> ax = corrected_forecasts.plot(ax=ax, global_label='corrected forecasts', mfc=None,
... mec='tab:orange', ls='', marker='o', ms=3.0)
>>> t = ax.set_ylabel('temperature [C°]')
>>> t = ax.set_xlabel('date')
>>> t = ax.legend()
>>> t = ax.set_title('Example of corrected forecasts')
>>>
>>> # Plotting the postprocessing parameters
>>> ax = postprocessor.plot_parameters()
>>> t = ax.set_title('Postprocessing parameters')
>>>
>>> # Computing the CRPS score
>>> raw_crps = forecasts.CRPS(observations)
>>> corrected_crps = corrected_forecasts.CRPS(observations)
>>>
>>> # Plotting the CRPS score
>>> ax = raw_crps.plot(timestamps=postprocessor.parameters_list[0].timestamps[0, 0],
... global_label='Raw forecast CRPS', color='tab:blue')
>>> ax = corrected_crps.plot(timestamps=postprocessor.parameters_list[0].timestamps[0, 0], ax=ax,
... global_label='Corrected forecast CRPS', color='tab:orange')
>>> t = ax.set_ylabel('CRPS [C°]')
>>> t = ax.set_xlabel('time [hour]')
>>> t = ax.set_title('CRPS score')
>>> t = ax.legend()
>>> plt.show()

(Source code)

../../_images/MBM-2_00.png — (png, hires.png, pdf)

../../_images/MBM-2_01.png — (png, hires.png, pdf)

../../_images/MBM-2_02.png — (png, hires.png, pdf)

train(observations, predictors, forecast_init_time=None, num_threads=None, ntrial=1, **kwargs)[source]

Method to train the postprocessor with an observation and predictors training set. Once trained, the postprocessor parameters are stored in the parameter_list.

Parameters

observations (Data) – A Data object containing the observation training set.
predictors (Data) – A Data object containing the predictors training set. Must be broadcastable with the observations.
forecast_init_time (int or datetime, optional) – The time at which the predictors generation begin (the (re)forecasts initial time). If None, assume that this time is zero (00Z).
num_threads (None or int, optional) – The number of threads used to compute the parameters. One thread computes a chunk of the given grid points and lead times. If None, uses the maximum number of cpu cores available. Default to None.
ntrial (int) – Number of parameters initial conditions to try to find the best solution. Not used if the minimizer being chosen is a global one. Default to 1.
kwargs (dict) – Dictionary of options to pass to the scipy.optimize.minimize() function.

class postprocessors.MBM.EnsembleMeanCorrection[source]

Bases: core.postprocessor.PostProcessor

A postprocessor to correct the ensemble mean. Correct according to the formula:

\[\mathcal{D}^C_{n,m,v} (t) = \alpha_{v} (t) + \sum_{p=1}^P \beta_{p,v} (t) \, \mu^{\rm ens}_{p,n,v} (t) + \tau_{v} (t) \bar{\mathcal{D}}^{\rm ens}_{0,n,m,v} (t)\]

where \(\alpha_{v} (t) = \left\langle \mathcal{O}_{n,v} (t) - \sum_{p=1}^P \beta_{p,v} (t) \, \mu^{\rm ens}_{p,n,v} (t) \right\rangle_n\) with \(\mathcal{O}_{n,v} (t)\) the observation training set and \(\mu^{\rm ens}_{p,n,v} (t)\) the ensemble mean of the predictors of the training set. \(\bar{\mathcal{D}}^{\rm ens}_{0,n,m,v} (t)\) is the centered_ensemble of the first predictor. We have also

\[\beta_{p,v} (t) = \sum_{p_1=1}^P \, {\rm Cov}^{\rm obs}_{p_1, v} [\bar{\mathcal{O}}^{\rm obs}, \mu^{\rm ens}] (t) \, {\rm Cov}^{\rm obs}_{p_1, p_2, v} [\mu^{\rm ens}, \mu^{\rm ens}] (t)\]

where \({\rm Cov}^{\rm obs}_{p_1, v} [\bar{\mathcal{O}}^{\rm obs}, \mu^{\rm ens}]\) is the ensemble_mean_observational_covariance() with the observation training set \(\mathcal{O}_{n,v} (t)\), and \({\rm Cov}^{\rm obs}_{p_1, p_2, v} [\mu^{\rm ens}, \mu^{\rm ens}]\) is the ensemble_mean_observational_self_covariance. Finally, we have \(\tau_{v} (t) = 1\) for all \(v\) and \(t\).

parameters_list

The list of training parameters \(\alpha_{v} (t)\), \(\beta_{n,v} (t)\), \(\tau_{v} (t)\), as Data objects. This list is empty until the first train() method call.

Type: list(Data)

Examples

>>> from pp_test.test_data import test_sin
>>> import postprocessors.MBM as MBM
>>> import matplotlib.pyplot as plt
>>>
>>> # Loading random sinusoidal observations and reforecasts
>>> past_observations, reforecasts = test_sin(60., 3., number_of_observation=200)
>>> past_observations.index_shape
(1, 200, 1, 1, 21)
>>> reforecasts.index_shape
(1, 200, 10, 1, 21)
>>>
>>> # Training the postprocessor
>>> postprocessor = MBM.EnsembleMeanCorrection()
>>> postprocessor.train(past_observations, reforecasts)
>>>
>>> # Loading new random sinusoidal observations and forecasts
>>> observations, forecasts = test_sin(60., 3.)
>>> corrected_forecasts = postprocessor(forecasts)  # Correcting the new forecasts
>>>
>>> # Plotting the forecasts
>>> ax = observations.plot(global_label='observations', color='tab:green')
>>> ax = forecasts.plot(ax=ax, global_label='raw forecasts', mfc=None, mec='tab:blue',
... ls='', marker='o', ms=3.0)
>>> ax = corrected_forecasts.plot(ax=ax, global_label='corrected forecasts', mfc=None,
... mec='tab:orange', ls='', marker='o', ms=3.0)
>>> t = ax.set_ylabel('temperature [C°]')
>>> t = ax.set_xlabel('date')
>>> t = ax.legend()
>>> t = ax.set_title('Example of corrected forecasts')
>>>
>>> # Plotting the postprocessing parameters
>>> ax = postprocessor.plot_parameters()
>>> t = ax.set_title('Postprocessing parameters \n'+r'($\gamma_{1,0}$ is $\tau_{0}$)')
>>>
>>> # Computing the CRPS score
>>> raw_crps = forecasts.CRPS(observations)
>>> corrected_crps = corrected_forecasts.CRPS(observations)
>>>
>>> # Plotting the CRPS score
>>> ax = raw_crps.plot(timestamps=postprocessor.parameters_list[0].timestamps[0, 0],
... global_label='Raw forecast CRPS', color='tab:blue')
>>> ax = corrected_crps.plot(timestamps=postprocessor.parameters_list[0].timestamps[0, 0], ax=ax,
... global_label='Corrected forecast CRPS', color='tab:orange')
>>> t = ax.set_ylabel('CRPS [C°]')
>>> t = ax.set_xlabel('time [hour]')
>>> t = ax.set_title('CRPS score')
>>> t = ax.legend()
>>> plt.show()

(Source code)

../../_images/MBM-3_00.png — (png, hires.png, pdf)

../../_images/MBM-3_01.png — (png, hires.png, pdf)

../../_images/MBM-3_02.png — (png, hires.png, pdf)

__call__(predictors, predictor_offset=0, proc_time_offset=0, shift_parameters=0, interpolate_offset=None, init_params=None)[source]

Actual postprocessing of new predictors. Return the corrected first predictor.

Parameters

predictors (Data) – Data object with the predictors to postprocess. The method will get their timestamps and interpolate the parameters for the timestamps that are missing.
predictor_offset (int, optional) – Allow to skip the first predictor_offset values of the given predictors. Default to zero.
proc_time_offset (int, optional) – Indicate at which time in hours the processor start with respect to the start of the forecast. Default to zero.
interpolate_offset (None or int, optional) – Allow to start the interpolation before the first data point, and specify at which time (in hour after the start of the forecast). None to disable. Default to None.
init_params (None or list(int) or list(Data), optional) – Specify the initial parameters of the interpolation if interpolate_offset is set. None to disable. Must be set if interpolate_offset is set. Default to None.

get_interpolated_parameters(predictors, predictor_offset=0, proc_time_offset=0, interpolate_offset=None, init_params=None)[source]

Method to get the postprocessor parameters interpolated to fix missing timesteps. Assumes that the parameters evolve “smoothly” with the timesteps to fill the gaps.

Parameters

predictors (Data) – Data object with the predictors to postprocess. The method will get their timestamps and interpolate the parameters for the timestamps that are missing.
predictor_offset (int, optional) – Allow to skip the first predictor_offset values of the given predictors. Default to zero.
proc_time_offset (int, optional) – Indicate at which time in hours the processor start with respect to the start of the forecast. Default to zero.
interpolate_offset (None or int, optional) – Allow to start the interpolation before the first data point, and specify at which time (in hour after the start of the forecast). None to disable. Default to None.
init_params (None or list(int) or list(Data), optional) – Specify the initial parameters of the interpolation if interpolate_offset is set. None to disable. Must be set if interpolate_offset is set. Default to None.

plot_interpolated_parameters(predictors, proc_kwargs, timestamps=False, variable='all', raw_param_kwargs=None, ax=None, grid_point=None, **kwargs)[source]

Method to plot the postprocessor parameters interpolated to fix missing timesteps obtained by the method get_interpolated_parameters().

Parameters

predictors (Data) – Data object with the predictors to postprocess. The method will get their timestamps and interpolate the parameters for the timestamps that are missing.
proc_kwargs (dict) – Dictionary of arguments to pass to the get_interpolated_parameters() method.
timestamps (array_like, optional) – Timestamps to pass to the plotting routine. Supersedes the timestamps of the predictors. If False, use the timestamps of the predictors. Default to False.
variable (str, slice, list or int, optional) – Allow to select for which variable to plot the parameters. Default to ‘all’.
raw_param_kwargs (dict) – Dictionary of arguments to pass to the ~.Data.plot method for the raw (not interpolated) parameters.
ax (Axes, optional) – An axes on which to plot.
grid_point (tuple(int, int), optional) – If the data are fields, specifies the parameters of which grid point to plot.
kwargs (dict) – Dictionary of arguments to pass to the ~.Data.plot method for the interpolated parameters.

Returns

ax – An axes where the parameters were plotted.

Return type

Axes

train(observations, predictors, forecast_init_time=None, **kwargs)[source]

Method to train the postprocessor with an observation and predictors training set. Once trained, the postprocessor parameters are stored in the parameter_list.

Parameters

observations (Data) – A Data object containing the observation training set.
predictors (Data) – A Data object containing the predictors training set. Must be broadcastable with the observations.
forecast_init_time (int or datetime, optional) – The time at which the predictors generation begin (the (re)forecasts initial time). If None, assume that this time is zero (00Z).

class postprocessors.MBM.EnsembleNgrCRPSCorrection[source]

Bases: postprocessors.MBM.EnsembleAbsCRPSCorrection

A postprocessor to correct the ensemble mean and scale the spread of the ensemble. Correct according to the formula:

\[\mathcal{D}^C_{n,m,v} (t) = \alpha_{v} (t) + \sum_{p=1}^P \beta_{p,v} (t) \, \mu^{\rm ens}_{p,n,v} (t) + \left(\gamma_{1,v} (t) + \gamma_{2,v} (t) / \delta_{0, n, v} (t) \right) \, \bar{\mathcal{D}}^{\rm ens}_{0,n,m,v} (t)\]

where \(\delta_{0, n, v} (t)\) is the average over the ensemble members of the ensemble members distance (delta). The parameters \(\alpha_{v} (t)\), \(\beta_{v} (t)\), \(\gamma_{1,v} (t)\) and \(\gamma_{2,v} (t)\) are optimized at each lead time to minimize Non-homogeneous Gaussian Regression (NGR) CRPS (Abs_Ngr()) of \(\mathcal{D}^C_{n,m,v} (t)\) with the observations \(\mathcal{O}\). This postprocessor must thus be trained with a training set composed of past ensembles \(\mathcal{D}_{p,n,m,v}\) and observations \(\mathcal{O}\). If needed, the parameters initial conditions for the minimization are constructed using the values given by the EnsembleSpreadScalingCorrection, dependding on the minimizer and options being chosen.

The parameters \(\gamma_{1,v} (t)\) and \(\gamma_{2,v} (t)\) control respectively the scaling of the ensemble spread and its nudging toward the climatological variance for the long lead times.

parameters_list

The list of training parameters \(\alpha_{v} (t)\), \(\beta_{n,v} (t)\), \(\tau_{v} (t)\), as Data objects. This list is empty until the first train() method call.

Type: list(Data)

Examples

>>> from pp_test.test_data import test_sin
>>> import postprocessors.MBM as MBM
>>> import matplotlib.pyplot as plt
>>>
>>> # Loading random sinusoidal observations and reforecasts
>>> past_observations, reforecasts = test_sin(60., 3., number_of_observation=200)
>>> past_observations.index_shape
(1, 200, 1, 1, 21)
>>> reforecasts.index_shape
(1, 200, 10, 1, 21)
>>>
>>> # Training the postprocessor
>>> postprocessor = MBM.EnsembleNgrCRPSCorrection()
>>> postprocessor.train(past_observations, reforecasts, ntrial=10)
>>>
>>> # Loading new random sinusoidal observations and forecasts
>>> observations, forecasts = test_sin(60., 3.)
>>> corrected_forecasts = postprocessor(forecasts)  # Correcting the new forecasts
>>>
>>> # Plotting the forecasts
>>> ax = observations.plot(global_label='observations', color='tab:green')
>>> ax = forecasts.plot(ax=ax, global_label='raw forecasts', mfc=None, mec='tab:blue',
... ls='', marker='o', ms=3.0)
>>> ax = corrected_forecasts.plot(ax=ax, global_label='corrected forecasts', mfc=None,
... mec='tab:orange', ls='', marker='o', ms=3.0)
>>> t = ax.set_ylabel('temperature [C°]')
>>> t = ax.set_xlabel('date')
>>> t = ax.legend()
>>> t = ax.set_title('Example of corrected forecasts')
>>>
>>> # Plotting the postprocessing parameters
>>> ax = postprocessor.plot_parameters()
>>> t = ax.set_title('Postprocessing parameters')
>>>
>>> # Computing the CRPS score
>>> raw_crps = forecasts.CRPS(observations)
>>> corrected_crps = corrected_forecasts.CRPS(observations)
>>>
>>> # Plotting the CRPS score
>>> ax = raw_crps.plot(timestamps=postprocessor.parameters_list[0].timestamps[0, 0],
... global_label='Raw forecast CRPS', color='tab:blue')
>>> ax = corrected_crps.plot(timestamps=postprocessor.parameters_list[0].timestamps[0, 0], ax=ax,
... global_label='Corrected forecast CRPS', color='tab:orange')
>>> t = ax.set_ylabel('CRPS [C°]')
>>> t = ax.set_xlabel('time [hour]')
>>> t = ax.set_title('CRPS score')
>>> t = ax.legend()
>>> plt.show()

(Source code)

../../_images/MBM-4_00.png — (png, hires.png, pdf)

../../_images/MBM-4_01.png — (png, hires.png, pdf)

../../_images/MBM-4_02.png — (png, hires.png, pdf)

class postprocessors.MBM.EnsembleSpreadScalingAbsCRPSCorrection[source]

Bases: postprocessors.MBM.EnsembleSpreadScalingCorrection

A postprocessor to correct the ensemble mean and scale the spread of the ensemble. Correct according to the formula:

\[\mathcal{D}^C_{n,m,v} (t) = \alpha_{v} (t) + \sum_{p=1}^P \beta_{p,v} (t) \, \mu^{\rm ens}_{p,n,v} (t) + \tau_{v} (t) \bar{\mathcal{D}}^{\rm ens}_{0,n,m,v} (t)\]

The parameters \(\alpha_{v} (t)\), \(\beta_{v} (t)\) and \(\tau_{v} (t)\) are optimized at each lead time to minimize the absolute norm CRPS (Abs_CRPS()) of \(\mathcal{D}^C_{n,m,v} (t)\) with the observations \(\mathcal{O}\). This postprocessor must thus be trained with a training set composed of past ensembles \(\mathcal{D}_{p,n,m,v}\) and observations \(\mathcal{O}\). If needed, the parameters initial conditions for the minimization are constructed using the values given by the EnsembleSpreadScalingCorrection, dependding on the minimizer and options being chosen.

parameters_list

The list of training parameters \(\alpha_{v} (t)\), \(\beta_{n,v} (t)\), \(\tau_{v} (t)\), as Data objects. This list is empty until the first train() method call.

Type: list(Data)

Examples

>>> from pp_test.test_data import test_sin
>>> import postprocessors.MBM as MBM
>>> import matplotlib.pyplot as plt
>>>
>>> # Loading random sinusoidal observations and reforecasts
>>> past_observations, reforecasts = test_sin(60., 3., number_of_observation=200)
>>> past_observations.index_shape
(1, 200, 1, 1, 21)
>>> reforecasts.index_shape
(1, 200, 10, 1, 21)
>>>
>>> # Training the postprocessor
>>> postprocessor = MBM.EnsembleSpreadScalingAbsCRPSCorrection()
>>> postprocessor.train(past_observations, reforecasts, ntrial=10)
>>>
>>> # Loading new random sinusoidal observations and forecasts
>>> observations, forecasts = test_sin(60., 3.)
>>> corrected_forecasts = postprocessor(forecasts)  # Correcting the new forecasts
>>>
>>> # Plotting the forecasts
>>> ax = observations.plot(global_label='observations', color='tab:green')
>>> ax = forecasts.plot(ax=ax, global_label='raw forecasts', mfc=None, mec='tab:blue',
... ls='', marker='o', ms=3.0)
>>> ax = corrected_forecasts.plot(ax=ax, global_label='corrected forecasts', mfc=None,
... mec='tab:orange', ls='', marker='o', ms=3.0)
>>> t = ax.set_ylabel('temperature [C°]')
>>> t = ax.set_xlabel('date')
>>> t = ax.legend()
>>> t = ax.set_title('Example of corrected forecasts')
>>>
>>> # Plotting the postprocessing parameters
>>> ax = postprocessor.plot_parameters()
>>> t = ax.set_title('Postprocessing parameters \n'+r'($\gamma_{1,0}$ is $\tau_{0}$)')
>>>
>>> # Computing the CRPS score
>>> raw_crps = forecasts.CRPS(observations)
>>> corrected_crps = corrected_forecasts.CRPS(observations)
>>>
>>> # Plotting the CRPS score
>>> ax = raw_crps.plot(timestamps=postprocessor.parameters_list[0].timestamps[0, 0],
... global_label='Raw forecast CRPS', color='tab:blue')
>>> ax = corrected_crps.plot(timestamps=postprocessor.parameters_list[0].timestamps[0, 0], ax=ax,
... global_label='Corrected forecast CRPS', color='tab:orange')
>>> t = ax.set_ylabel('CRPS [C°]')
>>> t = ax.set_xlabel('time [hour]')
>>> t = ax.set_title('CRPS score')
>>> t = ax.legend()
>>> plt.show()

(Source code)

../../_images/MBM-5_00.png — (png, hires.png, pdf)

../../_images/MBM-5_01.png — (png, hires.png, pdf)

../../_images/MBM-5_02.png — (png, hires.png, pdf)

train(observations, predictors, forecast_init_time=None, num_threads=None, ntrial=1, **kwargs)[source]

Method to train the postprocessor with an observation and predictors training set. Once trained, the postprocessor parameters are stored in the parameter_list.

Parameters

observations (Data) – A Data object containing the observation training set.
predictors (Data) – A Data object containing the predictors training set. Must be broadcastable with the observations.
forecast_init_time (int or datetime, optional) – The time at which the predictors generation begin (the (re)forecasts initial time). If None, assume that this time is zero (00Z).
num_threads (None or int, optional) – The number of threads used to compute the parameters. One thread computes a chunk of the given grid points and lead times. If None, uses the maximum number of cpu cores available. Default to None.
ntrial (int) – Number of parameters initial conditions to try to find the best solution. Not used if the minimizer being chosen is a global one. Default to 1.
kwargs (dict) – Dictionary of options to pass to the scipy.optimize.minimize() function.

class postprocessors.MBM.EnsembleSpreadScalingCorrection[source]

Bases: postprocessors.MBM.EnsembleMeanCorrection

A postprocessor to correct the ensemble mean and scale the spread of the ensemble. Correct according to the formula:

\[\mathcal{D}^C_{n,m,v} (t) = \alpha_{v} (t) + \sum_{p=1}^P \beta_{p,v} (t) \, \mu^{\rm ens}_{p,n,v} (t) + \tau_{v} (t) \bar{\mathcal{D}}^{\rm ens}_{0,n,m,v} (t)\]

where \(\alpha_{v} (t) = \left\langle \mathcal{O}_{n,v} (t) - \sum_{p=1}^P \beta_{p,v} (t) \, \mu^{\rm ens}_{p,n,v} (t) \right\rangle_n\) with \(\mathcal{O}_{n,v} (t)\) the observation training set and \(\mu^{\rm ens}_{p,n,v} (t)\) the ensemble mean of the predictors of the training set. \(\bar{\mathcal{D}}^{\rm ens}_{0,n,m,v} (t)\) is the centered_ensemble of the first predictor. We have also

\[\beta_{p,v} (t) = \sum_{p_1=1}^P \, {\rm Cov}^{\rm obs}_{p_1, v} [\bar{\mathcal{O}}^{\rm obs}, \mu^{\rm ens}] (t) \, {\rm Cov}^{\rm obs}_{p_1, p_2, v} [\mu^{\rm ens}, \mu^{\rm ens}] (t)\]

where \({\rm Cov}^{\rm obs}_{p_1, v} [\bar{\mathcal{O}}^{\rm obs}, \mu^{\rm ens}]\) is the ensemble_mean_observational_covariance() with the observation training set \(\mathcal{O}_{n,v} (t)\), and \({\rm Cov}^{\rm obs}_{p_1, p_2, v} [\mu^{\rm ens}, \mu^{\rm ens}]\) is the ensemble_mean_observational_self_covariance. Finally, we have

\[\tau_{v} (t) = \left\langle \sigma^{\rm{ens}}_{n,v} (t)^2 \right\rangle_n^{-1} \, \left\{ \sigma^{\rm obs}_{v} (t)^2 - \sum_{p=1}^P \, \beta_{p,v} (t) \, {\rm Cov}^{\rm obs}_{p, v} [\bar{\mathcal{O}}^{\rm obs}, \mu^{\rm ens}] (t) \right\}\]

for all \(n\), which scales the spread of the ensemble over the lead time \(t\). In this formula, \(\sigma^{\rm{ens}}_{n,v} (t)^2\) and \(\sigma^{\rm obs}_{v} (t)^2\) stands respectively for the ensemble variance (ensemble_var) and for the observational variance (observational_var) of \(\mathcal{D}\).

parameters_list

The list of training parameters \(\alpha_{v} (t)\), \(\beta_{n,v} (t)\), \(\tau_{v} (t)\), as Data objects. This list is empty until the first train() method call.

Type: list(Data)

Examples

>>> from pp_test.test_data import test_sin
>>> import postprocessors.MBM as MBM
>>> import matplotlib.pyplot as plt
>>>
>>> # Loading random sinusoidal observations and reforecasts
>>> past_observations, reforecasts = test_sin(60., 3., number_of_observation=200)
>>> past_observations.index_shape
(1, 200, 1, 1, 21)
>>> reforecasts.index_shape
(1, 200, 10, 1, 21)
>>>
>>> # Training the postprocessor
>>> postprocessor = MBM.EnsembleSpreadScalingCorrection()
>>> postprocessor.train(past_observations, reforecasts)
>>>
>>> # Loading new random sinusoidal observations and forecasts
>>> observations, forecasts = test_sin(60., 3.)
>>> corrected_forecasts = postprocessor(forecasts)  # Correcting the new forecasts
>>>
>>> # Plotting the forecasts
>>> ax = observations.plot(global_label='observations', color='tab:green')
>>> ax = forecasts.plot(ax=ax, global_label='raw forecasts', mfc=None, mec='tab:blue',
... ls='', marker='o', ms=3.0)
>>> ax = corrected_forecasts.plot(ax=ax, global_label='corrected forecasts', mfc=None,
... mec='tab:orange', ls='', marker='o', ms=3.0)
>>> t = ax.set_ylabel('temperature [C°]')
>>> t = ax.set_xlabel('date')
>>> t = ax.legend()
>>> t = ax.set_title('Example of corrected forecasts')
>>>
>>> # Plotting the postprocessing parameters
>>> ax = postprocessor.plot_parameters()
>>> t = ax.set_title('Postprocessing parameters \n'+r'($\gamma_{1,0}$ is $\tau_{0}$)')
>>>
>>> # Computing the CRPS score
>>> raw_crps = forecasts.CRPS(observations)
>>> corrected_crps = corrected_forecasts.CRPS(observations)
>>>
>>> # Plotting the CRPS score
>>> ax = raw_crps.plot(timestamps=postprocessor.parameters_list[0].timestamps[0, 0],
... global_label='Raw forecast CRPS', color='tab:blue')
>>> ax = corrected_crps.plot(timestamps=postprocessor.parameters_list[0].timestamps[0, 0], ax=ax,
... global_label='Corrected forecast CRPS', color='tab:orange')
>>> t = ax.set_ylabel('CRPS [C°]')
>>> t = ax.set_xlabel('time [hour]')
>>> t = ax.set_title('CRPS score')
>>> t = ax.legend()
>>> plt.show()

(Source code)

../../_images/MBM-6_00.png — (png, hires.png, pdf)

../../_images/MBM-6_01.png — (png, hires.png, pdf)

../../_images/MBM-6_02.png — (png, hires.png, pdf)

train(observations, predictors, forecast_init_time=None, **kwargs)[source]

Method to train the postprocessor with an observation and predictors training set. Once trained, the postprocessor parameters are stored in the parameter_list.

Parameters

observations (Data) – A Data object containing the observation training set.
predictors (Data) – A Data object containing the predictors training set. Must be broadcastable with the observations.
forecast_init_time (int or datetime, optional) – The time at which the predictors generation begin (the (re)forecasts initial time). If None, assume that this time is zero (00Z).

class postprocessors.MBM.EnsembleSpreadScalingNgrCRPSCorrection[source]

Bases: postprocessors.MBM.EnsembleSpreadScalingAbsCRPSCorrection

A postprocessor to correct the ensemble mean and scale the spread of the ensemble. Correct according to the formula:

\[\mathcal{D}^C_{n,m,v} (t) = \alpha_{v} (t) + \sum_{p=1}^P \beta_{p,v} (t) \, \mu^{\rm ens}_{p,n,v} (t) + \tau_{v} (t) \bar{\mathcal{D}}^{\rm ens}_{0,n,m,v} (t)\]

The parameters \(\alpha_{v} (t)\), \(\beta_{v} (t)\) and \(\tau_{v} (t)\) are optimized at each lead time to minimize the Non-homogeneous Gaussian Regression (NGR) CRPS (Abs_Ngr()) of \(\mathcal{D}^C_{n,m,v} (t)\) with the observations \(\mathcal{O}\). This postprocessor must thus be trained with a training set composed of past ensembles \(\mathcal{D}_{p,n,m,v}\) and observations \(\mathcal{O}\). If needed, the parameters initial conditions for the minimization are constructed using the values given by the EnsembleSpreadScalingCorrection, dependding on the minimizer and options being chosen.

parameters_list

The list of training parameters \(\alpha_{v} (t)\), \(\beta_{n,v} (t)\), \(\tau_{v} (t)\), as Data objects. This list is empty until the first train() method call.

Type: list(Data)

Examples

>>> from pp_test.test_data import test_sin
>>> import postprocessors.MBM as MBM
>>> import matplotlib.pyplot as plt
>>>
>>> # Loading random sinusoidal observations and reforecasts
>>> past_observations, reforecasts = test_sin(60., 3., number_of_observation=200)
>>> past_observations.index_shape
(1, 200, 1, 1, 21)
>>> reforecasts.index_shape
(1, 200, 10, 1, 21)
>>>
>>> # Training the postprocessor
>>> postprocessor = MBM.EnsembleSpreadScalingNgrCRPSCorrection()
>>> postprocessor.train(past_observations, reforecasts, ntrial=10)
>>>
>>> # Loading new random sinusoidal observations and forecasts
>>> observations, forecasts = test_sin(60., 3.)
>>> corrected_forecasts = postprocessor(forecasts)  # Correcting the new forecasts
>>>
>>> # Plotting the forecasts
>>> ax = observations.plot(global_label='observations', color='tab:green')
>>> ax = forecasts.plot(ax=ax, global_label='raw forecasts', mfc=None, mec='tab:blue',
... ls='', marker='o', ms=3.0)
>>> ax = corrected_forecasts.plot(ax=ax, global_label='corrected forecasts', mfc=None,
... mec='tab:orange', ls='', marker='o', ms=3.0)
>>> t = ax.set_ylabel('temperature [C°]')
>>> t = ax.set_xlabel('date')
>>> t = ax.legend()
>>> t = ax.set_title('Example of corrected forecasts')
>>>
>>> # Plotting the postprocessing parameters
>>> ax = postprocessor.plot_parameters()
>>> t = ax.set_title('Postprocessing parameters \n'+r'($\gamma_{1,0}$ is $\tau_{0}$)')
>>>
>>> # Computing the CRPS score
>>> raw_crps = forecasts.CRPS(observations)
>>> corrected_crps = corrected_forecasts.CRPS(observations)
>>>
>>> # Plotting the CRPS score
>>> ax = raw_crps.plot(timestamps=postprocessor.parameters_list[0].timestamps[0, 0],
... global_label='Raw forecast CRPS', color='tab:blue')
>>> ax = corrected_crps.plot(timestamps=postprocessor.parameters_list[0].timestamps[0, 0], ax=ax,
... global_label='Corrected forecast CRPS', color='tab:orange')
>>> t = ax.set_ylabel('CRPS [C°]')
>>> t = ax.set_xlabel('time [hour]')
>>> t = ax.set_title('CRPS score')
>>> t = ax.legend()
>>> plt.show()

(Source code)

../../_images/MBM-7_00.png — (png, hires.png, pdf)

../../_images/MBM-7_01.png — (png, hires.png, pdf)

../../_images/MBM-7_02.png — (png, hires.png, pdf)