Data object module
The Data
is the backbone of the library that allows one to compute easily the different quantities needed to
postprocess forecasts.
Basically, it is a Numpy ndarray
with the first 5 dimensions allocated to a
dedicated and fixed meaning. These first axis of the data represent:
Axis 0th: predictor number (\(p\))
Axis 1st: observation/realization number (\(n\))
Axis 2nd: ensemble member number (\(m\))
Axis 3rd: variable or label number (\(v\)) [Not used/implemented for the moment!]
Axis 4th: lead time (\(t\))
These 5 first dimensions are called the data index (see Data.index_shape
).
As such, they represent the data as a multi-dimensional array \(\mathcal{D}_{p,n,m,v} (t)\) where \(t\)
is the lead time.
The extra dimensions possibly trailing in the array are the intrinsic dimensions of the data itself.
For instance, an array of total dimension 7 represents 2D data (e.g. fields).
If only 5 dimensions are present on total, then the data is a scalar.
The main operations of the Data
are broadcasted over these extra-dimension, making the data object directly compliant
with multi-dimensional forecast data.
Examples
Here is an example showing how the Data
object works:
>>> import numpy as np
>>> from core.data import Data
>>> a = np.random.randn(2, 3, 10, 1, 60, 20, 20)
>>> data = Data(a)
>>> data.number_of_predictors
2
>>> data.number_of_observations
3
>>> data.number_of_members
10
>>> data.number_of_variables
1
>>> data.number_of_time_steps
60
>>> data.shape
(20, 20)
>>> data.index_shape
(2, 3, 10, 1, 60)
Notes
The methods of the
Data
object return as much as possible anotherData
object. If it is not possible to format the ouptut according to the shape described above, andarray
is returned. For example, matrix derived from the data are returned as NumPy arrays.By convention, if a method or operation reduces or returns a
Data
with one of the index missing, the corresponding index of the object is set to zero to preserve index shape of the object. For example, for aData
object \(\mathcal{D}_{p,n,m,v} (t)\) ofindex_shape
(P, N, M, V, T), theData.ensemble_max
method returns \(\max_m \mathcal{D}_{p,n,m,v} (t)\) as aData
object of shape (P, N, 1, V, T). E.g.:>>> import numpy as np >>> from core.data import Data >>> a = np.random.randn(2, 3, 10, 1, 60, 20, 20) >>> data = Data(a) >>> data.index_shape (2, 3, 10, 1, 60) >>> maxi = data.ensemble_max >>> maxi.index_shape (2, 3, 1, 1, 60)
In the following, in such a case we will use the notation \(\mathcal{D}_{p,n,v} (t) \equiv \mathcal{D}_{p,n,0,v} (t)\).
Missing values in
Data
objects can be marked asnumpy.nan
. The various averages, summation and methods will automatically ignore the missing values. As a consequence, it means that these averages and summations will include less terms. For example, if at one lead time, an ensemble member value is missing, the ensemble mean is done on the rest of the ensemble at this precise lead time and obviously does not include this member.
References
- DATA-GR07
Tilmann Gneiting and Adrian E Raftery. Strictly proper scoring rules, prediction, and estimation. Journal of the American statistical Association, 102(477):359–378, 2007. URL: https://doi.org/10.1198/016214506000001437.
- DATA-GRWIG05
Tilmann Gneiting, Adrian E Raftery, Anton H Westveld III, and Tom Goldman. Calibrated probabilistic forecasting using ensemble model output statistics and minimum crps estimation. Monthly Weather Review, 133(5):1098–1118, 2005. URL: https://doi.org/10.1175/MWR2904.1.
- DATA-Her00(1,2,3,4)
Hans Hersbach. Decomposition of the continuous ranked probability score for ensemble prediction systems. Weather and Forecasting, 15(5):559–570, 2000. URL: https://doi.org/10.1175/1520-0434(2000)015%3C0559:DOTCRP%3E2.0.CO;2.
- DATA-VSV15(1,2)
Bert Van Schaeybroeck and Stéphane Vannitsem. Ensemble post-processing using member-by-member approaches: theoretical aspects. Quarterly Journal of the Royal Meteorological Society, 141(688):807–818, 2015. URL: https://doi.org/10.1002/qj.2397.
Warning
Several properties and definitions inside the Data object are not yet fixed or well-defined. Usages and standards might still evolve.
- class core.data.Data(data=None, metadata=None, timestamps=None, dtype=<class 'numpy.float64'>)[source]
Bases:
object
Main data structure of the library.
- Parameters
The data array. If not None, should be an array of shape
Default to None.
timestamps (None or ndarray(datetime) or list(ndarray(datetime))) – The timestamps of the forecast data. Can be a 1D
ndarray
ofdatetime
timestamps (one per lead time). In that case, the same timestamps vector is attributed to all the predictors and the observations provided by data. Can also be a list of 1Dndarray
ofdatetime
timestamps (one list entry per observation). It allows one to set a different timestamps per observation/realization. If None, no timestamp is set. Default to None.metadata (object or ndarray(object) or list(ndarray(object))) – Object(s) describing the metadata of the data (not implemented yet). Can be an object, a
ndarray
of objects (one per observation/realization), or a list of 1Dndarray
of objects (one list entry per predictor, one array component per observation/realization). If a single array is provided, it can be of shape (number_of_predictors
,number_of_observations
) to specify the metadata of each predictor and observation/realization separately. It can also be a 1D array for which each component corresponds to an observation/realization. In this case, the same metadata object is used for each predictor. Default to the None object.dtype (dtype) – The data type of the data being stored. Default to
numpy.float64
.
- timestamps
The timestamps of the data, stored as
ndarray
ofdatetime
and with shape corresponding to (number_of_predictors
,number_of_observations
).
- Abs_CRPS(other)[source]
Return the Absolute norm CRPS scores with another Data object \(\mathcal{O}\) (typically containing observations). This score is computed with the analytical formula:
\({\rm CRPS}^{\rm Abs}_{p,v} (t) = \left\langle d^{\rm ens}_{p,n,v} [\mathcal{O}] (t) - \delta_{p,n,v} (t) /2 \right\rangle_n\)
where \(d^{\rm{ens}}_{p,n,v} [\mathcal{O}]\) is the
ensemble_distance()
with the observations object \(\mathcal{O}\), and \(\delta_{p,n,v}\) isdelta
, obtained by taking the average of theensemble_members_distance
over the ensemble members. See [DATA-VSV15] and [DATA-GR07] for more details.
- CRPS(other)[source]
Return the CRPS scores with another Data object \(\mathcal{O}\) (typically containing observations). This score is computed according to [DATA-Her00] (see pp. 563-564).
- CRPS_decomposition(other)[source]
Return the decomposition of CRPS scores with another Data object \(\mathcal{O}\) (typically containing observations) according to the fomula:
\({\rm CRPS}_{p,v} (t) = {\rm Reli}_{p,v} (t) - {\rm Resol}_{p,v} (t) + {\rm Unc}_{p,v} (t)\)
where \({\rm Reli}_{p,v} (t)\), \({\rm Resol}_{p,v} (t)\) and \({\rm Unc}_{p,v} (t)\) are respectively the reliability, the resolution and the uncertainty contribution to the CRPS. See [DATA-Her00], pp. 565 for more details.
- CRPS_relipot(other)[source]
Return the decomposition of CRPS scores with another Data object \(\mathcal{O}\) (typically containing observations) according to the fomula:
\({\rm CRPS}_{p,v} (t) = {\rm Reli}_{p,v} (t) + {\rm CRPS}^{\rm pot}_{p,v} (t)\)
where \({\rm Reli}_{p,v} (t)\) and \({\rm CRPS}^{\rm pot}_{p,v} (t)\) are respectively the reliability and potential CRPS, i.e. the CRPS one would obtain with a perfectly reliable ensemble. See [DATA-Her00], pp. 564 for more details.
- Ngr_CRPS(other)[source]
Return the Non-homogeneous Gaussian Regression (NGR) CRPS scores with another Data object \(\mathcal{O}\) (typically containing observations). This score is computed with the analytical formula:
\({\rm CRPS}^{\rm Ngr}_{p,v} (t) = \left\langle\sigma^{\rm ens}_{p,n,v} (t) \left(z_{p,n,v}(t)(2\Phi(z_{p,n,v}(t)) -1) + 2\phi(z_{p,n,v}(t)) - \pi^{-1/2}\right)\right\rangle_n\)
where \(\phi\) is the normal distribution, \(\Phi\) is its cumulative distribution function and \(z_{p,n,v} = \left(\mathcal{O}_{p,n,v} (t) -\mu^{\rm{ens}}_{p,n,v} (t)\right)/\sigma^{\rm{ens}}_{p,n,v} (t)\) is the standardized error with respect to the other data (where \(\mu^{\rm{ens}}\) and \(\sigma^{\rm{ens}}\) are respectively the
ensemble_mean
and theensemble_std
). See [DATA-VSV15] and [DATA-GRWIG05] for more details.
- append_members(data)[source]
Append a members Data object to the current ones (i.e. along the 2nd axis).
- Parameters
data (Data) – The data object of the members to append. Must be compatible/broadcastable. If the initial Data object is empty, simply copy the data object.
- append_observations(data)[source]
Append a observations Data object to the current ones (i.e. along the 1st axis). Alias for
append_realization()
.- Parameters
data (Data) – The data object of the observations to append. Must be compatible/broadcastable. If the initial Data object is empty, simply copy the data object.
- append_predictors(data)[source]
Append a predictors Data object to the current ones (i.e. along the 0th axis).
- Parameters
data (Data) – The data object of the predictors to append. Must be compatible/broadcastable. If the initial Data object is empty, simply copy the data object.
- append_realizations(data)[source]
Append a realizations Data object to the current ones (i.e. along the 1st axis).
- Parameters
data (Data) – The data object of the realizations to append. Must be compatible/broadcastable. If the initial Data object is empty, simply copy the data object.
- bias(other)[source]
Return the bias \(\left\langle\mu^{\rm{ens}}_{p,n,v} (t) - \mathcal{O}_{p,n,v} (t)\right\rangle_n\) with another Data object \(\mathcal{O}\) (typically containing observations).
- property centered_ensemble
Returns an ensemble centered on its
ensemble_mean
: \(\bar{\mathcal{D}}^{\rm ens}_{p,n,m,v} (t) = \mathcal{D}_{p,n,m,v} (t) - \mu^{\rm ens}_{p,n,v} (t)\).- Type
- property centered_observation
Returns an ensemble centered on its
observational_mean
: \(\bar{\mathcal{D}}^{\rm obs}_{p,n,m,v} (t) = \mathcal{D}_{p,n,m,v} (t) - \mu^{\rm obs}_{p,n,v} (t)\).- Type
- copy()[source]
Return a (shallow) copy of the Data object.
- Returns
A copy of the Data object.
- Return type
- property delta
Average over the ensemble members of the
ensemble_members_distance
: \(\delta_{p,n,v} (t) = \left\langle d^{\rm{MBM}}_{p,n,m_1,m_2,v} (t) \right\rangle_{m_1, m_2}\)- Type
- ensemble_distance(other)[source]
Data: Averaged distance between ensemble member and another Data object: \(d^{\rm{ens}}_{p,n,v} [\mathcal{O}] (t) = \langle|\mathcal{D}_{p,n,m,v} (t)- \mathcal{O}_{p,n,m,v} (t)|\rangle_m\) where \(\mathcal{O}\) is the other Data object.
- property ensemble_max
Ensemble maximum over the ensemble index \(m\): \(\max_m \mathcal{D}_{p,n,m,v} (t)\).
- Type
- property ensemble_mean
Ensemble mean. Mean over the ensemble index \(m\): \(\mu^{\rm{ens}}_{p,n,v} (t) = \langle \mathcal{D}_{p,n,m,v} (t) \rangle_m\).
- Type
- ensemble_mean_MSE(other)[source]
Return the Mean Square Error of the ensemble mean \(\left\langle\left(\mu^{\rm{ens}}_{p,n,v} (t) - \mathcal{O}_{p,n,v} (t)\right)^2\right\rangle_n\) with another Data object \(\mathcal{O}\) (typically containing observations).
- ensemble_mean_RMSE(other)[source]
Return the Root Mean Square Error of the ensemble mean \(\sqrt{\left\langle\left(\mu^{\rm{ens}}_{p,n,v} (t) - \mathcal{O}_{p,n,v} (t)\right)^2\right\rangle_n}\) with another Data object \(\mathcal{O}\) (typically containing observations).
- ensemble_mean_observational_covariance(other)[source]
Observational covariance matrix of the ensemble mean with another Data object \(\mathcal{O}\): \({\rm Cov}^{\rm obs}_{p_1, p_2, v} [\bar{\mathcal{O}}^{\rm obs}, \mu^{\rm ens}] (t)= \left\langle \left\langle\bar{\mathcal{O}}^{\rm obs}_{p_1,n,m,v} (t)\right\rangle_m \, \, \bar{\mu}^{\rm ens}_{p_2,n,v} (t) \right\rangle_n\)
where \(\bar{\mu}^{\rm ens}_{p,n,v} (t) = \mu^{\rm ens}_{p,n,v}(t) - \langle \mu^{\rm ens}_{p,n',v}(t) \rangle_{n'}\) and where \(\mu^{\rm ens}_{p,n,v}(t)\) is the
ensemble_mean
. \(\bar{\mathcal{O}}^{\rm obs}_{p,n,m,v} (t)\) is thecentered_observation
of the other Data object.
- property ensemble_mean_observational_self_covariance
Ensemble mean observational covariance matrix: \({\rm Cov}^{\rm obs}_{p_1, p_2, v} [\mu^{\rm ens}, \mu^{\rm ens}] (t)= \left\langle \bar{\mu}^{\rm ens}_{p_1,n,v} (t) \, \bar{\mu}^{\rm ens}_{p_2,n,v} (t) \right\rangle_n\)
where \(\bar{\mu}^{\rm ens}_{p,n,v} (t) = \mu^{\rm ens}_{p,n,v}(t) - \langle \mu^{\rm ens}_{p,n',v}(t) \rangle_{n'}\) and where \(\mu^{\rm ens}_{p,n,v}(t)\) is the
ensemble_mean
.- Type
- property ensemble_members_distance
Distance between ensemble members: \(d^{\rm{MBM}}_{p,n,m_1,m_2,v} (t) = |\mathcal{D}_{p,n,m_1,v} (t)- \mathcal{D}_{p,n,m_2,v} (t) |\)
- Type
- property ensemble_min
Ensemble minimum over the ensemble index \(m\): \(\min_m \mathcal{D}_{p,n,m,v} (t)\).
- Type
- ensemble_quantiles(q, interpolation='linear')[source]
Return the ensemble quantiles of the data.
- Parameters
q (array_like(float)) – Quantile or sequence of quantiles to compute, which must be between 0 and 1 inclusive.
interpolation (str, optional) – This optional parameter specifies the interpolation method to use when the desired quantile lies between two data points. See
numpy.quantile()
for more information.
- Returns
The ensemble quantiles, stored along the ensemble member number axis (1st axis).
- Return type
- property ensemble_std
Ensemble standard deviation over the ensemble index \(m\): \(\sigma^{\rm{ens}}_{p,n,v} (t)\).
- Type
- property ensemble_var
Ensemble variance over the ensemble index \(m\): \(\sigma^{\rm{ens}}_{p,n,v} (t)^2 = \left\langle \left( \mathcal{D}_{p,n,m,v} (t) - \mu^{\rm{ens}}_{p,n,v} (t) \right)^2 \right\rangle_m\).
- Type
- full_like(value, **kwargs)[source]
Like
numpy.full_like()
, returns a fullData
object with the sameindex_shape
andshape
and type as the initial one.- Parameters
value – The fill value to use.
kwargs (dict) – The argument to pass to
numpy.full_like()
.
- Returns
The full
Data
object- Return type
- load_from_file(filename, **kwargs)[source]
Function to load previously saved data with the method
save_to_file()
.
- load_scalars(data, metadata=None, timestamps=None, load_axis=1, concat_axis=1, columns=0, replace_timestamps=False)[source]
Load scalar data in the Data object. For the moment, only Pandas dataframe and NumPy arrays are accepted.
- Parameters
data (DataFrame or ndarray or list(DataFrame) or list(ndarray)) – The data to load in the object, packed along the load_axis. If ~numpy.ndarray are provided, they can be at most 2-dimensional and their last axis is always identified with the lead time. The remaining axis will be identified to an axis of the Data object given by load_axis If ~pandas.DataFrame are provided, there row axis is expected to be identified with the lead time, while the columns axis will be identified to an axis of the Data object given by load_axis In both cases, a list of data can be provided instead, the list items will be loaded along the axis provided by the first element of load_axis, which must thus be a 2-tuple. If the list elements are 2D, their first axis are loaded along the axis of the Data object corresponding to the second element of load_axis, the last axis being loaded along the lead time axis. If the list elements are 1D, they are loaded along the lead time axis. Finally, if there are already data inside the object, the data provided will be appended to them along the concat_axis.
metadata (object or list(object)) – The metadata of the provided data. If a list of data is provided, then a list of metadata object can be provided. Otherwise, the same metadata object will be used for all the data items in the list.
timestamps (ndarray(datetime) or list(ndarray(datetime))) – The timestamps array(s) of the provided data, as ~datetime.datetime object. If a list of data is provided, then a list of timestamps arrays can be provided. Otherwise, the same timestamps array will be used for all the data items in the list.
load_axis (int or str or tuple(int) or tuple(str)) – Axis over which the provided data are loaded in the Data object. Equal to 1 by default to match the observation index. Can be a number if data is a ~numpy.ndarray or a ~pandas.DataFrame. Have to be 2-tuple if data is a list (see above). Can also be a string like i.e. ‘obs’ to load along the observation index, or ‘members’ to load along the ensemble member index.
concat_axis (int or str) – Axis over which the data have to be concatenated. Can be a number or a string like i.e. ‘obs’ to concatenate along the observation index, or ‘members’ to concatenate along the ensemble member index.
columns (int or str or list(int) or list(str), optional) – Allow to specify the column of the ~pandas.DataFrame to load along load_axis. Only works with pandas ~pandas.DataFrame.
replace_timestamps (bool) – Replace the timestamps possibly already present in the Data object if concat_axis is not 1. Default to False.
- load_timestamps(timestamps)[source]
Load timestamps data.
- Parameters
timestamps (ndarray(datetime) or list(ndarray(datetime))) – The timestamps of the forecast data. Can be a 1D
ndarray
ofdatetime
timestamps (one per lead time). In that case, the same timestamps vector is attributed to all the predictors and the observations provided by data. Can also be a list of 1Dndarray
ofdatetime
timestamps (one list entry per observation). It allows one to set a different timestamps per observation/realization.
- property observational_distance
Distance between observations: \(d^{\rm{obs}}_{p,n_1,n_2,m,v} (t) = |\mathcal{D}_{p,n_1,m,v} (t)- \mathcal{D}_{p,n_2,m,v} (t)|\)
- Type
- property observational_max
Observational maximum over the ensemble index \(n\): \(\max_n \mathcal{D}_{p,n,m,v} (t)\).
- Type
- property observational_mean
Mean over the observation index \(n\): \(\mu^{\rm obs}_{p,m,v} (t) = \langle \mathcal{D}_{p,n,m,v} (t) \rangle_n\).
- Type
- property observational_min
Observational minimum over the ensemble index \(n\): \(\min_n \mathcal{D}_{p,n,m,v} (t)\).
- Type
- observational_quantiles(q, interpolation='linear')[source]
Return the observational quantiles of the data.
- Parameters
q (array_like(float)) – Quantile or sequence of quantiles to compute, which must be between 0 and 1 inclusive.
interpolation (str, optional) –
- This optional parameter specifies the interpolation method to use when the desired quantile lies between
two data points. See
numpy.quantile()
for more information.
- Returns
The observational quantiles, stored along the observation axis (1st axis).
- Return type
- property observational_std
Standard deviation over the observation index \(n\): \(\sigma^{\rm obs}_{p,m,v} (t)\).
- Type
- property observational_var
Variance over the observation index \(n\): \(\sigma^{\rm obs}_{p,m,v} (t)^2 = \left\langle (\mathcal{D}_{p,n,m,v} (t) - \mu^{\rm obs}_{p,m,v} (t))^2 \right\rangle_n\).
- Type
- plot(predictor=0, variable=0, ax=None, timestamps=None, global_label=None, grid_point=None, **kwargs)[source]
Plot the data as a function of time.
- Parameters
predictor (int, optional) – The predictor index to use. Default is 0.
variable (int, optional) – The variable index to use.
ax (Axes, optional) – An axes on which to plot.
timestamps (None or ndarray(datetime), optional) – An array containing the timestamp of the data. If None, try to use the data timestamps and in last resort a numbered time index. Default to None.
global_label (None or str or list(str), optional) – Label to represent all the data (str), or all the data of one observation (list of str) in the legend.
grid_point (tuple(int, int), optional) – If the data are fields, specifies which grid point to plot.
kwargs (dict) – Argument to be passed to the plotting routine.
- Returns
ax – An axes where the data were plotted.
- Return type
- plot_Abs_CRPS(other, predictor=0, variable=0, ax=None, timestamps=None, grid_point=None, **kwargs)[source]
Plot the data Absolute norm CRPS
Abs_CRPS()
score with respect to observation data (other) a function of time.- Parameters
other (Data) – Another data structure holding the observations.
predictor (int, optional) – The predictor index to use. Default is 0.
variable (int, optional) – The variable index to use.
ax (Axes, optional) – An axes on which to plot.
timestamps (None or ndarray(datetime), optional) – An array containing the timestamp of the data. If None, try to use the data timestamps and in last resort a numbered time index. Default to None.
grid_point (tuple(int, int)) – If the data are fields, specifies which grid point to plot.
kwargs (dict) – Argument to be passed to the plotting routine.
- Returns
ax – An axes where the CRPS data were plotted.
- Return type
- plot_CRPS(other, predictor=0, variable=0, ax=None, timestamps=None, grid_point=None, **kwargs)[source]
Plot the data CRPS
CRPS()
score with respect to observation data (other) a function of time.- Parameters
other (Data) – Another data structure holding the observations.
predictor (int, optional) – The predictor index to use. Default is 0.
variable (int, optional) – The variable index to use.
ax (Axes, optional) – An axes on which to plot.
timestamps (None or ndarray(datetime), optional) – An array containing the timestamp of the data. If None, try to use the data timestamps and in last resort a numbered time index. Default to None.
grid_point (tuple(int, int)) – If the data are fields, specifies which grid point to plot.
kwargs (dict) – Argument to be passed to the plotting routine.
- Returns
ax – An axes where the CRPS data were plotted.
- Return type
- plot_Ngr_CRPS(other, predictor=0, variable=0, ax=None, timestamps=None, grid_point=None, **kwargs)[source]
Plot the data Non-homogeneous Gaussian Regression (NGR) CRPS
Ngr_CRPS()
score with respect to observation data (other) a function of time.- Parameters
other (Data) – Another data structure holding the observations.
predictor (int, optional) – The predictor index to use. Default is 0.
variable (int, optional) – The variable index to use.
timestamps (None or ndarray(datetime), optional) – An array containing the timestamp of the data. If None, try to use the data timestamps and in last resort a numbered time index. Default to None.
grid_point (tuple(int, int)) – If the data are fields, specifies which grid point to plot.
ax (Axes, optional) – An axes on which to plot.
kwargs (dict) – Argument to be passed to the plotting routine.
- Returns
ax – An axes where the CRPS data were plotted.
- Return type
- plot_ensemble_mean(predictor=0, variable=0, ax=None, timestamps=None, grid_point=None, **kwargs)[source]
Plot the data ensemble mean
ensemble_mean
as a function of time.- Parameters
predictor (int, optional) – The predictor index to use. Default is 0.
variable (int, optional) – The variable index to use.
ax (Axes, optional) – An axes on which to plot.
timestamps (None or ndarray(datetime), optional) – An array containing the timestamp of the data. If None, try to use the data timestamps and in last resort a numbered time index. Default to None.
grid_point (tuple(int, int)) – If the data are fields, specifies which grid point to plot.
kwargs (dict) – Argument to be passed to the plotting routine.
- Returns
ax – An axes where the data were plotted.
- Return type
- plot_ensemble_median(predictor=0, variable=0, ax=None, timestamps=None, grid_point=None, **kwargs)[source]
Plot the data ensemble median
ensemble_median
as a function of time.- Parameters
predictor (int, optional) – The predictor index to use. Default is 0.
variable (int, optional) – The variable index to use.
ax (Axes, optional) – An axes on which to plot.
timestamps (None or ndarray(datetime), optional) – An array containing the timestamp of the data. If None, try to use the data timestamps and in last resort a numbered time index. Default to None.
grid_point (tuple(int, int)) – If the data are fields, specifies which grid point to plot.
kwargs (dict) – Argument to be passed to the plotting routine.
- Returns
ax – An axes where the data were plotted.
- Return type
- plot_ensemble_minmax(predictor=0, variable=0, ax=None, timestamps=None, grid_point=None, **kwargs)[source]
Plot the data ensemble minimum
ensemble_min
and maximumensemble_max
as a function of time.- Parameters
predictor (int, optional) – The predictor index to use. Default is 0.
variable (int, optional) – The variable index to use.
ax (Axes, optional) – An axes on which to plot.
timestamps (None or ndarray(datetime), optional) – An array containing the timestamp of the data. If None, try to use the data timestamps and in last resort a numbered time index. Default to None.
grid_point (tuple(int, int)) – If the data are fields, specifies which grid point to plot.
kwargs (dict) – Argument to be passed to the plotting routine.
- Returns
ax – An axes where the data were plotted.
- Return type
Axes, optional
- plot_ensemble_quantiles(q, low_interpolation='linear', high_interpolation='linear', predictor=0, variable=0, ax=None, timestamps=None, grid_point=None, alpha=0.1, **kwargs)[source]
Plot the data ensemble quantiles
ensemble_quantiles
as a function of time.- Parameters
q (array_like(float)) – Quantile or sequence of quantiles to compute, which must be between 0 and 0.5 exclusive. A symmetric quantile with respect to 0.5 will also be computed.
low_interpolation (str, optional) – This optional parameter specifies the interpolation method to use when the desired lower quantile (q<0.5) lies between two data points. See
numpy.quantile()
for more information.high_interpolation (str, optional) – This optional parameter specifies the interpolation method to use when the desired higher quantile (q>0.5) lies between two data points. See
numpy.quantile()
for more information.predictor (int, optional) – The predictor index to use. Default is 0.
variable (int, optional) – The variable index to use.
ax (Axes, optional) – An axes on which to plot.
timestamps (None or ndarray(datetime), optional) – An array containing the timestamp of the data. If None, try to use the data timestamps and in last resort a numbered time index. Default to None.
grid_point (tuple(int, int)) – If the data are fields, specifies which grid point to plot.
alpha (float) – Base level of transparency for the highest and lowest quantiles.
kwargs (dict) – Argument to be passed to the plotting routine.
- Returns
ax – An axes where the data were plotted.
- Return type
- plot_ensemble_std(predictor=0, variable=0, ax=None, timestamps=None, grid_point=None, **kwargs)[source]
Plot the data ensemble standard deviation
ensemble_std
as a function of time.- Parameters
predictor (int, optional) – The predictor index to use. Default is 0.
variable (int, optional) – The variable index to use.
ax (Axes, optional) – An axes on which to plot.
timestamps (None or ndarray(datetime), optional) – An array containing the timestamp of the data. If None, try to use the data timestamps and in last resort a numbered time index. Default to None.
grid_point (tuple(int, int)) – If the data are fields, specifies which grid point to plot.
kwargs (dict) – Argument to be passed to the plotting routine.
- Returns
ax – An axes where the data were plotted.
- Return type
- plot_observational_mean(predictor=0, variable=0, ax=None, timestamps=None, grid_point=None, **kwargs)[source]
Plot the data observational mean
observational_mean
as a function of time.- Parameters
predictor (int, optional) – The predictor index to use. Default is 0.
variable (int, optional) – The variable index to use.
ax (Axes, optional) – An axes on which to plot.
timestamps (None or ndarray(datetime), optional) – An array containing the timestamp of the data. If None, try to use the data timestamps and in last resort a numbered time index. Default to None.
grid_point (tuple(int, int)) – If the data are fields, specifies which grid point to plot.
kwargs (dict) – Argument to be passed to the plotting routine.
- Returns
ax – An axes where the data were plotted.
- Return type
- plot_observational_median(predictor=0, variable=0, ax=None, timestamps=None, grid_point=None, **kwargs)[source]
Plot the data observational median
observational_median
as a function of time.- Parameters
predictor (int, optional) – The predictor index to use. Default is 0.
variable (int, optional) – The variable index to use.
ax (Axes, optional) – An axes on which to plot.
timestamps (None or ndarray(datetime), optional) – An array containing the timestamp of the data. If None, try to use the data timestamps and in last resort a numbered time index. Default to None.
grid_point (tuple(int, int)) – If the data are fields, specifies which grid point to plot.
kwargs (dict) – Argument to be passed to the plotting routine.
- Returns
ax – An axes where the data were plotted.
- Return type
- plot_observational_minmax(predictor=0, variable=0, ax=None, timestamps=None, grid_point=None, **kwargs)[source]
Plot the data observational minimum
observational_min
and maximumobservational_max
as a function of time.- Parameters
predictor (int, optional) – The predictor index to use. Default is 0.
variable (int, optional) – The variable index to use.
ax (Axes, optional) – An axes on which to plot.
timestamps (None or ndarray(datetime), optional) – An array containing the timestamp of the data. If None, try to use the data timestamps and in last resort a numbered time index. Default to None.
grid_point (tuple(int, int)) – If the data are fields, specifies which grid point to plot.
kwargs (dict) – Argument to be passed to the plotting routine.
- Returns
ax – An axes where the data were plotted.
- Return type
Axes, optional
- plot_observational_quantiles(q, low_interpolation='linear', high_interpolation='linear', predictor=0, variable=0, ax=None, timestamps=None, grid_point=None, alpha=0.1, **kwargs)[source]
Plot the data observational quantiles
observational_quantiles
as a function of time.- Parameters
q (array_like(float)) – Quantile or sequence of quantiles to compute, which must be between 0 and 0.5 exclusive. A symmetric quantile with respect to 0.5 will also be computed.
low_interpolation (str, optional) – This optional parameter specifies the interpolation method to use when the desired lower quantile (q<0.5) lies between two data points. See
numpy.quantile()
for more information.high_interpolation (str, optional) – This optional parameter specifies the interpolation method to use when the desired higher quantile (q>0.5) lies between two data points. See
numpy.quantile()
for more information.predictor (int, optional) – The predictor index to use. Default is 0.
variable (int, optional) – The variable index to use.
ax (Axes, optional) – An axes on which to plot.
timestamps (None or ndarray(datetime), optional) – An array containing the timestamp of the data. If None, try to use the data timestamps and in last resort a numbered time index. Default to None.
grid_point (tuple(int, int)) – If the data are fields, specifies which grid point to plot.
alpha (float) – Base level of transparency for the highest and lowest quantiles.
kwargs (dict) – Argument to be passed to the plotting routine.
- Returns
ax – An axes where the data were plotted.
- Return type
- plot_observational_std(predictor=0, variable=0, ax=None, timestamps=None, grid_point=None, **kwargs)[source]
Plot the data observational standard deviation
observational_std
as a function of time.- Parameters
predictor (int, optional) – The predictor index to use. Default is 0.
variable (int, optional) – The variable index to use.
ax (Axes, optional) – An axes on which to plot.
timestamps (None or ndarray(datetime), optional) – An array containing the timestamp of the data. If None, try to use the data timestamps and in last resort a numbered time index. Default to None.
grid_point (tuple(int, int)) – If the data are fields, specifies which grid point to plot.
kwargs (dict) – Argument to be passed to the plotting routine.
- Returns
ax – An axes where the data were plotted.
- Return type
- save_to_file(filename, **kwargs)[source]
Function to save the data to a file with the
pickle
module.
- set_dtype(dtype)[source]
Set the data type.
- Parameters
dtype (dtype) – The Numpy data type of the data.
- property uncertainty
Average over the observations of the
observational_distance
divided by 2: \(\langle d^{\rm{obs}}_{p,n_1,n_2,m,v} (t)\rangle_{n_1,n_2} / 2\). Sometimes called the uncertainty contribution of the CRPS. See [DATA-Her00] for more details.- Type
- zeros_like(**kwargs)[source]
Like
numpy.zeros_like()
, returns aData
object with the sameindex_shape
andshape
and type as the initial one, but filled with zeros.- Parameters
kwargs (dict) – The argument to pass to
numpy.zeros_like()
.- Returns
The zeros
Data
object- Return type