Prediction Classes Docs
At the moment there are three types of prediction objects implemented:
Prediction: an object with limited and general properties that is designed to fit any prediction.
BinaryPrediction: an object with properties specific for cases where fitted and real data attain only two values.
NumericPrediction: an object with properties specific for cases where fitted and real values are numeric.
Find the content related to each of the modules.
Prediction
Contains the generic Prediction. This class represents any kind of prediction interpreted as fitted array Y’ attempting to be close to real array Y.
The Prediction class allows to compute some metrics concerning the accuracy without needing to know how the prediction was computed.
The subclasses allow for metrics that are relevant for just specific types of predictions.
- class easypred.base_prediction.Prediction
Bases:
object
Class to represent a generic prediction.
- fitted_values
The array-like object of length N containing the fitted values.
- Type
np.ndarray | pd.Series
- real_values
The array-like object containing the N real values.
- Type
np.ndarray | pd.Series
- __init__(real_values, fitted_values)
Class to represent a generic prediction.
- Parameters
real_values (np.ndarray | pd.Series | list | tuple) – The array-like object of length N containing the real values. If not pd.Series or np.array, it will be coerced into np.array.
fitted_values (np.ndarray | pd.Series | list | tuple) – The array-like object of containing the real values. It must have the same length of real_values. If not pd.Series or np.array, it will be coerced into np.array.
Examples
>>> from easypred import Prediction >>> pred = Prediction(real_values=["Foo", "Foo", "Bar", "Baz"], ... fitted_values=["Foo", "Bar", "Foo", "Baz"])
- property accuracy_score: float
Return a float representing the percent of items which are equal between the real and the fitted values.
Examples
>>> pred = Prediction(real_values=["Foo", "Foo", "Bar", "Baz"], ... fitted_values=["Foo", "Bar", "Foo", "Baz"]) >>> pred.accuracy_score 0.5
- as_dataframe()
Return prediction as a dataframe containing various information over the prediction quality.
Examples
>>> pred = Prediction(real_values=["Foo", "Foo", "Bar", "Baz"], ... fitted_values=["Foo", "Bar", "Foo", "Baz"]) >>> pred.as_dataframe() Real Values Fitted Values Prediction Matches 0 Foo Foo True 1 Foo Bar False 2 Bar Foo False 3 Baz Baz True
- Return type
pandas.core.frame.DataFrame
- describe()
Return a dataframe containing some key information about the prediction.
Examples
>>> pred = Prediction(real_values=["Foo", "Foo", "Bar", "Baz"], ... fitted_values=["Foo", "Bar", "Foo", "Baz"]) >>> pred.describe() Value N 4.0 Matches 2.0 Errors 2.0 Accuracy 0.5
- Return type
pandas.core.frame.DataFrame
- matches()
Return a boolean array of length N with True where fitted value is equal to real value.
- Return type
pd.Series | np.array
Examples
>>> pred = Prediction(real_values=["Foo", "Foo", "Bar", "Baz"], ... fitted_values=["Foo", "Bar", "Foo", "Baz"]) >>> pred.matches() array([True, False, False, True])
BinaryPrediction
Subclass of Prediction specialized in representing a binary prediction, thus a prediction where both the fitted and real data attain at most two different values.
It allows to compute accuracy metrics like true positive, true negative, etc.
- class easypred.binary_prediction.BinaryPrediction
Bases:
easypred.base_prediction.Prediction
Subclass of Prediction specialized in representing numeric categorical predictions with binary outcome.
- fitted_values
The array-like object of length N containing the fitted values.
- Type
np.ndarray | pd.Series
- real_values
The array-like object containing the N real values.
- Type
np.ndarray | pd.Series
- value_positive
The value in the data that corresponds to 1 in the boolean logic. It is generally associated with the idea of “positive” or being in the “treatment” group. By default is 1.
- Type
Any
Examples
Classic 0/1 case:
>>> from easypred import BinaryPrediction >>> pred = BinaryPrediction(real_values=[0, 1, 1, 1], ... fitted_values=[1, 0, 1, 1]) >>> pred.real_values array([0, 1, 1, 1]) >>> pred.fitted_values array([1, 0, 1, 1]) >>> pred.value_positive 1
Other values are accepted:
>>> from easypred import BinaryPrediction >>> pred = BinaryPrediction(real_values=["Foo", "Foo", "Bar", "Foo"], ... fitted_values=["Foo", "Bar", "Foo", "Bar"] ... value_positive="Foo") >>> pred.value_positive Foo
- __init__(real_values, fitted_values, value_positive=1)
Create an instance of BinaryPrediction to represent a prediction with just two possible outcomes.
- Parameters
real_values (np.ndarray | pd.Series | list | tuple) – The array-like object of length N containing the real values. If not pd.Series or np.array, it will be coerced into np.array.
fitted_values (np.ndarray | pd.Series | list | tuple) – The array-like object of containing the real values. It must have the same length of real_values. If not pd.Series or np.array, it will be coerced into np.array.
value_positive (Any) – The value in the data that corresponds to 1 in the boolean logic. It is generally associated with the idea of “positive” or being in the “treatment” group. By default is 1.
Examples
>>> from easypred import BinaryPrediction >>> pred1 = BinaryPrediction(real_values=[0, 1, 1, 1], ... fitted_values=[1, 0, 1, 1]) >>> pred2 = BinaryPrediction(real_values=["Foo", "Foo", "Bar", "Foo"], ... fitted_values=["Foo", "Bar", "Foo", "Bar"] ... value_positive="Foo")
- property accuracy_score: float
Return a float representing the percent of items which are equal between the real and the fitted values.
Examples
>>> pred = Prediction(real_values=["Foo", "Foo", "Bar", "Baz"], ... fitted_values=["Foo", "Bar", "Foo", "Baz"]) >>> pred.accuracy_score 0.5
- as_dataframe()
Return prediction as a dataframe containing various information over the prediction quality.
Examples
>>> pred = Prediction(real_values=["Foo", "Foo", "Bar", "Baz"], ... fitted_values=["Foo", "Bar", "Foo", "Baz"]) >>> pred.as_dataframe() Real Values Fitted Values Prediction Matches 0 Foo Foo True 1 Foo Bar False 2 Bar Foo False 3 Baz Baz True
- Return type
pandas.core.frame.DataFrame
- property balanced_accuracy_score: float
Return the float representing the arithmetic mean between recall score and specificity score.
It provides an idea of the goodness of the prediction in unbalanced datasets.
Examples
>>> from easypred import BinaryPrediction >>> pred = BinaryPrediction(real_values=[0, 1, 1, 1], ... fitted_values=[1, 0, 1, 1]) >>> pred.balanced_accuracy_score 0.3333333333333333
- confusion_matrix(relative=False, as_dataframe=False)
Return the confusion matrix for the binary classification.
The confusion matrix is a matrix with shape (2, 2) that classifies the predictions into four categories, each represented by one of its elements: - [0, 0] : negative classified as negative - [0, 1] : negative classified as positive - [1, 0] : positive classified as negative - [1, 1] : positive classified as positive
- Parameters
relative (bool, optional) – If True, absolute frequencies are replace by relative frequencies. By default False.
as_dataframe (bool, optional) – If True, the matrix is returned as a pandas dataframe for better readability. Otherwise a numpy array is returned. By default False.
- Returns
If as_dataframe is False, return a numpy array of shape (2, 2). Otherwise return a pandas dataframe of the same shape.
- Return type
np.ndarray | pd.DataFrame
Examples
>>> from easypred import BinaryPrediction >>> pred = BinaryPrediction(real_values=[0, 1, 1, 1], ... fitted_values=[1, 0, 1, 1]) >>> pred.confusion_matrix() array([[0, 1], [1, 2]]) >>> pred.confusion_matrix(as_dataframe=True) Pred 0 Pred 1 Real 0 0 1 Real 1 1 2 >>> pred.confusion_matrix(as_dataframe=True, relative=True) Pred 0 Pred 1 Real 0 0.00 0.25 Real 1 0.25 0.50
- describe()
Return a dataframe containing some key information about the prediction.
Examples
>>> from easypred import BinaryPrediction >>> pred = BinaryPrediction(real_values=[0, 1, 1, 1], ... fitted_values=[1, 0, 1, 1]) >>> pred.describe() Value N 4.000000 Matches 2.000000 Errors 2.000000 Accuracy 0.500000 Recall 0.666667 Specificity 0.000000 Precision 0.666667 Negative PV 0.000000 F1 score 0.666667
- Return type
pandas.core.frame.DataFrame
- property f1_score
Return the harmonic mean of the precision and recall.
It gives an idea of an overall goodness of your precision and recall taken together.
Also called: balanced F-score or F-measure
Examples
>>> from easypred import BinaryPrediction >>> pred = BinaryPrediction(real_values=[0, 1, 1, 1], ... fitted_values=[1, 0, 1, 1]) >>> pred.f1_score 0.6666666666666666
- property false_negative_rate
Return the ratio between the number of false negatives and the total number of real positives.
It tells the percentage of positives falsely classified as negative.
Examples
>>> from easypred import BinaryPrediction >>> pred = BinaryPrediction(real_values=[0, 1, 1, 1], ... fitted_values=[1, 0, 1, 1]) >>> pred.false_negative_rate 0.3333333333333333
- property false_positive_rate: float
Return the ratio between the number of false positives and the total number of real negatives.
It tells the percentage of negatives falsely classified as positive.
Examples
>>> from easypred import BinaryPrediction >>> pred = BinaryPrediction(real_values=[0, 1, 1, 1], ... fitted_values=[1, 0, 1, 1]) >>> pred.false_positive_rate 1.0
- classmethod from_binary_score(binary_score, threshold=0.5)
Create an instance of BinaryPrediction from a BinaryScore object.
- Parameters
binary_score (BinaryScore) – The BinaryScore object the BinaryPrediction is to be constructed from.
value_positive (Any) – The value in the data that corresponds to 1 in the boolean logic. It is generally associated with the idea of “positive” or being in the “treatment” group. By default is 1.
threshold (float | str, optional) –
If float, it is the minimum value such that the score is translated into value_positive. Any score below the threshold is instead associated with the other value. If str, the threshold is automatically set such that it maximizes the metric corresponding to the provided keyword. The available keywords are: - “f1”: maximize the f1 score - “accuracy”: maximize the accuracy score
By default 0.5.
- Returns
An object of type BinaryPrediction, a subclass of Prediction specific for predictions with just two outcomes. The class instance is given the special attribute “threshold” that returns the threshold used in the convertion.
- Return type
Examples
>>> from easypred import BinaryPrediction, BinaryScore >>> real = [0, 1, 1, 0, 1, 0] >>> fit = [0.31, 0.44, 0.24, 0.28, 0.37, 0.18] >>> score = BinaryScore(real, fit, value_positive=1) >>> BinaryPrediction.from_binary_score(score, threshold=0.5) <easypred.binary_prediction.BinaryPrediction object at 0x000001AA51C3EEE0>
- classmethod from_prediction(prediction, value_positive)
Create an instance of BinaryPrediction from a general Prediction object.
- Parameters
prediction (Prediction) – The prediction object the BinaryPrediction is to be constructed from.
value_positive (Any) – The value in the data that corresponds to 1 in the boolean logic. It is generally associated with the idea of “positive” or being in the “treatment” group. By default is 1.
- Returns
An object of type BinaryPrediction, a subclass of Prediction specific for predictions with just two outcomes.
- Return type
Examples
>>> from easypred import BinaryPrediction, Prediction >>> pred = Prediction(real_values=[0, 1, 1, 1], ... fitted_values=[1, 0, 1, 1]) >>> BinaryPrediction.from_prediction(pred, value_positive=1) <easypred.binary_prediction.BinaryPrediction object at 0x000001AA51C3EF10>
- matches()
Return a boolean array of length N with True where fitted value is equal to real value.
- Return type
pd.Series | np.array
Examples
>>> pred = Prediction(real_values=["Foo", "Foo", "Bar", "Baz"], ... fitted_values=["Foo", "Bar", "Foo", "Baz"]) >>> pred.matches() array([True, False, False, True])
- property negative_predictive_value
Return the ratio between the number of correctly classified negative and the total number of predicted negative.
It measures how accurate the negative predictions are.
Examples
>>> from easypred import BinaryPrediction >>> pred = BinaryPrediction(real_values=[0, 1, 1, 1], ... fitted_values=[1, 0, 1, 1]) >>> pred.negative_predictive_value 0.0
- property precision_score
Return the ratio between the number of correctly predicted positives and the total number predicted positives.
It measures how accurate the positive predictions are.
Also called: positive predicted value.
Examples
>>> from easypred import BinaryPrediction >>> pred = BinaryPrediction(real_values=[0, 1, 1, 1], ... fitted_values=[1, 0, 1, 1]) >>> pred.precision_score 0.6666666666666666
- property recall_score
Return the ratio between the correctly predicted positives and the total number of real positives.
It measures how good the model is in detecting real positives.
Also called: sensitivity, hit rate, true positive rate.
Examples
>>> from easypred import BinaryPrediction >>> pred = BinaryPrediction(real_values=[0, 1, 1, 1], ... fitted_values=[1, 0, 1, 1]) >>> pred.recall_score 0.6666666666666666
- property specificity_score
Return the ratio between the correctly predicted negatives and the total number of real negatives.
It measures how good the model is in detecting real negatives.
Also called: selectivity, true negative rate.
Examples
>>> from easypred import BinaryPrediction >>> pred = BinaryPrediction(real_values=[0, 1, 1, 1], ... fitted_values=[1, 0, 1, 1]) >>> pred.specificity_score 0.0
- property value_negative: Any
Return the value that it is not the positive value.
Examples
>>> from easypred import BinaryPrediction >>> pred = BinaryPrediction(real_values=[0, 1, 1, 1], ... fitted_values=[1, 0, 1, 1]) >>> pred.value_negative 0
NumericPrediction
Subclass of prediction specialized in representing numeric predictions, thus a prediction where both fitted and real data are either ints or floats.
It allows to compute accuracy metrics that represent the distance between the prediction and the real values.
- class easypred.numeric_prediction.NumericPrediction
Bases:
easypred.base_prediction.Prediction
Subclass of Prediction specialized in representing numeric predictions.
- fitted_values
The array-like object of length N containing the fitted values.
- Type
np.ndarray | pd.Series
- real_values
The array-like object containing the N real values.
- Type
np.ndarray | pd.Series
Examples
>>> from easypred import NumericPrediction >>> pred = NumericPrediction([7, 1, 3, 4, 5], [6.5, 2, 4, 3, 5]) >>> pred.real_values array([7, 1, 3, 4, 5]) >>> pred.fitted_values array([6.5, 2. , 4. , 3. , 5. ])
- __init__(real_values, fitted_values)
Class to represent a generic prediction.
- Parameters
real_values (np.ndarray | pd.Series | list | tuple) – The array-like object of length N containing the real values. If not pd.Series or np.array, it will be coerced into np.array.
fitted_values (np.ndarray | pd.Series | list | tuple) – The array-like object of containing the real values. It must have the same length of real_values. If not pd.Series or np.array, it will be coerced into np.array.
Examples
>>> from easypred import Prediction >>> pred = Prediction(real_values=["Foo", "Foo", "Bar", "Baz"], ... fitted_values=["Foo", "Bar", "Foo", "Baz"])
- property accuracy_score: float
Return a float representing the percent of items which are equal between the real and the fitted values.
Examples
>>> pred = Prediction(real_values=["Foo", "Foo", "Bar", "Baz"], ... fitted_values=["Foo", "Bar", "Foo", "Baz"]) >>> pred.accuracy_score 0.5
- as_dataframe()
Return prediction as a dataframe containing various information over the prediction quality.
- Returns
Dataframe of shape (N, 5) containing summary information for each observation’s prediction.
- Return type
pd.DataFrame
Examples
>>> from easypred import NumericPrediction >>> pred = NumericPrediction([7, 1, 3, 4, 5], [6.5, 2, 4, 3, 5]) >>> pred.as_dataframe() Fitted Values Real Values Prediction Matches Absolute Difference Relative Difference 0 6.5 7 False 0.5 0.071429 1 2.0 1 False -1.0 -1.000000 2 4.0 3 False -1.0 -0.333333 3 3.0 4 False 1.0 0.250000 4 5.0 5 True 0.0 0.000000
- describe()
Return a dataframe containing some key information about the prediction.
- Returns
Dataframe of shape (6, 1) containing summary information on the prediction quality.
- Return type
pd.DataFrame
Examples
>>> from easypred import NumericPrediction >>> pred = NumericPrediction([7, 1, 3, 4, 5], [6.5, 2, 4, 3, 5]) >>> pred.describe() Value N 5.000000 MSE 0.650000 RMSE 0.806226 MAE 0.700000 MAPE 0.330952 R^2 0.861680
- property mae: float
Return the Mean Absolute Error.
References
https://en.wikipedia.org/wiki/Mean_absolute_error
Examples
>>> from easypred import NumericPrediction >>> pred = NumericPrediction([7, 1, 3, 4, 5], [6.5, 2, 4, 3, 5]) >>> pred.mae 0.7
- property mape: float
Return the Mean Absolute Percentage Error.
References
https://en.wikipedia.org/wiki/Mean_absolute_percentage_error
Examples
>>> from easypred import NumericPrediction >>> pred = NumericPrediction([7, 1, 3, 4, 5], [6.5, 2, 4, 3, 5]) >>> pred.mape 0.33095238095238094
- matches()
Return a boolean array of length N with True where fitted value is equal to real value.
- Return type
pd.Series | np.array
Examples
>>> pred = Prediction(real_values=["Foo", "Foo", "Bar", "Baz"], ... fitted_values=["Foo", "Bar", "Foo", "Baz"]) >>> pred.matches() array([True, False, False, True])
- matches_tolerance(tolerance=0.0)
Return a boolean array of length N with True where the distance between the real values and the fitted values is inferior to a given parameter
- Parameters
tolerance (float, optional) – The maximum absolute difference between the real value and its fitted counterpart such that the pair considered a match. By default is 0.0.
- Returns
Boolean array of shape (N,). Its type reflects the type of self.real_values.
- Return type
np.ndarray or pd.Series
Examples
>>> from easypred import NumericPrediction >>> pred = NumericPrediction([7, 1, 3, 4, 5], [6.5, 2, 4, 3, 5]) >>> pred.matches_tolerance() array([False, False, False, False, True]) >>> pred.matches_tolerance(tolerance=2) array([ True, True, True, True, True])
With pandas series:
>>> import pandas as pd >>> pred = NumericPrediction(pd.Series([7, 1, 3, 4, 5]), ... pd.Series([6.5, 2, 4, 3, 5])) >>> pred.matches_tolerance(tolerance=2) 0 True 1 True 2 True 3 True 4 True dtype: bool
- property mse: float
Return the Mean Squared Error.
References
https://en.wikipedia.org/wiki/Mean_squared_error
Examples
>>> from easypred import NumericPrediction >>> pred = NumericPrediction([7, 1, 3, 4, 5], [6.5, 2, 4, 3, 5]) >>> pred.mse 0.65
- plot_fit(figsize=(20, 10), line_slope=1, title_size=14, axes_labels_size=12, ax=None, **kwargs)
Plot the scatterplot depicting real against fitted values.
- Parameters
figsize (tuple[int, int], optional) – Tuple of integers specifying the size of the plot. Default is (20, 10).
line_slope (int | None, optional) – Slope of the red dashed line added to the scatterplot. If None, no line is drawn. By default is 1, representing parity between real and fitted values.
title_size (int, optional) – Font size of the plot title. Default is 14.
axes_labels_size (int, optional) – Font size of the axes labels. Default is 12.
ax (matplotlib Axes, optional) – Axes object to draw the plot onto, otherwise creates new Figure and Axes. Use this option to further customize the plot.
kwargs (key, value mappings) – Other keyword arguments to be passed through to matplotlib.pyplot.scatter().
- Returns
Matplotlib Axes object with the plot drawn on it.
- Return type
matplotlib Axes
Examples
>>> from easypred import NumericPrediction >>> pred = NumericPrediction([7, 1, 3, 4, 5], [6.5, 2, 4, 3, 5]) >>> pred.plot_fit() <AxesSubplot:title={'center':'Real against fitted values'}, xlabel='Fitted values', ylabel='Real values'> >>> from matplotlib import pyplot as plt >>> plt.show()
- plot_fit_residuals(figsize=(20, 10), title_size=14, axes_labels_size=12, axs=None, **kwargs)
Plot a two panels figure containing the plot of real against fitted values and the plot of residuals against fitted values.
This method combines plot_fit and plot_residuals.
These two graphs are useful in detecting potential biases in the prediction as they allow to detect deviations and clusters in the prediction.
- Parameters
figsize (tuple[int, int], optional) – Tuple of integers specifying the size of the plot. Default is (20, 10).
title_size (int, optional) – Font size of the plots’ titles. Default is 14.
axes_labels_size (int, optional) – Font size of the axes labels. Default is 12.
axs (list of matplotlib Axes, optional) – List of axes object of length 2 to draw the plot onto. Otherwise creates new Figure and Axes. Use this option to further customize the plot.
kwargs (key, value mappings) – Other keyword arguments to be passed through to matplotlib.pyplot.scatter() of each subplot.
- Returns
NumPy array of shape (2,) containing one matplotlib Axes object for each of the subplots.
- Return type
np.ndarray[matplotlib Axes, matplotlib Axes]
Examples
>>> from easypred import NumericPrediction >>> pred = NumericPrediction([7, 1, 3, 4, 5], [6.5, 2, 4, 3, 5]) >>> pred.plot_fit_residuals() array([<AxesSubplot:title={'center':'Real against fitted values'}, xlabel='Fitted values', ylabel='Real values'>, <AxesSubplot:title={'center':'Residuals against fitted values'}, xlabel='Fitted values', ylabel='Residuals'>], dtype=object) >>> from matplotlib import pyplot as plt >>> plt.show()
- plot_residuals(figsize=(20, 10), hline=0, title_size=14, axes_labels_size=12, ax=None, **kwargs)
Plot the scatterplot depicting the residuals against fitted values.
- Parameters
figsize (tuple[int, int], optional) – Tuple of integers specifying the size of the plot. Default is (20, 10).
hline (int, optional) – Y coordinate of the red dashed line added to the scatterplot. If None, no line is drawn. By default is 0.
title_size (int, optional) – Font size of the plot title. Default is 14.
axes_labels_size (int, optional) – Font size of the axes labels. Default is 12.
ax (matplotlib Axes, optional) – Axes object to draw the plot onto, otherwise creates new Figure and Axes. Use this option to further customize the plot.
kwargs (key, value mappings) – Other keyword arguments to be passed through to matplotlib.pyplot.scatter().
- Returns
Matplotlib Axes object with the plot drawn on it.
- Return type
matplotlib Axes
Examples
>>> from easypred import NumericPrediction >>> pred = NumericPrediction([7, 1, 3, 4, 5], [6.5, 2, 4, 3, 5]) >>> pred.plot_residuals() <AxesSubplot:title={'center':'Residuals against fitted values'}, xlabel='Fitted values', ylabel='Residuals'> >>> from matplotlib import pyplot as plt >>> plt.show()
- property r_squared: float
Returns the r squared calculated as the square of the correlation coefficient. Also called ‘Coefficient of Determination’.
References
https://en.wikipedia.org/wiki/Coefficient_of_determination
Examples
>>> from easypred import NumericPrediction >>> pred = NumericPrediction([7, 1, 3, 4, 5], [6.5, 2, 4, 3, 5]) >>> pred.r_squared 0.8616803278688523
- residuals(squared=False, absolute=False, relative=False)
Return an array with the difference between the real values and the fitted values.
- Parameters
squared (bool, optional) – If True, the residuals are squared, by default False.
absolute (bool, optional) – If True, the residuals are taken in absolute value, by default False.
relative (bool, optional) – If True, the residuals are divided by the real values to return a relative measure. By default False.
- Returns
Numpy array or pandas series depending on the type of real_values and fitted_values. Its shape is (N,).
- Return type
np.ndarray or pd.Series
Examples
>>> from easypred import NumericPrediction >>> pred = NumericPrediction([7, 1, 3, 4, 5], [6.5, 2, 4, 3, 5]) >>> pred.residuals() array([ 0.5, -1. , -1. , 1. , 0. ]) >>> pred.residuals(squared=True) array([0.25, 1. , 1. , 1. , 0. ]) >>> pred.residuals(absolute=True) array([0.5, 1. , 1. , 1. , 0. ]) >>> pred.residuals(relative=True) array([ 0.07142857, -1. , -0.33333333, 0.25 , 0. ]) >>> pred.residuals(relative=True, absolute=True) array([0.07142857, 1. , 0.33333333, 0.25 , 0. ])
- property rmse: float
Return the Root Mean Squared Error.
References
https://en.wikipedia.org/wiki/Root-mean-square_deviation
Examples
>>> from easypred import NumericPrediction >>> pred = NumericPrediction([7, 1, 3, 4, 5], [6.5, 2, 4, 3, 5]) >>> pred.rmse 0.806225774829855