Machine Learning Guide for Oil and Gas Using Python by Hoss Belyadi & Alireza Haghighat

Machine Learning Guide for Oil and Gas Using Python by Hoss Belyadi & Alireza Haghighat

Author:Hoss Belyadi & Alireza Haghighat [Belyadi, Hoss & Haghighat, Alireza]
Language: eng
Format: epub
ISBN: 9780128219300
Publisher: Elsevier Inc.
Published: 2021-04-14T06:56:22+00:00


Figure 5.60 NPV training actual versus prediction.

Figure 5.61 NPV testing actual versus prediction.

Let's also obtain MAE, MSE, and RMSE for the 30% testing data as follows:

from sklearn import metrics print('MAE:', round(metrics.mean_absolute_error(y_test, y_pred_test),5)) print('MSE:', round(metrics.mean_squared_error(y_test, y_pred_test),5)) print('RMSE:', round(np.sqrt(metrics.mean_squared_error(y_test, y_pred_test)),5)) Python output=MAE: 0.76514MSE: 1.06659RMSE: 1.03276

Let's also obtain the feature importance of each input variable and permutation feature importance. The permutation feature importance is defined as the decrease in a model score when a single feature value is randomly shuffled (Breiman, 2011). Permutation feature importance evaluates input and output features by calculating the drop in the model score. The bigger the drop in the model score, the more impact the input feature has on the model, and the higher it will rank. Permutation importance can be obtained by importing the “permutation_importance” library from “sklearn.inspection” and passing in the trained model (in this case “gb”), testing set (in this case “X_test,y_test”), n_repeats (in this case 10 was used), and random_state (to use a seed number of 1000 as was previously used). Please note that “n_repeats” refers to the number of times a feature is randomly shuffled. In this example, each feature was randomly shuffled 10 times and a sample of feature importance is returned.

from sklearn.inspection import permutation_importance feature_importance=gb.feature_importances_ sorted_features=np.argsort(feature_importance) pos=np.arange(sorted_features.shape[0]) + .5 fig=plt.figure(figsize=(12, 6)) plt.subplot(1, 2, 1) plt.barh(pos, feature_importance[sorted_features], align='center') plt.yticks(pos, np.array(df.columns)[sorted_features]) plt.title('Feature Importance') result=permutation_importance(gb, X_test, y_test, n_repeats=10,random_state=seed) sorted_idx=result.importances_mean.argsort() plt.subplot(1, 2, 2) plt.boxplot(result.importances[sorted_idx].T, vert=False, labels=np.array(df.columns)[sorted_idx]) plt.title(“Permutation Importance (test set)”) fig.tight_layout() plt.show() Python output=Fig. 5.62



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.