A Straightforward Introduction To Machine Learning With Python Implementation by Md. Akramul Hossain

Author:Md. Akramul Hossain [Hossain, Md. Akramul] , Date: July 17, 2021 ,Views: 120

A Straightforward Introduction To Machine Learning With Python Implementation by Md. Akramul Hossain

Author:Md. Akramul Hossain [Hossain, Md. Akramul]
Language: eng
Format: azw3, pdf
Publisher: UNKNOWN
Published: 2021-07-12T00:00:00+00:00

#train_data["Age"].fillna(train_data.groupby("Name")["Age"]. âtransform("median"), inplace=True)

#test_data["Age"].fillna(test_data.groupby("Name")["Age"]. âtransform("median"), inplace=True)

#test_data = test_data.interpolate() [34]: 'from sklearn.impute import SimpleImputer
imputer = SimpleImputer()

train â=

pd.DataFrame(imputer.fit_transform(train_data))
train.columns = train_data.columns

test =

pd.DataFrame(imputer.fit_transform(test_data))
test.columns = test_data.columns'

[35]: train = train_data.interpolate() test = test_data.interpolate()

[36]: train.head() [36]: PassengerId Survived Pclass Name Sex Age SibSp Parch Fare
0 1 0 3 12 1 22.0 1 0 7.2500

1 2 1 1 13 0 38.0 1 0 71.2833

2 3 1 3 9 0 26.0 0 0 7.9250

3 4 1 1 13 0 35.0 1 0 53.1000

4 5 0 3 12 1 35.0 0 0 8.0500

Embarked

0 2

1 0

2 2

3 2

4 2

[37]: test.head() [37]: PassengerId Pclass Name Sex Age SibSp Parch Fare Embarked

0 892 3 5 1 34.5 0 0 7.8292 1

1 893 3 6 0 47.0 1 0 7.0000 2

2 894 2 5 1 62.0 0 0 9.6875 1

3 895 3 5 1 27.0 0 0 8.6625 2

4 896 3 6 0 22.0 1 1 12.2875 2

Class Imbalance

[38]: # visualizing the class imbalance

checkingImbalance = sns.countplot(train['Survived']) checkingImbalance.set_xticklabels(['Dead','Survived']) plt.show() [39]: # calculating weights to fix class imbalance

# we will pass this weights as a parameters of fit method

freq_pos = np.sum(train.Survived, axis = 0)/len(train.Survived) freq_neg = 1 freq_pos pos_weights = freq_neg

neg_weights = freq_pos

#pos_contribution = freq_pos * pos_weights

#neg_contribution = freq_neg * neg_weights

weight = {'0' : freq_neg * neg_weights, '1' : freq_pos * pos_weights} weights = [weight[str(p)] for p in train.Survived.astype('int')] #print(weights)

[40]: # separating features and target

X = train.drop(['Survived'], axis=1)

y = train['Survived']

[41]: # splitting train data into training set and validation set

from sklearn.model_selection import train_test_split

X_train, X_valid, y_train, y_valid = train_test_split(X, y, test_size = 0.20, ârandom_state = 1)

[42]: X_train.shape

[42]: (712, 9)

[43]: y_train.shape

[43]: (712,)

4.0.3 Cross Validation and Building Model [44]: from sklearn.model_selection import KFold

from sklearn.model_selection import cross_val_score k_fold = KFold(n_splits=10, shuffle=True, random_state=1)

[45]: from xgboost import XGBClassifier model = XGBClassifier()

score = cross_val_score(model, X, y, cv=k_fold, n_jobs=1, scoring='accuracy') print(score)

print(np.mean(score))

[0.73333333 0.76404494 0.79775281 0.84269663 0.76404494 0.84269663 0.79775281 0.80898876 0.84269663 0.78651685]

0.7980524344569287

[46]: from sklearn.ensemble import RandomForestClassifier

model = RandomForestClassifier(n_estimators=150, criterion='entropy', ârandom_state=1) score = cross_val_score(model, X, y, cv=k_fold, n_jobs=1, scoring='accuracy') print(score)

print(np.mean(score))

[0.77777778 0.79775281 0.7752809 0.85393258 0.83146067 0.84269663 0.85393258 0.83146067 0.88764045 0.80898876]

0.8260923845193509

4.0.4 Fit the data and Predict

[47]: model.fit(X_train, y_train) y_pred = model.predict(X_valid)

[48]: # plotting the confusion matrix

from sklearn.metrics import plot_confusion_matrix

plot_confusion_matrix(model, X_valid, y_valid) plt.show()

[49]: # let's see the accuracy score, precision score, recall score, f1 score from sklearn.metrics import accuracy_score, precision_score, recall_score, âf1_score accuracy = accuracy_score(y_valid, y_pred)

precision = precision_score(y_valid, y_pred, average='weighted') recall = accuracy_score(y_valid, y_pred)

f1 = accuracy_score(y_valid, y_pred)

print(f"Accuracy score : {accuracy}
Precision core : {precision}
Recall âscore : {recall}
F1 score : {f1}") Accuracy score : 0.7821229050279329

Precision core : 0.787092075315539

Recall score : 0.7821229050279329

F1 score : 0.7821229050279329

[50]: # Let's plot the decision boundary

from mlxtend.plotting import plot_decision_regions

from sklearn.decomposition import PCA

pca = PCA(n_components = 2)

X_train2 = pca.fit_transform(X_train)

model.fit(X_train2, y_train)

plot_decision_regions(X_train2, np.array(y_train).astype('int'), clf=model,

âlegend=2) plt .xlabel("x", size=14)

plt.ylabel("y", size=14)

plt.title('Random Forest Classifier Decision Region Boundary', size=16)

[50]: Text(0.5, 1.0, 'Random Forest Classifier Decision Region Boundary')

Fit on whole data and predict on test data

[51]: model.fit(X, y, sample_weight = weights)

preds = model.predict(test)

[52]: submission = pd.DataFrame({"PassengerId" : test_data.PassengerId. âastype('int'), 'Survived': np.array(preds).astype('int')}) submission.to_csv('submission.csv', index=False)

[ ]:

Download

A Straightforward Introduction To Machine Learning With Python Implementation by Md. Akramul Hossain.azw3
A Straightforward Introduction To Machine Learning With Python Implementation by Md. Akramul Hossain.pdf

Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.

Categories

other	Arts & Photography
Biographies & Memoirs	Business & Money
Calendars	Christian Books & Bibles
Comics & Graphic Novels	Computers & Technology
Cookbooks, Food & Wine	Crafts, Hobbies & Home
Education & Teaching	Engineering & Transportation
Health, Fitness & Dieting	Humor & Entertainment
Law	Lesbian, Gay, Bisexual & Transgender Books
Literature & Fiction	Medical Books
Mystery, Thriller & Suspense	Parenting & Relationships
Politics & Social Sciences	Reference
Religion & Spirituality	Romance
Science & Math	Science Fiction & Fantasy
Self-Help	Sports & Outdoors
Teen & Young Adult	Test Preparation
Travel	Children's Books
History