Mastering_Predictive_Analytics_with_Python by 2016
Author:2016
Language: eng
Format: epub
The parabola is a convex function because the values between x1 and x2 (the two points where the blue line intersects with the parabola) are always below the blue line representing α(F(x1))+(1-α) (F(x2)) . As you can see, the parabola also has a global minimum between these two points.
When we are dealing with matrices such as the Hessian referenced previously, this condition is fulflled by each element of the matrix being ≥ 0 , a property known as positive semidefnite, meaning any vector multiplied by this matrix on either side (xTHx) yields a value ≥ 0 . This means the function has a global minimum, and if our solution converges to a set of coeffcients, we can be guaranteed that they represent the best parameters for the model, not a local minimum. We noted previously that we could potentially offset imbalanced distribution of classes in our data by reweighting individual points during training. In the formulas for either SGD or IRLS, we could apply a weight wi for each data point, increasing or decreasing its relative contribution to the value of the likelihood and the updates made during each iteration of the optimization algorithm. Now that we have described how to obtain the optimal parameters of the logistic regression model, let us return to our example and apply these methods to our data. Fitting the model
We can use either the SGD or second-order methods to ft the logistic regression model to our data. Let us compare the results using SGD; we ft the model using the following command:
>>> log_model_sgd = linear_model.SGDClassifier(alpha=10,loss='log', penalty='l2',n_iter=1000, fit_intercept=False).fit(census_features_ train,census_income_train)
Where the parameter log for loss specifes that this is a logistic regression that we are training, and n_iter specifes the number of times we iterate over the training data to perform SGD, alpha represents the weight on the regularization term, and we specify that we do not want to ft the intercept to make comparison to other methods more straightforward (since the method of ftting the intercept could differ between optimizers). The penalty argument specifes the regularization penalty, which we saw in Chapter 4 , Connecting the Dots with Models – Regression Methods , already for ridge regression. As l2 is the only penalty we can use with second-order methods, we choose l2 here as well to allow comparison between the methods. We can examine the resulting model coeffcients by referencing the coeff_ property of the model object:
>>> log_model_sgd.coef_
Compare these coeffcients to the second-order ft we obtain using the following command:
>>> log_model_newton = linear_model.LogisticRegression(penalty='l2',solve r='lbfgs', fit_intercept=False).fit(census_features_train,census_income_ train
Like the SGD model, we remove the intercept ft to allow the most direct comparison of the coeffcients produced by the two methods., We fnd that the coeffcients are not identical, with the output of the SGD model containing several larger coeffcients. Thus, we see in practice that even with similar models and a convex objective function, different optimization methods can give different parameter results. However, we can see that the results are highly correlated based on a pairwise scatterplot of the coeffcients:
>>> plt.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Hello! Python by Anthony Briggs(9911)
OCA Java SE 8 Programmer I Certification Guide by Mala Gupta(9795)
The Mikado Method by Ola Ellnestam Daniel Brolund(9775)
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8293)
Sass and Compass in Action by Wynn Netherland Nathan Weizenbaum Chris Eppstein Brandon Mathis(7775)
Test-Driven iOS Development with Swift 4 by Dominik Hauser(7758)
Grails in Action by Glen Smith Peter Ledbrook(7693)
The Well-Grounded Java Developer by Benjamin J. Evans Martijn Verburg(7557)
Windows APT Warfare by Sheng-Hao Ma(6795)
Layered Design for Ruby on Rails Applications by Vladimir Dementyev(6521)
Secrets of the JavaScript Ninja by John Resig Bear Bibeault(6409)
Blueprints Visual Scripting for Unreal Engine 5 - Third Edition by Marcos Romero & Brenden Sewell(6388)
Kotlin in Action by Dmitry Jemerov(5062)
Hands-On Full-Stack Web Development with GraphQL and React by Sebastian Grebe(4315)
Functional Programming in JavaScript by Mantyla Dan(4037)
Solidity Programming Essentials by Ritesh Modi(3979)
WordPress Plugin Development Cookbook by Yannick Lefebvre(3769)
Unity 3D Game Development by Anthony Davis & Travis Baptiste & Russell Craig & Ryan Stunkel(3712)
The Ultimate iOS Interview Playbook by Avi Tsadok(3685)
