Pages

Sunday, March 10, 2019

Regression in Python


Earlier we have discussed how to use to conduct Regression testing in R - Regression in R

In this article ,we will delve into some of the details to conduct the same using Python

Assumption -  Dependent and independent data sets have been stored in respective variables sbiR & niftyR

Code to Run Linear Regression

from sklearn.linear_model import LinearRegression
regression = LinearRegression(normalize=True)
regression.fit(niftyR,sbiR)

As you might notice, Sklearn is the library in Python which has linear module and Linear Regression is the function that we have imported to run Regression.

You can also print the r2 value by function regression.score()

Code to Run Logistic Regression

from sklearn.linear_model import LogisticRegression
logistic = LogisticRegression()

logistic.fit(niftyR,sbiR)

Everything else remain same however depending upon the dependent variable or output, we might need to use Logistic regression

Cross Validation & Train - Test Data sets

To test data sets, we first need to split the available data into 2 parts - train data & test data. Below library should be used to call split function

from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split( X, Y, test_size=0.33, random_state=42)

Array X contains Independent data while Y is the output, we will split whole data into 2 parts and will run regression on train data and check the results on test data to establish if model can be used for prediction

regression.fit(X_train,y_train)
print mean_squared_error(y_true=y_train, y_pred=regression.predict(X_train))
print mean_squared_error(y_true=y_test, y_pred=regression.predict(X_test))

This article has details on installing & Running Python Hello Code from Ubuntu Machine


No comments:

Post a Comment