Pages

Sunday, March 10, 2019

Installing & Running Python Code in Ubuntu


To Run Python, you will need 2 software's

Python - to compile and run the code & Jupyter - to manage project and programming using GUI
Both of these are shipped together and Installing Anaconda will take care of both !!


Installing Python & Jupyter

  1. Download Python sh file from https://www.anaconda.com/distribution/#linux
  2. Run this command in the folder where Binary was saved in previous step 
  • bash Anaconda2-5.3.1-Linux-x86_64.sh 
Detailed instructions at Anaconda website - https://docs.anaconda.com/anaconda/install/linux/ 

Running 'Hello' code

Open Jupyter using command 

./anaconda2/bin/jupyter-notebook

Type below Python Code after selecting NEW & Select Run

print "Hello Python"

Regression in Python


Earlier we have discussed how to use to conduct Regression testing in R - Regression in R

In this article ,we will delve into some of the details to conduct the same using Python

Assumption -  Dependent and independent data sets have been stored in respective variables sbiR & niftyR

Code to Run Linear Regression

from sklearn.linear_model import LinearRegression
regression = LinearRegression(normalize=True)
regression.fit(niftyR,sbiR)

As you might notice, Sklearn is the library in Python which has linear module and Linear Regression is the function that we have imported to run Regression.

You can also print the r2 value by function regression.score()

Code to Run Logistic Regression

from sklearn.linear_model import LogisticRegression
logistic = LogisticRegression()

logistic.fit(niftyR,sbiR)

Everything else remain same however depending upon the dependent variable or output, we might need to use Logistic regression

Cross Validation & Train - Test Data sets

To test data sets, we first need to split the available data into 2 parts - train data & test data. Below library should be used to call split function

from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split( X, Y, test_size=0.33, random_state=42)

Array X contains Independent data while Y is the output, we will split whole data into 2 parts and will run regression on train data and check the results on test data to establish if model can be used for prediction

regression.fit(X_train,y_train)
print mean_squared_error(y_true=y_train, y_pred=regression.predict(X_train))
print mean_squared_error(y_true=y_test, y_pred=regression.predict(X_test))

This article has details on installing & Running Python Hello Code from Ubuntu Machine