Pages

Friday, June 17, 2011

Regression Analysis in R

We will perform regression analysis on few banking stocks and nifty index to find out their beta and then make decision on validity of these figures. Regression analysis is done using t test to find out validity of coefficient while standard error and degrees of freedom to find out t score.

To get all the data into R we have download csv files (from nifty website) and following command will be further used to find out return series.

>sbitemp <- read.csv("01-04-2010-TO-01-04-2011SBINALLN.csv")
> sbi <- sbitemp[[9]]

> pnbtemp <- read.csv("01-04-2010-TO-01-04-2011PNBALLN.csv")

> pnb <- pnbtemp[[9]]

> hdfctemp <- read.csv("01-04-2010-TO-01-04-2011HDFCBANKALLN.csv")

> hdfc <- hdfctemp[[9]]
> icicitemp <- read.csv("01-04-2010-TO-01-04-2011ICICIBANKALLN.csv")
> icici <- icicitemp[[9]]

> sbiR <- bankl(sbi,sbi)

> niftyR <- bankl(nifty,nifty)

> pnbR <- bankl(pnb,pnb)
> hdfcR <- bankl(hdfc,hdfc)

> iciciR <- bankl(icici,icici)

Where function bankl has following source code
> bankl function (first,second) { a= first
a[1]=0
k=length(second)

i=2
while(i<=k)
{ a[i]= log(second[i]/first[i-1])
i=i+1 }
a }
>

Let's plot returns of NIFTY as compare to returns of SBI.

Next step is to perform pair wise regression to find out beta for each of these securities. glm function in R language help us to do regression analysis and output can be saved.
> glm(sbiR ~ niftyR) Call: glm(formula = sbiR ~ niftyR) Coefficients: (Intercept) niftyR 0.0005735 1.1515344 Degrees of Freedom: 254 Total (i.e. Null); 253 Residual Null Deviance: 0.09468 Residual Deviance: 0.05277 AIC: -1434

Storing the output of regression into a linear object for (SBI & NIFTY)

> sbi.linear <- glm(sbiR ~ niftyR)
> summary(sbi.linear)

Call:
glm(formula = sbiR ~ niftyR)
Deviance Residuals:
Min 1Q Median 3Q Max -0.051342 -0.008509 -0.000283 0.008339 0.068983
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0005735 0.0009049 0.634 0.527
niftyR 1.1515344 0.0812352 14.175 <2e-16 *** ---

T value of 14.175 clearly shows the Beta for SBI is 1.151 at more than 99.99% confidence interval. Similarly we will run regression on all the stocks.
> sbiB <- sbi.linear[[1]][[2]]
> sbiB
[1] 1.151534
> pnbB <- pnb.linear[[1]][[2]]
> hdfcB <- hdfc.linear[[1]][[2]]
> iciciB <- icici.linear[[1]][[2]]


(Please change all &gt to >(in symbol) and &lt tp <(in symbol) , I am trying to convert this but his revert back in code)

Reference:




Distributions & Hypothesis Testing


In Risk Management we will often need statistical tools of distributions and hypothesis testing to make guess on future of portfolio returns and avoid ourselves from any losses at minimum cost.
Plotting stock returns along time line, give us a distribution (it can be normal or log normal). Some of the distributions are called parametric distributions, mean they can be plotted with few parameters. Normal distribution can be plotted by having only mean return and variance.
Hypothesis testing is test to verify our decisions, for example we want to find out return distribution for the portfolio, however we have only 30 days data. Assuming stock returns follow normal distributions, 30 days data can be used to guess mean returns for longer duration. We need to perform student's t-distribution test to test our hypothesis.
T-score =(X - U)/ (S/Sqrt(n))
H0 : X = U
H1 : X not Equal to U

U in this case will be the sample portfolio which portfolio manager has used benchmark.

We will find out where T-score lies on standard t-distribution, if lies of either side of critical region(depending upon confidence interval) then null hypothesis is rejected. We can say we reject null hypothesis where sample mean was suppose to equal population mean.
One of the most important point is to choose right interval , for example if we using 95% confidence interval. If the null hypothesis is right , probability of its getting rejected is 5% otherwise if null hypothesis is wrong then it probability of getting accepted is B (0-95%). We have to set up a right balance over here while using hypothesis as a tool for portfolio analysis.
Similarly we can use this test to find out portfolio manager claim of daily return and volatility. This was t-test apart from this , we have number of statistical test normal test, ANOVA test,2 part t-distribution to compare portfolio variance with other portfolios or compare returns, I will recommend doing Google.

configuring linux machine for quickfix C++

first step to start programming in quickfix is to setup a machine which can run quickfix (FIX Protocol API).I have Suse Linux 11.3 installed on my system.

Download quickfix source code from http://www.quickfixengine.org/ and Install it using following steps

cd Downloads/
tar -xzf quickfix-1.13.3.tar.gz
cd quickfix
./configure
sudo make
sudo make install

One can run tradclient in the examples but before this we need to have configure file. Configure file for tradeclient is stored in

/bin/cfg
cp tradeclient.cfg ../../example/tradeclient/

Now we can run to complile the client code

make

Next step is to execute the code. This can be done using
./tradeclient tradeclient.cfg

Similarly we have to copy ordermatch.cfg to run the file. In the ordermatch.cfg file, I changed FIX42.XML path from ../spec to ../../ to make it recognize required directories at right place.

Now you can read through the source code to work on your understanding of FIX protocol and coding of QuickFIXengine.