R version 2.9.2 (2009-08-24)
Copyright (C) 2009 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
>
>
> # from one of our students,
> # who pointed out the merge function (that I didn't know about)
>
> dji = read.table("http://www-stat.stanford.edu/~jtaylo/courses/stats202/data/dowjones.csv", sep=',', header=T)
> djiDate = strptime(dji$Date, format="%Y-%m-%d")
>
> sp = read.table("http://www-stat.stanford.edu/~jtaylo/courses/stats202/data/spx500.csv", sep=',', header=T)
> spDate = strptime(sp$Date, format='%Y-%m-%d')
>
> #Bind date data to each data frame:
> sp = cbind(sp,spDate)
> dji = cbind(dji,djiDate)
>
> #Change name of dow close so it's different than the sp500 name
> names(dji)[5] = "djiClose"
>
> #MERGE DATA using merge function:
> alldata = merge(dji, sp, by.x = "djiDate", by.y = "spDate")
>
> #sanity check:
> length(alldata$djiClose)
[1] 14940
> length(alldata$Close)
[1] 14940
>
> #OLS REGRESSION
> summary(lm(Close~djiClose, data=alldata))
Call:
lm(formula = Close ~ djiClose, data = alldata)
Residuals:
Min 1Q Median 3Q Max
-136.541 -14.131 -7.805 18.731 261.991
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.197e+00 4.493e-01 4.889 1.02e-06 ***
djiClose 1.161e-01 9.088e-05 1277.740 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 42.27 on 14938 degrees of freedom
Multiple R-squared: 0.9909, Adjusted R-squared: 0.9909
F-statistic: 1.633e+06 on 1 and 14938 DF, p-value: < 2.2e-16
>
>
>
>
> #DATA PLOT with same axis:
> par(mfrow=c(2,1))
>
> plot(alldata$djiDate, alldata$djiClose, type='l', col='red', ylab='Dow Jones
+ close')
> plot(alldata$djiDate, alldata$Close, col='blue', type='l', ylab='S&P
+ 500 close')
>
>
>
>
> proc.time()
user system elapsed
3.820 2.012 8.552
R script
