Predictive market modeling in R Language

July 29, 2017 09:17 AM

Using R to predict tomorrow’s returns 

The complete code is too long to show here and will be posted in the online version. Here are some key parts: 

Let our “from” date be 2005-01-02.


getSymbols(“^GSPC”, from=from_date)

spReturns = diff(log(Cl(GSPC)))

spReturns[as.character(head(index(Cl(GSPC)),1))] = 0

This will create a XTS named “GSPC,” a name of the symbol without ^. We then will take the log of the close and then take the difference of this log to make the data stationary. 

We next set up a window for testing and create a vector to save the forecasts: 

windowLength = window_Length

foreLength = length(spReturns) - windowLength

forecasts<- vector(mode=”character”, length=foreLength)

Next we set up a loop that we use to loop though the data while calculating our ARIMA/GARCH hybrid model. 

We will save the optimal order for ARIMA based on AIC in final order and use it as the mean part of the GARCH Model. 

The next step is to take the best model found for ARIMA and pass it into the ARMA part of the GARCH algorithm (see “Mixing models,” below).

Because our structure for our format is today’s date, forecast for tomorrow, we need to have two different versions of the forecasts. The first one is the “tomorrow” forecast, which is calculated using data from today back to window-length days. Next, we shift the forecast forward or dates backward so that the forecast date is on the row where the actual returns are. This allows us to use little trick to do a backtest. 

spArimaGarch = as.xts( read.zoo(file=”forecasts_new.csv”, format=”%Y-%m-%d”, header=F, sep=”,”))

spIntersect = merge( spArimaGarch[,1], spReturns, all=F )

spArimaGarchReturns = spIntersect[,1] * spIntersect[,2]

You can see we combined the GARCH forecasts shifts and the returns.  This allows us to multiple the sign of the forecast by the actual return on that day we create a poor man’s backtest. We then multiply together 1+the returns and take a cumproduct of these and then convert to a log scale. 

spArimaGarchCurve = log( cumprod( 1 + spArimaGarchReturns ) )

spBuyHoldCurve = log( cumprod( 1 + spIntersect[,2] ) )

We then merge the predictions equity curve and buy and hold and create a PDF using the as well as plotting it. Note you can only write the PDF in the active directory. 

spCombinedCurve = merge( spArimaGarchCurve, spBuyHoldCurve, all=F )


pl<- xyplot( spCombinedCurve,superpose=T,col=c(“darkred”, “darkblue”),

lwd=2,key=list( text=list(c(“ARIMA+GARCH”, “Buy & Hold”)),

                        lines=list(lwd=2, col=c(“darkred”, “darkblue”))))


print(pl) }

This gives you an insight into the methods of creating trading models with machine learning that can predict market returns. In the second part of this series we will show you how it all works and measure the results.  

Page 2 of 2
About the Author

Murray A. Ruggiero Jr. is the author of "Cybernetic Trading Strategies" (Wiley). E-mail him at