Find Communities by: Category | Product

Introduction

 

Due to the almost “unpredictable” nature of stock market, predicting stock price is one of the most challenging problems in financial service industry. In financial literature, a stock price is considered as a stochastic process and modeled with geometric Brownian motion (GBM), which is expressed as a stochastic differential equation (SDE) as follows

formula1b.PNG.png

Here S(t) is price of the stock at time t ; a is percentage drift; b is percentage volatility and W(t) is the Brownian motion term. The equation has a nice closed form solution

formula2b.PNG.png

However, for real problems, the percentage volatility b is hardly obtainable, making it extremely difficult to model the dynamics of a stock or predict its trends.


As a compromise, we use neural network models to predict stock price trends. As a data-driven approach, neural networks take stock prices as time series and use historical data to learn the parameters involved in the model and make predictions for the future. Also, neural networks possess a good diversity in model architectures, so they are good candidates for ensemble learning, which usually produces more accurate predictions than any single models.


In this blog, we discuss about building an ensemble prediction system consisting of neural network models to predict the trends for 542 technology companies’ stocks. Also, we will show how to train the prediction system in parallel utilizing the HPC (high performance computing) power of Intel® Xeon cluster.

 



Ensemble Model for Predicting a Bundle of Stocks


Evolutions of different stocks’ prices are not independent of each other. Instead, they are influencing each other in some way. It is very hard to know exactly how they interact with each other. We can approximately model the dynamics of stock prices with the following system

formula3.PNG.png

Here k is the maximum lag that is used as predictors. In this blog, we explore on using neural network models to approximate F_1, F_2, ......, F_n. As is shown in Figure 1, we firstly train three different neural network models (multi-layer perceptron (MLP), convolutional neural network (CNN) and long short-term memory (LSTM)) and then do an ensemble forecasting via majority voting, i.e., we believe that the price of a stock will increase/decrease if at least 2/3 of the prediction models believe so.

figure1.png

Figure 1: Training ensemble model for stock price trends prediction on multiple nodes with Intel® Xeon processors




Low Latency Prediction with Intel® Xeon Scalable Processors


Latency is very critical for stock trading. An investment portfolio on the stock market needs to be frequently adjusted to hedge the risk (to minimize the probability of loss). The hedging strategy needs to be executed at high frequency with very low latency. So, training and predictions must be finished within very short time periods.


We leverage the HPC power of Intel® Xeon processors to build a low latency neural network system for real-time predictions for stock price trends. The system in Figure 1 was tested on the Zenith supercomputer in Dell EMC HPC and AI Innovation Lab, which consists of 422 Dell EMC PowerEdge C6420 servers, each with 2 Intel® Xeon Scalable Gold 6148 processors. Each of these processors has 20 physical cores.


In Figure 1, MLP, CNN and LSTM models can be trained simultaneously with multiple processes (e.g., 40 processes) on 3 Zenith nodes. Training with multiple processes was performed with Uber’s Horovod framework. We trained the models with historical data in the past 200 consecutive trading days. Batch size for each process was 8. We did the test for 20 trading days, so there were 10840 (20x542) trends. The ensemble system can predict about 56% of them correctly.

figure2.png

Figure 2: Training time comparison for 1 process and 40 processes.

Figure 2 compares training time costs for 1 process and 40 processes. As is shown, distributed training over multi-processes can reduce training time by about 10 times. The speed-up is especially significant for LSTM model, because it has more parameters and usually takes longer time to train than other models. With this prediction system, the latency is less than 20 seconds. Stock trading system can take advantage of this low latency to respond and react to changes/fluctuations in the market more quickly, which in the end will help better optimize the investment portfolio so that the risk of loss is lower.




Summary


While deep neural networks have had great success in areas like computer vision, natural language processing, heath care, weather forecasting, etc., it is worthwhile to do more exploring research on their applications in financial industry. Since most operations (e.g., stock trading and options trading) in financial industry are time-sensitive and requires low latency, training prediction models in parallel is a necessity when financial companies integrate artificial intelligence (AI) applications into their business. HPC clusters with Intel® Xeon scalable processors will be good infrastructure choices for building such low latency AI systems.