A Picture is Worth a Thousand Numbers

A Picture is Worth a Thousand Numbers

I’ve enjoyed the recent research and writing about politics and voting data. Time for a short break from that (at least until closer to the Georgia Senate election). Back to some deep learning topics for a bit. The most well-known advances in deep learning have been in computer vision (self-driving cars) and natural language processing (see GPT-3 and BERT). These models have eclipsed human performance in the tasks they perform. At least publicly, fewer advances have been made in modeling time-series data, despite the prevalence of such data. As I’ve written about in the past, often a fully-connected neural network or an LSTM model is used to model time series or other tabular data, often with mixed results. Many simpler time series problems can be reasonably well addressed with other machine learning techniques such as random forest and gradient boosting machines.

Recently, a number of academic papers (e.g. here and here) propose to apply the power of computer vision to time series data. They are inspired by a fascinating 2015 paper Encoding time series as images for visual inspection and classification using tiled convolutional neural networks by Zhiguang Wang and Tim Oates presented at the Workshops at the twenty-ninth AAAI Conference on Artificial Intelligence. The paper suggests turning one-dimensional time series data into two-dimensional images using two different concepts called Gramian Angular Fields (GAF) and Markov Transition Fields (MTF). Once an image dataset is produced, they apply Convolutional Neural Network (CNN) models to those datasets.

I applied Wang and Oates’s theories to a financial time series, S&P 500 equity futures. The dataset is daily closing prices of the continuous contract going back to 1998. I took rolling 90-day windows of prices and then calculated the return 90 days out into the future. That time series dataset was converted into images using the GAF technique. With that image dataset in hand, I then trained a CNN to classify the time series based on whether the future 90-day move was “big” (>4% or -4%) or “small (< 4% or -4%). I selected the 4% threshold for big versus small gains arbitrarily.

With only minutes of training, I was able to achieve an accuracy of nearly 80%. It’s easy to imagine how models such as these could be used in conjunction with other fundamental and technical techniques to further increase accuracy in financial time series forecasting.

The potential applications of this type of technique are widespread beyond financial data. It can be used with virtually any time series, sports statistics, weather data, demographic information, voting data, etc.

Full code and data are on my GitHub under post 81. Many thanks to Dr. Shashank Virmani for his help in deciphering GAFs.

Converting Time Series to Images (Gramian Angular Fields)

Wang and Oates proposed two different types of transformations; I discuss one of them here, Gramian Angular Fields (GAF). As shown below by the diagram from the paper, the goal is to turn a time series into a picture:

Source

The math is complex, and it’s full explanation is beyond the scope of this post. I don’t claim special expertise in the math. Below is a simple explanation.

Step 1: Rescale the Data

Per the authors, “we represent time series in a polar coordinate system instead of the typical Cartesian coordinates.”

The time series is rescaled so all values fall in the interval between -1 and 1 (or 0 and 1).

Step 2: Represent the Data in Polar Coordinates

To understand the next step, we need to review some geometry. To pinpoint where we are on a map or graph there are two main systems, Cartesian and Polar. Per the website mathisfun.com, Cartesian coordinates mark points based on how far along and how far up they are on from 0,0. Polar coordinates mark how far away a point is and at what angle. The images below show an example of the point (12,5). In Cartesian coordinates, it is 12 across, and 5 up. In Polar coordinates, it is at an angle of 22.6° and a radius of 13

Given a time series X, Wang and Oates convert all values of X to polar coordinates using the following equation:

Source

Let’s break that down. Remember, to identify polar coordinates you need two things: an angle (ϴ) and a radius (r). Wang and Oates propose the time stamp as the radius (r): “In the equation above, ti is the time stamp and N is a constant factor to regularize the span of the polar coordinate system.” Once you have the radius, you can actually calculate the angle (ϴ). Here is the basic math:

From trigonometry, we know that the cosine of an angle equals the x-value divided by the radius. This would be part one in converting Polar coordinates to Cartesian. See the example below:

Source

The authors propose taking the arccos of each value of the time series X. What is the arccos? Per RapidTables.com, “The arccosine of x is defined as the inverse cosine function of x when -1≤x≤1.” Showing the math makes it easier to understand. As shown below, when the cosine of a value y = x, then arccos of x = y.

Using the example above with the Cartesian point (12,5) representing x and y. We know that the cos(22.6°) = x/13, with 13 being the radius. Per the arccos formula above, we can thus infer that the arccos(x/13) = 22.6%. Voila.

In our time series, we know both the x and the radius. The x is each value in the time series and the radius is the timestamp (regularized using a constant N). So for each point in the time series, we can calculate the angle for its Polar coordinate. With both an angle and radius in hand, you have what the authors describe:

“Given a time series, the proposed map produces one and only one result in the polar coordinate system with a unique inverse map.”

Source

What you end up with is an angle and distance for each point in the time series. As this diagram from the paper suggests, each point is farther away, since the radius is by definition farther away. For example, the point at time 1 with an assumed constant of 10, will have a radius of 1/10, or 0.1. The point at time 2 will have a radius of 0.2, thus farther away.

Source

Step 3: Create Gramian Angular Fields

The final step is to create an image and here the authors define a Gramian Summation Angular Field (GASF) and a Gramian Difference Angular Field (GADF).

Source

Let’s break this down as well. What the authors are doing is comparing every two possible points in a time series. If a time series has 10 steps, then they compare step 1 to step 2….10, then step 2 to step 1, 3….10, etc. The proposed approach to take the cosine of angles of these two steps, which produces a single number. Those numbers are plotted on a graph for each time point. So there is a point on the graph for time(1,2) and every other combination of two points in the data series. All of the numbers by definition will be between -1 and 1, and so a color scheme is applied to show differences. As shown in the paper, you end up with something like this:

Source

Note that the above picture has x- and y-axes that end at ~500. This means that the time series used to create this has 500 points. The authors make sure to highlight a key point which is that GAFs retain “temporary dependency”. The top left point in the above image is time 1,1, then as you go down, 1,2. Thus from top-right to bottom left, is actually showing the time increase. Since each point on the GAF represents one point in the time series relative to another, it’s possible that points close together will retain a similar color if prices don’t change quickly. Points far apart in time will likely be a different color, regardless of price changes, given the passing of time.

Applying GAFs to the S&P 500

I applied Wang and Oates’s theories to a financial time series, S&P 500 equity futures. The dataset is daily closing prices of the continuous contract going back to 1998. I took rolling 90 day windows of prices and then calculated the return 90 days out into the future. That time series dataset was converted into images using the GAF technique. With that image dataset in hang, I then trained a CNN to classify the time series based on whether the future 90-day move was “big” >4% or -4% or “small (< 4% or -4%). I selected the 4% threshold for big versus small gains arbitrarily.

With only minutes of training, I was able to achieve an accuracy of nearly 80%. It’s easy to imagine how models such as these could be used in conjunction with other fundamental and technical techniques to further increase accuracy in financial time series forecasting.

Description of the Code

Sourcing the Data

The datasets were created from continuous contract S&P500 data from Quandl.com. Once the data was in a csv file, we are ready to produce the GAFs

Building the GAFs

I leveraged an implementation of Wang and Oates GAF models in pyts, a Python library for time series classification. This implementation requires the time series to be in the form of a NumPy array. The main steps are:

  1. Convert the csv file data to a NumPy array using the handy genfromtxt function
  2. Build a simple function to scroll through the entire data set, pulling out each record in the time series
  3. Apply the pyts implementation of GAFs to create the GAF data for each time series record
def GAF(file):
    X = genfromtxt(file, delimiter=',')
    counter = 0
    for column in X:
        counter +=1
        X = column
        X = np.reshape(X,(1,-1))
        import matplotlib.pyplot as plt
        from mpl_toolkits.axes_grid1 import ImageGrid
        from pyts.image import GramianAngularField

# Transform the time series into Gramian Angular Fields
        gasf = GramianAngularField(image_size=21, method='summation')
        X_gasf = gasf.fit_transform(X)
        gadf = GramianAngularField(image_size=21, method='difference')
        X_gadf = gadf.fit_transform(X)

Once the above was done, I used matplotlib to number and save the pictures while also making sure to remove any titles or axis formatting that would influence the CNN model.

Running the Convolutional Neural Network

With the image dataset now built, running the CNN is straightforward in the fast.ai library, with a few lines of code workable for many models.

from fastai.vision.all import *
path = Path("/home/jupyter/fastbook/sameer/gaf")
dls = ImageDataLoaders.from_folder(path, train = 'train', valid = 'valid', bs=4, item_tfms=Resize(224))
learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(2)

Any opinions or forecasts contained herein reflect the personal and subjective judgments and assumptions of the author only. There can be no assurance that developments will transpire as forecasted and actual results will be different. The accuracy of data is not guaranteed but represents the author’s best judgment and can be derived from a variety of sources. The information is subject to change at any time without notice.