# A Data-Driven Fed Wouldn't Cut Rates

## TL;DR on Fed Rate Cuts

The Federal Reserve Open Market Committee (FOMC) announces its rate decision on Wednesday. A robust review of economic data does not support a rate cut; I built both a neural network and random forest model trained on 72 economic indicators (monthly data going back to 1992) to predict Fed rate decisions. The models, which find relationships only from the data, suggest the Fed should hold rates. Furthermore, as discussed in my last post, NLP sentiment analysis suggests a mixed view on the part of voting FOMC members.

Yet, we know that Fed Funds futures markets are 100 percent certain of at least a 25 basis point cut. Conclusion? A rate cut would call into question the Fed’s data-driven bias. Some have already implied that the Fed is cherry-picking the data. It would also open the Fed up for more critique of its communication strategy and its willingness to succumb to politics and the bond market. Finally, it furthers the concern that low rates drive inequality. You would be correct in feeling like this is a no-win situation.

Given the mixed conclusions of both the economic and sentiment model, assuming 100% certainty of a cut feels like a high-risk, crowded trade. Markets have a way of inflicting maximum pain on the greatest number of participants. Consider the possibility that the Fed holds on rates while offering strengthened language about its vigilance in following the data.

Let’s dig into the facts.

## Reviewing Core Economic Data

A yield curve inversion has markets concerned and the widely followed NY Fed Recession Probability Indicator is flashing possible recession. A variety of non-US economic data such as Purchasing Managers’ Index (PMI) are softening. However, core US economic data paints a picture of relative strength.

Let’s start with **GDP**: the latest data, as well as 2018 revisions, was released last week. The economy is still growing about as well as it did during the Obama administration.

How about **Initial Jobless Claims**? Nothing too concerning here.

What about **Industrial Production**? Right at highs.

And the **ISM Manufacturing Index**? Down from recent highs, but well within the range of activity post-2008.

Is the consumer is struggling? Let’s look at **Consumer Confidence** and **Retail Sales**. Turns out, the consumer is fine.

Certainly, **Inflation** must be plummeting? Actually, its right near the Fed’s 2% target and within the last decade’s range. Sidebar: for a discussion on why lower inflation might be ok, read this post.

Let’s check out bond markets. The **Yield Curve** has inverted, and according to the San Francisco Fed, every recession of the past 60 years has been preceded by an inversion.

This spread drives the widely followed **NY Fed Recession Indicator**, which has moved into the historical zone of concern.

Finally, international data. The latest **PMI** readings for G20 countries show only seven countries above 50, and almost every country trending down.

## Predictive Model

I built a fully-connected neural network using base architecture from PyTorch and Fast.ai libraries to predict the Fed’s rate action (Raise, Hold, Lower). Given the small dataset, neural networks felt like a brute force technique so I also built a Random Forest model using the same dataset (excluding the 8 variables with missing data). Results were as follows:

Model | Accuracy |

Neural Network | 87.1% |

Random Forest | 80.7% |

Using June data, both models predicted that the Fed would hold rates steady. Full code for the model can be found here.

### Training Data

The training set was monthly data on 72 key economic indicators going back to 1992 (8 data sets started later). The dependent variable was the Fed rate decision, Raise, Hold, or Lower.

### Model Architecture

The base fast.ai architecture for tabular neural networks works for many use cases. Given the possibility of overfitting with a small data set, I adjusted the architecture to 400 x 100 layers, from the 200 x 100 default.

The model created embedding matrices (initially randomly generated numbers) for the one categorical variable (date) and also for the eight variables where they had missing data. Embedding matrices are a technique that converts nonnumerical data into numbers, suitable for model training. These numbers become the activations that are inputted into your model. It is a powerful concept because previously trained (learned) embeddings can be reused in new models.

The model takes these categorical variable embedding matrices and applies the Dropout function to them. The default setting in the model was for no Dropout, which I adjusted to 0.5. Dropout is a form of regularization, a technique to avoid over-fitting. Regularization reduces over-fitting by adding a penalty to the loss function. Explained simply, Dropout ignores a random percentage (that you set) of the activations that you feed the model, and thus their corresponding weights. This works quite well, because, by adding this penalty, the model is trained such that it does not learn interdependent sets of weights.

Next, BatchNorm is applied to the 72 continuous variables. This is a form of normalization, a scaling technique that is commonly applied to neural networks to avoid model instability from inputs with a large range. A very large input can cascade through layers, creating gradients (derivative of your loss function with respect to your weights) that are imbalanced, making optimization difficult.

“Normalization is a process where, given a set of data, you subtract from each element the mean value for the dataset and divide it by the dataset’s standard deviation. By doing so, we put the input values onto the same “scale”, in that all the values have been normalized into units of “# of standard deviations from the mean. BatchNorm extends this idea with two additions. After normalizing the activation layer, let’s multiply the outputs by an arbitrarily set parameter, and add to that value an additional parameter, therefore setting a new standard deviation and mean. Make all four of these values (the mean and standard deviation for normalization, and the two new parameters to set arbitrary mean and standard deviations after the normalization) trainable.”

Source: fast.ai

The results at this point are concatenated into one set of inputs, and the core architecture follows. We have blocks of a Linear layer of size 131×400, followed by the application of the most popular non-linear function in deep learning, ReLU, or Rectified Linear Unit (which replaces all values less than zero with zero). BatchNorm is once again applied. Another block follows with a linear layer, this time of size 400×100 (coming out fo the 131×400 input block from the previous layer) followed by ReLU and BatchNorm. This culminates in a final linear layer of 100×3. Three is the logical output, since we asked our model to train on three outcomes, Raise, Hold, or Lower.

### Alternative Model – Random Forest

For comparison, given the small dataset, I built a Random Forest model using the same dataset, excluding the 8 variables with missing data. I used the base RandomForestClassifier module from the Scikit-learn library. The random forest classifier consists of a large number of individual decision trees that operate as an ensemble. Each tree produces a prediction, and the one that receives the most votes becomes the full model’s prediction. The use of multiple, uncorrelated models is usually able to outperform any individual models. This concept is visualized here.

### Revisiting the Sentiment Model

As discussed in my last post, NLP sentiment analysis suggests a mixed view toward rate cuts amongst FOMC voters. In fact, the most recent June statement text suggested no rate cut, although certain subsequent speeches and minutes released in July suggested otherwise.

*Any opinions or forecasts contained herein reflect the personal and subjective judgments and assumptions of the author only. There can be no assurance that developments will transpire as forecasted and actual results will be different. The accuracy of data is not guaranteed but represents the author’s best judgment and can be derived from a variety of sources. The information is subject to change at any time without notice.*