Deep learning in Macroeconomics — Treasury Bonds (2024)

Predicting 10-year US Treasury Bond Rates

Published in

Towards Data Science

11 min read

Jan 19, 2020

Deep learning in Macroeconomics — Treasury Bonds (3)

How will US treasury rates move over the in the coming year? Next financial quarter? What about next month? These questions play an essential role in the decision making of both financial market investors and policymakers. Investors aim to search for higher investment returns, to estimate longer-run returns, and to model risk premia. Policymakers attempt to predict future rates to help drive appropriate monetary and fiscal measures in order to maintain a healthy market and macroeconomy.

In this article, I compare the forecasting performance of a Convolutional-LSTM Neural Network to the aggregate forecast performance of the Philadelphia Federal Reserve’s Survey of Professional forecasters. This model and approach are similar to that used in my previous article on predicting US inflation rates. Historically such survey aggregated approaches to forecasting have been utilized to improve performance by aggregating the results of many economists models and predictions of future performance. This is a similar approach to what is used by the Blue Chip Economic Indicators forecast as well as the Dow Jones/Wall Street Journal Economic Forecast survey to name a few. I find that through the use of a neural network prediction algorithm, forecast performance can be improved over all time horizons tested.

Treasury yields are not only important in signaling the state of the stock market and general economy but are also a driver of many other interest rates and security pricing.

The 10-year Treasury Bond is in effect a share of US debt. By purchasing a bond you make a small loan to the U.S. federal government. The 10-year bond is one that matures ten years after its issuance from the US Department of the Treasury. These notes are auctioned by US Treasury allowing their price to be, in part, determined by demand.

These bonds are typically viewed as a risk free debt instrument. That is they determine the rate of return for debt with no risk of default. This is because all Treasury bonds are backed by the by the guarantee of the US economy. Relative to many countries there is very little perceived risk of the US defaulting on its debt.

This perception of Treasury bonds as a risk-free investment is part of what drives their importance in understanding economic perceptions and what drives their influence over other debt instruments.

When the economy is performing well and perceptions of future performance are high, then investors will look to find the highest rate of return for their investment and demand for treasury securities will diminish. In this type of expansionary period of the business cycle, there are many other investment instruments which will yield higher returns than what can be achieved from a Treasury bond. As a result, demand declines and purchasers are only willing to below face value for the bond. This drives the yield higher as the market balances to compete with other investment instruments.

The opposite is true in times of economic contraction or when there is a perceived risk of recession. Investors shift their assets away from instruments perceived as higher risk in search of a safe and stable investment, like a Treasury bond. This high demand drives up the price of bonds and reduces the rate of return. Investors are willing to accept this lower return in exchange for the knowledge that their investment is safe. This is why in times of expansion we see treasury rates fall and in during times leading up to contractions we can see rates increase.

In contractionary periods, this decrease in the risk-free rate leads bank lending rates and other interest rates to fall as well in order to compete in the market place. Less safe investments such as mortgages must reduce rates in order to draw in investors. This provides an added liquidity boost to the market. Lower mortgage and loan rates drives up borrowing by making it cheaper to buy a home and take on debt.

By better understanding the future movements of bond rates individuals, policy makers, and market participants can improve their decision making. Investors can achieve higher returns and act before the market moves rather than reacting to it. Similarly, policy makers can make decisions about monetary policy and liquidity before the economy falls into crisis. Perhaps most importantly, by predicting movements in bond rates, we can better understand the broader sentiment about the economy. Large shifts in rates over a period can indicate changes in perceptions about risk and serve as an indicator for recession.

For data, I use the GS10 series from the FRED-MD database. FRED-MD is a database maintained by the Federal Reserve Board of St. Louis which was “designed for the empirical analysis of “big data.” The data is updated in real-time through the FRED database.

The GS10 variable represents the constant maturity yield at the end of the month for a 10-year Treasury Bond. This rate is calculated based on information provided by the US Department of the Treasury. The data is also available from FRED here.

For this analysis, I forecast the bond rate in each month. As bonds are traded daily, this corresponds to the rate calculated at the end of each month. The series is first differenced to produce stationarity and then structured in a tensor of rolling 24-month periods with a single feature. While there is debate whether data must be stationary before being used in a non-linear forecasting model, in this case, I found results were improved after differencing. The rolling average input is similarly structured. After differencing the series the moving averages are calculated and then structured in a tensor of rolling 24-month periods with each of the three moving averages serving as a feature.

I developed a Convolutional-LSTM Neural Network (CNN-LSTM) to predict bond rates at the end of each of the next twelve months.

Convolutional Neural Networks are a series of deep learning algorithms that were originally designed for the classification of images. The network takes an image, passes that image through a set of filters applying weights to different aspects of the image and ultimately provides a prediction. This works as a feature engineering system whereby, over time, the network “learns” what aspect filters are most important in classifying an image.

A similar method can be applied to time series. Although a time series does not have “physical” features in the same way an image does, time series data does contain time dimensional features. If we think of our time series data like an image we can think of the convolutional network like a spotlight, or window, which scans across the time period illuminating the shape of our series in that period and then filtering it to find the feature shape that it most represents. A simple illustration of this model is below:

This convolutional model can be extended with a Long Short Long short-term memory (LSTM) layer set in order to better learn which of these historic time dimensional features impact rates when. An LSTM is an artificial recurrent neural network (RNN) architecture used in deep learning. Unlike other feed-forward neural networks, an LSTM network has feedback connections. These feedback connections allow the network to learn what past information is important and to forget what is not.

The cell is made up of a few gate functions which determine whether new information is important to the prediction problem and whether old information remains relevant. This memory is referred as the cell state and can keep all previously learned relevant information for the full processing of the time series sequence. This allows for information learned much earlier in the sequence to be maintained through the full processing.

As information is processed through the LSTM it passes through a series of gates which determine whether the information is maintained, updated, or forgotten entirely. This is the benefit of the LSTM architecture over other RNN structures. The LSTM is able to carry information from earlier in the processing through to the end where other RNN networks simply update their understanding with each additional input in the sequence. This makes the LSTM network very powerful for the analysis of sequence data like time series.

The CNN-LSTM network that I utilize for this analysis is diagrammed below. First, we begin with two inputs, the raw time series and three moving average smoothed series. The smoothing series each represent the prior three month, six month, and one year moving averages for any given observation.

These inputs are then fed into separate convolutional layers to extract the relatively important feature weights of each input series. These results are then merged together and passed to a series LSTM networks and then finally to a set of fully connected blocks. Each subsequent block in the stack of fully connected blocks contains fewer nodes than in the previous stack. Between each block a residual, or skip, connection is used allowing the model to use the information learned in earlier layers to continue training the later ones. This prevents the information which was output from the convolutional layer from being too quickly lost or obscured by subsequent layers of the model. This also helps to prevent the vanishing gradients problem and allows for some smaller details from the original series to be maintained further in the model structure. Finally, a dropout is applied to the final layer before outputting the final predictions.

After differencing the data, I split the dataset into a training and test set (70/30 split). The data is trained on all months up to June 2001 and then predictions are made on for July 2001 through September 2019. I use a one step rolling prediction. For every month in the test set the bond rate corresponding to the end of each of the next twelve months is predicted and then the observed bond rate is used to predict the next set of twelve months. Predictions are made in a multivariate fashion, predicting each of the subsequent twelve months simultaneously, and validated on every tenth observation in the series.

This is a fairly realistic approach, as in forecasting the next period of bond rates the analyst will have access to all prior observed rates. By using a one step prediction we maintain this increasing stock of modeling information. By using a stateful LSTM model the model weights are adjusted with each new forecast iteration. This allows the model to pick up any structural changes in the series that may occur over time. The error rates for the model are presented in annualized percentage points.

As a benchmark to the model, I present the error rates over the test period of the Survey of Professional Forecasters (SPF) and the Federal Reserve of Philadelphia’s DARM model (their highest performing benchmark for the SPF). The SPF is the longest running forecasting survey of macroeconomic indicators in the United States. The error rates for these benchmarks are calculated from the SPF official error rate documentation.

Below is a summary of the model results:

We can see that the multivariate convolutional-LSTM model far outperforms the Direct Autoregressive Model and the survey aggregated forecasts. While there are some differences in how these forecasts are run and thus how the error performs, we can see that the convolutional model outperforms these more traditional methods at every time period. The SPF forecasts are performed on a quarterly basis in the middle of each quarter and are then forecast for the end of current quarter and subsequent quarters rather than on a monthly basis. These can still be compared by treating each quarterly forecast as a three month out forecast and the current quarter forecast as an approximately one month forecast.

Below, we can see the model’s performance across the full time series. In general, we can see the model fits the data well with some slightly higher error over periods of greater variance or when the trend changes quickly. This is particularly true at the highest historic rate values during the 1980s.

If we look more closely at only the test period, we can see that the model nearly exactly matches the actual bond rate. Some of this is due to the model gaining additional information at each subsequent forecast. Treasury rates have been declining steadily since the peak in 1984. Recent economic theory has suggested that bond rates may have changed structurally to a lower steady state. This is particularly interesting when considering the Treasury yield curve. The model, however, seems to have had little trouble adjusting to this change in structure.

While there is a clear improvement over the benchmarks in terms of error, another key benefit that can be seen is in the models ability to forecast without lag. Typical econometric model forecasts often show a lag in their predictions and have difficulty quickly adapting to changes in trend. In looking at the additional benchmark forecasts below we can see that, particularly for medium and longer run forecasts there is a clear lag in the forecasted rate. We can also see what I would refer to as model stickiness in the benchmark forecasts. This is where a longer run forecast tends to over estimate the magnitude of a shift in the series trend and is slow to adapt to reversal in direction. This is most obvious looking at the DARM T+12 forecasts during the 1980s. We can see that the model is slow to reverse direction following the peak.

This provides strong evidence for the use of neural networks for the forecasting of bond rates. This neural network structure allows for vastly improved forecasts and is faster at reacting to changes in the trend. This forecasting methodology may also have better performance as it forecasts all twelve future periods simultaneously. Unlike other benchmark models which typical produce forecasts recursively (i.e forecast T+2 is based on forecast T+1) the neural network forecasts all twelve periods using only information at time T. This essentially allows the model to optimize its weighting as though twelve separate models were used.

This analysis used only a single predictor variable, the history of inflation, in order to predict future inflation rates. It is possible that by adding in additional predictors that model performance may improve. It also possible that different smoothing methods may be used, such as an exponential smoothing, which may allow for improved forecasts. Data on bond rates is also available in a daily series, which may make for an interesting addition to the forecasting challenge and provide additional data.

My code for this analysis is available on my GitHub. I will continue to update the code and refine the model so results may change slightly.

acertainKnight/FRED_Forecasting_Final

You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

github.com

10-Year Treasury Constant Maturity Rate

Source: Board of Governors of the Federal Reserve System (US) Release: H.15 Selected Interest Rates Units: Frequency…

fred.stlouisfed.org

Note from Towards Data Science’s editors: While we allow independent authors to publish articles in accordance with our rules and guidelines, we do not endorse each author’s contribution. You should not rely on an author’s works without seeking professional advice. See our Reader Terms for details.