Analysis of Consumer Sentiment Using Hidden Markov Model Regimes

Consumer sentiment plays a critical role in the growth of the economy, as the average consumer's opinion on the state of the economy can directly manifest in changes in spending, saving, investment, and the flow of capital. Understanding this sentiment is vital to understanding the economy, as consumption accounts for over two-thirds of GDP. When sentiment improves, people are more likely to spend and invest, boosting economic activity. When it falls, they tend to save more and cut back on purchases, slowing growth.

However, this value is also is noisy, and serves as a response to economic conditions, so deeper potential relationships may not be fully captured by traditional time series analysis. This analysis serves as an alternate approach to identify these shifts in sentiment by applying a Hidden Markov Model (HMM) to categorize consumer sentiment into specific states, which can then be better understood using macroeconomic analysis.

Unlike linear models that assume constant relationships between variables or variable independence, HMMs excel in modeling non-stationary time series data. Through analyzing the data, the model can identify unobserved conditions that dictate the probability of transitioning into another state. By training the HMM on macroeconomic indicators, we can analyze which variables play the greatest role in shaping sentiment, estimate the likelihood of transitions between sentiment states, and predict future shifts more reliably than alternative models.

Once a month, the University of Michigan surveys around 800 citizens on their outlook of current and future economic prospects. The questions investigate how consumers feel about their personal finances and the direction of the economy, including questions like:

How are your current personal finances compared to a year ago?
Do you think business conditions in the country are good or bad at present?
Do you expect your personal finances to improve or worsen over the next year?
Do you expect business conditions to improve or worsen in the next year?
Over the next 5 years, do you think the economy will have good times or bad times?

Responses are categorized as positive, neutral, or negative. For each question, the net value (percent positive minus percent negative) is calculated and used to compute the index:

$$ \text{Index}_{i} = \frac{\text{Current period score}}{\text{Base period score}} $$

The calculated value showes how consumers feel about the overall direction of the economy and their finances, relative to how consumers felt in 1966. Values above 100 indicate sentiment stronger than in the base period, while values below 100 indicate weaker sentiment. The index is still interpretable when comparing the values between two separate periods. For example, if period 1 had a score of 85 and period 2 had a score of 87, it can be concluded that consumer sentiment has improved relative to the earlier period.

For this analysis, a set of 17 macroeconomic indicators were collected from the Federal Reserve of St. Louis (FRED), a widely respected source for accurate, up-to-date economic data. Data from 1987 to 2024 was collected for modeling and analysis, striking a good balance for the amount of time to learn from and the quality of available data, ensuring many economic sectors were well represented. The selection of indicators is critical as they serve as the observed variables in the HMM, providing the data to train the model and identify the complex, non-linear relationships.

The indicators are categorized as the following:

Output: Measures of economic productivity which reflect the overall health of the economy.
Labor Market: Indicators measuring employment, which directly impact people's financial security and future outlook.
Price Levels: Inflation metrics which greatly affect purchasing power and consumer confidence.
Monetary and Fiscal: Interest rates and government spending, which shape the financial environment and economic policy.
Housing and Construction: Indicators such as housing starts and building permits, which serve as leading indicators of economic cycles and household confidence.

Detailed information about each category and their relationship to consumer sentiment is in Economic Indicators.

Economic Indicators

Indicator Category

The Hidden Markov Model

A Hidden Markov Model (HMM) is a statistical tool used to understand systems where you can't directly observe the state of the system, but you can observe evidence or signals that are influenced by that state.

Think of it like trying to guess the weather somewhere on another planet. You can't see the weather from where you are, but if you are told the humidity is low, there is a high-pressure system, and wind speeds are low, so you can make a good guess that it is likely calm weather at the moment.

There is two layers to this: the unobservable hidden states (the weather type) and the observations we can see (the meteorological indicators). The model assumes that the sequence of hidden states follows a specific type of random process called a Markov chain , where the next state only depends on the current state.

The primary goal of a Hidden Markov Model is to infer the most likely sequence of hidden states given a sequence of observations. For example, if you observe a pattern of {Arid, Arid, Humid} over three days, the HMM can calculate the most probable weather sequence, such as {Sunny, Sunny, Rainy} .

The model trains by analyzing a large set of historical data where both the states and observations are known. At this stage, its central task is to determine the values of its internal probabilities (the initial state, transition likelihood, and emission probabilities) that best explain the given data. To measure what 'best' means, a statistical score called the log-likelihood is used, which numerically measures how well the model explains the data.

To calculate the transition and emission probabilities, we need to know the sequence of hidden states. However, that sequence is unknown. To solve this, the HMM uses a powerful iterative method called the Expectation-Maximization (EM) algorithm. The algorithm starts with an initial guess for the probabilities and then repeatedly refines them through a two-step cycle, guaranteeing that the log-likelihood improves with each iteration.

This cycle of expectation and maximization repeats until the model converges, when the improvements in the log-likelihood score become negligibly small. At this point, the algorithm has found a stable, optimized set of probabilities that best describes the hidden dynamics within the data.

Transition Probabilities govern how the hidden states change over time. They are the probabilities of moving from one hidden state to another in the next time step.

These probabilities model the dynamics of the system. For instance, if the weather is sunny today, what is the probability that it will be cloudy tomorrow? These are learned from historical data by counting how often each state follows another. For example, the model would count how many times a sunny day was followed by a rainy day in the dataset to calculate that specific probability.

In the matrix below, each transition likelihood is stored as a value between 0 and 1. Each row represents the likelihood of moving from one state to every other state in the system. Using this matrix, we can identify trends in time. For example, the value from Sunny to Rainy may be low, and the value from Rainy to Sunny may be high, meaning that it is unlikely on any Sunny day that it will become Rainy, and once it does it is very likely to become Sunny the next day.

Emission Probabilities link the hidden states to the observations. They represent the probability of seeing a particular observation given that the system is in a specific hidden state. For example, given that it is Rainy, what is the probability that the air is Arid? These are also learned from historical data. For all the times the state was sunny, the model calculates the proportion of times the observations were arid and humid.

In the matrix below, each entry represents the likelihood of observing a condition given a state. Each row shows the probability of seeing a condition given every state in the system. This gives a mathematically useful way of identifying trends in our observations, such as analyzing how useful an indicator is.

For example, if an indicator initially seemed to be useful, but its emissions matrix did not demonstrate any relationships (all values are about equal), then that indicator cannot give any predictive insight on the states.

Use the transition matrix to see how the probability values changes the states in the simulation. Increasing the value of a cell means that transition is more likely to occur.

(Note: All probability rows must sum to 1)

Use the emissions matrix to see how the probability values changes the observations in the simulation. Increasing the value of a cell means that observation is more likely to occur.

(Note: All probability rows must sum to 1)

Sequence length

To optimize the interpretability and numerical stability of the model, the macroeconomic indicators were first filtered. This ensured that no single economic category was overrepresented, which would inadvertently bias the model. This process involved an initial screening to find the single indicator from each category that exhibited the highest predictive power with consumer sentiment. As part of this screening, each indicator was evaluated both contemporaneously $t$ and with a one-year delay (12 month lag) to reflect how information typically reaches households and markets. This helped avoid look-ahead effects and capture genuine lead-lag dynamics.

Following this preliminary filtering, a feature selection process was applied to identify the optimal combination of three indicators. Each candidate model was then rigorously tested using out-of-sample validation via a rolling cross-validation scheme, using training and testing windows to ensure the model's performance is not artificially inflated by period-specific anomalies. The final specification therefore balances signal across categories and timing, with lags included only when they improved real-time predictability.

Model selection relied on two core metrics:

Mean Log-Likelihood per Observation: how well a candidate explains unseen test-set data, judged relative to a baseline model with no time-varying drivers.
Standard Deviation Across Folds: the model’s stability and robustness across different training periods.

Preference was given to models that deliver a higher average log-likelihood, indicating better predictive performance, with lower variability across folds, highlighting their ability to generalize reliably across different economic cycles. Where lags were selected, they consistently improved these out-of-sample scores.

Conclusion

Sentiment Regimes
Best Model

The Hidden Markov Model was able to successfully identify two distinct regimes in the Consumer Sentiment Index, a High state associated with steady growth and tailwinds, and a Low state tied to stress periods. The model’s regimes line up with economic downturns and their immediate aftermath, but it also catches transitions on the way in and out, not just the troughs. The use of a state to state analysis strengthened the algorithm to time dependent explanatory data and response, allowing it to better analyze the relationships and trends without the typical issues of a strongly related dataset.

The High regime clusters well above the long-run baseline while the Low regime sits significantly below it. Consistent with most Markov models, states persist for long stretches, as self-transition probabilities dominate, and switches are asymmetric. Moves into Low tend to be abrupt during stress events, while recoveries back to High are steadier as activity firms and price pressure cools paired with cascading effect that lasts years with changes in interest rates, employment, and productivity. The inferred state often turns before official recession dates and remains informative through the early expansion, which gives a peek at where the economy is heading.

The model utilizing Lagged Real GDP (12 month lag), the PCE Price Index , and the Lagged Federal Surplus/Deficit (12 month lag) (Model 1) was identified as the most effective combination. It consistently delivered the strongest out-of-sample improvement over the baseline, proving superior and more consistent at anticipating Consumer Sentiment regime shifts in unseen historical periods than alternative models. The strength of this combination lies in its ability to non-redundantly capture the three key economic levers influencing households:

Real Productivity (lagged Real GDP captures income/jobs momentum as consumers feel it)
Cost of Living (PCE Price Index reflects inflation/price pressure)
Public Finance (lagged Federal Surplus/Deficit proxies for policy stance and stress as people learn about it)

t-SNE compresses the three selected indicators (Real GDP, PCE Price Index, and Federal Surplus/Deficit ) into a 3-D map by preserving local neighborhoods. Years that look similar across those variables are placed near each other, and dissimilar years are pushed apart. By analyzing the clustering and separation, we can get a better understanding of the relationship of the variables. In our plot, the groups align with the model's High and Low states, indicating the chosen indicators naturally form two regimes with limited overlap. Points between or on the fringes of clusters likely mark transition years and lower classification confidence. This demonstrates separability, and that the regimes are truly two distinct periods.

Live Forecasting Page

Table Definitions

Current (Z): Where the indicator is right now (in standard deviations).
Baseline (Z): Where the indicator usually is during this economic regime.
Difference: The distance between Current and Baseline. If this is greater than 1.0, it is flagged as Inconsistent with the current regime.
Status: 'Inconsistent' means the indicator is behaving abnormally for this specific regime.

Monthly Analysis Summary | Generated: January 2026

AI Disclaimer

The summary is synthesized by a language learning model (LLM) using model regime probabilities, anomaly scores, and conditional indicator baselines. Verify against raw data.