The Pearson’s r coefficient is the most popular metric of correlation. In the social sciences, researchers not attuned to the nuances of statistics may be tempted to use it every time they need to compute the correlation between variables, including time series. However, several aspects should be considered when applying this method to time series data.
I present three scenarios where the correlation coefficient can be misleading without deeper data analysis:
- Series with a common long-term trend but divergent short-term behavior.
- Series with common seasonality or cyclical fluctuations.
- Time series with regime-switching characteristics, exhibiting different behaviors at different stages.
These examples illustrate why calculating correlations between time series requires careful consideration and a thorough understanding of the underlying processes.
Case 1: Common Trend
Two series may have a common trend, for example, increasing or decreasing together. Such behavior returns a very high value of Pearson’s correlation coefficient. In the case of the series below the correlation coefficient is 0.99, indicative of an almost perfect positive correlation.

However, if we look at how the series proceed point by point, we can see even antithetical behavior (one goes up, the other goes down) or completely random behavior. In the case of this example, if we subtract the trend from the series and calculate the correlation coefficient again, we find a value indicative of an inverse rather than positive correlation: r=-0.38.

Trending series are rather frequent. For example, measuring the correlation between scientific publications in two completely different and unrelated subject areas can return a high correlation coefficient simply because the global scientific productivity increases over time due to the ‘publish or perish’ culture, but without any other notable relationship between the two specific series.
Case 2: Common Seasonality
Two series that follow a common seasonality are generative of a high correlation coefficient. As with regard to trend, such correlation due to seasonality may mask opposite correlation or nonexistent correlation. The series represented below show a remarkable correlation of 0.70, using Pearson’s r coefficient.

However, when we go to subtract the seasonal component and recalculate the correlation coefficient, we find that the series are substantially uncorrelated (r = -0.04).

Seasonality is common in social science data. For example, the daily frequency of social conversations on two topics may manifest a seasonal component with more conversations on weekends than on weekdays, producing a high correlation coefficient despite the fact that the two conversation topics are not correlated.
Case 3: Regime‐Switching Time Series
In the third case, we find two time series that are characterized by a sequence of different processes. In the example, they are first characterized by a common positive trend, then proceed flatly with negligible variability, and in the third phase take on opposite trends. Such series can be called regime-switching time series because they are characterized by parameters taking different values in each of a series of regimes or phases.

Calculating the correlation coefficient between the two series, we find an almost complete lack of correlation (r=-0.07) despite the fact that the human eye may suspect the presence of common determinant factors. In fact, if we break the series into their three phases and calculate the correlation coefficient for each of them, we find an r=0.99 in the first phase, r=0.00 in the second phase, and an r=-0.99 in the third phase.
In the paper Protest and repression on social media: Pro-Navalny and pro-government mobilization dynamics and coordination patterns on Russian Twitter1, we used a preliminary changepoint analysis to differentiate between different phases of social media mobilization for and against Alexey Navalny before proceeding to analyze them individually.
Conclusions
Time series are data with a special nature and therefore require specific statistical tools. It often happens that correlations due to common seasonality or trends patterns produce high correlation coefficients, even though such patterns are not determined by the processes the analyst intends to measure. Time series can also have varying behaviors over time: new factors may come into play, representing different processes. This happens all the more easily as the series covers larger time frames and in the case of communicative and social processes.
Misleading time series correlations are what is commonly referred to in the field of time series analysis as spurious correlations, an unclear term that I hope I have helped, in part, to clarify. The topic is indeed complex and multifaceted. To delve a bit deeper into the main concepts of classic time series analysis, a primer can be found in the online handbook on the topic that I wrote for my Master’s students in Communication Science at the University of Vienna2.
References
- Kulichkina, A., Righetti, N., & Waldherr, A. (2024). Protest and repression on social media: Pro-Navalny and pro-government mobilization dynamics and coordination patterns on Russian Twitter. New Media & Society, https://doi.org/10.1177/14614448241254126.
- Righetti, N. (2022). Time Series Analysis With R. https://nicolarighetti.github.io/Time-Series-Analysis-With-R/.