Data Analytics, Stationarity, and Cointegration in Policy Research

Session report

The Generation Alpha Data Centre, at IMPRI Impact and Policy Research Institute, New Delhi conducted a Four-Day Immersive Online Certificate Training Course on ‘Data Analytics for Policy Research’ from November 4th to 25th, 2023. 

The course, spread over four-consecutive days, helped to equip policymakers, researchers, and data enthusiasts with cutting-edge analytical skills. In this course, we went beyond theory and provided hands-on training in data analytics techniques, empowering participants to derive meaningful insights from complex datasets

On the third day our second speaker, Dr Soumyadip Chattopadhyay, Associate Professor, Economics, Visva-Bharati, Santiniketan; Visiting Senior Fellow, IMPRI, opened the discussion by delving into time series variables, emphasizing their role in regression analysis for policy research. He focused on non-forecasting aspects, elucidating the significance of stationarity.

Stationarity and Its Importance

Understanding stationarity is pivotal, as it profoundly influences a variable’s behavior and properties. Dr. Chattopadhyay highlighted the impact of shocks on non-stationary variables, emphasizing the need for stability. He discussed spurious regression in the context of non-stationary time series, emphasizing the importance of stationarity for reliable forecasting.

Detecting Stationarity

  • Graphical Approach

Dr. Chattopadhyay advocated a graphical approach, urging researchers to plot time series data. A stable time series exhibits consistent behavior, while non-stationary ones show trends or erratic patterns.

  • Auto-correlation Function

Introducing a more robust measure, Dr. Chattopadhyay discussed the auto-correlation function. This statistical tool aids in constructing correlograms, providing a quantitative method to assess stationarity.

  • Unit Root Process

The unit root process, exemplified by the random walk model, served as a theoretical underpinning for non-stationary time series. Dr. Chattopadhyay illustrated how this process reflects dependencies on past values, emphasizing the implications for regression modeling.

Auto Production Function

The session began with an explanation of the auto production function, highlighting the physical variance between YT and YT minus k, guided by YT. The discussion emphasized the use of cold programming to understand stationarity. The coldogram involves plotting the Rouque against different lag periods to create a cold log graph.

Decision Rule for Coldogram

The decision rule for interpreting the coldogram was explained. If the values of Rouke hover around 0, it indicates stationarity. This occurs when the covariance between YT and YT minus k is 0, implying that shocks do not significantly impact the variable over time.

Maximum Lag Length

Two important considerations when using coldograms were discussed. First, determining the maximum lag length involves calculating the autocorrelation function, and a rule of thumb suggests using one-third of the time series length for computation.

Statistical Significance of Rok

The second consideration involved testing the statistical significance of Rok. Various statistical methods were mentioned, with a focus on the unit root test, specifically the Dickey-Fuller test. This test checks whether the row is statistically significantly different from 1.

Cointegration and Regression Analysis

The session transitioned to the method of cointegration as a sophisticated approach to regression analysis with non-stationary time series variables.

Addressing Non-Stationarity

Methods for making non-stationary time series stationary were discussed, including the inclusion of trend terms and differencing.

Cointegration Test

The Engle-Granger test was introduced as a three-step process to check for cointegration between two non-stationary variables. This involves testing for the order of integration, generating estimated residuals, and applying the ordinary Dickey-Fuller test to check for stationarity in residuals.

Data Import and Transformation

The session commenced with an overview of the dataset, spanning 50 years from 1960 to 2010, containing information on consumption expenditure and GDP. Dr. Chattopadhyay detailed the steps for importing Excel files into the statistical tool and highlighted the necessity of transforming data into logarithmic form for subsequent analysis.

Stationarity Assessment

The first critical step was to assess the stationarity of the time series variables. Dr. Chattopadhyay demonstrated the use of autocorrelation functions, correlograms, and unit root tests to ascertain the non-stationary nature of the level forms of both consumption and GDP. Emphasis was placed on statistical significance in determining stationarity status.

Differencing and Unit Root Tests

Recognizing the non-stationarity in level forms, Dr. Chattopadhyay proceeded to illustrate how differencing could be applied to make the variables stationary. Unit root tests were employed to confirm the stationarity of the first difference forms, providing essential insights for subsequent analyses.

Cointegration Analysis

The core of the session involved exploring cointegration between consumption and GDP. Dr. Chattopadhyay performed a regression analysis and generated residuals to test for cointegration. Augmented Dickey-Fuller tests were conducted on the residuals, revealing their statistical significance and indicating a credible long-run relationship between consumption and GDP.


In conclusion, Dr. Soumyadeep Chattopadhyay’s meticulous walkthrough of the regression analysis underscored the importance of addressing stationarity concerns and assessing cointegration for meaningful policy implications. The results indicated a robust relationship between consumption and GDP, laying the foundation for future forecasting and policy-making endeavors. Participants were assured of receiving the dataset and pertinent materials for further exploration. The session concluded with an open floor for questions and clarifications.

Acknowledgment: This article was posted by Rahul Soni, a research intern at IMPRI.

Read more at IMPRI:

Data and Public Policy: Municipal Finance Case Study

Research Ethics in Data Collection and Analysis