Master Time Series Analysis: Ultimate Guide + 20 Top Interview FAQs 👨🏿‍💻

Mastering Time Series Analysis:
Interview Guide & FAQs

A comprehensive guide to Forecasting, ARIMA, Stationarity, and the Top 20 Questions asked in Data Science Interviews.

📈
What is Time Series?
A sequence of data points collected in strict chronological order. Unlike standard data, the order cannot be shuffled because the "Time" aspect is the primary axis.
Examples: Stock prices, Heart rate monitoring, Daily temperature.

1. Time Series vs. Regression (Cross-Sectional)

This is the most fundamental question. Why can't we just use Linear Regression?

Feature Regression Time Series
Order Doesn't matter (Can shuffle rows) Strictly Ordered
Independence Assumes data is Independent Assumes data is Dependent (Autocorrelation)
Goal Predict Y using Features (X) Predict Yt using History (Yt-1)
Cross Validation Random K-Fold Rolling Window (Time Split)

2. Why do we perform Time Series Analysis?

  • Forecasting: Predicting the future (e.g., Inventory planning).
  • Descriptive Analysis: Finding patterns like seasonality (e.g., "Sales peak on Fridays").
  • Anomaly Detection: Identifying weird behavior (e.g., Credit Card Fraud).

3. Requirements & Pre-processing

To run models like ARIMA, your data must meet specific criteria:

  1. Chronological Order: Sorted by date.
  2. Constant Interval: No missing timestamps (Daily means every day).
  3. Stationarity: Mean and Variance should not change over time.

4. Correlation vs. Autocorrelation

Correlation: Relationship between two different variables (Price vs. Demand).
Autocorrelation: Relationship of a variable with itself from the past (Price Today vs. Price Yesterday).

💡 The "Interview Trap" Question: "Why is Autocorrelation bad in Regression but good in Time Series?"

In Regression, autocorrelation violates the assumption of independence (bad).
In Time Series, Autocorrelation IS the signal. If the past isn't correlated with the future, the data is just random noise, and we cannot forecast it.

5. Properties: Trend, Seasonality, Noise

Time Series data ($Y_t$) is usually decomposed into:

  • Trend (T): Long-term direction (Upward/Downward).
  • Seasonality (S): Repeating pattern at fixed intervals (e.g., Monthly).
  • Cyclicity (C): Long-term waves (Economic recessions).
  • Irregularity (I): Random noise (Residuals).

6. Visualizing ACF & PACF Plots

These plots tell us the order of the ARIMA model.

ACF Plot (Autocorrelation)
Slow Decay = Trend/Non-Stationary
Lags
PACF Plot (Partial Auto)
Sharp Cut-off = AR Term
Lags

Top 20 Time Series Interview FAQs

Prepare for your exam or interview with these essential questions.

1. What is Stationarity?
It means statistical properties (Mean, Variance) are constant over time. The data looks like "static noise" with no trend. Essential for ARIMA.
2. Seasonality vs. Cyclicity?
Seasonality: Fixed frequency patterns (e.g., every Christmas).
Cyclicity: Fluctuations with no fixed duration (e.g., Economic Recessions).
3. What is White Noise?
Purely random data with Mean=0, Constant Variance, and Zero Autocorrelation. It cannot be predicted.
4. What is Heteroscedasticity?
When the variance changes over time (e.g., volatility clusters in stock markets). We fix this using Log or Box-Cox transformations.
5. What is Spurious Correlation?
A false relationship between two unrelated variables (like Cheese consumption vs Civil Engineering degrees) simply because both have an upward trend.
6. What is the ADF Test?
Augmented Dickey-Fuller: Tests for stationarity.
P-value < 0.05 = Stationary (Good).
P-value > 0.05 = Non-Stationary.
7. What is the KPSS Test?
The opposite of ADF.
Null Hypothesis = Series IS Stationary.
If P-value < 0.05, the data is Non-Stationary.
8. How to handle Missing Values?
Never drop rows. Use Forward Fill (last known value), Interpolation (connect the dots), or Seasonal Adjustment.
9. What is Differencing?
Subtracting $Y_t - Y_{t-1}$ to remove trends and stabilize the mean. Used to make data stationary (The 'd' in ARIMA).
10. What is Box-Cox Transformation?
A power transform ($y^\lambda$) used to stabilize variance and make non-normal data look more Gaussian (Normal distribution).
11. AR vs. MA terms?
AR (AutoRegressive): Uses past values ($Y_{t-1}$) for prediction.
MA (Moving Average): Uses past forecast errors ($\epsilon_{t-1}$).
12. How to read ACF vs PACF?
AR Model: PACF cuts off; ACF decays.
MA Model: ACF cuts off; PACF decays.
ARMA: Both decay gradually.
13. What is SARIMA?
Seasonal ARIMA. It adds seasonal parameters $(P,D,Q)_m$. Useful when patterns repeat every $m$ months (e.g., $m=12$ for yearly).
14. Additive vs. Multiplicative?
Additive: Trend + Seasonality (Constant seasonal height).
Multiplicative: Trend × Seasonality (Seasonal height grows with trend).
15. What are Exogenous Variables?
External parallel data used to help prediction. E.g., Using Rainfall data to help predict Umbrella Sales in an ARIMAX model.
16. What are AIC and BIC?
Metrics for model selection. They reward accuracy but penalize complexity (number of parameters). Lower is better.
17. Cross-Validation in Time Series?
Standard K-Fold is forbidden (shuffling destroys order). We use Rolling Window split (Training data is always before Test data).
18. What is MAPE?
Mean Absolute Percentage Error. Easy to interpret ("Error is 5%"), but fails if actual values are 0 (undefined).
19. ARIMA vs. LSTM (Deep Learning)?
Use ARIMA for simple, linear data with little data. Use LSTM for massive datasets with complex, non-linear patterns.
20. What is Facebook Prophet?
A robust open-source library designed for business forecasting. It handles missing data, holidays, and trend shifts better than standard ARIMA.

Expert Guide to Time Series Analysis | Data Science & Machine Learning Interviews

Post a Comment

0 Comments