• Standard errors

  • Heteroskedasticity robust standard errors

    • HC0

    • HC1

    • HC2

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm

生成模拟数据¶

Homoscedasticity standard errors

$$ y=X\beta+\epsilon $$

  • $y\in R^{n\times1}$

  • $X=[X_1, X_2,\dots,X_n]^T, X\in R^{n\times k}, X_i\in R^{k\times1}$

  • $\beta\in R^{k\times1}$

  • $\epsilon\in R^{n\times1}$

In [7]:
rng = np.random.default_rng(seed=42)

X = np.linspace(1, 10, 10)
X = np.repeat(X, 10)

y = 3 * X + rng.normal(0, 5, size=len(X))
X += rng.normal(0, 1, len(X))

X = X.reshape(-1, 1)
y = y.reshape(-1, 1)

n, k = X.shape[0], 1

plt.scatter(X, y, s=2)
plt.show()

print(y.shape, X.shape, 
      f'n={n} k={k}')
No description has been provided for this image
(100, 1) (100, 1) n=100 k=1

$$ \hat\beta = \arg\min_{\beta} ||y- X\hat\beta||^2 = (y-X\hat\beta)^T (y-X\hat\beta) $$ $$ \downarrow \\ $$ $$ \frac{\partial ||y-X\hat\beta||^2}{\partial \beta} = 2 (-X)^T (y- X\hat\beta ) = 0 $$ $$ \downarrow $$ $$ \hat\beta = (X^TX)^{-1} X^T y $$

In [9]:
beta_hat = np.linalg.inv(X.T @ X) @ X.T @ y
epsilon_hat = y - X @ beta_hat

beta_hat.shape, epsilon_hat.shape, beta_hat, 
Out[9]:
((1, 1), (100, 1), array([[2.84726633]]))

$$ \hat\beta=(X^TX)^{-1}X^T(X\beta+\epsilon)=\beta+(X^TX)^{-1}X^T\epsilon $$

$$ \downarrow $$

$$ E[\hat\beta]=\beta + (X^TX)^{-1}X^T E[\epsilon]=\beta $$

  • $E[\epsilon]=0$

Sandwich estimator¶

$$ \begin{aligned} V[\hat\beta\mid X]&=V[\hat\beta-\beta\mid X]\\ &=V[(X^TX)^{-1}X^T\epsilon\mid X]\\ &=(X^TX)^{-1}X^T V[\epsilon\mid X] X(X^TX)^{-1}\\ &=(X^TX)^{-1} \{X^T V[\epsilon\mid X] X\}(X^TX)^{-1}\\ &=(X^TX)^{-1} \{X^T E[\epsilon\epsilon^T\mid X] X\}(X^TX)^{-1} \end{aligned} $$

  • $V[\hat\beta\mid X]\in R^{k\times k}$

  • $X^T V[\epsilon\mid X] X \in R^{k\times k}$


Standard errors¶

假设球形扰动(同方差、无自相关)

$$ V[\epsilon\mid X]=\sigma^2 I = \begin{bmatrix} \sigma^2 & & \\ & \ddots & \\ & & \sigma^2 \\ \end{bmatrix} $$

$$ \downarrow $$

$$ X^T V[\epsilon\mid X] X = [X_1, X_2,\dots,X_n] \begin{bmatrix} \sigma^2 & & \\ & \ddots & \\ & & \sigma^2 \\ \end{bmatrix} [X_1,X_2,\dots,X_n]^T=\sigma^2 X^TX $$

$$ V[\hat\beta\mid X]=\sigma^2 (X^TX)^{-1} X^TX (X^TX)^{-1}=\sigma^2 (X^TX)^{-1} $$


对扰动项方差 $\sigma^2$ 的无偏估计

$$ s^2\equiv\frac1{n-k}\sum_{i=1}^n e_i^2 = \frac{n}{n-k} \frac1n \sum_{i=1}^n e_i^2 $$

$$ \hat V[\hat\beta\mid X]= s^2 (X^TX)^{-1} $$

In [21]:
# standard errors
s2 = np.sum(epsilon_hat**2) / (n-k)

v_hat_beta_hat = s2 * np.linalg.inv(X.T @ X)

se = v_hat_beta_hat**0.5

print(v_hat_beta_hat, se)
[[0.00520589]] [[0.07215188]]

Heteroskedasticity robust standard errors¶

存在异方差,但无自相关

$$ V[\epsilon\mid X] = \begin{bmatrix} E[\epsilon_1^2\mid X] & & \\ & \ddots & \\ & & E[\epsilon_n^2\mid X] \\ \end{bmatrix} $$

$$ \downarrow $$

$$ \begin{aligned} \{X^TV[\epsilon\mid X]X\}_{k\times k} &=[X_1,X_2,\dots,X_n] \begin{bmatrix} E[\epsilon_1^2\mid X] & & \\ & \ddots & \\ & & E[\epsilon_n^2\mid X] \\ \end{bmatrix} [X_1,X_2,\dots,X_n]^T\\ &=[X_1 E[\epsilon_1^2 \mid X] , X_2 E[\epsilon_2^2 \mid X], \dots, X_n E[\epsilon_n^2 \mid X]]\ [X_1,X_2,\dots,X_n]^T\\ &=\sum_{i=1}^n X_i E[\epsilon_i^2\mid X] X_i^T \end{aligned}\\ $$


$$ \begin{aligned} V[\hat\beta\mid X] & =(X^TX)^{-1} \{\sum_{i=1}^n X_i E[\epsilon_i^2\mid X]X_i^T\} (X^TX)^{-1}\\ & =(X^TX)^{-1} \{\sum_{i=1}^n E[\epsilon_i^2\mid X] X_i X_i^T\} (X^TX)^{-1}\\ \end{aligned} $$

HC0¶

异方差稳健标准误 (Eicker, 1963, 1967; Huber, 1967; White, 1980)

$$ \hat V[\hat\beta\mid X]^{HC0}=(X^TX)^{-1} \{\sum_{i=1}^n e_i^2 X_i X_i^T\} (X^TX)^{-1}\\ $$

In [37]:
v_hat_beta_hat_hc0 = (
    np.linalg.inv(X.T @ X)
    @ X.T @ np.diag(epsilon_hat.reshape(-1))**2 @ X
    @ np.linalg.inv(X.T @ X)
)

se_hc0 = v_hat_beta_hat_hc0**0.5

print(v_hat_beta_hat_hc0, se_hc0)
[[0.0040926]] [[0.0639734]]

HC1¶

小样本校正的异方差稳健标准误(Stata的默认稳健标准误,选择项 robust)(Hinkley, 1977)

$$ \hat V[\hat\beta\mid X]^{HC1}=\frac{n}{n-k} (X^TX)^{-1} \{\sum_{i=1}^n e_i^2 X_i X_i^T\} (X^TX)^{-1}\\ $$

In [39]:
v_hat_beta_hat_hc1 = (
    np.linalg.inv(X.T @ X)
    @ X.T @ np.diag(epsilon_hat.reshape(-1))**2 @ X
    @ np.linalg.inv(X.T @ X)
) * n / (n - k)

se_hc1 = v_hat_beta_hat_hc0**0.5

print(v_hat_beta_hat_hc1, se_hc1)
[[0.00413394]] [[0.0639734]]

HC2¶

在同方差的情况下,$E[e_i^2]=(1-h_i)\sigma^2$,故 $E[\frac{e_i^2}{1-h_i}=\sigma^2$

定义如下稳健标准误(Horn, Horn and Duncan, 1975)

$$ \hat V[\hat\beta\mid X]^{HC2} = (X^TX)^{-1} \{ \sum_{i=1}^n \frac{e_i^2}{1-h_i} X_i X_i^T \} (X^TX)^{-1} $$

{stata}
vce(hc2)
In [53]:
h = np.diag(X @ np.linalg.inv(X.T @ X) @ X.T)

v_hat_beta_hat_hc2 = (
    np.linalg.inv(X.T @ X)
    @ X.T @ (np.diag(epsilon_hat.reshape(-1))**2 / (1 - np.diag(h))) @ X
    @ np.linalg.inv(X.T @ X)
)

se_hc2 = v_hat_beta_hat_hc2**0.5

print(v_hat_beta_hat_hc2, se_hc2)
[[0.00415573]] [[0.06446497]]
In [54]:
model = sm.OLS(y, X)
results = model.fit()
print(results.summary())

results_hc0 = results.get_robustcov_results(cov_type='HC0')
print(results_hc0.summary())

results_hc1 = results.get_robustcov_results(cov_type='HC1')
print(results_hc1.summary())

results_hc2 = results.get_robustcov_results(cov_type='HC2')
print(results_hc2.summary())
                                 OLS Regression Results                                
=======================================================================================
Dep. Variable:                      y   R-squared (uncentered):                   0.940
Model:                            OLS   Adj. R-squared (uncentered):              0.940
Method:                 Least Squares   F-statistic:                              1557.
Date:                Thu, 17 Oct 2024   Prob (F-statistic):                    2.26e-62
Time:                        00:04:43   Log-Likelihood:                         -292.83
No. Observations:                 100   AIC:                                      587.7
Df Residuals:                      99   BIC:                                      590.3
Df Model:                           1                                                  
Covariance Type:            nonrobust                                                  
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x1             2.8473      0.072     39.462      0.000       2.704       2.990
==============================================================================
Omnibus:                        1.520   Durbin-Watson:                   1.611
Prob(Omnibus):                  0.468   Jarque-Bera (JB):                1.361
Skew:                           0.142   Prob(JB):                        0.506
Kurtosis:                       2.505   Cond. No.                         1.00
==============================================================================

Notes:
[1] R² is computed without centering (uncentered) since the model does not contain a constant.
[2] Standard Errors assume that the covariance matrix of the errors is correctly specified.
                                 OLS Regression Results                                
=======================================================================================
Dep. Variable:                      y   R-squared (uncentered):                   0.940
Model:                            OLS   Adj. R-squared (uncentered):              0.940
Method:                 Least Squares   F-statistic:                              1981.
Date:                Thu, 17 Oct 2024   Prob (F-statistic):                    2.85e-67
Time:                        00:04:43   Log-Likelihood:                         -292.83
No. Observations:                 100   AIC:                                      587.7
Df Residuals:                      99   BIC:                                      590.3
Df Model:                           1                                                  
Covariance Type:                  HC0                                                  
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x1             2.8473      0.064     44.507      0.000       2.720       2.974
==============================================================================
Omnibus:                        1.520   Durbin-Watson:                   1.611
Prob(Omnibus):                  0.468   Jarque-Bera (JB):                1.361
Skew:                           0.142   Prob(JB):                        0.506
Kurtosis:                       2.505   Cond. No.                         1.00
==============================================================================

Notes:
[1] R² is computed without centering (uncentered) since the model does not contain a constant.
[2] Standard Errors are heteroscedasticity robust (HC0)
                                 OLS Regression Results                                
=======================================================================================
Dep. Variable:                      y   R-squared (uncentered):                   0.940
Model:                            OLS   Adj. R-squared (uncentered):              0.940
Method:                 Least Squares   F-statistic:                              1961.
Date:                Thu, 17 Oct 2024   Prob (F-statistic):                    4.57e-67
Time:                        00:04:43   Log-Likelihood:                         -292.83
No. Observations:                 100   AIC:                                      587.7
Df Residuals:                      99   BIC:                                      590.3
Df Model:                           1                                                  
Covariance Type:                  HC1                                                  
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x1             2.8473      0.064     44.284      0.000       2.720       2.975
==============================================================================
Omnibus:                        1.520   Durbin-Watson:                   1.611
Prob(Omnibus):                  0.468   Jarque-Bera (JB):                1.361
Skew:                           0.142   Prob(JB):                        0.506
Kurtosis:                       2.505   Cond. No.                         1.00
==============================================================================

Notes:
[1] R² is computed without centering (uncentered) since the model does not contain a constant.
[2] Standard Errors are heteroscedasticity robust (HC1)
                                 OLS Regression Results                                
=======================================================================================
Dep. Variable:                      y   R-squared (uncentered):                   0.940
Model:                            OLS   Adj. R-squared (uncentered):              0.940
Method:                 Least Squares   F-statistic:                              1951.
Date:                Thu, 17 Oct 2024   Prob (F-statistic):                    5.86e-67
Time:                        00:04:43   Log-Likelihood:                         -292.83
No. Observations:                 100   AIC:                                      587.7
Df Residuals:                      99   BIC:                                      590.3
Df Model:                           1                                                  
Covariance Type:                  HC2                                                  
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x1             2.8473      0.064     44.168      0.000       2.719       2.975
==============================================================================
Omnibus:                        1.520   Durbin-Watson:                   1.611
Prob(Omnibus):                  0.468   Jarque-Bera (JB):                1.361
Skew:                           0.142   Prob(JB):                        0.506
Kurtosis:                       2.505   Cond. No.                         1.00
==============================================================================

Notes:
[1] R² is computed without centering (uncentered) since the model does not contain a constant.
[2] Standard Errors are heteroscedasticity robust (HC2)

生成模拟数据¶

异方差

In [163]:
rng = np.random.default_rng(seed=42)

X = np.linspace(1, 10, 10)
X = np.repeat(X, 10)

y = 3 * X + rng.normal(0, X*7)
X += rng.normal(0, 1, len(X))
plt.scatter(X, y, 
            s=1, c='b')
Out[163]:
<matplotlib.collections.PathCollection at 0x2d102301f70>
No description has been provided for this image
In [55]:
beta_hat = np.matmul(
    np.matmul(
        np.linalg.inv(
            np.matmul(
                X.reshape(-1, 1).T,
                X.reshape(-1, 1)
            )
        ),
        X.reshape(-1, 1).T,
    ),
    y
)

print(beta_hat, beta_hat.shape)
[[2.84726633]] (1, 1)
In [165]:
model = sm.OLS(y, X)
results = model.fit()
print(results.summary())

results_hc0 = results.get_robustcov_results(cov_type='HC0')
print(results_hc0.summary())

results_hc1 = results.get_robustcov_results(cov_type='HC1')
print(results_hc1.summary())

results_hc2 = results.get_robustcov_results(cov_type='HC2')
print(results_hc2.summary())
                                 OLS Regression Results                                
=======================================================================================
Dep. Variable:                      y   R-squared (uncentered):                   0.101
Model:                            OLS   Adj. R-squared (uncentered):              0.092
Method:                 Least Squares   F-statistic:                              11.14
Date:                Wed, 16 Oct 2024   Prob (F-statistic):                     0.00119
Time:                        00:50:05   Log-Likelihood:                         -490.56
No. Observations:                 100   AIC:                                      983.1
Df Residuals:                      99   BIC:                                      985.7
Df Model:                           1                                                  
Covariance Type:            nonrobust                                                  
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x1             1.7398      0.521      3.338      0.001       0.706       2.774
==============================================================================
Omnibus:                       11.728   Durbin-Watson:                   1.554
Prob(Omnibus):                  0.003   Jarque-Bera (JB):               12.448
Skew:                          -0.729   Prob(JB):                      0.00198
Kurtosis:                       3.930   Cond. No.                         1.00
==============================================================================

Notes:
[1] R² is computed without centering (uncentered) since the model does not contain a constant.
[2] Standard Errors assume that the covariance matrix of the errors is correctly specified.
                                 OLS Regression Results                                
=======================================================================================
Dep. Variable:                      y   R-squared (uncentered):                   0.101
Model:                            OLS   Adj. R-squared (uncentered):              0.092
Method:                 Least Squares   F-statistic:                              7.078
Date:                Wed, 16 Oct 2024   Prob (F-statistic):                     0.00910
Time:                        00:50:05   Log-Likelihood:                         -490.56
No. Observations:                 100   AIC:                                      983.1
Df Residuals:                      99   BIC:                                      985.7
Df Model:                           1                                                  
Covariance Type:                  HC0                                                  
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x1             1.7398      0.654      2.660      0.009       0.442       3.037
==============================================================================
Omnibus:                       11.728   Durbin-Watson:                   1.554
Prob(Omnibus):                  0.003   Jarque-Bera (JB):               12.448
Skew:                          -0.729   Prob(JB):                      0.00198
Kurtosis:                       3.930   Cond. No.                         1.00
==============================================================================

Notes:
[1] R² is computed without centering (uncentered) since the model does not contain a constant.
[2] Standard Errors are heteroscedasticity robust (HC0)
                                 OLS Regression Results                                
=======================================================================================
Dep. Variable:                      y   R-squared (uncentered):                   0.101
Model:                            OLS   Adj. R-squared (uncentered):              0.092
Method:                 Least Squares   F-statistic:                              7.007
Date:                Wed, 16 Oct 2024   Prob (F-statistic):                     0.00945
Time:                        00:50:05   Log-Likelihood:                         -490.56
No. Observations:                 100   AIC:                                      983.1
Df Residuals:                      99   BIC:                                      985.7
Df Model:                           1                                                  
Covariance Type:                  HC1                                                  
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x1             1.7398      0.657      2.647      0.009       0.436       3.044
==============================================================================
Omnibus:                       11.728   Durbin-Watson:                   1.554
Prob(Omnibus):                  0.003   Jarque-Bera (JB):               12.448
Skew:                          -0.729   Prob(JB):                      0.00198
Kurtosis:                       3.930   Cond. No.                         1.00
==============================================================================

Notes:
[1] R² is computed without centering (uncentered) since the model does not contain a constant.
[2] Standard Errors are heteroscedasticity robust (HC1)
                                 OLS Regression Results                                
=======================================================================================
Dep. Variable:                      y   R-squared (uncentered):                   0.101
Model:                            OLS   Adj. R-squared (uncentered):              0.092
Method:                 Least Squares   F-statistic:                              6.933
Date:                Wed, 16 Oct 2024   Prob (F-statistic):                     0.00982
Time:                        00:50:05   Log-Likelihood:                         -490.56
No. Observations:                 100   AIC:                                      983.1
Df Residuals:                      99   BIC:                                      985.7
Df Model:                           1                                                  
Covariance Type:                  HC2                                                  
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x1             1.7398      0.661      2.633      0.010       0.429       3.051
==============================================================================
Omnibus:                       11.728   Durbin-Watson:                   1.554
Prob(Omnibus):                  0.003   Jarque-Bera (JB):               12.448
Skew:                          -0.729   Prob(JB):                      0.00198
Kurtosis:                       3.930   Cond. No.                         1.00
==============================================================================

Notes:
[1] R² is computed without centering (uncentered) since the model does not contain a constant.
[2] Standard Errors are heteroscedasticity robust (HC2)

References¶

  • https://www.stata.com/meeting/china22-Uone-Tech/slides/China22_Qiang.pdf

    • 主要参考了陈强教授的讲义
  • https://en.wikipedia.org/wiki/Ordinary_least_squares

  • https://en.wikipedia.org/wiki/Heteroskedasticity-consistent_standard_errors

  • https://en.wikipedia.org/wiki/Gram_matrix

In [ ]: