Sunday, January 29, 2012

Hausman Test, Small Number of Clusters and Bootstrapped Standard Errors

Person A:
Have you ever run into estimation problems due to finite number of
clusters (M<50) or uneven cluster size (where some clusters make up
more than 5% of the sample)?

A problem that is troubling me currently is that WITHOUT clustering (at the unit level), I cannot
reject exogeneity of unit specific effects and thus would be inclined to use random effects. But once I cluster and rerun the test, I reject exogeneity. SO:

WITHOUT clustering --> cannot reject exogeneity of RE (i.e. can use RE)
WITH clustering --> reject use of RE

I know that clustering is supposed to make the se's smaller, but how would that lead to rejection of exogeneity (using xtoverid after the re, cluster regression), since se ests for re and fe estimations would
be effected that way? Could this have anything to do with finite clusters or having a few really big clusters (each of three big ones comprise of 8-11% of the sample)?



Person B:
 - Which stata version?
 - Secondly, you are using xtoverid to essentially choose between RE and FE correct?
 - Thirdly, I think I read somewhere that you cant use xtoverid after clustering. but there may be a way to use the Hausman test
 - Fourth, I don't think I have ever had a situation where someone says, why are you clustering? Usually they complain if you are NOT clustering. So you may be able to get away with this in a publication sense.


Person A:
- I am using Stata 10, all updated
- I am using xtoverid to test exogeneity of the individual specific
effect. If they come out as exogenous, I say "okay, the RE assumptions
are met and I can use that.
- I am using xtoverid because the Hausman test does not work with
clustered standard errors. xtoverid does work with clustered errors.
So, yeah, they are two different test with two different results, but
they are asymptotically equivalent; if you do xtoverid with ordinary
standard errors it is the same as the Hausman test.
- I know I should cluster, so I guess that is not the issue. the issue
is: can I use random effects? i can just say "oh, the test for
exogeneity failed so I assume the individual effects are correlated
and the RE assumptions are not met. I will go with FE. BUT I really
want to know why because I worry it means there is some bigger issue
at stake here (like issues with asymptotics due to finite number of
clusters or some clusters accounting for more than 5% of the data). If
I have these problems that can mess with the asymptotics, then my
inference can be all wrong.
- SO, I want to know if these things are impacting this weird
difference in results when clustering vs not clustering and if I need
to be worried in a more general sense OR if there is even something
else going on. Why would the results be diff with clustering?



Person B:
Isn't the idea behind clustering that within a cluster there is not much variance, but outside there is, and thus you want to treat each cluster as a unit?

There is a lot of grey area here that I really didn't ponder here. My gut intuition says it may have to do with unbalanced panels, and so you are right, its related to the size of the clusters. 



Iamjustapointewrote:
Let me restate the problem to see if I understand. 

Problem A
If you cluster, i.e. extract the variation in your explanatory variable into the error term by each cluster,  then you reject RE in a Hausman test. 

Namely, once the within cluster variation is removed from the regressors, then regressors (sans within cluster variation) appear to be correlated with the error term. 

Problem B
You have a few number of clusters, i.e. <50. 
So asymptotic rules at the cluster level (i.e. the betas being approximately gaussian or rather tdistributed) don't hold. 

Given A&B
So should Problem A even be considered if we face Problem B?


***********************************************************************************
Solution A
Have you tried bootstrapped clustering so as not to rely on asymptotic distribution of your stats?

I have had this problem in the past, and bootstrapping assuaged concerns with my small number of clusters. Asymptotic assumptions of statistics depend on CLT and large sample size. Inference with bootstrapping does not because repeated sampling (with added noise) is essentially re-creating the distribution rather than assuming it.  

See: Cameron, Gelbach & Miller, REStat, 2007, "Bootstrap-based Improvements for Inference with Clustered Errors"


They consider GK small (as small as 4 in some simulations).They use bootstrap methods, which, under certain circumstances, can actually yield tighter confidence intervals than analytically "correct" (i.e. asymptotically correct) standard errors.


Solution B
beginning at the bottom of page 3, describes two alternatives you could try:

"One approach, suggested by Donald and Lang (2001), is to effectively treat the number of groups as the number of observations, and use finite sample analysis (with individual-specific unobservables becoming unimportant – relative to the cluster effect – as the cluster sizes get large). A second approach is to view the cluster-level covariates as imposing restrictions on cluster-specific intercepts in a set of individual-specific regression models, and then imposing and testing the restrictions using minimum distance estimation."

Person B:
That is a pretty cool response to small clusters (I have not yet faced that problem as I'm usually clustering on states or countries :-p But this could come in handy). One question though. Can you still use xtoverid after using boostrapped clustering?

Iamjustapointewrote:
I believe so, as xtoverid is used after xtreg which accepts bootstrapping:
http://www.stata.com/statalist/archive/2010-04/msg01412.html

You can also bootstrap "by hand":
*BOOTSTRAPPED STD ERRORS

local B = 1000
matrix bs = J(`B', 1, 0)
forvalues b = 1(1)`B' {


qui {

  * NOTE: use "preserve"/"restore" to bsample from original dataset for each iteration
     preserve
     bsample, cluster(cluster unit)
     capture drop xb lamda
     probit y x1 x2 
     predict xb, xb
     gen lamda = normalden(xb) / normal(xb)
     reg log_wage edyrs age lamda  
     matrix e = e(b)
     matrix bs[`b', 1] = e[1,1]
     restore
          }


}

svmat bs
summ bs, det
* store std deviation from summarize command using return fn
local bs_se = r(sd)
di "bootstrapped standard error: `bs_se' ..."
add_stat "bs_se" `bs_se'




Person A:


Update -- I figured out the reason my exogeneity test fails for random effects when I allow for arbitrary heterogeneity and autocorrelation by clustering errors. Recall that one of the assumptions of Random
Effects is that uit and eit are both homoskedastic and uncorrelated across t. When I tell Stata to cluster the errors, I am relaxing this assumption. This is not fatal for RE, per se (see Wooldrige, panel data text book), but it does mean that if my se estimates change after allowing for het. and a.c. then that assumption was never sufficiently
true in the first place and I was underestimating estimating my errors. Underestimating estimating the errors would over-estimate the t-stats in any exogeneity test and lead to over rejection of the null. a separate problem I was having was that in addition to this one of my variables was correlated with the facility effect (which I was
modeling with RE) -- when I estimate things without that variable, everything behaves better. the question remains as to what to do from here, but at least I figured out that a) my cluster size should not be an issue (in fact, cluster size is WAY less important than number of clusters) and b) the estimation method that Stata's cluster command
uses does 'reasonably well' with small number of clusters (G=10). with G=30, I fall between small and enough (safe is considered G=50, but some people say 30 is fine). In any case, cluster size or number does not seem to be my problem. Nonetheless, bootstrapping has been shown to perform better than clustering, esp. with fixed effects (not sure about re's) so that is my next step.



No comments:

Post a Comment