Thursday, June 30, 2011

R Squared

Why and When to Use it?

Economists are not so keen on using R-squared, even the Adjusted R-Squared, which due-fully adjusts for the the number of parameters being estimated in a model.

Here are some 2 cents about it:

Reader 1: I thought I'd put in my 2 cents here -- so far int he stuff I have
done with Ken and Andreas and Erkut, no one seems to care about the R2
at all! And papers I have read recently do not discuss it, though I
always report it. I n general, if the adj R2 goes up, it sggests that
the model fit is better with the increased variable included. We use
adjusted because adding a var will always make the regular R2 go up so
it does not really tell us anything (which you probably already know).

Reader 2: I don't ever use R squared. I used the adjusted R squared for the informal use of Altonji (2005) to see robustness of the coefficients to additional independent variables. So lets say you are looking at if years of schooling leads to higher growth. Now someone might claim your estimation is too parisomonious. Lets say you may not have included inflation. So you add inflation, see if the adjusted r squared improves. If it improves, its a better fit. And if your coefficient of years of schooling is still significant, it implies your estimation is robust. That is basically when I use adjusted r squared. I read somewhere that the F-stats are better.

Reader 3:
Yes to adjusted r-squared...both the Fstat and adjusted r squared do the same thing in terms of adjusting for the # of parameters being estimated (hence the adjustmnet), otherwise r squared straight up increases as the # variables in your model (i.e parameters being estimated) increases.

Why? because over a concave space, you will always acheive a lower minimum (summ of sqrd errors or residuals will go down) with more variables in your function. Hence SSerrors goes down and Rsquared=1-(SSE/SST) goes up.

R squared adjusted divides by the number of variables (parameters) to counteract the SSE going down. Same thing with F stat.

Response to Reader 2: This is the sentence I wanted: "And if your coefficient of years of schooling is still significant, it implies your estimation is robust."
So, reader 2 uses adjusted r squared for robustness.

But what if adj r sq goes up but schooling becomes insig?
Not robust-right?
Throw out the model?

Reader2: If your adj- r squared goes up, and years of schooling looses significance, it implies that inflation is a) important to your model and b) years of schooling is capturing something inflation is explaining. So your estimation is not robust and you have to either justify why inflation is incorrect to put in the model theoretically, or claim that inflation steals away an important years of schooling affect. Maybe inflation reflects status of the economy, and whne you have a bad economy, schooling plummets. Anyways, here is where everything becomes an art form. Obviously if you add every possible variable out there, you will eventually lose significance, so theory has to guide your specification. Of course this is just a robustness check. I found a paper that claimed it wasn't the greatest robustness check either, as you are hadn picking measurable variables, when your issue is ommitted variable bias.

Reader 3: I'm a non believer in adj r-squared over except for Asif's above.
Why does everyone else use it so much? sociologists? business? do they not know better?

1 comment:

  1. I was talking over R-Squared with my co-author :) the other day, and asking, why the obsession with R squared in the business world? And he mentioned something I hadn't thought of. At least in finance, if you're making money of the market, or even if you're a small business, and can't change any of your strategic parameters, then all you care about is getting the right model to predict where the market is going to go. Then you can still hedge against that prediction.

    But...if you have control over parameters that will affect the success of your own business then you should start looking at significance. For some reason this distinction between financial predication and hedging for profit vs. affecting change for profit is not made clear.