Oog Robot: biases

2011-01-25

Selection Biases in Real World Data

So a web-page depicting the average SAT score by state has been making the rounds a little bit in Iowa. Apparently Iowans are so smart! But something seemed fishy about the data...oh, right the participation rate is 3% in Iowa, and in general states with a lower participation rate have better scores.

Now, if you take the ACT data for 2010, Iowa is around 15th, but with a much higher participation rate.

Look at Maine! Maine is ranked dead last, 51st, on the SAT list, but 5th on the ACT list, up there with a lot of East Coast states whose students didn't do so well on the SAT list. New York is 46th on one list and 4th on the other.

Why the difference in Iowa's rank? And why the huge difference in Maine's rank?

Biases in the test-takers is likely to be at work. ACT has its mothership in Iowa City, meaning that Iowa is likely to be pushing the ACT for political-economic ties. In my experience, only seasoned test-takers in Iowa take the SAT. The ACT is "good enough" for most people, but because of the additional practice and the inherent variance of the tests, you can get a higher personal best simply by more tests (personal note from an MIT grad: I took each test 6 times, once per year from 7th grade to 12th grade). Thus it is likely that Iowans hoping to get into a selective school are more likely to take the SAT. A similar effect is probably at work with the ACT in states like Massachusetts --- lower participations rates in these states are being caused by forces which select for the best test-takers and smartest students.

Making the SAT difficult to take in Iowa might help the state look the best in the SAT rankings, even though the ACT is HQ'ed in Iowa City. But most of all, reading too much into improperly gathered rankings is dangerous.

2009-11-02

Randomizing a Coin Toss

Way to report only the problem, Freakonomics blog:

But it may be that the the random coin toss isn’t so random. A 2007 study found that a vigorously flipped coin is likely to land on the same side it started on at least 51 percent of the time, possibly more depending on the person doing the flipping.

Say you have a coin with a bias that doesn't change from flip to flip. You can remove the bias by flipping the coin twice, and taking the first result if the second result differs from the first or repeating if it lands the same way both times.

Mathematically, if the probability of landing heads is P, then:

P^2 = Prob[HH]
P*(1-P) = Prob[HT]
(1-P)*P = Prob[TH]
(1-P)^2 = Prob[TT]

Since the probability of HT is the same as TH, taking the first element removes the bias.

Mad props to von Neumann for inventing this trick (publication was "Various techniques used in connection with random digits"), along with fundamental portions of the theory for quantum mechanics, cryptography, and game theory. They don't make mathematicians like they used to.

Oog Robot

2011-01-25

Selection Biases in Real World Data

2009-11-02

Randomizing a Coin Toss

Blog Archive

About Me