Friday, April 15, 2011

The alpha and the beta

Stats 101, just so everyone's on the same page: there are 2 types of errors.

Type-1 (alpha) error is a detection (false-rejection) error.

Type-2 (beta) error is a rejection (false-detection) error.

What does this mean for our lives?

Let's pretend there exists a social welfare program for people whose houses spontaneously combusted. In order to determine if they should be granted restitution, their situation is quantified on a scale by some means. Of course there will be people who burn their houses down to scam money and we would seek to exclude them from this system. In an ideal world, we would imagine a distribution of scammers and non-scammers on the scale as such:


Yes, prepare for a lot of bad MSPaint diagrams because I can't be assed to work with Illustrator.

So in this ideal world, it's simple. We set P as the cut-off point and exclude everyone who scores higher than P on the scale and accept everyone who scores less than P. Done!

Except no.

By central limit theorem, we would expect normal distributions of both populations centered around separate means (if this test has any amount of effectiveness):


At point Q is how the average Spontaneous Combustor shows up and point R is the average Scammer. Under the two bells are how the populations of both categories are expected to be distributed. Notice though, how they overlap! I guess you can argue that a better test would space the curves further apart, but realistically you rarely see that kind of thing in real life in any meaningful way even in stochastic processes that don't involve one set of sentient beings trying to appear like the other. What you can do, is tighten the acceptance criteria and shift P left, or relax it and shift P right.

Okay, here's where the math gets a bit sketchy with assumptions, but bear with me. You can decide if it makes sense or not (it does to me), but doesn't affect my core premises either way.

The population of Scammers is much less than the population of Spontaneous Combustors so our distributions actually look like this:


Now if P is where % alpha is equal for both hypotheses, the absolute alpha and beta for our leftmost graph is:


I might've eyeballed P too far right actually, but no matter. As you can see, because one group is larger than the other, even if we accidentally accept the same percentage of blue as we reject red, the absolute alpha is much bigger than beta.

So a huge part of social policy is really all about where we want to move P. We will never know absolutely how big each slice is because obviously if we had a foolproof way to detect it, we wouldn't be committing these errors in the first place! The question is in which direction do we want to err?

In every aspect of our social systems, Conservatives are so frightened of Type-2 errors that they cripple them to beyond usability for many people with legitimate needs for them. Whether this be something like unemployment insurance, disability insurance, or even something as fundamental to democracy as voting.

Realistically some number people in real need is going to look exactly like some number of people that aren't in whatever system you're using the quantify "need". I would argue the number of the former is much greater than the number of the latter in whatever confidence interval we're using, but for the sake of argument let's say it's 1:1.

You have 2 applicants to social welfare program x. One who is in dire need through no fault of his own and another one who is scamming for money.

You can choose to shut neither or both out. Which do you choose? Which do you think is the appropriate choice for a developed democratic society?

Either the CPC thinks that a token amount of infringement is unbearably galling that they would rather shut the door on someone who has paid into the social system in the expectation of being protected when luck turns sour or they are just looking for an excuse to Not Give a Damn.

Pathetic. Or evil. You decide.

Additional food for thought: Engineering stats begins at the CLT whereas that's about the point after the stats most math students take ends. It's almost as though they said to themselves, "whoa, we better stop now or else they might learn something useful!".

No comments: