Generalizations vs. Statistics

This topic has 51 replies, 10 voices, and was last updated 12 years, 10 months ago by squeak.

Viewing 50 posts - 1 through 50 (of 52 total)

1 2 →

Author

Posts
February 17, 2013 1:48 am at 1:48 am #608239

Torah613Torah
Participant

In another topic, OOM states that “Generalizations are almost as bad as statistics”. I think her statement is worthy of deeper examination.

OOM, by way of introduction, it may interest you as a Humanities person that a very early description of statistics – “by a small sample, we may judge of the whole piece”, comes from Don Quixote.

Statistics are not exactly the same thing as a generalizations, but they are more similar then they look. A generalization uses statistics in exactly the same way that a three year old boy uses calculus when learning how to aim a ball. He has no idea of the principles involved, but he can do it anyway. He can’t build an Iron Dome without calculus, though.

The world, and most of our life, especially our shidduch system, runs on generalizations. Girls from seminary x are type x. Boys from x chabura are good boys. This works, or doesn’t work, depending on who you ask, but it is useful.

Generalizations may help in making choices, but they can’t be validated or compared. Most boys from Brisk are smart, but how smart? Are they less smart then boys from Gush? If Lakewood boys have a lower average intelligence, but with higher variance, than boys from Brisk, then if you date an above average boy from Lakewood, he is probably more intelligent than an average Brisk boy. A generalization would not help you in deciding a question like which boy to date, but statistics would.

Statistics allows you to say TRUE or FALSE by turning generalizations into quantified facts. “Seminary is really important for spiritual growth” vs. “Seminary attendance increases average prayer time by 2 hours per student per day in the three-year post-seminary period”. Either the students pray for 2 hours more a day, or they don’t. The former statement can be argued with. The latter speaks for itself.

However, statistics are no substitute for judgment (paraphrased). NASI (National Abusers of Statistical Information), is an example of an organization that abuses statistics. They use statistics as a drunken man uses lamp-posts- for support rather than illumination. (paraphrased) However, the drunken man could not see the lamp-posts, were it not for illumination. So they have the right idea, even if they are not going about it properly.

Both statistics and generalizations can be used to argue for or against anything you wish. Real Truth is only found in the Torah.

Your thoughts?

February 17, 2013 1:54 am at 1:54 am #930856

popa_bar_abba
Participant

Popa knows statistics. He can run a Z score, and a T score, and all the clever things he can do to convince you of things that aren’t true.

February 17, 2013 1:58 am at 1:58 am #930857

Torah613Torah
Participant

Popa: Who do you believe – me or your lyin’ eyes?

February 17, 2013 2:02 am at 2:02 am #930858

popa_bar_abba
Participant

Knowing statistics can make you sound very smart, when in fact you are really dumb. I use this to great effect.

February 17, 2013 2:03 am at 2:03 am #930859

OneOfMany
Participant

Relying on statistics is way worse than relying on generalizations.

Read How to Lie with Statistics by Darrell Huff (favorite book of LIFE).

February 17, 2013 2:09 am at 2:09 am #930860

Torah613Torah
Participant

PBA: We all do. Although I don’t recall when you last used statistics, period.

February 17, 2013 2:18 am at 2:18 am #930861

popa_bar_abba
Participant

I don’t mean citing a study; I mean knowing how examine info for statistical significance.

I don’t think I ever did it on this site, but on a different one, I did. You won’t find it.

But if the occasion arises, I’ll do it for you.

February 17, 2013 2:18 am at 2:18 am #930862

Torah613Torah
Participant

OOM: I own that book. 🙂 How to Lie With and Statistics is just about my favorite secular book ever, right up there with Harry Potter, Dale Carnegie, The Phantom Tollbooth, and Larger than Life.

February 17, 2013 2:20 am at 2:20 am #930863

OneOfMany
Participant

yayz Phantom Tollbooth

I wrote a college admissions essay on that book. ^_^

February 17, 2013 2:25 am at 2:25 am #930864

popa_bar_abba
Participant

There is something funny about a book that purports to use numbers to show how numbers are manipulated, and everyone reading the book believes it.

Perhaps the book is self proving.

February 17, 2013 2:34 am at 2:34 am #930865

Torah613Torah
Participant

PBA: Very nice of you to offer. What kind of info and what kind of significance?

Z/T are pretty basic.

February 17, 2013 2:36 am at 2:36 am #930866

Torah613Torah
Participant

PBA: It’s a quick and entertaining read. The author was a journalist, not a statistician.

February 17, 2013 2:45 am at 2:45 am #930867

OneOfMany
Participant

It doesn’t use numbers. Did you read the book?

February 17, 2013 2:51 am at 2:51 am #930868

Torah613Torah
Participant

OOM: It does use numbers. At least in my version.

February 17, 2013 3:13 am at 3:13 am #930869

OneOfMany
Participant

I took “use” to imply that they are the basis for his ideas. They aren’t.

And I have to ask, how is that book of any use to someone who is ideologically opposed to knowledge?

February 17, 2013 6:17 am at 6:17 am #930870

squeak
Participant

” If Lakewood boys have a lower average intelligence, but with higher variance, than boys from Brisk, then if you date an above average boy from Lakewood, he is probably more intelligent than an average Brisk boy.”

Inconclusive. He could be more or less.

” Statistics allows you to say TRUE or FALSE by turning generalizations into quantified facts.”

Patently false. No amount of statistics yields facts.

” NASI (National Abusers of Statistical Information), is an example of an organization that abuses statistics.”

No, their entire problem is that they completely lack statistics and instead use emotional anecdotery. Would that they abused statistics, for then they could be made to see reason.

You asked for our thoughts so here they are. Study harder for your test.

February 17, 2013 6:45 am at 6:45 am #930871

popa_bar_abba
Participant

Squeak is as usual correct. Statistics allow you to predict how accurate your outcome is by determining how likely it is that a sample would have the outcome you have if the underlying population did not have the criteria you hypothesize. But it will always be a likelihood.

If you flip a quarter 10 times and get 10 heads, statistics will tell you that the quarter is weighted and we will say that it is statistically significant at the 99% level. But we still don’t know it is true, because it will still happen with a normal quarter 1% of the time.

(z score=3.16 p hat is 1, and p is .5. Square root of .5*.5 is .5. Divided by square root of 10 is .158. Which means we are 3.16 stdevs off the mean.)

Stated differently, if you took 100 quarters and flipped each one of them 10 times, you could expect that one of the times you would have either 10 heads or 10 tails. If you do it with one quarter though, it is so unlikely that it will happen, that you can still conclude that it is weighted with reasonable certainty.

Torah613: Maybe what I just did is pretty basic, but I’m pretty proud of myself for knowing how to do it anyway and very few of my friends can do that.)

February 17, 2013 7:13 am at 7:13 am #930872

Torah613Torah
Participant

Squeak:

“Inconclusive. He could be more or less.”

Nope – higher variance means the spread of the data is larger, covering a wider range. Assume a special bell curve. Let’s say the average Brisk IQ is 120, with a variance of 4. That’s a tall bell curve. And let’s say the average Lakewood IQ is 118, with a variance of 20. That’s a flat bell curve.

The top 50% of Brisk IQs is only 124, but the top half of Lakewood IQs would be 118-138, roughly speaking, since I am not going to make up data for my hypothetical situation. So you’d be more likely to get a smarter boy in Lakewood if there was a high variance and you’re already selecting for above average intelligence.

(Yes, I’m exaggerating to make a point about how to use statistics. They never line up so nicely in real life.) Yes, a particular boy will be what he will be, but we are discussing probabilities, and in real life we make decisions without knowing everything, through probabilities.

“Patently false. No amount of statistics yields facts.”

Really? What about a census?

“No, their entire problem is that they completely lack statistics and instead use emotional anecdotery. Would that they abused statistics, for then they could be made to see reason.”

Ever saw their red, blue and gray chart? They claim it is based on statistics, exaggerated to make their point of course.

Thanks for your good wishes on my studying.

February 17, 2013 7:17 am at 7:17 am #930873

Torah613Torah
Participant

PBA: It is definitely something to be proud of! If you enjoy basic statistics, you should read the Cartoon guide to statistics, it’s a fun way to get more of the basics and helps in understanding Pubmed articles if you don’t want to get too serious about them. And you’re absolutely correct that most people unfortunately have no idea about anything to do with statistics. IMHO this makes them make poorer decisions in life.

February 17, 2013 7:18 am at 7:18 am #930874

Torah613Torah
Participant

OOM: I value knowledge very much when it is useful to me in real life, and statistics is extremely useful for me. 🙂

February 17, 2013 7:25 am at 7:25 am #930875

OneOfMany
Participant

I don’t think anything bothers me more than intelligent people who drink the Kool-Aid…

February 17, 2013 1:26 pm at 1:26 pm #930876

squeak
Participant

“Ever saw their red, blue and gray chart? They claim……..”

Yes, they claim many things. Its one of their greatest achievements. Making claims.

” higher variance means the spread of the data is larger, covering a wider range.”

But that doesnt tell you enough. If one mean is hgher than the other, there will be some above average points in set 1 that are still below the other mean. Sure, you can make up an example that proves your point, but you didnt give enough information at first to state your conclusion.

And skoyach, popa!

February 17, 2013 1:53 pm at 1:53 pm #930877

Chortkov
Participant

Studies have shown that 83.6% of statistics are made up on the spot.

February 17, 2013 1:55 pm at 1:55 pm #930878

Torah613Torah
Participant

Hi Squeak!

“Yes, they claim many things. Its one of their greatest achievements. Making claims.”

We agree completely. What annoys me is that they claim that their claims are backed up by data, and imply that they are using well-researched statistics which “prove” that this is the problem. At least as of last week in Hamodia (letters Section D, I think). To my knowledge, not one reputable statistician has weighed in on this.

“But that doesnt tell you enough. If one mean is hgher than the other, there will be some above average points in set 1 that are still below the other mean. Sure, you can make up an example that proves your point, but you didnt give enough information at first to state your conclusion.”

Correct, I was not specific in my opening post. I was merely trying to make the point that it is not wise to use generalizations to compare data, since statistical differences in the data can make them invalid.

“And skoyach, popa!”

yep!

February 17, 2013 2:03 pm at 2:03 pm #930879

squeak
Participant

Popa- your example with the quarter is also incorrect, sorry. Shame on me because I just skipped over it expecting you to be right. But the chance of a quarter landing on the same side 10 times in a row is less than one in a tbousand. I dont know t and z, but I can multiply 50% by itself as many times as need be.

February 17, 2013 2:06 pm at 2:06 pm #930880

benignuman
Participant

“Stated differently, if you took 100 quarters and flipped each one of them 10 times, you could expect that one of the times you would have either 10 heads or 10 tails. If you do it with one quarter though, it is so unlikely that it will happen, that you can still conclude that it is weighted with reasonable certainty.”

I don’t think that is correct. If the percentage of weighted quarters in circulation is less than 1% (and I would guess it is whole lot less than that) and you picked up the coin at random, then it more likely that you just got lucky with your flipping than that the coin is weighted.

This is like the urban legend of the young man that shot himself after testing positive for AIDs. Only .04% of the tests produced false positives, but that number was still much larger than the percentage of people with AIDs in society. In other words there were more false positives with this test than true positives.

February 17, 2013 2:10 pm at 2:10 pm #930881

squeak
Participant

” To my knowledge, not one reputable statistician has weighed in on this.”

The truth is many have weighed in, but NASI did not like what they heard so shouted them down.

February 17, 2013 2:16 pm at 2:16 pm #930882

chevron
Member

When have statisticians weighed in on NASI’s theories?

February 17, 2013 2:18 pm at 2:18 pm #930883

Torah613Torah
Participant

Squeak: Flipping heads is 1/1024, and his Z-score of 3.16 is about 1 in 1268 (according to one online calculator). Z score is just a measure of how many standard deviations the result is from the mean, and it sounds about right.

His math is correct since he is doing it to 100 quarters, not 1 quarter.

February 17, 2013 2:20 pm at 2:20 pm #930884

Torah613Torah
Participant

“The truth is many have weighed in, but NASI did not like what they heard so shouted them down.”

Would you mind pointing me to where they weighed in? I would be interested in reading what they said. What should I google? You clearly know a lot more about NASI than I do (my knowledge comes from reading the papers, so I have no way of judging)

February 17, 2013 2:46 pm at 2:46 pm #930885

popa_bar_abba
Participant

Squeak: I rounded some of the numbers. Some because my calculator only goes to 10 decimals. And including the final 3.16 since the z table I used only went to the second decimal.

February 17, 2013 3:12 pm at 3:12 pm #930886

popa_bar_abba
Participant

Squeak: I think the real answer to your question is that the distribution of sample means approximates a normal distribution, but it does not do so precisely, and it gets more precise as the sample size grows bigger.

I used a sample size of 10 (I spun 10 times to divine what would happen if I spun it infinite times), which is not so large. I think if I ran it with a sample size of 100, it would be much closer to the mathematical answer.

The z score for that is an even 10, but I can’t find a z table that goes that low, so…

February 18, 2013 1:05 am at 1:05 am #930887

squeak
Participant

This is starting to go over my head, but I think you missed what I said. You claimed that 1% of the time you would get a quarter to land on ten heads in a row and I pointed out that it is only a tenth as likely. Everything you said after that was to try and confuse me.

Torah- there are several other blogs that discussed nasi and posters known to be experts (ie not anonymous to the forum) made their points. But most of my due diligence was by asking real people what they think, not online, so I have no links for that. But if you are interested reach out and ask someone who has the analytical ability to help you figure it out for yourself. Nasi offers the easy blameless answer, so it takes some willpower to allow yourself to disbelieve it.

February 18, 2013 1:18 am at 1:18 am #930888

Torah613Torah
Participant

Hello, I’m your friendly local statistician, please let me help you all out today!

Firstly, here are some online calculators make your life a lot easier. Statistical Computing for The Internet Savvy:

1. Input your data, SD (standard deviation) and at easycalculations’s calculator to obtain a z-score.

2. Plug in your Z-score at Fourmilab’s Z-score calculator to calculate the probability of your z-score, which will give you a sense of how rare your result was.

3. Or, use the normal distribution calculator by computerpsych research software, which takes your z-scores and places them on a bell curve which is conveniently drawn for you.

Another note: Chebyshev’s rule says for a standard distribution of data, 99.7% of results will fall within 3 standard deviations from the mean. So if you’re getting a z-score above 3, make sure that’s what you’re expecting.

What does this have to do with Popa’s coin flipping?

For Popa’s original scenario, a z-score of 3 or so sounds about right. 1 means typical, 2 means a little less than typical, 3 means very rare, 4 means nearly unheard of.

Popa, a Z score of 10 is about 1 in infinity. The chance of getting 1000 heads is not infinity.

We’re all familiar with IQs and SATs as probability distributions. In standardized testing, a perfect score is nearly always set to 3 SDs above the mean.

A Z score of 10 would be equivalent to an IQ of 250, which is considered a mathematically vacuous result. To put that into perspective, a Z score of 10 is the equivalent of an SAT score of 1500 PER SECTION, for a total of 3000 rather than 1600 which is the maximum SAT score. It is functionally meaningless.

I hope that was helpful. This is your friendly local statistician signing off for today!

February 18, 2013 3:42 am at 3:42 am #930889

popa_bar_abba
Participant

Squeak: I understand what you are asking now. I have no idea what the answer is. I double checked my math, and it is correct, and I double checked the z score and it is correct. There is .4992% to the left of the score, which means there is slightly less than a 1% chance of falling this far off the mean. Which means that it will happen about 1% of the time even with a quarter that is evenly weighted.

I frankly have no idea what the mistake I am making is.

Torah613: Squeak points out that the chances are in fact 1 in a thousand, while a z score of 3.16 gives us about a 1 in a hundred chance. Which makes absolutely no sense.

I don’t get it; I’ll ponder more tomorrow.

February 18, 2013 3:58 am at 3:58 am #930890

popa_bar_abba
Participant

But, just to show you that I’m still more correct than you, I will note that the mathematical chances are 1/500–not 1/1000.

This is because I am calculating the chances of being this far off the mean in either direction, which would happen if there are either 10 heads or 10 tails. Therefore, the formula is 1*.5*.5*.5*.5*.5*.5*.5*.5*.5. Because the first time it doesn’t matter which one it lands on.

This is true even though I said the case was 10 heads. Because I am calculating the chances that a sample of 10 would be that far off the mean of all spins which is .5, and it would be that far off the mean whether it was 10 heads or 10 tails.

Snort at you.

February 18, 2013 4:38 am at 4:38 am #930891

ari-free
Participant

Hopefully we won’t end up with a debate between the Bayesians and the Frequentists. Otherwise, things here will get REALLY ugly.

February 18, 2013 4:45 am at 4:45 am #930892

Torah613Torah
Participant

Popa: I just realized that I never explicated the data set.

“Stated differently, if you took 100 quarters and flipped each one of them 10 times, you could expect that one of the times you would have either 10 heads or 10 tails. If you do it with one quarter though, it is so unlikely that it will happen, that you can still conclude that it is weighted with reasonable certainty.”

If you take 100 quarters and flip them 10 times, that’s 1000 flips. I’m not sure what you are doing – flipping 10 times, 100 times or 1000 times?

February 18, 2013 8:22 pm at 8:22 pm #930893

squeak
Participant

If you guys are struggling with 10 flips, maybe you can reduce it to 2 flips for simplicity. The chance of getting two heads in a row is one in four. Same chance of getting two tails in a row. With three flips, the chance is one in eight, count em. Just add two possibilites to each of the four from the two flip case. Keep adding a flip at a time and youll get to ten. All your fancy z scores dont mean a thing if they give you the wrong answer.

I never would have imagined that I would one day be explaining something to a statistician. Too bad we lost dr. Pepper.

February 18, 2013 8:54 pm at 8:54 pm #930894

nfgo3
Member

The opening post shows some good insights for a 14-year old, or some egregious gaps in the education of a 21-year old.

Statistics is a branch of mathematics that enables us to draw reliable conclusions about a phenomenon based on a sample of information that about that phenomenon. Generalizations are non-statistical conclusions about a phenomenon that are untested and unproven, e.g., all Jews love money, all frum Jews are crazy. all goyim are stupid. Generalizations are an important tool for yentas. Statistics are an important tool for scientists.

February 18, 2013 9:18 pm at 9:18 pm #930895

OneOfMany
Participant

Statistics are an important tool for scientists.

As the progeny of two scientists who have engrained in me a deep disdain of statistics, to that I say HA.

February 18, 2013 9:20 pm at 9:20 pm #930896

Torah613Torah
Participant

Popa: If you’re doing it for 1000 flips, a really high z score is correct.

Squeak: No one is trying to confuse you. You seem really smart. It would be easy for you to read up on statistics without getting confused.

AriFree: LOL. Just use what works for you, that’s my philosophy.

NFGO: The point of the opening post was to help people with low numeracy enjoy reading about statistics, and realize they use it unconsciously to make generalizations, but it can help them make better decisions. The point was not my knowledge (or lack of) statistics. If you like, we can go deeper into the topic.

I find that people get very intimidated and confused when statistics come into the picture, and a bit of exposure goes a long way in our community. And shidduchim are a lot less intimidating than coin flipping. 🙂

Maybe I simplified too much, but I wanted everyone on this website to enjoy reading it, even people who don’t like numbers.

And we do generalize, from a sample to a larger population, which you yourself acknowledge. But we agree that statistics are much more useful than generalizations.

February 18, 2013 10:18 pm at 10:18 pm #930897

popa_bar_abba
Participant

Forget the 100 flips.

Back to squeak’s question. If the odds of getting 10 in a row of heads or tails can be computed by multiplying .5 to the 9th power and gives us 1 in 500, then how can the z score for the same thing be 1 in 100?

Squeak: You don’t get a reliable z score on 2 flips because the sample size is not big enough, so it won’t approximate a normal distribution.

February 18, 2013 10:57 pm at 10:57 pm #930898

benignuman
Participant

nfgo3,

Generalizations are not necessarily “untested and unproven.” E.g. if I have statistical evidence that the average Jewish IQ is nearly 2 standard deviations above the national average, I can then make the generalization “Jews are smart.”

February 19, 2013 3:43 am at 3:43 am #930899

squeak
Participant

“If the odds of getting 10 in a row of heads or tails can be computed by multiplying .5 to the 9th power and gives us 1 in 500, then how can the z score for the same thing be 1 in 100?”

Since torah613 decided to patronize me, let me provide the answer. Not 42. I refer to popas first post in the thread. “Popa knows statistics. He can run a Z score, and a T score, and all the clever things he can do to convince you of things that aren’t true.”

QED

February 19, 2013 3:54 am at 3:54 am #930900

popa_bar_abba
Participant

lol

I’ll ask someone who knows stuff.

February 19, 2013 4:07 am at 4:07 am #930901

Torah613Torah
Participant

Squeak and PBA, that’s because it isn’t the correct Z score. 3.16 is a chance out of a thousand, 10 is 99.999999999999999999999 and probably more than that percent chance.

February 19, 2013 4:16 am at 4:16 am #930902

popa_bar_abba
Participant

Torah:

Again, we are discussing that you flipped one quarter 10 times and got 10 heads.

The calculation is this:

p-hat minus p

divided by

square root of (p*1-p) / square root of sample size

Where, p hat is the sample mean and p is the population mean.

So,

p-hat is 1, and p is .5

square root of (.5*.5)=.5

divided by square root of 10 = 0.15811388300841896659994467722164

Now, I subtract p from p-hat and get .5

.5/0.15811388300841896659994467722164 = 3.1622776601683793319988935444326

Then I google a handy dandy z table and find that the value for z score of 3.16 is .4992, which means that half a percent of results on a normal distribution lie to the left of z score 3.16, which means that there is a 1/100 chance of getting a result this far off the mean if you pick results at random from the distribution.

So where do you disagree?

February 19, 2013 4:25 am at 4:25 am #930903

popa_bar_abba
Participant

Oh, I get it. I’m reading the z table wrong. It’s formatted differently than I’m used to–it is showing the percentage between z score and mean instead of percentage left of the z score.

.4992 means there is 49.92% between the z score and the mean, which means that there is only .08 outside, which means a total of .16% on both sides, which is much closer to 1/500 than to 1/100

February 19, 2013 5:11 am at 5:11 am #930904

Torah613Torah
Participant

Popa, I’m glad you realized you were reading the table wrong. I’ve been feeling a bit attacked in this thread by you and Squeak. I definitely have to work on my middos. It’s okay if anonymous people think I don’t know anything, really it is. I just need to convince myself. 🙂

BTW if you use a z-score calculator, it will give you a much more accurate number. My feeling, without doing calculations, is that your answers are a bit too rounded. 3.16 should be closer to 1000.

I just came home and am too tired to focus, but, if you put your Z 3.1622 into the Fourmilab z-score probability calculator (to the ten thousandth digit), it comes out to 1/1277.
Author

Posts

Viewing 50 posts - 1 through 50 (of 52 total)

1 2 →

You must be logged in to reply to this topic.