I don’t have the tools to decide if this is a deliberate effort on the part of the DNC, but I can try to measure what effect the different debate schedules have had on the candidates opportunity to present their views.

Most obviously, the Democrats have 6 debates scheduled, and the Republicans have 13. 13 is more than 6. But that’s a pretty simplistic analysis. Much of the discussion in the media has centered on when the debates are scheduled. The Democrats have a Tuesday, a Wednesday, a Thursday, two Saturdays, and a Sunday. Weeknights typically have higher TV viewership than Saturdays, but Sunday is often better still (the Sunday in question is after a long weekend, however, which may make it less ideal.) Republicans have two Tuesdays, two Wednesdays, four Thursdays, a Friday, and two Saturdays (and one unscheduled but planned debate).

So, the Republicans three of twelve badly scheduled debates is less than the Democrats two of six. If the Sunday debate is in a ratings hole, the Republicans clearly scheduled better. If it is a bright spot, they may be equivalent. It’s hard to say right now, but time will tell. In any case, the Democrats aren’t winning back any points the lost in the number by the scheduling.

However, there is one place where the Democratic debates are doing a much better job than the Republicans, which is giving plenty of time to each candidate. In the first Republican debate, only Trump edged out ten minutes of speaking time, whereas Hillary Clinton had over half an hour in the Democratic debate, and Bernie Sanders nearly so. This trend has continued over the debates so far.

Now, this is obviously because there are far fewer Democratic candidates than Republican ones. But that fact was at least somewhat known when the debate schedules were designed, so it can’t be ignored.

To combine these various factors into a measurement of how helpful the debates have been in allowing a candidate to express their views, I will create a new number; the eyeball-minute. One eyeball-minute represents the attention of one debate viewer for one minute of speaking time. Actually, it represents one million debate viewers for one minute, for ease of calculation.

So, by combining the published television ratings for the debates with compiled lists of speaking times, I can calculate a sum total number of eyeball-minutes that each candidate has earned in the debates so far. This uses the average television rating of each debate, where I could find it, and the peak rating where I could not. It includes both main and secondary debates on the Republican side, but only for those candidates who have qualified for at least one main debate.

- Donald Trump (R) – 1212
- Hillary Clinton (D) – 1059
- Ted Cruz (R) – 984
- Jeb Bush (R) – 959
- Marco Rubio (R) – 934
- Ben Carson (R) – 865
- Bernie Sanders (D) – 839
- John Kasich (R) – 826
- Carly Fiorina* (R) -821
- Chris Christie* (R) – 798
- Ron Paul (R) – 767
- Martin O’Malley (D) – 619
- Mike Huckabee* (R) – 585
- Scott Walker** (R) – 334
- Jim Webb** (D) – 239
- Lincoln Chafee** (D) – 141

A single asterisk means that the candidate participated in at least one secondary debate, and a double asterisk indicates that the candidate withdrew from the race and so missed subsequent debates.

Returning to our original question, Bernie Sanders collection of 839 eyeball-minutes places him in an equivalent position to the majority of the Republican field. Martin O’Malley’s 619 is lower, but not catastrophically so.

The eyeball-minute may not be the best measure of debate impact, but it has the advantage of being objectively measurable, and by that measure, the Democratic debates are by and large just as effective candidate message platforms as the Republican debates.

There are many debates to come, and I will hopefully return to check on these numbers again.

]]>

This is not a minor problem, this is an industry. Under a most favorable scenario one has to expect the overwhelming majority of the voters matching name and date of birth are the same person.

I don’t think we can accept that claim without some analysis. The match on matching in large groups is a funny, and most people don’t judge it very well.

The classic example of this is to ask “How many people need to be in a room before there is a 50% chance that two of them share a birthday?”. Most people think that, with 365 days to choose from, it would need to be in the hundreds. The actual answer is 23. The first person to enter the room can have any birthday. The second person has one day they cannot have. The third has two, and so on. While the chance for each person colliding with someone else as they enter the room is low, the cumulative chance that every person will dodge every other person shrinks faster than most people think.

The math for voters isn’t the same, but it’s not too hard to figure out. There were 6.5 million NC voters in the data, and 101 million non NC voters. Voter birthdays are not evenly distributed, which means they will tend to match more often than by pure chance, but to keep the math simple, I assumed all voters were equally proportioned between 18 and 62. The lessened range is more than compensated for by the flattening of the voter density. Therefore, each voter in NC with a given name has a 1 in 16,000 chance of sharing a birthday with a voter in another state who also shares their name.

Now, consider John Smith. Howmanyofme.com says that there are 45,963 people named John Smith in the United States. Assuming they are evenly proportioned, 939 would be among the 6.5 million NC voters, and 14,670 would have voted in another state. Each of the NC John Smiths has a 60% chance of sharing a birthdate with an out-of-state voter, which means that 564 of them would turn up on this list.

And that’s just one name. There’s also James Smith, Michael Smith, Robert Smith… and John Jones, James Jones… and so on and so on and so on. In fact, just the top 10 first names paired with the top 10 last names on howmanyofme.com predicts nearly 25,000 matches. So, the 35,570 is easily reachable with no foul play whatsoever. The key in the math is that not only is a common name more likely to match, because there are more of them in other states, it will account for more matches, because there are more of them in NC.

An alternate analysis to to take a sample of real names and average them. Here, we only count each name in NC once, but we still use the frequency to predict the chances of a match. Finding a good sample is hard; I used the officers of several NC clubs with webpages, and the rosters of several NC high school sports teams.

The results were interesting. Many names are unique or nearly unique in the country, so have very little chance of matching. But a few spike very high, and account for the vast majority of the matches. My results were trending considerably lower than the reported results until I came across a Michael Smith, which pulled the average up to nearly double reality.

It seems that this method is very sensitive to the sample, so I will consider the other method more reliable. If the top 100 names can account for a significant fraction of the reported value, we can assume that the rest of names will account for the rest.

But what about the matches that also included social security numbers? The math for this is really the same as above, just with a 1 in 159,984,000 (16,000*9999) chance that matching names completely match. Now, however, the odds do grow long. There is only about 0.08 of a John Smith match, and the top 100 names only account for about 6 total matches. It is still possible, and indeed likely, that some of the 765 voters who matched in this way are pure coincidence. But not all of them. Voter fraud isn’t the only explanation for this. But it is one worth looking into.

Streiff’s claim that this report supports 1 million cases of voter fraud is preposterous. Matching names and birthdates is a completely insufficient tool to make any claim about the uniqueness of the people in question. Including social security numbers goes a long way towards resolving this concern, and those results warrant further investigation. However, they are well within the potential margin of human error.

]]>

Last week Nate Silver of FiveThirtyEight wrote a piece about flawed statistical thinking in an op-ed by Peggy Noonan. He used some simple calculations to show his point.

Dan McLaughlin of RedState had an issue with Mr. Silver’s piece.

Silver concedes of his statistical analysis that “this calculation

assumes that individuals’ risk of being audited is independent of their political views,” which of course is the very thing in dispute; it’s like the old joke about an economist stranded on a desert island with a stack of canned goods whose solution begins, “assume a can opener.” All things being equal, all things are equal.

Mr. McLaughlin fundamentally misidentified what Silver was doing when he made that assumption. It doesn’t weaken his argument; it is necessary to make it, statistically.

What Mr. Silver was doing in his piece was using an informal version of the null hypothesis, which is the foundation of much of modern statistics. The fundamental mathematical principle behind statistical significance relies, not on proving a hypothesis, but on disproving the null hypothesis.

Thus, if a statistician wants to show that smoking causes cancer, he does the math assuming that smoking has no effect on cancer. If the math leads to an unlikely result, he has disproven the null hypothesis. If the math doesn’t, then he has failed to disprove the null hypothesis. A statistician never proves a positive hypothesis, they simply disprove null hypotheses.

This is a bit tough to grasp, so I’ll try to explain it with a very simple example. If I have a coin that I think might be weighted, the way I test that is to flip if a bunch of times and write down the results. Then, I assume it was 50-50 heads tails, and ask, “If the coin weren’t biased, how unlikely would it be that I got the results I just did?” In 100 flips, if my sample came out 47-53, sure, the most likely answer is that it is slightly weighted. But the null hypothesis is still very likely, so I would not reject it. If my sample were 82-18, however, that would be a staggeringly unlikely event with a fair coin, so it is probably safe to reject the null. If I only flipped the coin twice, however, I couldn’t disprove the null even both flips were heads, since that has a good chance of happening regardless. Mathematically, this is what is represented by the p-value; the probability of a result like the one in question, given the null hypothesis.

That an individual’s risk of audit is independent of their political views is a null hypothesis. Mr. Silver proposes it, then shows that Peggy Noonan’s evidence does not disprove it. Thus, statistically, Peggy Noonan has very weak evidence. He does this by showing that, if the null is true, it would not be unusual to find four or five (indeed, four or five thousand) Republican donors that were audited. So, the fact that Peggy Noonan did find four or five Republican donors that were audited is not statistical evidence that the null is false.

The null hypothesis itself, however, is not a political statement. It is simply the way one has to formulate the problem in order to use the mathematical tools available.

As a side note, I was banned from RedState some time ago for formulating a statistical query in this way, because the null hypothesis looked like a political position, so the fact that it has come up again is of some interest to me.

]]>

…Without cherry-picking data as union bosses must in order to defend forced unionism, total seasonally adjusted non-farm employment growth shows a huge advantage for residents of right to work states.

The actual data presented, however, are the employment growth over 20 years for all right to work states, but only a few union-friendly states.

This, particularly considering that the introduction explicitly calls out cherry-picking, triggered my sensors. So, let’s see what happens if we look at data for all states.

First, I have to find the data used to create this chart. It took some rummaging on the BLS website, but I finally found numbers that almost, but not quite, recreate the numbers on the chart in the original article. My version of the chart is below.

As you can see, it looks essentially like the version in the article, though because I am using slightly different data, the percentages vary by a point or two.

Now, I’d like to present a different chart, this time comparing Ohio to other union-friendly states.

Here is where the cherry picking comes in. Ohio job growth isn’t terrible because it is union friendly, it’s terrible because it’s terrible. Nearly everyone does better than Ohio regardless of their labor policy.

Now, there is potentially something to be said that right-to-work states have a greater gain over the last 20 years than union friendly states do. But that wasn’t the argument; Jason hart asserted a “huge advantage” for right-to-work states, and presented evidence that was built around comparing to the second-worst performing state of any kind.

Presenting the data like that, particularly in the same sentence as calling out others for cherry-picking data, is disingenuous at best.

]]>

“There’s something going on with Republican-governed states. Seven out of the 10 states nationwide, Candy, that have the lowest unemployment rates: Republican governor states.”

By now you should have figured out the drill; does this show real evidence that Republican-governed states have lower unemployment than Democratic-governed ones?

The short answer is no. The long answer is nnnnnnooooooooooooooooooooooooooooooooooooooooooo. (sorry, bad joke). Simply put, there are more Republican governors than Democratic ones, so more Republican states appear in every part of the unemployment list. Only 3 out of 10 of the lowest unemployment states are governed by Democrats, but only 3 out of 10 of the highest unemployment states are (the one independent governed state means that only 6 of them are Republican).

Of course, we can do better than just counting from the top 10 and bottom 10. A simple statistical model can give a much better sense of whether governor party affiliation affects unemployment. The answer is no. While Republican governed states have slightly lower average unemployment, the difference is tiny (0.3%) and is very likely caused by random change (p=0.54). Trying to refine the model by adding length of incumbency or length of party incumbency does not produce any results other than noise. Sometimes one party is a little ahead, sometimes the other, but the results are never significant.

The conclusion is pretty clear, then. Mr. McDonnell’s statement is factually true only in the most technical sense, and any implication he tries to draw from it is faulty.

]]>

So, lets take a look at the numbers, shall we?

The 2011 salaries and titles of all 454 White House staff members are public information, published here. However, gender is not included. Converting names to genders is risky and full of ambiguities. The Free Beacon claims that they researched every name that was not obvious to ascertain the person’s gender; I have done the same. However, I still don’t know the gender of 17 employees, and I am sure that I am wrong on the gender of some of the ones I am claiming. Three entries in the list were listed with a salary of $0.00, I removed these as either errors or indicating some sort of volunteer arrangement. I also make no attempt to distinguish anyone who identifies with or falls into a non-binary gender category.

I do not know how similar my resultant dataset is to the Free Beacon’s on a case by case basis. They do not indicate if there were any names on the list that they did not assign genders to, or any other data-cleaning steps they took. I can, however, compare my median salaries with theirs (the only number they published). I counted 223 men, 211 women, and 17 unknowns in the sample. The median salary for men was $75,000, and the median for women was $63,240 (the median unknown was $50,000). This actually differs somewhat from the Free Beacon set; though the proportions are about the same. I assume that the Free Beacon data assigned a gender to everyone, and left no unknowns.

The first question, which I have seen in a few places, is why use median? The mean values are much closer (78K to 86K, in my data), so is this just cherry picking statistics by the Free Beacon? Probably not. Because salaries are not normally distributed (they tend to be clumped up near the low end and spread out more and more as you get higher), the average is often not considered a useful value. Here, though, since the maximum salary is only 4 times the minimum, this is less important, but there is plenty of precedent to use the median.

Next, looking at $75000 and $63240, one is obviously bigger. But salaries range from $41k to $170k, so is that difference actually significant, or just a coincidence? The standard deviation (here just a general estimate of fluctuation, since the distribution isn’t normal) of both gender’s salary distribution is around $40k, so a difference of $12K is less significant in comparison. But a proper test would be to test the significance of a model which predicts salary by gender. This model shows a p-value of right around 0.05, depending on how one treats the unknowns, which is on the threshold of traditional “statistical significance”. But the R^2 of the model is 0.0086. This means that known the gender explains less than 1% of the difference in salaries between people.

So, the Free Beacon’s mathematical claims are holding up. The difference is real, and it and probably indicates an underlying fact about the data.

The interesting part comes when you remember that the pay-equality argument is “equal pay for equal work.” Maybe women just have different sorts of jobs in the White House. So, perhaps we should consider how men and women are paid for the same job. For example, there are 19 “Staff Assistants” on the White House payroll (14 women, 4 men, and one unknown). All but one of these make the same amount; one male makes slightly more. For the “Analyst” job, things are reversed. There are more men than women, nearly everyone makes the same, but one woman makes more.

We can make a model that accounts for this. It models salary based on title, then on gender to try to describe the remaining differences. The salary gap under these circumstances is reduced to just under $2000, though not eliminated. However, the chance that that $2k is meaningful decreases significantly, with the p-value climbing past 0.1. Add the interaction between position and gender (to see if women are being underpaid only in certain positions), and the significance of the model decays further. The flaw in this model is that while many titles are generic, many of the people in this dataset have unique titles, such as “Special Assistant To The President For Urban Affairs”. An analysis like this basically doesn’t use those at all.

There are two criticisms that flowed from this article. The first is the accusation that the White House made a conscious and deliberate decision to pay female employees less than males because they were women. If this were true, we would expect to see women being paid less than men while doing the same job, and the data do not show that.

The second criticism is that the White House does not employ women as inner-circle trusted advisers to the President. The data are more supportive of this, but only if you accept many more assumptions, such as that more pay directly maps to a more influential policy role. To prove this, it would be much better to actually categorize who among the staff are the influential advisers to the President, rather than how much they are paid.

So, my ultimate conclusion is a mixed bag. I don’t agree with the reasoning used to turn this information into an attack on the President, but the numbers themselves are valid and meaningful.

]]>

But could there be a good reason why Justice Ginsburg doesn’t think the US Constitution is a good model? I think there is; Presidential systems do not lend themselves well to long-term stable democracies, which is the goal of a well-written Constitution. Of course, the United States is the exception, but how well do other countries with a Presidential system fare?

Once more, I will rely on the Polity IV database to judge the degree of democracy in a country. As a reminder, the Polity IV project rates every country every year from -10 (totally autocratic) to +10 (totally democratic). The threshold for a functional democracy in this scale is generally considered to be +6 or higher.

I define a stable democracy, somewhat arbitrarily, as a country with a Polity IV score of +6 or better every year since 1970, and without any coups, revolutions, or other extra-Constitutional transfers of power in that time.

Armed with this information, it only remains to check the 43 or so countries large enough to be listed in the Polity IV database that presently have Presidential systems of government to see if they meet these conditions.

And there are five.

- Colombia
- Costa Rica
- Cyprus
- Sri Lanka
- United States of America

Plenty more countries are functioning democracies now, but only those have been so long enough to be called stable. Far more Presidential countries show the same trend; spurts of true democracy followed by descents into authoritarian rule.

Of course, without any comparison, 5 of 43 (or 11.6%) might be a good score. So let’s check parliamentary democracies. There are 64 of those (and a few with hybrid systems overlap, including Sri Lanka). How many of them are stable democracies?

- Australia
- Austria
- Belgium
- Botswana
- Canada
- Denmark
- Finland
- Germany
- India
- Ireland
- Israel
- Italy
- Jamaica
- Japan
- The Netherlands
- New Zealand
- Norway
- Sri Lanka (shared)
- Sweden
- United Kingdom

So, for Parliamentary systems of government, 20 out of 63 (or 31.7%) are stable democracies. That sounds like much better odds.

Another thing that I don’t understand about these complaints is that they seem to undermine the theory of American Exceptionalism, also popular in conservative circles. If anyone can adopt the US Constitution and become just as free as Americans, what is so special about America itself?

In fact, I think I may have just gathered evidence that there is something special about America, but whether it is due to some inherent quality of the American people, or just the unusual circumstances under which the country was formed and grew is well beyond the scope of this article, or indeed this blog.

P.S. Yes, it has been nearly six months since I wrote anything. It’s hard to find subjects that lend themselves well to this kind of analysis, and I have several half-finished but not very good drafts sitting around unpublished from that time.

]]>

Well, another sex scandal has made the news, so it’s time to put my model to the test again. Hermain Cain’s scandal isn’t very interesting; quite frankly. The variables that matter to the model are pretty straightforward; Mr. Cain’s scandal is nothing special.

- Intensity: 5 – multiple instances of sexual advances, but no actual sex.
- Unfaithfulness: 7 – Cain has been married for 40+ years, but hasn’t quite been accused of actually cheating on his wife.
- Kinkiness: 3 – Nothing more than a little dirty talk.
- Hypocrisy: 4 – Courting the religious right but having adulterous intentions.
- Coercion: 6 – The actions were non-consensual.

The other ratings (such as Contrition, which is 1 (Cain denies the events), and Plausibility, which is 6 (there isn’t very strong evidence that they happened), aren’t a part of the model.

So, as a low intensity Republican with a coercive but not kinky scandal, the model does not predict a happy outcome for Mr. Cain. Specifically, the result is a value of 0.16, which means he will most likely drop out of the race or lose the nomination. But, this is the same model that predicted that David Wu wasn’t going anywhere on the precise day he announced his resignation, so take that with a grain of salt.

It should also be noted that only one of my model data cases (Jack Ryan) was a non-incumbent candidate for election, so the dynamics may be very different. But I have the model, so it’s worth testing it again. And the best way to test is to make the prediction in advance of the event, so there you are.

]]>

How wrong I was. Herman Cain recently declared his plan to build an electrified fence on the border to keep out illegal immigrants. He later said he was joking, but his description of the plan is lacking in humor.

It’s going to be 20 feet high. It’s going to have barbed wire on the top. It’s going to be electrified. And there’s going to be a sign on the other side saying, ‘It will kill you — Warning’

By now, you probably know how this works. For the moment, let’s assume Mr. Cain was serious, and let’s set aside the issues of killing illegal immigrants in cold blood. How much would a fence like this cost?

First we have to ask, where do people build lethal electric fences? The only place I know of is around prisons. Fortunately, the US Federal Bureau of Prisons recently upgraded fencing at a number of facilities, and the costs of the plan were published. With a little work on Google Earth, I calculated the total perimeter of the seven prisons to be just about 9 miles. The project was planned to cost $10 million or more, for a rate of $1.11 million per mile of fencing.

This isn’t a perfect estimate, by any means. It probably significantly underestimates construction costs, but its not clear by how much. The physical nature of the fencing at prisons is about what Mr. Cain is describing, and any pricing advantage for working in bulk will likely be more than offset by the inaccessibility of the work site.

So, the cost to fence the 1,980 mile border comes to $2.2 billion, at a minimum. That’s not actually so bad. It comes to around 20% of the budget of Customs and Border Security, but a project of this nature would fundamentally change the way they operated.

Even if building across the desert causes serious cost inflation, the price of building this project, while significant, is not out of the realm of possibility. However, the plausibility of this plan is not yet assured. Electric fences require one other important thing; electricity.

Modern electric fences don’t actually consume a lot of power; they only energize when touched. However, they need access to a lot of power in order to deliver a lethal shock anywhere along their length. So the major costs in electifying the fence will be in power transmission, not consumption.

Defining good numbers for this is almost impossible, because of the numbers of variables involved. Each section of fence must be separately given power, but how long can the sections be? How close is electrical infrastructure to each section along the entire border? However, consider the following numbers. 69kV electric lines, suitable for transmitting power a few dozen miles, cost several hundred thousand dollars a mile. At a minimum, the fence would need several thousand miles of these power lines to connect it to the grid in various areas, as well as electrical substations and other infrastructure. This would be a very expensive project, but not out of the realm of possibility, if a major policy change dictated it.

But would a fence alone really help the situation? Electrified fences require constant monitoring to remain effective. Once the fence is touched, it delivers a lethal shock. However, now the body is grounding the fence out, and it cannot continue to discharge lethal electricity for any significant time. In a prison, this isn’t a concern, since there are guards mere seconds away from any alarm. But in the middle of the desert, what good does the fence actually do? Can the US afford to energize sections for as long as it takes for agents to respond (considering that there will be many many false triggers over 1980 miles of fence)? Will the system withstand such heavy use? How will it help to stop well-prepared groups of people with ladders, insulators, and other tools?

The fence, really, seems like it is just the start of building a set of comprehensive static border defenses, much like the Inner German Border I discussed last time I addressed this issue. This is a valuable discussion, becasue last time I had no starting point for an estimate of the costs of erecting such a wall in the US. Now I do ($3 billion or more). Another cost estimate would be to compare the cost to that of the Israeli Security Fence, which is costing around $3.3 million per mile. At that rate, a wall would cost $6.5 billion. As an aside, please note that the Israelis, despite attempting to deny entry to persistent terrorist attackers, did not resort to lethal countermeasures in their static defenses.

But remember, the static defenses of the Inner German Border only worked because in addition to all of the fixed defenses, they were staffed by a massive border protection force, many times larger than that of the US. As expensive as it is, just building a fence is not enough, even if it expresses a wanton disregard for human life. There is no way around the fact that controlling a border takes both manpower and infrastructure.

So what about Herman Cain, and his idea? Well, obviously, I think it is both callous and immoral, but that isn’t what I’m trying to evaluate here. It is technically feasible to build, but I don’t think it would be nearly as effective as he does. Or perhaps it was just a joke.

]]>

Writing about restrictions on unpasteurized milk, Bob Unruh observes:

The reason cannot be safety, the report said, since a report from the Weston A. Price Foundation revealed that from 1980 to 2005 there were 10 times more illnesses from pasteurized milk than from raw milk.

Unfortunately, this is citing a report (without giving details or linking to it) that cites another report that makes this claim. I eventually found the original report here, only to discover that it has almost no further details about what it means. From that report, we learn that there were 41 outbreaks and 19,531 illnesses attributed to pasteurized milk, and this is 10.7 times the illnesses for raw milk. If you use the raw milk numbers later in the report, it comes out to 8.4 times, but they may be using different numbers. It sounds damning, until you realize that this does not mean that an illness is ten times more likely from pasteurized milk. The problem is that more people drink pasteurized milk, so even if illness is less likely, there will be more in total.

In fact, according to the CDC in 2002, about 80% of Americans drink milk, and about 3.5% drink raw milk. Other numbers (cited in the report linked about) have the consumption of raw milk much lower, at around 0.5%, but we’ll stick with the higher number. To understand what this means, consider a notional 1000 person town. 35 people drink raw milk, and 765 drink regular milk. 1 raw milk drinker gets ill, and 10 regular milk drinkers do.

So, the danger to a raw milk drinker is 1/35, or a 2.8% chance of illness. The danger to a regular milk drinker is 10/765, or a 1.3% chance of illness. Remember these specific chances are made up (in reality both numbers are probably lower), but the key fact is, even though ten times more regular milk drinkers got sick, there are so many more of them to choose from that the rate of sickness among regular milk drinkers is less than half that among raw milk drinkers.

This is a version of the fallacy of the base rate, where one can get sidetracked by what numbers really mean if one forgets the base probabilities underlying the situation (in this case, the chance that a random sick person drank raw milk or regular milk). However, in the usual case of the base rate there are two intersecting probabilities, usually a population and a test, and without doing the math carefully, the fallacy is easy to fall into. Here, there is no probability, just a number of illnesses, and the math to show how the base rate affects the probabilities is trivial. So in this case, the fallacy is less forgivable.

I have opinions on the larger issues about the safety of raw milk, and whether regulating a farm as described in the article is an unlawful violation of personal property rights or properly applying the law to ignore legal trickery. But this article is simply about the real meaning of the “ten times as many” claim. Using that as evidence that pasteurized milk is more dangerous than raw milk is like saying that driving is more dangerous than hang-gliding because there are more car than hang-glider accidents. Both claims are preposterous if given any real thought.

]]>