Pay Equality at the White House

Posted: April 20, 2012 in analysis

An article on The Free Beacon here makes the simple claim that the White House pays women less than men, according to public records.  They then go on to imply (as others who link to them do more explicitly, like here) that this is demonstrative of an anti-woman attitude in the administration.

So, lets take a look at the numbers, shall we?

The Dataset

The 2011 salaries and titles of all 454 White House staff members are public information, published here.  However, gender is not included.  Converting names to genders is risky and full of ambiguities.  The Free Beacon claims that they researched every name that was not obvious to ascertain the person’s gender; I have done the same.  However, I still don’t know the gender of 17 employees, and I am sure that I am wrong on the gender of some of the ones I am claiming.  Three entries in the list were listed with a salary of $0.00, I removed these as either errors or indicating some sort of volunteer arrangement.  I also make no attempt to distinguish anyone who identifies with or falls into a non-binary gender category.

I do not know how similar my resultant dataset is to the Free Beacon’s on a case by case basis.  They do not indicate if there were any names on the list that they did not assign genders to, or any other data-cleaning steps they took.  I can, however, compare my median salaries with theirs (the only number they published).  I counted 223 men, 211 women, and 17 unknowns in the sample.  The median salary for men was $75,000, and the median for women was $63,240  (the median unknown was $50,000).   This actually differs somewhat from the Free Beacon set; though the proportions are about the same.  I assume that the Free Beacon data assigned a gender to everyone, and left no unknowns.

Analysis

The first question, which I have seen in a few places, is why use median?  The mean values are much closer (78K to 86K, in my data), so is this just cherry picking statistics by the Free Beacon?  Probably not.  Because salaries are not normally distributed (they tend to be clumped up near the low end and spread out more and more as you get higher), the average is often not considered a useful value.  Here, though, since the maximum salary is only 4 times the minimum, this is less important, but there is plenty of precedent to use the median.

Next, looking at $75000 and $63240, one is obviously bigger.  But salaries range from $41k to $170k, so is that difference actually significant, or just a coincidence?  The standard deviation (here just a general estimate of fluctuation, since the distribution isn’t normal) of both gender’s salary distribution is around $40k, so a difference of $12K is less significant in comparison.  But a proper test would be to test the significance of a model which predicts salary by gender.  This model shows a p-value of right around 0.05, depending on how one treats the unknowns, which is on the threshold of traditional “statistical significance”.  But the R^2 of the model is 0.0086.  This means that known the gender explains less than 1% of the difference in salaries between people.

So, the Free Beacon’s mathematical claims are holding up.  The difference is real, and it and probably indicates an underlying fact about the data.

The interesting part comes when you remember that the pay-equality argument is “equal pay for equal work.”  Maybe women just have different sorts of jobs in the White House.  So, perhaps we should consider how men and women are paid for the same job.  For example, there are 19 “Staff Assistants” on the White House payroll (14 women, 4 men, and one unknown).   All but one of these make the same amount; one male makes slightly more.  For the “Analyst” job, things are reversed.  There are more men than women, nearly everyone makes the same, but one woman makes more.

We can make a model that accounts for this.  It models salary based on title, then on gender to try to describe the remaining differences.  The salary gap under these circumstances is reduced to just under $2000, though not eliminated.  However, the chance that that $2k is meaningful decreases significantly, with the p-value climbing past 0.1.  Add the interaction between position and gender (to see if women are being underpaid only in certain positions), and the significance of the model decays further.  The flaw in this model is that while many titles are generic, many of the people in this dataset have unique titles, such as “Special Assistant To The President For Urban Affairs”.  An analysis like this basically doesn’t use those at all.

Conclusions

There are two criticisms that flowed from this article.  The first is the accusation that the White House made a conscious and deliberate decision to pay female employees less than males because they were women.   If this were true, we would expect to see women being paid less than men while doing the same job, and the data do not show that.

The second criticism is that the White House does not employ women as inner-circle trusted advisers to the President.  The data are more supportive of this, but only if you accept many more assumptions, such as that more pay directly maps to a more influential policy role.  To prove this, it would be much better to actually categorize who among the staff are the influential advisers to the President, rather than how much they are paid.

So, my ultimate conclusion is a mixed bag.  I don’t agree with the reasoning used to turn this information into an attack on the President, but the numbers themselves are valid and meaningful.

A few months ago, Supreme Court Justice Ruth Bader Ginsburg was discussing the drafting of a new Egyptian Constitution in Egypt, and she said that she didn’t believe that the US Constitution was the best model.  This terribly offended many commentators, as can be seen here, here, and here.

But could there be a good reason why Justice Ginsburg doesn’t think the US Constitution is a good model?  I think there is; Presidential systems do not lend themselves well to long-term stable democracies, which is the goal of a well-written Constitution.  Of course, the United States is the exception, but how well do other countries with a Presidential system fare? Read the rest of this entry »

Political Sex Scandals Revisited

Posted: November 8, 2011 in analysis

Some of you may recall that last summer, I tried to build a model to predict the results of political sex scandals, and documented my efforts here and here.  The model was unusual, and it turned out to predict the then-current sex scandal (David Wu) very poorly.

Well, another sex scandal has made the news, so it’s time to put my model to the test again.  Hermain Cain’s scandal isn’t very interesting; quite frankly.  The variables that matter to the model are pretty straightforward; Mr. Cain’s scandal is nothing special.

  • Intensity: 5 – multiple instances of sexual advances, but no actual sex.
  • Unfaithfulness: 7 – Cain has been married for 40+ years, but hasn’t quite been accused of actually cheating on his wife.
  • Kinkiness: 3 – Nothing more than a little dirty talk.
  • Hypocrisy: 4 – Courting the religious right but having adulterous intentions.
  • Coercion: 6 – The actions were non-consensual.

The other ratings (such as Contrition, which is 1 (Cain denies the events), and Plausibility, which is 6 (there isn’t very strong evidence that they happened), aren’t a part of the model.

So, as a low intensity Republican with a coercive but not kinky scandal, the model does not predict a happy outcome for Mr. Cain.  Specifically, the result is a value of 0.16, which means he will most likely drop out of the race or lose the nomination.  But, this is the same model that predicted that David Wu wasn’t going anywhere on the precise day he announced his resignation, so take that with a grain of salt.

It should also be noted that only one of my model data cases (Jack Ryan) was a non-incumbent candidate for election, so the dynamics may be very different.  But I have the model, so it’s worth testing it again.  And the best way to test is to make the prediction in advance of the event, so there you are.

Border Security Revisited

Posted: October 23, 2011 in analysis

My first real posts on this site were a 3 part series about border security, my loyal readers may recall.  As you may recall, in part 3 of that series, I assumed that nobody was seriously discussing putting lethal deterrents on the U.S.-Mexican border (like minefields and electric fences), both for political reasons and for reasons of basic human dignity.

How wrong I was.  Herman Cain recently declared his plan to build an electrified fence on the border to keep out illegal immigrants.  He later said he was joking, but his description of the plan is lacking in humor.

It’s going to be 20 feet high. It’s going to have barbed wire on the top. It’s going to be electrified. And there’s going to be a sign on the other side saying, ‘It will kill you — Warning’

By now, you probably know how this works.  For the moment, let’s assume Mr. Cain was serious, and let’s set aside the issues of killing illegal immigrants in cold blood.  How much would a fence like this cost? Read the rest of this entry »

The Fallacy of the Base Rate

Posted: October 6, 2011 in analysis

This isn’t an analysis up to my usual standards, but I have been struggling for content these last few weeks.  Nothing out there seems to fit the somewhat particular requirements I have for a subject to write about; that it be a factual statement which requires aggressive or unorthodox analysis to get to the bottom of, but that there is a path of analysis to enlighten.  So, today I have a much simpler error, from wnd.com.

Writing about restrictions on unpasteurized milk, Bob Unruh observes:

The reason cannot be safety, the report said, since a report from the Weston A. Price Foundation revealed that from 1980 to 2005 there were 10 times more illnesses from pasteurized milk than from raw milk.

Unfortunately, this is citing a report (without giving details or linking to it) that cites another report that makes this claim.  I eventually found the original report here, only to discover that it has almost no further details about what it means.  From that report, we learn that there were 41 outbreaks and 19,531 illnesses attributed to pasteurized milk, and this is 10.7 times the illnesses for raw milk.  If you use the raw milk numbers later in the report, it comes out to 8.4 times, but they may be using different numbers.   It sounds damning, until you realize that this does not mean that an illness is ten times more likely from pasteurized milk.  The problem is that more people drink pasteurized milk, so even if illness is less likely, there will be more in total. Read the rest of this entry »

Failed Ideas

Posted: September 19, 2011 in Meta

Finding good subject material to write about for this blog can be difficult.  Indeed, even once I have a what seems like a good idea for an article, sometimes it doesn’t pan out.  While I keep looking for more ideas, I thought I’d share a few of the past ones that didn’t really work out, for one reason or another.

  • An analysis of President (then candidate) Obama’s 2008 comment about putting air in tires instead of drilling for oil.  Won’t do it, because many people already did.  See here for a good example of what I would hope to have done, if they hadn’t done it first.  Looking into this helped me realize that I needed to seek claims and questions that would be subject to more unusual or unorthodox analysis, not just number crunching.
  • An analysis of a claim on Conservapedia‘s page describing public schools.  The claim is “Given that public schools educate about 90% of Americans, it is astounding how few prominent Americans attended public school after the banning of school prayer in 1962. “  I actually did a large amount of research on this one before this blog existed, and collected a fairly large list of prominent conservative Americans sorted by schooling (my compulsion to do that research was one of the reasons I started this blog in the first place), but it doesn’t make sense to put it up here, since the methodology is pretty sloppy, and picking apart bias from Conservapedia feels almost unfairly easy.
  • An attempt to quantify military success rates of countries that allow gays to serve openly vs. those that do not allow gays at all, also based in part on previous research.  The problem with this is that most wars are messy things without clear winners and with multiple parties on each side, and data on precisely when each country might have changed its policies on gays in the military is even harder to find than just what that policy is.   The short version is that Israel allows gays to serve in the military, and if there’s a country on this planet that needs every bit of military efficiency it can get, it’s Israel.
  • Some sort of data-based analysis of the media coverage of Hurricane Irene relative to its tangible effects.  I had just started to ponder how to approach this analysis when I saw that Nate Silver did it better than I could hope to.
  • An analysis of the “13 keys to victory” election prediction that was making the rounds on the Internet.  While I waited for the book to come via inter-library loan, Nate Silver ninja’d me again.  I’m working on a different kind of analysis that complements rather than retreading what Mr. Silver did, but it is slow going.

Lightbulb Energy Savings, part II

Posted: September 8, 2011 in analysis

In part I of this article, we discussed whether or not high efficiency lightbulbs would cause sufficiently higher heating bills to offset their energy savings, and determined that, for one case of a St. Paul, MN suburban homeowner, they would not.

There are some more areas to explore about this, however.

LED Lighting

But what if my friend is concerned about warmup time, light quality, or the dangerous substances in CFLs, and decided to light his house with LEDs instead? Read the rest of this entry »