Oliver Sherouse Writes Occasionally

on Public Policy
and Python Programming

The Metric System vs. the Soul

26 Feb 2015

I found this article on Facebook1, which argues that the Fahrenheit system is better for everyday use than the Celsius scale because it corresponds to a human range of hot and cold, rather than to the scientific but arbitrary freezing and boiling points of water. I find this argument obviously correct.

But in fact, the Celsius scale is only the tip of the 32-degree iceberg. I hold that the whole metric system dehumanizes us, when we use it out of its proper context.

That’s not me being funny to make a point: I actually believe that using the metric system for everything cheapens the human experience. Some people use this map, with countries that use the metric system in green and those that do not in gray, to mock the United States as a hopeless yokel of a nation:2

Map showing that the US is almost alone in not adopting the metic system

To me, that map shows the US as a lone holdout of common sense and civilization.

The metric system was developed to accomplish a few specific goals. It simplifies calculating higher or lower by its omnipresent powers of ten. It aligns, where possible, different kinds of measurement; a cubic centimeter of water is also one milliliter, and at four degrees Celsius it has a mass of one gram. In the laboratory, say, or in large scale manufacturing, these properties are no doubt desirable, because the extreme precision required comes most easily when unencumbered by factors purely human.

And it is for the exact same reason that the metric system ruins the glory and splendor and even romance of every day life.

A meter, for example, is the length light travels in about one three-hundred millionth of a second. That is a very precise definition, but to any person who does not go around noting the precise locations of photons, it is a useless definition. A foot, on the other hand, is about the length of a man’s foot when he wears a shoe.

A liter is the volume of a container 10 centimeters long, wide, and high. A cup is about as much as you get in a cup. A pint is two of those; the perfect size for a serving of beer.

The metric system has no connection to humanity as such. You can see this just by looking at the arts.

When Shylock demands a pound of flesh, we shudder; if he demanded a kilogram, we would laugh. When Falstaff says “Peace, good pint-pot” to the hostess, he is a having a good time; if he said “Peace, good point-five-liter-pot,” he would be a pedant.

No-one would be much moved if Frost sighed “But I have promises to keep and kilometers to go before I sleep, and kilometers to go before I sleep.”

I do not say that no-one has ever or will ever write a poem about a kilometer; only that I doubt that anyone has or will write a good one.

It does no good to say that measurement has nothing to do with art. That answer proves my point; it loses that part of the human experience that sees the romance in the mile of a thousand steps, that perceives the relationship of man to the cosmos.

The imposition of the metric system on the public first occurred during the French Revolution. If it was the most minor atrocity of the Jacobin’s bloody and merciless rationalism, it was also the most lasting. It embodies the Revolution’s determination to cram the majestic complexity of the world into a human mechanical design.

When someone says that we should give up our old miles for kilometers or pounds for kilograms, what they are really saying is that our everyday life were more like a machine, or a laboratory, or a mass production facility; that it would be less like humanity, and less like life.

I prefer humanity to machinery, and I value art over easy convertibility. And if I am the last man to measure my journeys in miles, I will probably be the man who enjoys them most along the way.

  1. Hat tip to Nancy vander Veer

  2. Source

Afternoon Links

12 Jan 2015

Today I’m reading a few papers from NBER:

  • Cognitive Economics:

    “Cognitive Economics is the economics of what is in people’s minds. It is a vibrant area of research (much of it within Behavioral Economics, Labor Economics and the Economics of Education) that brings into play novel types of data—especially novel types of survey data. Such data highlight the importance of heterogeneity across individuals and highlight thorny issues for Welfare Economics. A key theme of Cognitive Economics is finite cognition (often misleadingly called “bounded rationality”), which poses theoretical challenges that call for versatile approaches. Cognitive Economics brings a rich toolbox to the task of understanding a complex world.”

  • Austerity in 2009-2013, ungated version here:

    “The conventional wisdom is (i) that fiscal austerity was the main culprit for the recessions experienced by many countries, especially in Europe, since 2010 and (ii) that this round of fiscal consolidation was much more costly than past ones. The contribution of this paper is a clarification of the first point and, if not a clear rejection, at least it raises doubts on the second.”

    I’m hoping that this paper on austerity will be a little more illumating than the fly-by analysis I was talking about earlier

Austerity Arguments are a Mess (Chart Fight!)

12 Jan 2015

Quick chart fight. A while back, Matt Yglesias posted this, saying that “2014 is the year American austerity came to an end”:

yglesias_chart

Econ blogger Angus argued that Yglesias is trying to re-define austerity because we’re now seeing some decent growth. He posted the nominal graph and quipped, “Either austerity means nominal cuts and we never had any of it, or austerity means cuts relative to trend and we are still savagely in its grasp”:

angus_chart

Kevin Drum says that’s bogus, because you have to look at real spending per capita, like so:

drum_chart

So here’s my entry. I’m going to add two economic indicators to that same chart: growth in real GDP per capita, and the prime-age employment-population ratio (which I like better than unemployment):

oliver_chart

To put growth and the E-P ratio on the same scale, I’ve arbitrarily subtracted 79%, which is about the average over the period in question. It’s the trend, not the level, that matters.

The point, as I see it, is this: to make an argument about the “end of austerity” and what it means, you have to look at that graph and say that the 2014 part of that chart is meaningfully different from the 2009-2013 part. If you see that, you have better eyes than I do.

This is why people don’t trust economists or economics writers. It’s why they shouldn’t. You can’t tell anything from that graph, and claiming you can means you’re at best overstating your case, and at worst lying. It can be a data point1, but only as part of a larger analysis and I haven’t seen any that I’m particularly thrilled about or ready to bank on.

  1. Paul Krugman, for what it’s worth, has taken this route, Scott Sumner responds to him and Simon Wren-Lewis here.

Can We Really Say Voter ID Suppressed Turnout?

10 Nov 2014

In a post dramatically entitled Voter Suppression in 2014, Sean McElwee of the think tank Demos argues that early statistics1 already suggest that meaningful numbers of voters were wrongly disenfranchised. He makes three points: first, that the number of people who cannot vote because they committed a felony was high relative to some victory margins; second, that states with voter ID laws saw suppressed turnout, and third, that states with same-day registration had higher turnouts.

I want to focus on the second point there, because it’s been a hot-button issue lately, and because I’m more skeptical than most people that voter ID makes much of a difference2. McElwee’s tries to demonstrate his point by graphing the mean voter turnout among states in three pools: those which require photo ID, those which require non-photo ID, and those with no ID requirement3.

mean

Mean turnout was highest in the no-ID states, and higher in the (presumably less restrictive) non-photo ID states than in the photo ID states. Case closed, right?

Not exactly. To use statistics like this to make a real point, you have to remember that you’re got an incredibly small sample size. What we really want to know is whether the variance between groups is bigger than the variance across groups.

For example, here’s another version of that graph, but I’ve added confidence intervals:

mean-ci

The idea here is that, if you tell me which group a state is in, I can be 95% sure, statistically, that the voter turnout for that state fell between the top and bottom of the black line. You can see that there’s a lot of overlap. A turnout of 38 percent, say, wouldn’t be out of line for any group.

Maybe we’d be better off if we didn’t look at the mean, but rather the median—the state that ranks exactly in the middle of its group in terms of turnout. This takes care of any outliers—observations that aren’t characteristic of the group as a whole:

median

Whoops! Now the suppression story doesn’t fit at all. There’s almost no difference between photo ID states and no-ID states, and non-photo ID states do worse for some reason. Of course, at this point, we start to suspect that it’s not so much a reason as chance, and other unexplained factors that affect turnout.

Heck, let’s do one more. Here’s a box plot:

box

The line in the middle is the mean—same as the first graph. The box represents the middle 50 percent of the states in that group. Finally, the lines (called “whiskers”) represent the entire range across the group, up to one and a half times the spread of the middle 50 above and below the mean.

Here we see an important point: there are two dots in the no-ID group that are so much higher than the rest that they fall outside that mean-plus-one-and-a-half-times-middle-fifty range. Those dots happen to represent Maine and Wisconsin, which had particularly high turnouts, and which pulled the mean of the no-ID group up quite a bit. Now, looking across the whole distribution, that data point looks a lot less compelling.

This all amounts to a huge statistical nothingburger. As more data comes out, I’m sure more careful analyses will be run on the numbers to see whether we think voter ID laws were important to the election. My bet’s on the null hypothesis, but I might be wrong.

But let’s not excite ourselves about statistically meaningless charts just yet, shall we?

  1. The turnout numbers come from Michael P. McDonald, a professor at the University of Florida, and his website, electproject.org.

  2. I believe some nefarious folks have tried to use voter ID to improve their chances in elections, I’m just skeptical that it worked.

  3. I put the data and script I used to create these charts in a GitHub repository for anyone who’s interested.

The Big Problem With Stata

29 Oct 2014

I use Python for almost all my data work, but both in my workplace and my field more generally Stata dominates. People use Stata for a reason1, and it provides a far wider range of advanced statistical tools than you can find with Python (at least so far), but I hate working in it.

I’ve always found it hard to explain to others just why I hate it so much. You can generally get your problem solved, the help files aren’t terrible, there’s lots of Google-able help online2, you can write functions if you want to learn how. And while I find lots of little things annoying (the way you get variable values, for example, or the terrible do-file editor), the big problem was the one other people didn’t understand.

Today, however, I was re-reading some pages about the Unix Philosophy, when I saw something that hit the nail on the head. It’s Rob Pike’s Rule 5:

Rule 5. Data dominates. If you’ve chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming

Stata only has one data structure: the dataset. A dataset is a list of columns of uniform length. You can only have one dataset open at a time.

This is the right data structure for performing the actual analysis of data—say, a regression—and the wrong data structure for literally everything else. The problem is, 90 percent of doing data work is cleaning, aligning, adjusting, aggregating, disaggregating, and generally mucking around with your source data, because source data always comes from people who hate you. And because the data structure is wrong, you’re forced to use algorithms that look like they come from an H.P. Lovecraft story.

Never having seen anything better, most Stata users seem to be resigned to doing things like creating an entire column to store a single number and writing impenetrable loops for simple tasks. Or they use sensible tools to create their datasets (increasingly Python, but also even something like Excel) and then use Stata just for the analysis.

The latter is my approach when I can’t avoid Stata entirely. But I’m really looking forward to the day when I can avoid the fundamentally flawed design of Stata altogether.

  1. In my graduate program, we started learning econometrics with a different statistical program, called SAS. SAS is…SAS is rough.

  2. I’m looking at you, R.