Category Archives: Science

T-Shirts, Feminism, Parenting, and Data Science, Part 1: Colors

Before I was a parent I never gave much thought to children’s clothing, other than to covet a few of the baby shirts at T-Shirt Hell. Now that I have a two-year-old daughter, I have trouble thinking of anything but children’s clothing. (Don’t tell my boss!)

What I have discovered over the last couple of years, is that clothing intended for boys is fun, whereas clothing intended for girls kind of sucks. There’s nothing inherently two-year-old-boy-ish about dinosaurs, surfing ninjas, skateboarding globes, or “become-a-robot” solicitations, just as there’s nothing inherently two-year-old-girl-ish about pastel-colored balloons, or cats wearing bows, or dogs wearing bows, or ruffles. Forget about gender, I want Madeline to grow up to be a “surfing ninja” kind of kid, not a “cats wearing bows” kind of kid. An “angry skateboarding dog” kind of kid, not a “shoes with pretty ribbons” kind of kid.

Accordingly, I have taken to buying all of Madeline’s shirts in the boys section, the result of (her boy-ish haircut and) which is that half the time people refer to her as “he”. This doesn’t terribly bother me, especially if she ends up getting the gender wage premium that people are always yammering about on Facebook, but it makes me wonder why such a stark divide between toddler boy shirts and toddler girl shirts. And, of course, it makes me wonder if the divide is so stark that I can build a model to predict it!

The Dataset

I downloaded images of every “toddler boys” and “toddler girls” t-shirt from Carters, Children’s Place, Crazy 8, Gap Kids, Gymboree, Old Navy, and Target. Because each one had their shirts at a different (random) website location, I decided that using an Image Downloader Chrome extension would be quicker and easier than writing a scraping script that worked with all the different sites.

I ended up with 616 images of boys shirts and 446 images of girls shirts. My lawyer has advised me against redistributing the dataset, although I might if you ask nicely.

Attempt #1: Colors

(As always, the code is on my GitHub.)

A quick glance at the shirts revealed that boys shirts tend toward boy-ish colors, girls shirts toward girl-ish colors. So a simple model could just take into account the colors in the image. I’ve never done much image processing before, so the Pillow Python library seemed like a friendly place to start. (In retrospect, a library that made at least a half-hearted attempt at documentation would probably have been friendlier.)

The PIL library has a getcolors function, that returns a list of

(# of pixels, (red, green, blue))

for each rgb color in the image. This gives 256 * 256 * 256 = almost 17 million possible colors, which is probably too many, so I quantized the colors by bucketing each of red, green, and blue into either [0,85), [85,170) or [170,255]. This gives 3 * 3 * 3 = 27 possible colors.

To make things even simpler, I only cared about whether an image contained at least one pixel of a given color [bucket] or whether it contained none. This allowed me to convert each image into an array of length 27 consisting only of 0′s and 1′s.

Finally, I trained a logistic regression model to predict, based solely on the presence or absence of the 27 colors, whether a shirt was a boys shirt or a girls shirt. Without getting too mathematical, we end up with a weight (positive or negative) for each of the 27 colors. Then for any shirt, we add up the weights for all the colors in the shirt, and if that total is positive, we predict “boys shirt”, and if that total is negative, we predict “girls shirt”.

I trained the model on 80% of the data and measured its performance on the other 20%. This (pretty stupid) model predicted correctly about 77% of the time.

Plotted below is the number of boys shirts (blue) and girls shirts (pink) in the test set by the score assigned them in the model. Without getting into gory details, a score of 0 means the model thinks it’s equally likely to be a boys shirt or a girls shirt, with more positive scores meaning more likely boys shirt and more negative scores meaning more likely girls shirt. You can see that while there’s a muddled middle, when the model is really confident (in either direction), it’s always right.

shirts_by_score

If we dig into precision and recall, we see

P(is actually girl shirt | prediction is “girl shirt”) = 75%
P(is actually boy shirt | prediction is “boy shirt”) = 77%
P(prediction is “girl shirt” | is actually girl shirt) = 63%
P(prediction is “boy shirt” | is actually boy shirt) = 86%

One way of interpreting the recall discrepancy is that it’s much more likely for girls shirts to have “boy colors” than for boys shirts to have “girl colors”, which indeed appears to be the case.

Superlatives

Given this model, we can identify

The Girliest Girls Shirt (no argument from me):

girliest_girl_shirt

The Boyiest Girls Shirt (must be the black-and-white and lack of color?):

boyiest_girl_shirt

The Girliest Boys Shirt (I can see that if you just look at colors):

girliest_boy_shirt

The Boyiest Boys Shirt (a slightly odd choice, but I guess those are all boy-ish colors?):

boyiest_boy_shirt

The Most Androgynous Shirt (this one is most likely some kind of image compression artifact, the main colors are boyish but the image also has some girlish purple pixels in it that cancel those out):

most_androgynous

The Blandest Shirt (for sure!):

blandest

The Most Colorful Shirt (no argument with this one either!):

coloriest

Scores for Colors

By looking at the coefficients of the model, we can see precisely which colors are the most “boyish” and which are the most “girlish”. The results are not wholly unexpected:

151.71
80.68
69.35
49.69
43.83
40.99
35.94
30.56
26.08
24.06
20.89
20.49
18.89
17.67
1.29
-17.37
-21.77
-29.95
-49.91
-56.4
-66.77
-69.52
-70.15
-82.17
-119.1
-175.2
-224.74

In Conclusion

In conclusion, by looking only at which of 27 colors are present in a toddler t-shirt, we can do a surprisingly good job of predicting whether it’s a boys shirt or a girls shirt. And that pretty clearly involves throwing away lots of information. What if we were to take more of the actual image into account?

Coming soon, Part 2: EIGENSHIRTS

Secrets of Fire Truck Society

Hi, I gave a talk at Ignite Strata on “Secrets of Fire Truck Society” and at the end I promised that for more information you could visit this blog. Unfortunately, I haven’t had time to write a blog post. Here are some links to tide you over until I do:

Hacking Hacker News

Hacker News, if you don’t know it, is an aggregator / forum attached to Y Combinator. People submit links to news stories and blog posts, questions, examples, and so on. Other people vote them up or down, and still other people argue about them in the comments sections.

If you have unlimited time on your hands, it’s an excellent firehose for things related to hacking. If your time is more limited, it’s more challenging. People submit hundreds of stories every day, and even if you only pay attention to the ones that get enough votes to make it to the homepage, it’s still overwhelming to keep up:

What’s more, a lot of the stories are about topics that are boring, like OSX and iPads and group couponing. So for some time I’ve been thinking that what Hacker News really needs is some sort of filter for “only show me stories that Joel would find interesting”. Unfortunately, it has no such filter. So last weekend I decided I would try to build one.

Step 1 : Design

To make things simple, I made a couple of simplifying design decisions.

First, I was only going to take into account static features of the stories. That meant I could consider their title, and their url, and who submitted them, but not how many comments they had or how many votes they had, since those would depend on when they were scraped.

In some ways this was a severe limitation, since HN itself uses the votes to decide which stories to show people. On the other hand, the whole point of the project was that “what Joel likes” and “what the HN community likes” are completely different things.

Second, I decided that I wasn’t going to follow the links to collect data. This would make the data collection easier, but the predicting harder, since the titles aren’t always indicative of what’s behind them.

So basically I would use the story title, the URL it linked to, and the submitter’s username. My goal was just to classify the story as interesting-to-Joel or not, which meant the simplest approach was probably to use a naive Bayes classifier, so that’s what I did.

Step 2 : Acquire Computing Resources

I have an AWS account, but for various reasons I find it kind of irritating. I’d heard some good things about Rackspace Cloud Hosting, so I signed up and launched one of their low-end $10/month virtual servers with (for no particular reason) Debian 6.0.

I also installed a recent Ruby (which is these days my preferred language for building things quickly) and mongoDB, which I’d been meaning to learn for a while.

Step 3 : Collect Data

First I needed some history. A site called Hacker News Daily archives the top 10 stories each day going back a couple of years, and it was pretty simple to write a script to download them all and stick them in the database.

Then I needed to collect the new stories going forward. At first I tried scraping them off the Hacker News “newest” page, but very quickly they blocked my scraping (which I didn’t think was particularly excessive). Googling this problem, I found the unofficial Hacker News API, which is totally cool with me scraping it, which I do once an hour. (Unfortunately, it seems to go down several times a day, but what can you do?)

Step 4 : Judging Stories

Now I’ve got an ever-growing database of stories. To build a model that classifies them, I need some training data with stories that are labeled interesting-to-Joel or not. So I wrote a script that pulls all the unlabeled stories from the database, one-at-a-time shows them to me and asks whether I’d like to click on the story or not, and then saves that judgment back to the database.

At first I was judging them most-recent-first, but then I realized I was biasing my traning set toward SOPA and PIPA, and so I changed it to judge them randomly.

Step 5 : Turning Stories into Features

The naive Bayes model constructs probabilities based on features of the stories. This means we need to turn stories into features. I didn’t spend too much time on this, but I included the following features:

* contains_{word}
* contains_{bigram}
* domain_{domain of url}
* user_{username}
* domain_contains_user (a crude measure of someone submitting his own site)
* is_pdf (generally I don’t want to click on these links)
* is_question
* title_has_dollar_amount
* title_has_number_of_years
* title_references_specific_YC_class (e.g. “(YC W12) seeks blah blah)
* title_is_in_quotes

For the words and bigrams, I removed a short list of stopwords, and I ran them all through a Porter stemmer. The others are all pretty self-explanatory.

Step 6 : Training a Model

This part is surprisingly simple:

* Get all the judged stories from the database.
* Split them into a training set and a test set. (I’m using an 80/20 split.)
* Compute all the features of the stories in the training set, and for each feature count (# of occurrences in liked stories) and (# of occurrences in disliked stories).
* Throw out all features that don’t occur at least 3 times in the dataset.
* Smooth each remaining feature by adding an extra 2 likes and an extra 2 dislikes. (2 is on the large side for smoothing, but we have a pretty small dataset.)
* That’s it. We YAML-ize the feature counts and save them to a file.
* For good measure, we use the model to classify the held-out test data, and plot a Precision-Recall curve

Step 7 : Classifying the Data

Naive Bayes classifier is fast, so it only takes a few seconds to generate and save interesting-to-Joel probabilities for all the stories in the database.

Step 8 : Publishing the Data

This should have been the easiest step, but it caused me a surprising amount of grief. First I had to decide between

* publish every story, accompanied by its probability; or
* publish only stories that met some threshhold

In the long term I’d prefer the second, but while I’m getting things to work the first seems preferable.

My first attempt involved setting up a Twitter feed and using the Twitter Ruby gem to publish the new stories to it as I scored them. This worked, but it wasn’t a pleasant way to consume them, and anyway it quickly ran afoul of Twitter’s rate limits.

I decided a blog of batched stories would be better, and so then I spent several hours grappling with Ruby gems for WordPress, Tumblr, Blogger, Posterous, and even LiveJournal [!] without much luck. (Most of the authentication APIs were for more heavy-duty use that I cared about — I just wanted to post to a blog using a stored password.)

Finally I got Blogger to work, and after some experimenting I decided the best approach would be to post once an hour, all the new stories since the last time I posted. Eventually I realized that I should rank the stories by interesting-to-Joel-ness, so that the ones I’d most want to read would be at the top:

and the ones I want to read least would be at the bottom:

The blog itself is at

http://joelgrus-hackernews.blogspot.com/

Step 9 : Automate

This part was pretty easy with two cron jobs. The first, once an hour, goes to the Hacker News API and retrieves all new unknown stories (up to a limit of like 600, which should never be hit in practice). It then scores them with the last saved model and adds them to the database. In practice, the API isn’t working half the time.

The second, a few minutes later, takes all the new stories and posts them to the blog. The end result is a blog of hourly scored digests of new Hacker News posts.

Step 10 : Improve the Model

The model can only get better with more training data, which requires me to judge whether I like stories or not. I do this occasionally when there’s nothing interesting on Facebook. Right now this is just the above command-line tool, but maybe I’ll come up with something better in the future.

Step 11 : Profit

I’m still trying to figure this one out. If you’ve got any good ideas, the code is here.

Hyphen Class Post-Mortem

Last fall I signed up for two of the hyphen classes: the Machine Learning ml-class (Ng) and the Artificial Intelligence ai-class (Thrun and Norvig). Both were presented by Stanford professors but one of the conditions of taking the courses was that whenever I discuss them I am required to present the disclaimer that THEY WERE NOT ACTUALLY STANFORD COURSES and that I WAS NEVER ACTUALLY A STANFORD STUDENT and that furthermore I AM NOT FIT TO LICK THE BOOTS OF A STANFORD STUDENT and so on. (Caltech is better than Stanford anyway, even if whenever you tell people you’re in the economics department they always say, “we have one of those?!”)

My background is in math and economics, but I’ve taught myself quite a bit of computer science over the years, and I consider myself a decent programmer now, to the point where I could probably pass a “code on the chalkboard” job interview if that’s what I needed to do in order to support my family and/or drug habit.

I’d worked on some machine learning projects at previous jobs, so I’d picked up some of the basics, but I’d never taken any sort of course in machine learning. At my current job I’m the de facto subject matter expert, so I thought the courses might be a good idea.

The classes ended up being vastly different from one another. Here’s kind of a summary of each:

ml-class:

* Every week 5-10 recorded lectures, total 1-2 hours of lecture time. (There was an option to watch the lectures at 1.2x or even 1.5x speed, which I always used, so it might have been more like 3 hours in real-time. This means that if I ever meet Ng in real-life, he will appear to me to be speaking very, very slowly.)

* Most lectures had one or two (ungraded) integrated multiple choice quizzes with the sole purpose of “did you understand the material I just presented?”

* Each week had a set of “review questions” that were graded and were designed to make sure you understood the lectures as a whole. You could retake the review if you missed any (or if you didn’t) and they were programmed to slightly vary each time (so that a “which of the following are true” might be replaced with a “which of the following are false” with slightly different choices but covering the same material).

* Each week also had a programming assignment in Octave, for which they provided the bulk of the code, and you just had to code in some functions or algorithms. I probably spent 2-3 hours a week on these, a fair amount of that chasing down syntax-error bugs in my code and/or yelling at Octave for crashing all the time.

* Machine learning is a pretty broad topic, and this course mostly focused on what I’d call “machine learning using gradient descent.” There was some amount of calculus involved (although you could probably get by without it) and a *lot* of linear algebra. If you weren’t comfortable with linear algebra, the class would have been very hard, and the programming assignments probably would have taken a lot longer than they took me.

* The material was a nice mix of theoretical and practical. I’ve already used some of what I learned in my work, and if there was a continuation of the class I would definitely take it. As it stands I’m right now signed up for the nlp-class and the pgm-class, which should be starting soon, both of which are relevant to what I do.

* The workload, and the corresponding amount I learned, were substantially less than they would have been in an actual 10-week on-campus university course. This was great for me, since I also have a day job and a baby. If I were a full-time student being offered ml-class instead of a real machine learning class, I might feel a little cheated. (I saw a blog post by some Stanford student whining about this, but he was mostly upset that the hyphen classes were devaluing his degree. Someone should have reminded him about the disclaimer.)

* The class was very solidly prepared. The lectures were smooth and well thought out. The review questions did a good job of making sure you’d learned the right things from the lectures. The programming assignments were good in their focus on the algorithms, although that did insulate you from the real-world messiness of getting programs set up correctly.

* It certainly seemed like Ng really enjoyed teaching, and at the end of the last lecture he thanked everyone in a very heartfelt way for taking the class.

ai-class:

* Every week dozens of lectures, each a couple of minutes long, interspersed with little multiple choice quizzes. This was my first point of frustration, in that the quizzes were frequently about parts of the lecture that hadn’t happened yet. Furthermore, they often asked ambiguous questions, or questions that were unanswerable based on the material presented so far.

* Each week had a final quiz that you submitted answers for one time only. Then you waited until the deadline passed to find out if your answers were correct (and then you waited another day, because the site always went down on quiz submission day, and so they always extended the deadline by 24 hours). These quizzes were also ambiguous, which meant that if you wanted to get them correct you had to pester for clarifications (and sometimes for clarifications of the clarifications).

* This resulted in the feeling that the grading in the class was stochastic, and that your final score was more reflective of “can I guess what the quiz-writer really meant” than “did I really understand the material”. Although I didn’t particularly care about my grade in the class, I was still frustrated and disheartened by the feeling that the quizzes were more interested in *tricking* me than in helping me learn.

* What’s more, the quizzes often seemed to focus on what seemed to me tangential or inconsequential parts of the lesson, like making sure that I really, deeply understood step 3 of a 5-step process, but not whether I understood the other four steps or the process itself.

* The material also seemed very grab-bag, almost like an “artifical intelligence for non-majors” survey course.

* Anyway, partly on account of my finding the class frustrating, partly on account of time pressures, and partly because I didn’t feel like I was learning a whole lot, I dropped the ai-class after about four weeks.

* There were no programming assignments, but there was a midterm and a final exam, both after I quit the course. From what I could tell, they were longer versions of the quizzes, with the same problems of clarity and ambiguity. (I never unfollowed the @aiclass twitter, and during exam time it was a steady stream of clarifications and allowed assumptions.)

* Compared to the tightly-planned ml-class, the ai-class felt very haphazard. In addition, the ml-class platform I found more pleasant to use than the ai-class platform.

* I quit long before the last lecture, so I have no idea how heartfelt it was.


One thing about both classes: I *hate* lectures. I learn much better reading than I do being lectured at, and I found the lecture aspect of *both* classes frustrating. I have complained about this in many venues, but my prejudice is that if you’re using the internet to make me watch *lectures*, you’re not really reinventing education, because I still have to watch lectures, and I hate lectures. Did I mention that I hate lectures?

By way of comparison, I have also been doing CodeYear. It is currently below my level (I am plenty familiar with variables and if-then statements and for loops), but I don’t know much Javascript, and the current pace makes me hopeful that it will get interesting for me after another month or two.

If you don’t know that platform, it gives you a task (“create a variable called myName, assign your name to it, and print it to the console”) and a little code window to do it in. Then you click “run” and it runs and tells you if you got it right or not. There is a pre-canned hint for each problem.

What I really like about Codeacademy is that I can do it at my own pace. The lessons are wildly variable in quality, but I’m glad not to have to sit through hours of lectures every week. They also do “badges”, which I find more satisfying than I wish I did. That said, I suspect someone with no experience debugging code would find the experience impenetrable and waste hours tracking down simple syntax errors, and indeed I saw on Hacker News a post to this effect a few weeks ago.

In the end, despite all this, the way I learn best is through a combination of reading books and writing actual code. I’ve had to learn F# over the last month, which I’ve done by reading a couple of (quite nice) books and writing a lot of actual code. It’s hard for me to imagine the course that would have done me any better (or any faster).

Similarly, if I wanted to learn Rails (which some days I think I do and other days I think I don’t), I have trouble imagining a course that would do better for me than just working through the Rails Tutorial (which I have skimmed, which has convinced me that I could learn well from it).

Similarly similarly, I suspect that the right Machine Learning book (and some quality time with e.g. Kaggle) would have been much more effective for me than the ml-class was. But if such a book exists, I haven’t found it yet.

Machine Learning Beverage

Although my formal training is in subjects like math and economics and animal husbandry, most of the money-work I do is in subjects like data science and fareology and writing over-the-top religious polemics. This is one of the reasons why I’m so sour on the value of college, as my multi-million-dollar investment in tuition and pitchers of Ice Dog beer and Tower Party t-shirts didn’t even provide me the opportunity to learn any of these.

I did get to take an “Artificial Intelligence” class. The only listed prerequisite was the “Intro to CS” class, but a brand new professor was teaching and she decided to make it a much more advanced class, and then I was going to partner with my friend who was a CS major so that he could handle all the more advanced programming aspects, but he dropped the class after a couple of weeks so he could spend his senior year focused on “not taking classes”, which meant that I got to spend my senior year focused on “learning enough about computer programming to not fail the class”, after which I picked up a bit of “how to sometimes beat the computer at tic-tac-toe” and “how to sometimes beat the computer at Reversi” and “how to narrowly avoid coming in last place in the classwide ‘Pac War‘ tournament.”

Despite that initial setback, over the course of my career I’ve managed to learn bits and pieces of what’s variously called “machine learning”, “artificial intelligence”, or “guessing stuff”. I suspect I would be more popular at data mining parties if I had a smidge more training in these subjects, and so I was very excited at the prospect of Stanford’s free online Artificial Intelligence Class and Machine Learning Course, both of which are offered this fall. (There’s also a Database Class, but I know too much about databases already.)

You don’t get actual Stanford credit if you take the classes online, but I don’t particularly want Stanford credit, which means that’s not a deal-breaker. You get some sort of certificate signed by the professors listing your rank in the class, which will probably be somewhere in the millions thanks to all the Chinese students who will be cheating on their assignments, but I don’t particularly want a certificate either. I wouldn’t mind some sort of bumper sticker (“MY COMPUTER ALGORITHM IS SMARTER THAN YOUR HONOR STUDENT AND FURTHERMORE WON’T EVER BE UNEMPLOYED AND LIVING IN MY BASEMENT UNDER A CRIPPLING MOUNTAIN OF STUDENT-LOAN DEBT”), but that doesn’t seem to be part of the plan.

Most likely I won’t have enough time to devote to the classes anyway, what with work and training the baby to take over the world someday and trying to finish the novel about the boy who likes to play baseball but is no good at it. And this isn’t helped by the fact that both classes are going to have hours of online lectures that I’m going to have to sit through. Lectures!

I twittered the other day that if I have to sit through lectures then you’re not really transforming education. A lot of people (reasonably) interpreted this as a dig at the Khan Academy, but I was more angry at the Stanford CS department, which is tech-savvy enough to offer courses over the Internet to millions of cheating Chinese people and yet not tech-savvy enough to think of a better method of knowledge transmission than lectures with slides, which were invented by Moses or possibly even God, making them thousands of years old. I’m happy to take their quizzes and solve their problem sets and write their examinations, but the prospect of having to spend time listening to lectures is really glooming me down.

It’s not that I don’t appreciate what they’re doing, but if the Stanford Computer Science department really wants to revolutionize the educational process, they should figure out a way to upload information directly into my brain, or to embed it subliminally in Spider-Man cartoons, or to make it somehow drinkable. “Machine Learning Class” is the past; the future belongs to whoever first figures out “Machine Learning Beverage”!

Endogeneity, or “The Skill of the Brewmeister”

The latest OkCupid blog post is one of their more interesting ones:

No matter their gender or orientation, beer-lovers are 60% more likely to be okay with sleeping with someone they’ve just met. Sadly, this is the only question with a meaningful correlation for women.

Of course, once every dude starts asking this question on the first date and every girl figures out that her answer is a signal of how easy she is, the correlation will almost surely vanish. Even if a woman is willing to put out on the first date, that doesn’t mean she wants to advertise the fact early on. (It’s at least possible that I am out of touch with the youth of America and am wrong about this.)

Accordingly, I predict a brief surge in beer-lover questions (and guys trying to bring their first dates to Bierstubes, Bierhausen, Bierhalls, and the like), followed by the evolution of non-committal, correlation-breaking answers to “do you like the taste of beer?” like

  • Only if it comes from a big keg,
  • It depends on the skill of the brewmeister, and
  • I do like the taste of beer, but not on the first date.

If OkCupid were evil (which they probably are, now that they’re a division of IAC), they’d instead sell a limited number of subscriptions to this sort of information, so that dudes could use these questions without having to worry that the dating pool had been overfished.

If they were really evil (which they probably are, now that they’re a division of IAC), they’d report false correlations and then laugh at people who tried to put them into practice. However, I assume that if they’d done this they would have chosen a funnier question than “Do you like the taste of beer?”

I’ll explore this further in my next post, “The Only Question That Correlates With Whether Women Put Out Is ‘Did you ever find Bugs Bunny attractive when he put on a dress and played girl bunny?’”

Sam Harris is Nonsensical in Principle

In my younger days, when I was full of libertarian bluster, I used to formulate arguments in terms of “Natural Rights.” Murder was Wrong (with a capital ‘W’) because it violated your “right to life.” I used to go on like this all day, until finally my friend Cesar (I think) kindly pointed out that I was full of shit.

I’m still full of libertarian bluster, I suppose, although you’d never in a million years catch me arguing based on “natural rights,” which (after my youthful indiscretions) I came to realize represent either religious (“they’re the rights god gave us”) or pseudo-religious (“they’re self-evident!”) attempts to create an “objective” basis for one’s policy preferences. (As a general rule, if most people refuse to agree with a proposition even after you’ve made your best case for it, it’s not “self-evident.”)

There’s no shortage of people who want an “objective” basis for their policy preferences. It turns them from opinions (e.g. “it’s my opinion that we should pay teachers more”) or hypothetical imperatives (e.g. “if we want to make teaching a more attractive profession, we should pay teachers more”) or self-interest (e.g. “speaking as a teacher, we should pay teachers more”) into “facts” (e.g. “it’s a fact that we should pay teachers more”) and “morals” (e.g. “if you don’t think we should pay teachers more, you’re a moral reprobate”). You can argue against opinions, but you can’t argue against facts! You can rail against self-interest, but not against morals!

It’s a nice sleight of hand when you can pull it off. Unfortunately, you usually can’t. Neither can Sam Harris, who has a new book out claiming that “science has a universal moral code.”

The book’s not quite out yet, but he’s posted an excerpt online.

Since it’s Sam Harris, his purpose is of course to debunk one of the arguments for god:

The defense one most often hears for belief in God is not that there is compelling evidence for His existence, but that faith in Him is the only reliable source of meaning and moral guidance.

It’s true that this is sometimes offered (I wouldn’t say “most often”) as a defense of religon. In my own book it’s addressed in chapter 85: “But without religion…”

Some people argue that religion is a necessary source of morality, and that if people all realized their religions were false, they would no longer have any incentives to fly airplanes into skyscrapers, to chop off the tips of their babies’ penises, to restrict poor people’s access to contraception, to censor cartoons, to make it difficult to purchase liquor on Sundays, to stone homosexuals, or to murder apostates and heathens. Society, they argue, would subsequently break down.

Of course, the sensible’s person rejoinder to this is that the truth of a belief is independent of its consequences. And anyway we don’t need an absolute morality, we just need a set of rules to help us get along. We don’t want to be murdered, and we don’t want our friends and neighbors to be murdered, so we outlaw murder and we punish murderers. We (ideally) enact new rules if they seem necessary and (ideally) repeal them if they seem counterproductive. There’s no need to embarrass ourselves by dragging philosophy into things and trying to make metaphysical statements about murder in the abstract.

It’s disappointing, then, that Harris’s klugy response is to advocate for a “science” of morality. Science is (among other things) a spirit of inquiry. If you’re serious about using science to solve a problem, then you’re committing to accepting science’s answer whatever it turns out to be.

If you believe that science can make a statement that (say) child abuse is wrong in some absolute sense, then you’re tacitly accepting that new evidence might reveal that child abuse is not actually wrong. If you’re not open to that possibility, then you’re not doing science. You can call it science, but it’s not science.

Maybe he doesn’t care. (Or maybe he’s open to the possibility that child abuse might be “moral,” but I’ll give him the benefit of the doubt.) Maybe he’s only interested in the name:

But whether morality becomes a proper branch of science is not really the point. Is economics a true science yet?

I’m not sure whether economics is a “true science.” But most honest economists are forthright that the scientific part of what they do is only the positive part. Economics can tell you which allocation rules satisfy certain “fairness” criteria. But it can’t tell you which criteria are the correct ones.

Look, I think murder is awful. But I don’t pretend that this is some sort of scientific judgment. It’s my opinion, and luckily most everyone else agrees with me.

You know what else is awful? Putting people in jail because they like to use drugs. It’s wicked, it’s evil, it’s barbaric, it’s disgusting, it’s shameful, it’s every bad adjective you could apply to it. This is as plainly obvious to me as is my feeling that murder is awful. And I’m not just talking marijuana. I’m talking heroin, cocaine, opium, you name it. Somehow, though, most people disagree with me. Most scientific people disagree with me. Of course, I’m right and they’re wrong. But science is powerless to settle this dispute. Science tells you what drugs do and what happens when you mix them and how to get a better high. Science tells you the likely consequences of your policy of throwing drug users in prison. But science doesn’t tell you whether it’s evil to throw drug users in prison. Science can’t tell you whether it’s evil to throw drug users in prison. Science can’t tell you how to find “peaks” on a “moral landscape” because there’s no such thing as a “moral landscape.”

This isn’t a problem unless you decide to start worrying about the bizarrely abstract:

Imagine how terrifying it would be if great numbers of smart people became convinced that all efforts to prevent a global financial catastrophe, being mere products of culture, must be either equally valid or equally nonsensical in principle. And yet this is precisely where most intellectuals stand on the most important questions in human life.

I’ve puzzled over this section for a long time, and I can’t make the slightest sense of it. Whether you’re evaluating “efforts to prevent a catastrophe” or “the most important questions in human life,” you ought to be concerned with practical things like “whether they’ll work” and “what side effects they’ll have” and “how much they’ll cost.” Worrying which ones are “valid in principle” (whatever on earth that means) is a perfect way to waste time and not solve anything. Not unlike the new Harris book, I suspect.