Vegas with a Lap Infant

Madeline is about to turn two, which is the magical age at which kids transition from fly-for-free lap infants to requires-a-ticket-and-some-sort-of-kid-specific-restraint-and-did-I-mention-a-ticket seat toddlers. Which meant we needed to squeeze in one last vacation. And since Seattle weather kind of sucks, we wanted to go somewhere where the weather was nice. And since flying with a lap infant also kind of sucks, we wanted to go somewhere that wasn’t too far away. Hence Vegas.

You might think Vegas an unorthodox place to take a two-year-old. Now that I’ve finally been here, I’m inclined to agree with you. Nonetheless, with a few caveats, Vegas is an awesome place to bring a lap infant.

1. You have to like to walk

Really, you have to like to walk. I forgot to own a pedometer, but based on the amount of grime that has accumulated on my shoes and a fairly elaborate spreadsheet, I estimate that we’ve been walking somewhere between 3 and 5 miles a day. Generally speaking, we are not stroller people, we are “let Madeline walk when she wants to, and carry her the rest of the time” people. This works fine when you walk about a mile a day. This does not work fine when you walk five miles, and our first day here ended with severe backaches.

Naturally, we didn’t even bring a stroller, so on the second day I hoofed it another 1.5 miles to the nearest Target and bought their cheapest $20 stroller, which was pink. (Then I took a bus back and got yelled at for trying to bring a coffee on the bus, where do you think you are, Seattle, and got chatted up by a junkie who assured me that if he had kids he never would have started using.) Being a $20 stroller, it is a complete piece of junk, and so of course Madeline has grown completely attached to it, has named it (“Pink”, imaginatively), and will probably cry when I throw it into the dumpster behind the hotel at check out, as is my plan.

Anyway, just about everywhere on the Strip is at least a 30-minute walk from anywhere else on the Strip. There’s kind of no way around this. Say you want to support your Wazzou Cougs, who are playing basketball in the Pac-12 Tournament, which — in order to show that gambling on college sports is in no way acceptable — is being held at the MGM Grand. Aha, you think, to make things convenient I’ll just stay at the MGM Grand myself. What you failed to account for is that the MGM Grand is itself a 30-minute walk from the MGM Grand, past a Rainforest Cafe, several Joël Robuchon Ateliers, and about a gazillion slot machines with Gen-X enticing themes like “Ghostbusters” and “Ghostbusters II” and “On Our Own (Theme from Ghostbusters II)”.

Additionally, in most non-Strip parts of the world, if you can see something it is generally close by. However in Las Vegas all of the hotels are built at grotesquely unintuitive scale, so that if you can see (say) the Bellagio then it’s likely (but not certain) that you could probably walk there in less than an hour, although your walk — despite both starting and ending at street level — will involve a bewildering variety of elevation changes, most of which involve escalators that you will get yelled at by security for bringing a stroller on, requiring you to ride a bewildering variety of foul-smelling elevators with a bewildering variety of obese people riding a bewildering variety of rented mobility scooters.

2. You have to like to eat

Lap infants are not allowed to gamble, are not allowed near gambling, not even if you just want to sit in the Rockin’ Sensory Immersion Surround Sound Gaming Chair of the KISS slot machine one more time so that you can “UNLOCK THE STARCHILD”. Lap infants are not allowed to see PEEPSHOW, featuring Coco of E!’s “Ice Loves Coco”. Lap infants are not allowed into the bar at Cabo Wabo, Coyote Ugly, or the Tabú Ultra Lounge.

They are, however, allowed into buffets, which all have a “kids 3 and under eat free” policy, which makes them good places for your lap infant to practice eating with utensils, since even if she drops every spoonful of creme brulee on the floor or her lap you can just grab a few more ramekins and try again, and even if she pukes up an entire cheese omelet you can just get another one.

Suffice it to say that we ate a lot of buffets in Las Vegas, here is how I would rank them:

1. The Bacchanal Buffet at Caesar’s Palace
2. The Wicked Spoon Buffet at the Cosmopolitan
3. just about every other buffet in Las Vegas
4. Le Buffet aux Paris Las Vegas

Supposedly there are also non-buffet places to eat in Vegas, many of them named after chefs who have appeared on television programs and/or have French-sounding names. I wouldn’t know anything about those.

3. You have to like to spend money

Vegas is not cheap. Sure, you could stay at Terrible’s, where I think they actually pay you to sleep and eat, and where the $9.99 Sunday Champagne Buffet Brunch is deservedly legendary. But it is a long, long walk from the strip, past a variety of foul-smelling homeless people, and past the same three HOT ASS ESCORTS advertisement dispensers over and over and over again. (Also, the hipsters at Yelp are kind of down on the place.)

However, if you want to stay and eat at one of the casinos named after birds, or dead people, or capitals of France, it’s going to cost you. If you want to eat at one of the buffets where “angry” describes the mac and cheese and not the service, it’s going to cost you. If you want your frozen sex-on-the-moon grape-raspberry dacquiri in the 32-ounce souvenir neck-lanyard yard-tube container, it’s going to cost you. And then you look back and realize that all the money you saved not buying the baby a plane ticket you spent on a dessert named after Emeril Lagasse and on getting your picture taken with a weirdo dressed like SpongeBob SquarePants dressed like a showgirl.

4. You have to like kid-friendly activities

Surprisingly, there are a few kid-friendly activities in Vegas. Lap infants are kind of at that sweet spot where they like to look at flashing lights and captive flamingos and garish costumes, but where they are too young to ask awkward questions like “Daddy, what’s a ‘hot ass escort’?” and “Daddy, isn’t it cruel to clip flamingos’ wings and put them on display for a bunch of drunken gamblers?” and “Daddy, what does ‘Cabo Wabo’ mean?” at which point you have to have the talk about the unlistenable “Van Hagar” years.

The Circus Circus (“What kind of circus?” “A circus circus!”) has an “AdventureDome” that contains three rides suitable for lap infants (who ride free as long as their parent buys a $5 ticket), one of which is a terrifying school-bus-themed ride which helps prepare lap infants for their mind-numbing trips through the public education system.

The Mandalay Bay (“What kind of bay?” “A Mandalay bay!”) has a “Shark Reef” that is not actually a reef (due to acquarium acidification, I suppose) but does have a handful of sharks and a manta ray petting zoo that’s surprisingly fun to frighten lap infants with.

The Excalibur (“What kind of caliber?”) has a “Tournament of Kings”, which involves horses and swords and broasted chicken and pyrotechnics and a mediocre A/V system that makes it impossible to understand whether Merlin the Wizard is telling you that you’re supposed to tip your servers or that you’re not supposed to tip your servers.

The Bellagio has a pretty incredible fountain show where they play Lee Greenwood and shoot water around in patriotic patterns, and the Mirage has a pretty incredible volcano show, which is fun to explain to your lap infant as a manifestation of the gods’ anger, which can only be assuaged by throwing a lap infant into the volcano.

If your lap infant has reached the age of obsession with choo choo trains, then you can spend the day riding the Las Vegas Monorail (after a bewildering trek through one of the casinos using a bewildering variety of elevators to reach one of the stations), where she can happily yell out “choo choo train!” over and over again all the while watching a bunch of drunk bros putting their lamest moves on a group of amateurishly-tattooed girls from Canada (“whoa, you’re from Canada, that’s so awesome, eh!”).

There is also a supposedly-family-friendly “Tribute to Red Skelton” show, which Madeline refused to see for political reasons.

All that said, bringing a lap infant also means you can’t eat at one of the Joël Robuchon Ateliers or see the “Steve-O and Tom Green Stand-Up Comedy Extravaganza” or slap Kathy Griffin, not unless you’re willing to pawn your father’s watch in order to afford the services of a Vegas Babysitter, who is sort of like a nanny except infinitely more expensive. (And you would have already had to pawn your father’s watch in order to put a deposit down on your Joël Robuchon meal anyway.)

In conclusion, Vegas is sort of like Disneyland for lap infants, except

(a) Vegas is cheaper
(b) Vegas is more fun
(c) the Mickey Mouse impersonators in Vegas have crappier costumes
(d) Vegas is marginally less evil

Highly recommend!

Secrets of Fire Truck Society

Hi, I gave a talk at Ignite Strata on “Secrets of Fire Truck Society” and at the end I promised that for more information you could visit this blog. Unfortunately, I haven’t had time to write a blog post. Here are some links to tide you over until I do:

On On Leaving Academia

Several people in my influencesphere have linked to this essay by a CS prof who’s leaving academia to join Google in order to “make a positive difference in the world.” I am, of course, wholly supportive of such a program, if not of his precise rationale, which is a mish-mash of ranting about wicked Republicans and wild-eyed idealism about the Academy.

What interests me most about his essay is the section entitled “Mass Production Of Education”, which is misguided in all the ways you’d expect from someone steeped in the culture of “bespoke” education. It lists three “worries”:

First, I worry that mass-production here will have the same effect that it has had on manufacturing for over two centuries: administrators and regents, eager to save money, will push for ever larger remote classes and fewer faculty to teach them.

Said differently, technologies that allow fewer faculty to teach the same number of students will allow universities to operate with fewer faculty. Let’s call this worry “Luddism“. I love a good loom-smashing as much as the next guy, but it’s sort of hard to take seriously a preference for the 19th-century manufacturing regime.

It seems likely that in a hundred years our grandchildren and those of us who’ve successfully been cryonically revived will share a laugh about how “education” used to involve crowding people into a room and making them sit still while someone stood up front and lectured at them. And then someone will brain-cast a ludicrous hyper-essay about how 4-D printing is democratizing the singularity, pining for the good old days of 3-D printing. And so on.

Second, I suspect that the “winners win” cycle will distort academia the same way that it has industry and society. When freed of constraints of distance and tuition, why wouldn’t every student choose a Stanford or MIT education over, say, UNM?

Said differently (and with apologies to UNM, which I’m sure is a fine school), if every student has access to cheap, high-quality education, few of them will choose to pursue a low-quality education. It is easy to see how purveyors of low-quality education might worry about this, but it’s hard to imagine why anyone else should.

Are we approaching a day in which there is only one professor of computer science for the whole US?

Seems pretty unlikely, but if we were that would be awesome because it would free up all the other computer science Ph.D.s, many of whom are brilliant, to do other stuff (like building Groupon and Pinterest clones)! This would be sad for the ones who really, really, really want to be teachers, but on balance it would be a huge win for the world.

Third, and finally, this trend threatens to kill some of what is most valuable about the academic experience, to both students and teachers. At the most fundamental level, education happens between individuals — a personal connection, however long or short, between mentor and student.

I have no idea how to say this differently, so I won’t try. Having been a teacher, I agree that the most rewarding moments happened between individuals. (Particularly when one of the individuals was the cute goth freshman girl who aced all the quizzes but still came to office hours.) Were those the most valuable parts of the teaching experience? Less clear. What’s more clear is that what was/is most valuable about my experience as a student was/is learning stuff. And these days most of what I know that’s useful I’ve learned from books or doing or even Coursera, not from the academy. I’ve broadened my horizons by pleasure reading, by arguing on LiveJournal, by discussions with peers on geek hikes far more than I ever did through school. With very few exceptions, my most profound intellectual connections have been with people I met outside of the school system.

It resonates at levels far deeper than the mere conveyance of information — it teaches us how to be social together and sets role models of what it is to perform in a field, to think rigorously, to be professional, and to be intellectually mature.

I suspect you have to have spent your whole life in academia to seriously assert that “the human connection in education” is the only path to these things, or even the easiest path to these things. College taught me how to play the same juvenile bulshytt status games we played in high school but at a slightly higher level. College professors were (sometimes) great role models for how to behave if you ever became a college professor, but not for much else. The levels of professionality and intellectual maturity I experienced in the academy were certainly no greater than I’ve experienced in the real world. I will freely admit to learning rigor (some would say too much rigor) while studying mathematics, which primed me to recognize the lack of rigor in so many other fields.

I am terribly afraid that our efforts to democratize the process will kill this human connection and sterilize one of the most joyful facets of this thousand-year-old institution.

Said differently, “we fear change”. Hopefully at Google he’ll learn to stop saying “democratize”, and maybe he’ll even meet a Republican or two. There must be one or two Republicans at Google, right?

The Hardest Job There Is

One summer during college I was stringing together temp jobs in order to make money so that I could afford to go out with my friends at night and play “Star Trek” pinball. (I would have preferred, of course, to spend my summer developing my idea for a “group couponing” website, but as the summer in question predated widespread adoption of the Internet, the decision was out of my hands.)

These were super-boring temp jobs, involving things like data-entering anonymous “secret shopper” surveys for Jersey Subs, filing papers alphabetically, and going through medical bills with a red pen to make sure that the prices didn’t exceed prescribed rates. (The last was the worst, as their computer system ran on OS/2, which some genius decided should have chess rather than Minesweeper, which made it very difficult to blow off steam after decimating a particularly tough bill, which is why I originally took up amphetamines.)

At some point the temp work simply dried up, possibly because there were no more medical bills, possibly because no one was willing to eat at Jersey Subs anymore, possibly because of the amphetamines. And so my dad arranged it that I could work for a friend of his who owned a warehouse of surplus metal parts.

What were these metal parts? I have no idea. They were large and heavy and in bins on pallets, and it’s possible they were used to repair trains, or in air conditioning, or as weapons. They came in various shapes and sizes and weights (heavy *and* very heavy), and every day orders would pour into the warehouse that some company wanted 137 of the metal pieces from bin A17. My job, then, was to retrieve bin A17 (which involved a forklift, which was sort of cool, except that I never got the hang of rear-wheel steering and always ended up crashing into things) and get an empty pallet and then manually choose 137 of the least-rusty metal pieces from bin A17 and pile them onto the empty pallet, all the while counting (and then double-counting) to make sure that there were indeed exactly 137 of them. Then I’d put the bin back and move on to the next order of 94 metal pieces from bin C29, and so on, and so forth.

(To this day, it is tough for me to imagine a job that is a worse mismatch for my aptitudes and preferences, except possibly for building model histories of men’s shoes.)

At the end of each day I would collect my pay (which was itself in non-descript metal pieces) and go home and take painkillers and try to scrub all the fine metal grit off my skin and try to cough all the fine metal grit out of my lungs and then cry myself to sleep and have nightmares about counting metal pieces. All of which, quite obviously, left no time for “Star Trek” pinball.

And so after a week, over the vociferous objections of my parents, who insisted that the metal pieces I was earning were likely to represent the difference between success and failure in life, I quit. Accordingly, I have blamed the various subsequent failures in my life on the metal pieces that never were.

So it stood until this week, when Hilary Rosen (who, for reasons inexplicable to me, is still allowed to show her face in public after her stint running the RIAA) made some crack disparaging Mitt Romney’s wife for being a stay-at-home mom. Tactically this was moronic, as everyone knows plenty of admirable stay-at-home moms, and also everyone knows that the most fruitful line of attack on Mitt Romney’s wife is that she married Mitt Romney, and let’s see how her “the angel Moroni pointed a shotgun at us and said we had to” excuse plays in the court of public opinion.

Which means that everyone and his brother is rushing to throw Hilary Rosen under one of a variety of buses. Bill Donohue, for instance, wants to throw her under some sort of “lesbian parent” bus, which I’m pretty sure runs on biodiesel, and I would love to throw her under the “she ran the RIAA, which means that nothing she says should ever be listened to by anyone ever” bus, but most people are focusing on the old “parenting is the hardest job there is!” bus.

It turns out, though, that I’m a parent, and so I happen to know that PARENTING IS NOT EVEN CLOSE TO THE HARDEST JOB THERE IS. Metal piece warehouse was a harder job. Burger King was a harder job. Even MATH FREAKING GRAD SCHOOL was a harder job. (As some versions of the bus insist that only mothering is the hardest job, I double-checked with Ganga, and she agrees with my analysis.)

That’s not to say that parenting isn’t work. It is, and occasionally it’s even very unpleasant work, like when it’s 3am and the baby won’t sleep and will scream if you don’t rock her, and you still haven’t prepared your slides for your 8am meeting with Hilary Rosen to present your new plan for permanently ruining the lives of music-downloading teenagers, and all you want to do is sleep and use your dreams to figure out a way to pretend like you care about “artists”. Or when she poops on you. (The baby, not Hilary Rosen, although that also sucks.) Or when you’re trying to write a blog post making fun of Hilary Rosen and the baby won’t stop screaming in your ear and banging on your keywinevsoivdkdsvl

But parenting is also a lot of fun. It’s a huge joy when you finally teach your kid how to Chicken Dance, or when she learns to swear, or the first time she asks you “please can you read me one more chapter before bed, daddy?” of Atlas Shrugged. No metal part ever even asked me about The Fountainhead!

I recognize that it’s uncharacteristic of me to stake out the middle ground like this, but I guess having a kid has been a deeply moderating influence and has taught me the value of compromise. So can’t we all just agree that parenting is nowhere near as hard as sorting and lifting and counting metal parts, that Hilary Rosen has no place in polite society, and that babies love Atlas Shrugged?

Why Have You Not Signed Up For BIL Already?

I’m sure you’ve heard of TED, which is a really expensive, really exclusive annual conference at which famous and/or accomplished people give lectures to wealthy and/or lucky people. Surprisingly, despite my fame, accomplishments, wealth, and luck, I have never been invited to attend or lecture. (Actually, it’s not that surprising, given that they once gave their TED Prize to Karen Armstrong, my mortal enemy, and that they seem to like Nathan Myhrvold, my other mortal enemy.1)

Luckily for me, there is a non-union, Mexican equivalent an open-source equivalent, the BIL conference, which costs only $50, and which is open to pretty much everyone. Three years ago they were kind enough to let me give my “Your Religion Is False” talk, and then two years ago they didn’t firm up the date until it was too late for me to make travel plans, and then last year they let me give my lukewarmly-received “How To Be Funny” talk.

This year I plan to outdo them all with my balanced discussion of intellectual property: “Hitler Loved Patents”. Although I have spent the majority of the past 10 years arguing on the Internet about intellectual property with various weirdos and libertarians and weirdo libertarians and libertarian weirdos, it has only recently become acceptable to express my views in public. And what better way than through a profanity-laden speed-talking Powerpoint presentation?

There will, of course, be a large number of other talks, many of which will be almost as entertaining and/or compelling as mine. There will also be, I’m told, a “sex-positive boiler room”2 and some sort of lockpicking workshop, one or both of which certainly addresses your hesitations about attending.

If it’s anything like last year, there will also be interesting breaks between sessions, where BILders socialize and where crazy people grab the empty mics and perform spoken-word-poetry-ish rants about free energy and capitalism, all the while people chuckle nervously and wonder whether this is a scheduled part of the performance or simply the result of too little security. There might be coffee too.

There will certainly be a huge assortment of burners, transhumanists, futurists, cryonicists, libertarians, anti-libertarians, polyamorists, monoamorists3, objectivists, subjectivists, artists, crossfitters, politicians, entertainers, hosts of invention-related television shows, hackers, humorists, Paul Grasshoffs, atheists, and doers and makers of all types. Many of them are my good friends, and many more will be by the time the weekend is over. (Also, many of them will be my enemies by the end of the conference, since you can’t exactly tell people that the industry they’ve dreamed of working in their whole lives is morally on par with the death gulags without alienating a few folks, but such is the price of progress.)

In addition, the whole event takes place on a boat, which has some sort of giggly significance that is lost on me but probably has something to do with some creepy anime that everyone except me downloads and watches illegally.

Anyway, Long Beach really isn’t that far from wherever you are, and $50 is less money than you’d spend buying a dozen Original Six Dollar Burger®s at Carl’s Junior, so why have you not signed up already? And in the event you need burgers that badly, Simone gave me this code for 20% off the registration, which will save you $10, which means you’ll still be able to buy two of those tasty, tasty Original Six Dollar Burger®s4 and have the conference weekend of your life.

So I guess I’m not really sure what your objection is at this point. Sometimes I hear “Joel, you’re biased because the whole event is organized and produced by your friends,” and sometimes I hear “Joel, surely you’re on the take from the Long Beach Convention and Visitors Bureau and/or Carl’s Jr.,” and still other sometimes I hear “Joel, you recommended that I attend the Libertarian National Convention in Anaheim in 2000, and that really sucked,” to which I can only respond, “were you at the same Libertarian Convention I was at, because I guarantee you that that was the most fun that anyone’s ever had in Anaheim in the history of mankind.”

So can you just go ahead and sign up already?

1. I’m only ten and I already got two mortal enemies.
2. No, I have no idea what this is either, although I suspect it has something to do with high-pressure stock trading.
3. Monoamorists. It’s a word. Look it up.
4. Six-dollars is what you put on your tax return, but the cash price is closer to $4.

Hacking Hacker News

Hacker News, if you don’t know it, is an aggregator / forum attached to Y Combinator. People submit links to news stories and blog posts, questions, examples, and so on. Other people vote them up or down, and still other people argue about them in the comments sections.

If you have unlimited time on your hands, it’s an excellent firehose for things related to hacking. If your time is more limited, it’s more challenging. People submit hundreds of stories every day, and even if you only pay attention to the ones that get enough votes to make it to the homepage, it’s still overwhelming to keep up:

What’s more, a lot of the stories are about topics that are boring, like OSX and iPads and group couponing. So for some time I’ve been thinking that what Hacker News really needs is some sort of filter for “only show me stories that Joel would find interesting”. Unfortunately, it has no such filter. So last weekend I decided I would try to build one.

Step 1 : Design

To make things simple, I made a couple of simplifying design decisions.

First, I was only going to take into account static features of the stories. That meant I could consider their title, and their url, and who submitted them, but not how many comments they had or how many votes they had, since those would depend on when they were scraped.

In some ways this was a severe limitation, since HN itself uses the votes to decide which stories to show people. On the other hand, the whole point of the project was that “what Joel likes” and “what the HN community likes” are completely different things.

Second, I decided that I wasn’t going to follow the links to collect data. This would make the data collection easier, but the predicting harder, since the titles aren’t always indicative of what’s behind them.

So basically I would use the story title, the URL it linked to, and the submitter’s username. My goal was just to classify the story as interesting-to-Joel or not, which meant the simplest approach was probably to use a naive Bayes classifier, so that’s what I did.

Step 2 : Acquire Computing Resources

I have an AWS account, but for various reasons I find it kind of irritating. I’d heard some good things about Rackspace Cloud Hosting, so I signed up and launched one of their low-end $10/month virtual servers with (for no particular reason) Debian 6.0.

I also installed a recent Ruby (which is these days my preferred language for building things quickly) and mongoDB, which I’d been meaning to learn for a while.

Step 3 : Collect Data

First I needed some history. A site called Hacker News Daily archives the top 10 stories each day going back a couple of years, and it was pretty simple to write a script to download them all and stick them in the database.

Then I needed to collect the new stories going forward. At first I tried scraping them off the Hacker News “newest” page, but very quickly they blocked my scraping (which I didn’t think was particularly excessive). Googling this problem, I found the unofficial Hacker News API, which is totally cool with me scraping it, which I do once an hour. (Unfortunately, it seems to go down several times a day, but what can you do?)

Step 4 : Judging Stories

Now I’ve got an ever-growing database of stories. To build a model that classifies them, I need some training data with stories that are labeled interesting-to-Joel or not. So I wrote a script that pulls all the unlabeled stories from the database, one-at-a-time shows them to me and asks whether I’d like to click on the story or not, and then saves that judgment back to the database.

At first I was judging them most-recent-first, but then I realized I was biasing my traning set toward SOPA and PIPA, and so I changed it to judge them randomly.

Step 5 : Turning Stories into Features

The naive Bayes model constructs probabilities based on features of the stories. This means we need to turn stories into features. I didn’t spend too much time on this, but I included the following features:

* contains_{word}
* contains_{bigram}
* domain_{domain of url}
* user_{username}
* domain_contains_user (a crude measure of someone submitting his own site)
* is_pdf (generally I don’t want to click on these links)
* is_question
* title_has_dollar_amount
* title_has_number_of_years
* title_references_specific_YC_class (e.g. “(YC W12) seeks blah blah)
* title_is_in_quotes

For the words and bigrams, I removed a short list of stopwords, and I ran them all through a Porter stemmer. The others are all pretty self-explanatory.

Step 6 : Training a Model

This part is surprisingly simple:

* Get all the judged stories from the database.
* Split them into a training set and a test set. (I’m using an 80/20 split.)
* Compute all the features of the stories in the training set, and for each feature count (# of occurrences in liked stories) and (# of occurrences in disliked stories).
* Throw out all features that don’t occur at least 3 times in the dataset.
* Smooth each remaining feature by adding an extra 2 likes and an extra 2 dislikes. (2 is on the large side for smoothing, but we have a pretty small dataset.)
* That’s it. We YAML-ize the feature counts and save them to a file.
* For good measure, we use the model to classify the held-out test data, and plot a Precision-Recall curve

Step 7 : Classifying the Data

Naive Bayes classifier is fast, so it only takes a few seconds to generate and save interesting-to-Joel probabilities for all the stories in the database.

Step 8 : Publishing the Data

This should have been the easiest step, but it caused me a surprising amount of grief. First I had to decide between

* publish every story, accompanied by its probability; or
* publish only stories that met some threshhold

In the long term I’d prefer the second, but while I’m getting things to work the first seems preferable.

My first attempt involved setting up a Twitter feed and using the Twitter Ruby gem to publish the new stories to it as I scored them. This worked, but it wasn’t a pleasant way to consume them, and anyway it quickly ran afoul of Twitter’s rate limits.

I decided a blog of batched stories would be better, and so then I spent several hours grappling with Ruby gems for WordPress, Tumblr, Blogger, Posterous, and even LiveJournal [!] without much luck. (Most of the authentication APIs were for more heavy-duty use that I cared about — I just wanted to post to a blog using a stored password.)

Finally I got Blogger to work, and after some experimenting I decided the best approach would be to post once an hour, all the new stories since the last time I posted. Eventually I realized that I should rank the stories by interesting-to-Joel-ness, so that the ones I’d most want to read would be at the top:

and the ones I want to read least would be at the bottom:

The blog itself is at

Step 9 : Automate

This part was pretty easy with two cron jobs. The first, once an hour, goes to the Hacker News API and retrieves all new unknown stories (up to a limit of like 600, which should never be hit in practice). It then scores them with the last saved model and adds them to the database. In practice, the API isn’t working half the time.

The second, a few minutes later, takes all the new stories and posts them to the blog. The end result is a blog of hourly scored digests of new Hacker News posts.

Step 10 : Improve the Model

The model can only get better with more training data, which requires me to judge whether I like stories or not. I do this occasionally when there’s nothing interesting on Facebook. Right now this is just the above command-line tool, but maybe I’ll come up with something better in the future.

Step 11 : Profit

I’m still trying to figure this one out. If you’ve got any good ideas, the code is here.

Hyphen Class Post-Mortem

Last fall I signed up for two of the hyphen classes: the Machine Learning ml-class (Ng) and the Artificial Intelligence ai-class (Thrun and Norvig). Both were presented by Stanford professors but one of the conditions of taking the courses was that whenever I discuss them I am required to present the disclaimer that THEY WERE NOT ACTUALLY STANFORD COURSES and that I WAS NEVER ACTUALLY A STANFORD STUDENT and that furthermore I AM NOT FIT TO LICK THE BOOTS OF A STANFORD STUDENT and so on. (Caltech is better than Stanford anyway, even if whenever you tell people you’re in the economics department they always say, “we have one of those?!”)

My background is in math and economics, but I’ve taught myself quite a bit of computer science over the years, and I consider myself a decent programmer now, to the point where I could probably pass a “code on the chalkboard” job interview if that’s what I needed to do in order to support my family and/or drug habit.

I’d worked on some machine learning projects at previous jobs, so I’d picked up some of the basics, but I’d never taken any sort of course in machine learning. At my current job I’m the de facto subject matter expert, so I thought the courses might be a good idea.

The classes ended up being vastly different from one another. Here’s kind of a summary of each:


* Every week 5-10 recorded lectures, total 1-2 hours of lecture time. (There was an option to watch the lectures at 1.2x or even 1.5x speed, which I always used, so it might have been more like 3 hours in real-time. This means that if I ever meet Ng in real-life, he will appear to me to be speaking very, very slowly.)

* Most lectures had one or two (ungraded) integrated multiple choice quizzes with the sole purpose of “did you understand the material I just presented?”

* Each week had a set of “review questions” that were graded and were designed to make sure you understood the lectures as a whole. You could retake the review if you missed any (or if you didn’t) and they were programmed to slightly vary each time (so that a “which of the following are true” might be replaced with a “which of the following are false” with slightly different choices but covering the same material).

* Each week also had a programming assignment in Octave, for which they provided the bulk of the code, and you just had to code in some functions or algorithms. I probably spent 2-3 hours a week on these, a fair amount of that chasing down syntax-error bugs in my code and/or yelling at Octave for crashing all the time.

* Machine learning is a pretty broad topic, and this course mostly focused on what I’d call “machine learning using gradient descent.” There was some amount of calculus involved (although you could probably get by without it) and a *lot* of linear algebra. If you weren’t comfortable with linear algebra, the class would have been very hard, and the programming assignments probably would have taken a lot longer than they took me.

* The material was a nice mix of theoretical and practical. I’ve already used some of what I learned in my work, and if there was a continuation of the class I would definitely take it. As it stands I’m right now signed up for the nlp-class and the pgm-class, which should be starting soon, both of which are relevant to what I do.

* The workload, and the corresponding amount I learned, were substantially less than they would have been in an actual 10-week on-campus university course. This was great for me, since I also have a day job and a baby. If I were a full-time student being offered ml-class instead of a real machine learning class, I might feel a little cheated. (I saw a blog post by some Stanford student whining about this, but he was mostly upset that the hyphen classes were devaluing his degree. Someone should have reminded him about the disclaimer.)

* The class was very solidly prepared. The lectures were smooth and well thought out. The review questions did a good job of making sure you’d learned the right things from the lectures. The programming assignments were good in their focus on the algorithms, although that did insulate you from the real-world messiness of getting programs set up correctly.

* It certainly seemed like Ng really enjoyed teaching, and at the end of the last lecture he thanked everyone in a very heartfelt way for taking the class.


* Every week dozens of lectures, each a couple of minutes long, interspersed with little multiple choice quizzes. This was my first point of frustration, in that the quizzes were frequently about parts of the lecture that hadn’t happened yet. Furthermore, they often asked ambiguous questions, or questions that were unanswerable based on the material presented so far.

* Each week had a final quiz that you submitted answers for one time only. Then you waited until the deadline passed to find out if your answers were correct (and then you waited another day, because the site always went down on quiz submission day, and so they always extended the deadline by 24 hours). These quizzes were also ambiguous, which meant that if you wanted to get them correct you had to pester for clarifications (and sometimes for clarifications of the clarifications).

* This resulted in the feeling that the grading in the class was stochastic, and that your final score was more reflective of “can I guess what the quiz-writer really meant” than “did I really understand the material”. Although I didn’t particularly care about my grade in the class, I was still frustrated and disheartened by the feeling that the quizzes were more interested in *tricking* me than in helping me learn.

* What’s more, the quizzes often seemed to focus on what seemed to me tangential or inconsequential parts of the lesson, like making sure that I really, deeply understood step 3 of a 5-step process, but not whether I understood the other four steps or the process itself.

* The material also seemed very grab-bag, almost like an “artifical intelligence for non-majors” survey course.

* Anyway, partly on account of my finding the class frustrating, partly on account of time pressures, and partly because I didn’t feel like I was learning a whole lot, I dropped the ai-class after about four weeks.

* There were no programming assignments, but there was a midterm and a final exam, both after I quit the course. From what I could tell, they were longer versions of the quizzes, with the same problems of clarity and ambiguity. (I never unfollowed the @aiclass twitter, and during exam time it was a steady stream of clarifications and allowed assumptions.)

* Compared to the tightly-planned ml-class, the ai-class felt very haphazard. In addition, the ml-class platform I found more pleasant to use than the ai-class platform.

* I quit long before the last lecture, so I have no idea how heartfelt it was.

One thing about both classes: I *hate* lectures. I learn much better reading than I do being lectured at, and I found the lecture aspect of *both* classes frustrating. I have complained about this in many venues, but my prejudice is that if you’re using the internet to make me watch *lectures*, you’re not really reinventing education, because I still have to watch lectures, and I hate lectures. Did I mention that I hate lectures?

By way of comparison, I have also been doing CodeYear. It is currently below my level (I am plenty familiar with variables and if-then statements and for loops), but I don’t know much Javascript, and the current pace makes me hopeful that it will get interesting for me after another month or two.

If you don’t know that platform, it gives you a task (“create a variable called myName, assign your name to it, and print it to the console”) and a little code window to do it in. Then you click “run” and it runs and tells you if you got it right or not. There is a pre-canned hint for each problem.

What I really like about Codeacademy is that I can do it at my own pace. The lessons are wildly variable in quality, but I’m glad not to have to sit through hours of lectures every week. They also do “badges”, which I find more satisfying than I wish I did. That said, I suspect someone with no experience debugging code would find the experience impenetrable and waste hours tracking down simple syntax errors, and indeed I saw on Hacker News a post to this effect a few weeks ago.

In the end, despite all this, the way I learn best is through a combination of reading books and writing actual code. I’ve had to learn F# over the last month, which I’ve done by reading a couple of (quite nice) books and writing a lot of actual code. It’s hard for me to imagine the course that would have done me any better (or any faster).

Similarly, if I wanted to learn Rails (which some days I think I do and other days I think I don’t), I have trouble imagining a course that would do better for me than just working through the Rails Tutorial (which I have skimmed, which has convinced me that I could learn well from it).

Similarly similarly, I suspect that the right Machine Learning book (and some quality time with e.g. Kaggle) would have been much more effective for me than the ml-class was. But if such a book exists, I haven’t found it yet.

What Part of Your Oath Do You Not Understand?

I’m mad as hell, and I’m not going to something something anymore!

It all started with Wil Wheaton1, who used to be the bartender (I believe) on “Star Trek”, but who is now some sort of Twitter celebrity. I myself have zero tolerance for Twitter celebrities, but one of the “data scientists” I follow “retweeted” the following into my newshose:

The SOPA/NDAA, in case you have more important things to do than follow politics, is the latest power grab by the content industries, and would allow the President to use unmanned drones to assassinate you and/or the Internet without a trial if he suspects you’re selling counterfeit handbags or illegally downloading Hall & Oates MP3s or waging jihad. It is indeed an abomination, which is why it is only supported by heartless, baby-killing monsters like record company executives and United States Senators. And it certainly seems plausible that a President who signed such a bill would be in violation of his oath to “defend the Constitution.”

You know what else is in violation of his oath to defend the Constitution? JUST ABOUT EVERYTHING HE’S EVER DONE. Invade Libya without declaring war? NOT IN THE CONSTITUTION. Illegally traffic guns to criminals in order to drum up popular support for eviscerating the Second Amendment? NOT IN THE CONSTITUTION. Override state medical marijuana laws? NOT IN THE CONSTITUTION. Force people to buy private health insurance? NOT IN THE CONSTITUTION. And so on. If it takes the NDAA to get you to care about Obama’s oath to defend the Constitution, then either you’ve been living in a cave in Pakistan for the past 3 years, or YOU DON’T ACTUALLY CARE ABOUT THE CONSTITUTION.

As it happens, I’m not one of those libertarian types who pounds the table about what is and what isn’t in the Constitution. Of course I’d rather the government lived up to its promises not to quarter soldiers in my condo, not to take away my guns, and not to censor my XXXXXXXXXXX. But they don’t, and no one seems to care that they don’t, and in fact most people are quite happy to let the government quarter soldiers in their condos as long as it gets them something they want, like endless war in Afghanistan, or patents on being aware of medical best practices, or subsidized pharmaceuticals for wealthy old people. In any event, I don’t treat the Constitution as holy writ, or think something is necessarily a good idea because it’s in the Constitution or necessarily a bad idea because it’s not, or consider it a good use of anyone’s time to yell “READ THE CONSTITUTION!” to people who don’t particular care about what’s in the Constitution.

But I will pound the table when some Obama-endorsing, juvenile-name-calling Twitter celebrity suddenly starts chastising people as if in this one case the Constitution is the most important thing in the world. You don’t get to do that. If you didn’t care about the Constitution back when activist judges insisted that deep in its penumbrae one could divine secret rights to funnel taxpayer money to politically-connected banks and carbuilding unions, then no one is going to take you seriously when you pretend to care about it now. Oh, they’ll pretend to care about your pretending, and maybe they’ll even mention to their friends that “that bartender from the Starship Enterprise had some great tweet where he pretended like he cares about the Constitution, and he used #hashtags and everything, and it was really such a stellar example of pretending to care about the Constitution that I favorited it and retweeted it and @replied to it, so you should check it out!” But they know that you’re posturing and that you know perfectly well that the President and the Congress perfectly well understand their oath to “uphold the Constitution”, they JUST DON’T GIVE A RAT’S ASS ABOUT IT, and they also know that 364 days out of the year NEITHER DO YOU.

What’s extra-sad is that this guy had a particularly unpleasant run-in with the TSA last spring:

You’d think that might have indicated to him that the “teabaggers’” fear of government power was maybe not so off-base after all. The bartenders at the places I hang out certainly would have noticed this, so maybe it’s that all the cosmic rays in space kill brain cells.

All that said, the NDAA and SOPA are both horrible laws and we’re worse as a society for passing them (or for being about to pass them) and the people defending them are heartless, baby-killing monsters who you should probably go out of your way to spit on if you encounter them. But they’re also perfectly predictable consequences of having the kind of busybody government that you’ve been loudly clamoring for your whole life. It wasn’t so long ago that you were blogging a stupid “CHANGE WE CAN BELIEVE IN” graphic and telling people to vote for this jerk. To the extent you care about preventing the next SOPA, you might consider next time listening to the libertarians instead of just calling them vulgar names and putting sugar in their gas tanks.

1. Technically, it started when I read the article about Sheila Jackson-Lee stopping a SOPA hearing so they could discuss whether someone had insulted her on Twitter, and I realized that I was the one with the “crazy” politics for not being eager to subject myself to thousands of pages of laws written by emotional preschoolers.

Doubling the Compost Box

If you are on Facebook you have probably seen the articles about the unnamed school board member who couldn’t do any of the math problems on the math standardized test (and who couldn’t pass the reading section). Most of the discussion drew the conclusion that the tests were too hard for 10th graders and covered topics that were irrelevant to success (at least, if “success” is defined by being an unnamed school board member).

(There was also an unspoken undercurrent that, as standardized testing had been embraced by the Pepsi party, maligning it was a good signal of one’s allegiance to the Coke party.)

The article, of course, did not give any examples of the questions that were too hard, leaving open the alternative hypothesis that perhaps one simply doesn’t have to be very bright to serve on a school board. (Having casually observed the Seattle School Board over the last several years, I am inclined toward this position.)

You probably didn’t see the follow-up article outing the test failer as someone with a Bachelor’s in Education (which he describes as a “Bachelor of Science”, making it sound like he has a science degree, which he doesn’t), a Master’s in Education, and a Master’s in Educational Psychology. Given my prior that each of these degrees is worthless (except, of course, for its value in jumping through some sort of public-servant-pay-grade hoop), I feel even safer about my alternative hypothesis.

Luckily, the second article names the test. It’s the Florida Comprehensive Assessment Test, and the state of Florida has thoughtfully provided sample questions on the web. The questions are not particularly interesting, nor are they particularly hard:

(Although the test is quite plainly biased against students from cultures that lack access to composting.)

What is interesting is his final criticism:

The math section, he said, tests information that most people don’t need when they get out of school.

There is a sense in which this is true. Most people never compute the volumes of composting bins. (Although if the city of Seattle gets its way, soon we’ll all be forced to.)

There’s also a larger sense in which it’s false. Solving word problems is a valuable skill (that most people sadly lack), and word problems have to be about something. And whatever that something is, probably most kids will never have to know its specifics again.

But there’s a bigger sense in which it’s irrelevant. Most of what you learn in high school (insofar as you learn anything) is information that you’ll never need again. I myself have forgotten almost all of my American history (although I remember our teacher’s stories about her redneck neighbors, who used to jump out of their second-story door after their deck collapsed), almost all of my chemistry (although I remember that the teacher had a toy stuffed mole named Avogadro), almost all of my English lit (although I remember that F. Scott Fitzgerald liked to use “flower imagery”), almost all of my Spanish (although I remember listening to cassette tapes of commercials for “Pal-mo-LEE-vay”), almost all of the pep rallies (although I remember that DHS Wildcats are “paw-some”), almost all of the motivational assemblies (although I remember the “what thou see-est, that thou be-est” guy), and almost all of my classmates (although I remember Josh Adams, because he visits Seattle every 10 years or so).

And the things you do need to know vary a lot from person to person. While it’s important to me that the Wildcats are paw-some, it might be equally important to Josh Adams that when the Wildcats rock the house they rock it all the way down. A test that asks about one neglects the other, and vice versa.

To the extent that most of what you learn in school is useless (and, believe me, most of what you learn in school is useless), any test that makes sure you actually learned it is going to be testing information that you don’t actually need. Blaming the test for that hardly seems fair.

And to the extent that your fancy degrees in education are useless (and, believe me, your fancy degrees in education are useless), then they’re not going to help you answer questions on a test. Again, blaming the test hardly seems fair.

All that said, standardized tests are essentially a 19th-century technology, and fixing education will almost certainly entail getting rid of them (although merely getting rid of them will not fix education in the slightest). I don’t mean to offer a blanket defense for them, only a defense against the criticism “I have three degrees and can’t do the test and therefore the problem is with the test.”

That leaves only the unresolved issue that you don’t have to be particularly competent to be on the school board, although if progressive Seattle is cool with it, then I imagine everyone else is too.

Fiction: The Difference Principle

Jessica pointed at a pile of rags beside a dumpster.

“This is the guy?” I asked.  I looked up and down the filthy alley we were standing in.  “This is a person?”

“It is,” she said tentatively, and then she checked the little brown Moleskine she carried everywhere.  “It is,” she repeated more confidently.  “A lot of them aren’t even this clean, so you’d better get used to it.”

“Sir?”  I crouched down closer to what I assumed was his head, trying to ignore the stench of cigarette smoke and sour beer and body odor.  “Sir?”  I asked again.

“Go ‘way.”  The voice sounded rusty, as if it weren’t used very often, but it was indeed coming from where I’d guessed his head was.  “Leave me ‘lone.”

“Keith Runson?” I asked.  I’d memorized the name on the drive over.  The pile shifted.

“Who’s askin’?”

“My name is Harry…”  Jessica kicked me, and I immediately remembered that we weren’t supposed to tell them our names.  Shit.  But this guy didn’t seem like he’d remember it, so I kept going. “We’re from Original Position.”

He poked his head out of the blankets. His hair and face clearly hadn’t been washed in months, he had the decrepit teeth common to those who prioritized drug abuse over hygiene, and his eyes were pretty much the saddest I’d ever seen.  The computer clearly knew its business.

“The char’ty?” he asked.

I puzzled over my answer for too long, and Jessica stepped in.  “Original Position is not a charity.  It’s a fundamental part of the social contract.”

“I never no signed no contract,” he objected.

“No,” Jessica explained, in a tone indicating that she’d delivered this exact explanation countless times before, “but you would have…”

“I never would have signed no contract,” he insisted.

“Maybe not in your lifetime,” she told him, “but before you were born you certainly would have. This is well understood.”

“Before I was born I ain’t would have signed no contract!”

“Before you even knew who you were,” Jessica patiently explained.  “Back when you didn’t know if you’d be the President or if you’d be … well … you.”

“I am me,” he growled.

“You’re not just you,” Jessica told him.  “According to our computers you’re the worst-off person in the United States.”

“Well, fuck you!”  He spat at her, but she didn’t flinch.  She’d warned me that they often got angry.  It was an occupational hazard.

“Don’t spit,” she patiently chided him.  “We’re here to help.  Before you were born, back behind the Veil of Ignorance, you would have wanted to live in a society that focused on the well-being of the worst-off-person.”

“I’m not that ignorant,” he objected.  “And you don’t know what I would have wanted.”

She ignored his objection.  “Right now you are that worst-off person.  And so we’re here to make you better off.  What would you like?  Within reason it’s yours.  Food?  Shelter?  Toilets?  A job?”

“Booze,” he wheezed.  “I want booze.”

Nine times out of ten they want booze.  They’d warned us in training.  I opened the unlabeled messenger bag Original Position had given me and pulled out a bottle of cheap whiskey.  He grabbed it out of my hand, opened it, and started drinking before I could say anything.

“Congratulations, Mister Runson,” Jessica told him.  “You’re no longer the worst-off person.”  I don’t think he even heard her.

“What did you think of your first assignment?” she asked me as we walked back to the car.

I thought for a while before I asked, “Was it really a good idea to give him whiskey?”  I didn’t feel like we’d been particularly philanthropic.

“A good idea?  Probably not.  Possibly he’ll end up on our list again someday.  But for the meantime he’s no longer the worst-off person, which means that our attention is needed elsewhere.”

“I didn’t realize it would be so depressing,” I told her.

“Rawls didn’t call his book A Theory of Why Justice is Fun.  Just wait until you get a quadriplegic.”

I tried not to think about that.

“Let’s see who’s next.”  She opened up the Moleskine.  “Ooh, child abuse!”