Five ways your data app can catch the big news hook.

01. Practice news-driven development

Most data-driven news applications I’ve encountered follow what I would call The Chicago Crime model, a name lifted from Adrian Holovaty’s famous site. Steady streams of government-provided data are repurposed into a flexible interface that allows users to compare disparate sources (“the mashup”) and easily localize the information so it can provide particulars to a wide body of users (“the long tail”).

It’s a brilliant model, the app that launched a 1,000 ships. But it’s not the only way to get things done.

In news terms, where minutes matter, it can still require a relatively long time to do. Especially when it comes to data acquisition. Let’s face it, if you’re using government data as your starting point, the idea of an SOAPy API is laughable. So don’t get your hopes up. Goofing around with delicious tags or Flickr photos is fun, but if you want to do something original from the public sector, they’re only going to get you so far. You’re going to be FOIA’ing, or, if you’re lucky, scraping. And then you’re going to be cleaning. Especially if you’re invested in serving accurate and consistant information. Because if there’s a government database out there that’s ready to serve, I’ve yet to see it.

And there’s usually not much of a news hook. Look, I appreciate Everyblock and Chicago Crime and that whole style. Hell, I’ve essentially remodeled my career to emulate them. But when you get down to it, they’re essentially built around the idea that umpteen little news hooks (”Someone was robbed in my neighborhood,” “A liquor store wants to open up on your block.”) will add up to something greater than the sum of their parts. That “hyperlocal” or “long tail” philosophy, to use the parlance of our time, may ultimately be where a lot of us end up, but blockbuster news is still happening and there’s no reason all the same tools that made the Chicago Crime successful can’t be used to cover the hell out of a big story when it breaks.

I had just such an opportunity last Friday at the L.A. Times. Late in the afternoon, news broke that a commuter train had crashed in the Valley, potentially killing many riders on board. We didn’t know how many fatalities to expect, nor how long it would take for their identities released. But we knew that our audience was going to want to know, and as soon as possible. The typical newspaper.com way to handle this sort of thing is to publish a simple list, or “blob of text”, when it’s available. And then follow up later with a scattershot of obituaries, usually released as they appear in the paper. But, when you think about it in terms of the Holovaty manifesto and the general concept of the Internet, there’s really no reason that information couldn’t be better collected and presented as a browsable database application. It’s a lesson the LA Times learned earlier this year when our ripoff of Adrian’s Faces of the Fallen concept reinvigorated the way the paper covers military casualties.

It meant staying late at work on a Friday night, busting ass most of my weekend, and putting more faith in memcached than most IT people are comfortable with, but the result was that when the government finally did cough up the fatality list we were ready to immediately publish it as a linked database that, over time, has been filled in by further reporting to include greater detail, photos, and more than 1,600 user comments, many of them extremely moving. It’s a long way from perfect, but it provided some amount of public service, was way ahead of the competition and generated a pretty goodly amount of traffic along the way. The site is called Chatsworth Metrolink Crash.

That’s all my long way of saying that I think big events matter and that database journalists shouldn’t be afraid to dive in when they happen. Whether it’s posting the location of hurricane shelters, letting people know who the hell all those superdelegates are, or connecting survivors following a disaster, there are plenty of obvious opportunities to do our thing. But it’s not going to happen if we don’t see taking on big news as an opportunity, anticipate things like the next hot Google search term, or have the capability to deploy very very quickly.

I’m a long way from an authority on the whole deal, but I’m stumbling my way through it. And here are a couple things I’ve learned along the way.

02. Let last year’s data be your guide.

Earlier this month, we released California Schools Guide, a collection of data about public and private schools across the state, at the very moment the government lifted its embargo on this year’s scores. I didn’t have the newsworthy data in hand until less than 24 hours before it would be publicly released. But by developing the site in advance using the previous year’s data as dummy entries, I was able to pre-script the loading of the 2008 data after only a few minor changes to the code. This meant that we were able to get our product out when the news hook dropped, at the same time as the paper was otherwise promoting an investigative story on the topic and the state’s propaganda arms were blasting its own message (”Things are getting better! Trust us!”).

03. Don’t Repeat Yourself, unless it saves you time.

Let me be clear. The DRY goal of elegence through efficiency is laudable. And, as a guiding principle for development, you probably can’t get any better. It is the single point of truth. It’s like natural selection, except for awesomeness. But when you’re on a tight deadline, and you’ve already got a code implementation that works, sometimes you JDFWI, Just Don’t Fuck With It. Yeah, so maybe you just copied and pasted and introduced a little redundancy. And maybe your css is just a hodgepodge of div’s repurposed from other apps. But it works, right? And what’s more important, trimming down your code base, or getting the news out ahead of your competition?

04. Use Django’s admin to your advantage.

For anyone who’s already doing this stuff, it probably goes without saying, but Django’s admin is really great. As soon as your database models are written, you’ve instantly got a set of entry forms that are ready to deploy. This is incredibly useful when trying to turn around simple data apps on deadline. For instance, when it came to the Metrolink crash, I was able to get the models and admin up Friday night so that reporters on Metro desk could begin working on entry as I shifted to work on the views and templates.

05. Publish now, or perish.

You can have the greatest app in the world, but if you can’t push it out the web ASAP, you’re nowhere. If you’re going the Chicago Crime route, this isn’t as big of a deal. But if you’re trying to hit the big news hook, it’s utterly essential. And treating big news like you would anything else on your “product schedule” or “iteration cycle” just isn’t going to be good enough. You can call it a waterfall, you can call it reckless, you can call it news-driven development.

California’s War Dead.

This Memorial Day weekend marked the formal launch of California’s War Dead, our database of the state’s casualties from the wars in Afghanistan and Iraq. It’s the result of a lot of hard work by many people at the paper, a large share of which had already been carried through the years by our many obituary writers.

The site intends to allow users to explore the data using a variety of criteria (for example, you can quickly look up fallen troops by hometown, high school or marital status). And to learn more about individuals by reading their obituaries from our back archives. Choice quotes have been selected to “pop” out of the individual profile pages and visitors are encouraged to leave memories and thoughts as comments.

Besides all my coworkers who pitched in to make this happen on a tight deadline, thank yous should be extended to all the great developers in the Django community. They not only provided the Web programming tools that made this idea possible, but also the leadership that showed me how the tools can be used to make journalism for the Web, not just on the Web. The same goes for all the people in the NICAR community who, by leading by example, have pushed me to keep learning new things and have the courage to take chances outside of journalism’s well worn comfort zones. Personally, I just hope that first group can forgive me for ripping off their ideas and that the second group doesn’t resent my getting the opportunity to do things like this without having to put in the once requisite 5 to 10 years on the cops-and-courts beat.

If you’re stretched for time, or maybe doubting there’s anything new to be learned about the war, let me promote a couple spots that might interest you.

  • Over the course of assembling the data, I was surprised to learn how many immigrants to California have died. It’s more than fifty, from Mexico and the Phillipines and South Korea and a number of other places. Check out the lists here. A fascinating story is of Sgt. Rafael Peralta of San Diego, who enlisted the same day he received his Green Card and died in Fallouja, Iraq, when he sacrificed himself to save his compatriots from a grenade attack. His profile is here and the story of his heroic death is here.
  • The most rewarding part of the project for me has been to see how quickly we’re getting great, thoughtful comments submitted by friends and family members of the deceased. One of my goals in the design was to give their writing equal footing with our previous reporting. It can be heartbreaking to read, but I’m proud to have helped make something that people think is worthy of such sensitive information. Examples I find particularly moving are the memories shared by the family of Sgt. Jason J. Buzzard of Ukiah and Corporal Christopher D. Leon of Lancaster, who I’m honored to know better now than I did before our commentors contributed.
  • It seems natural to expect that spending so much time with casualty data would have a numbing effect. But I think that’s only the case when we let the very real people we’ve lost remain numbers in a casualty count or unknown names on a page. It’s the stories that bring them to life, and my experience has been that the more stories you hear, the less numb you feel. The pain is in the details. A moving example is Teresa Watanabe’s obituary of Lt. Mark J. Daily of Irvine, who was inspired to join the war by the political writing of war advocate Christopher Hitchens. Hitchens has since gone to write a moving response to learning of Daily’s readership, and sacrifice, that you can find here.

LA red light cameras on your TomTom or Garmin.

Today our A1 features Rich Connell’s look at the effectiveness of all those automated red light cameras positioned around Los Angeles. Here’s the nut:

In Los Angeles, officials estimate that 80% of red light camera tickets go not to those running through intersections but to drivers making rolling right turns, a Times review has found.

One of the most powerful selling points for photo enforcement systems, which now monitor 175 intersections in Los Angeles County and hundreds more across the United States, has been the promise of reducing collisions caused by drivers barreling through red lights.

But it is the right-turn infraction — a frequently misunderstood and less pressing safety concern — that drives tickets and revenue in the nation’s second-biggest city and at least half a dozen others across the county.

Our web package includes some hot tape put together by Rich, an awesome interactive explainer by Raoul Ranoa, the now perfunctory Google Map, and my own little goofy idea: portable downloads for TomTom and Garmin GPS devices (check out the roadblock halfway down the main story).

Loading the points into your device will not only map them on your dashboard monitor — but you can also easily program your system to give you an audio warning as you approach upcoming lights. And in that same soothing computer voice that already tells you when to turn.

I’m not sure how interested readers will be in this sort of product, but it seemed like a fun experiment. And since Rich had put in a great effort collecting the data from LA’s many fragmented municipalities, it seemed like we had to look for some extra yard to go for.

The technical part is pretty easy. Both manufacturers have handy developer guides that — once the data is prepared — only take a couple hours to suss out. Here’s TomTom. Here’s Garmin.

Any thoughts on other newspapery data projects that might work for GPS? The most dangerous intersections? The location of famous landmarks around town?

Ubuntu Recipe: How to automagically post your Last.fm feed to Twitter.

I signed up for Twitter this morning, opening an account at http://twitter.com/palewire. Since I haven’t seen or heard from my cell phone in a week or two, don’t count on much on the scene reporting. But I did take a few minutes this morning to line up my Last.fm feed, so that my lastest listenings are now automatically Twittered to the huddled masses yearning to have my musical taste shoved down their throat.

For any other Ubuntu users who’d like to follow along, here’s a quick recap on how I made it happen.

1. Move to the folder where you store random scripts. Me, I use…

cd /usr/local/bin

2. Create a new Perl script and open it in gedit.

sudo gedit twitter_fm.pl

3. Copy and paste in the ready-to-serve code provided by Walter Higgins.

4. Edit in your Twitter and Last.fm login information. Save and exit the file.

5. Create a new shell script.

sudo gedit twitter_fm.sh

6. Paste in the following, editing the folder structure to reflect wherever you stuck your steez.

#!/bin/sh
 
perl /usr/local/bin/twitter_fm.pl

7. Set the shell script so it becomes executable.

sudo chmod +x twitter_fm.sh

8. Navigate through the System>Preferences>Session menu as described here and add the shell script to your startup processes.

9. Restart!

I just patched this mess together a couple minutes ago, so there might be some bugs. Either in my setup or in Walter’s script. Don’t know yet. Let me know if you see anything idiotic on my part.

I also installed Wordpress’s Twitter Tools plugin, so now my latest blog posts will also be sent out via Twitter.

Also on the Twitter tip, earlier this week we launched a feed at work for our popular political blog, Top of the Ticket. It includes the latest posts from our team of writers, and, on election nights, live election results as they come in. You can sign up here. For anyone looking to reroute their own data streams to Twitter, I can’t recommend Chris Thompon’s Net::Twitter Perl module enough. Easy. Peasy.

Petraeus ‘07 vs. Petraeus ‘08.

Here’s a word cloud I cooked up real quick over at Many Eyes comparing today’s opening statement from Iraq commander General David Petraeus to his previous Congressional visit last September. As Dana Milbank has noted, you’ll find less focus on Al Qaeda this time around, and more mentions for Iran.

Note that this isn’t his entire testimony. Just the opening statements. So, it doesn’t include the many questions he’s fielded.

What’s the Standard Issue?

A ritual stop on my regular tour of DC blogs is The Worldwide Standard, an online outpost of the conservative magazine The Weekly Standard.

Even if you’re not a DC newsjunkie, you’ve probably come across TWS’s editor, Bill Kristol, at one time or another. He’s on cable news all the time, serving as one of the Bush Administration’s leading supporters.

I like to follow the site’s blog, which is tended by editor Michael Goldfarb and a team of bloggers, to keep tabs on conservative opinion. The content has an interesting focus on military matters, so it’s also a good way to skim my way into what’s going on in the circle of military bloggers (”milbloggers”) that have bubbled up in Washington over the past couple of years.

One of the site’s regular features is a post called “Required Reading” that provides a short list of links and maybe a picture or video.

In the spirit of a previous post I made analyzing the links to online outlets offered by one of TWS’s political opponents, I wrote a script this afternoon to fetch all of the TWS’s “Required Reading” lists and add up what sources we’ve been pointed to the most.

If you click here, you can download a spreadsheet ranking the different sources. It totals all the links from posts they’ve tagged as Required Reading, which stretch back to February of this year.

I’ve eliminated all of the internal links to Weekly Standard’s own material, so those aren’t even in the running.

At the top of the list is the Washington Post, followed by a number of publications with a reputation for conservative editorializing. Fellow Rupert Murdoch properties, The New York Post and The Wall Street Journal finish ahead of the NYTimes. And a number of military-oriented organizations, foreign policy wonks and blogs pepper the rest of the list. The national security blog at Wired and BillRoggio.com have been particularly popular. You’ll also find a couple regional newspapers and a few other oddballs. Unlike my previous study, there are, sadly, no referrals for my employer, The Center for Public Integrity (Hey, guys. You might like my military aid database!).

Any thoughts? Anything I screwed up? Overlooked?

See his work

I work as a reporter, albeit a somewhat unconventional one. My job calls on me to specialize in what is often called computer-assisted reporting. That’s a funny phrase — have you ever heard of a computer-assisted photographer or a computer-assisted architect? — but what it means is that I use computers to collect, organize, analyze and present large amounts of information. Databases. Maps. Web Toys. Scripts. That stuff.

While I’m excited by the journalistic potential of new technology, I have an abiding admiration for the virtues of traditional reporting techniques, which I plan to continue using wherever I work.

I’m employed at the Los Angeles Times, a daily newspaper and 24-hour Web site based in Southern California. Nothing I write here should be interpreted as the opinion of that organization.

Before working at the Times, I worked on data projects at The Center for Public Integrity, covered state politics and elections in Jefferson City, Missouri, helped produce long-form documentaries for cable channels like CNN and Discovery Times, and pitched in on some television and newspaper reporting in Chicago. I earned a master’s degree from the Missouri School of Journalism — where I worked at the National Institute for Computer-Assisted Reporting (NICAR) — after receiving my undergraduate training at DePaul University.

Portfolio

Resumé

Resumé (.doc)
Resumé (hResume)

Selected Data Analysis and Presentation

California’s War Dead
The Los Angeles Times (Memorial Day 2008)

Hear No Evil, Smell No Evil
Fort Worth Weekly (June 11, 2008)

California Schools Guide
The Los Angeles Times (Sept. 4 2008)

LA’s Top Dogs
The Los Angeles Times (June 2008)

The 700 (MHz) Club: When Lobbying the FCC, Sometimes Less is More
The Center for Public Integrity (August 10, 2007)

Collateral Damage: Human Rights and U.S. Military Aid after 9/11
The Center for Public Integrity (May-June 2007)

Charity Fundraising Database
The Los Angeles Times (July 6, 2008)

Who Owns Your Media? Get the Facts from CPI’s Media Tracker.
The Center for Public Integrity (Autumn 2006)

Wasting Away: Superfund’s Toxic Legacy
The Center for Public Integrity (April-May 2007)

Passing the Buck: How the House majority leader exploited a campaign cash loophole
The Center for Public Integrity (March 16, 2007)

Selected Bylines

Only 48% of California high schools meet federal standards, even with easier measure
The Los Angeles Times (Sept. 4 2008)

Federal loans go for risky business
The Columbia Missourian (Dec. 27, 2005)

Pakistan’s $4.2 Billion “Blank Check” for U.S. Military Aid
The Center for Public Integrity (March 27, 2007)

Clear Channel gives Tate Talking Points Against XM-Sirius Merger
The Center for Public Integrity (April 14, 2007)

Searching for John Swenson: Recluse, Luddite, Candidate for Governor
The Columbia Missourian (Oct. 20, 2004)

Selected Video Production Credits

Nobody Told Me The Road Would Be Easy
WMAQ/WTTW (Winter 2006)

Keeping The Faith: Becoming a Priest in Today’s Catholic Church
Discovery Times (Feb. 1, 2005)

The Fight Over Faith
CNN Presents (Oct. 24, 2004)

Selected Side Projects

Shawington.com: An online hub for DC’s bloggiest neighborhood.
Summer 2007

AnyaLitvak.org: A journalist’s portfolio.
Autumn 2007

Awards

2007 IRE Certificate, Online Category
Collateral Damage: Human Rights and U.S. Military Aid after 9/11

2007 AHCJ Award Winner, Trade/Online Journals/Newsletters Category
Wasting Away: Superfund’s Toxic Legacy

2007 SPJ Sigma Delta Chi Award, Online Investigative Reporting (Independent) Category
Collateral Damage: Human Rights and U.S. Military Aid after 9/11

2007 SPJ Sigma Delta Chi Award, Online Non-Deadline Reporting (Independent) Category
Wasting Away: Superfund’s Toxic Legacy