Permalinks, low-rent data viz and other stupid Caspio tricks.

Today marked the release of a new Times investigation into the poor performance of for-profit fundraisers hired by not-for-profit charities. The poster child is Citizens Against Government Waste (CAGW), an advocacy group that rails against reckless government spending. According to reporting and analysis by Charles Piller and Doug Smith:

Records filed with the California attorney general’s office show that over the last decade, for-profit fundraisers for [CAGW] kept more than 94 cents of every donated dollar.

And the bigger picture:

In more than 5,800 campaigns on behalf of charities that were registered with the state attorney general from 1997 to 2006, the fundraisers reported taking in $2.6 billion. They kept nearly $1.4 billion — about 54 cents of every dollar raised.

As part of our effort to package the story for the Web, I worked with Times staff to publish all of the records collected for analysis as an online database. What we came up with allows readers to look up the track record of individual charities, browse charities of similar types, and quickly seek out the most and least efficient charities using a goofball visualization I cooked up with our graphics guy, Thomas Lauder. You can check it out here.

The app was pulled together using Caspio, a browser-based program for building data-driven web applications. While it is technically true, as the site claims, that developing a working Caspio app requires “no more programming,” my experience has been that you’re going to have to invest a significant amount of time hacking at its kludgey GUI to come up with something half-way decent. Whether you want to invest your time doing that, or mastering a more robust development option, is entirely up to you.

Other, smarter people have invested a goodly amount of space to explaining Caspio’s deficiencies, so I’ll leave that to the links. Instead let’s break out below a couple tricks that helped me at least marginally improve today’s product, in hopes they might be useful to somebody. (Though I suppose any “improvement” is a matter of opinion! Let me know what I fucked up.)

Hack 01: Roll your own forms

Caspio offers several templates. The one I use most often is the “search-and-result” set. It accepts a user’s input and returns any matching values. Might sound complicated, but it’s the same thing as Google. You pop something in, and you get back any hits. You can examine specimens in the wild here, here and here. (Thorough readers will notice that, at least at the time of writing, the Cincinnati app is dead on arrival, bearing only the cryptic message “DataPage does not exist. (Caspio Bridge error) (50501).”)

Since the “search” and “result” sides of the app are glued together in a single panel, the search box can’t be very easily plugged in around your site. You’ll have to find a way to make Caspio’s gunky JavaScript code work in each and every location where you want to encourage user input. The result is that most Caspio apps — including all three linked above — tend to live in backwater, standalone pages, lampooned by Matt Waite as “data ghettos.” (Personally, I prefer “Ghettos of the Mind.”)

That might be acceptable if you’re looking to make a destination page for your corporate intranet, like an employee directory. But it’s just not good enough for news Web sites, which draw a huge share of their incoming traffic on the homepage and the first page of featured stories. If your database isn’t prominently displayed there — and it isn’t unless you’ve got a search box or other entry point gaping open on the page — you’ve losing a whole lot of potential traffic. I think there’s something to be said for a “data central” section, but you’re probably giving up a lot of clicks if you’re waiting for people to hit the vague looking “data” link in your left-nav bar.

So what’s the hack? It’s pretty simple. Just build a search-and-result box without a search, which you then provide with your own custom HTML. You can then reuse the search box anywhere you want: the frontpage, right-rail, story-level reefer or — heaven forfend — standalone “data ghetto.”

Here’s how you do it, shot by shot.

First turn on the advanced options and allow parameters.

Tell Caspio it should look for an external parameter in the URL, rather than use it’s native search form.

Tell it which field it should run the inputs against. In this case, we’re building a search on a data table’s “name” field.

Now instruct Caspio to look for the user input after a query string variable called “name,” and to evaluate it against the data table using “contains” style matching, as opposed to “exact” or “starts with” matching. If you were using a unique identifer like a primary key for the lookup (as you likely would if you were building a dropdown menu rather than a search box), you would probably want to use an “exact” match instead of “contains.”

Then finish up by telling Caspio how to handle what to do with blank variables or circumstances where you don’t have a match.

Now you should deploy the Caspio app as you normally would, and then craft an HTML form on a different page that points to its location, placing the user’s input in the query string. For example, the search box in our charity app looks like this, with all the styling removed:

<form action="http://www.latimes.com/news/local/la-charity-search-name,0,5949050.htmlstory" method="get">
<input maxlength="100" name="name" size="6" type="text" />
<input type="submit" value="Go" />
</form>

That’ll send people to the following link, where they’ll see the search results as they’re formatted by the Caspio GUI.

http://www.latimes.com/news/local/la-charity-search-name,0,5949050.htmlstory?name=Red Cross

Hack 02: Permalinks for easy deep linking

An added benefit of using Hack 01 is that your results pages can have permalinks, albeit long and ugly ones. The link above will always call up the results for a search of “Red Cross,” and if you build all your drilldown pages this way, using a primary key as the external parameter, they’ll each have a distinct URL. That came in handy with the charity story because it allowed me to deep link charity names and types from the story down into the database (ex. Citizens Against Government Waste and disaster relief)

Hack 03: Low-rent data visualization as a novel entry point

Once you set up the query string, there’s no reason that your custom entry point must be an HTML form. My editors wanted to group the charities by their fundraising efficiency and give readers the chance to look at them group by group (i.e. which are the best, average, worst, et cetera.) We could have made a dropdown box, ordered list or sortable table. But the idea Thomas Lauder and I hatched instead was an interactive grid modeled on the Morningstar Style Box that sorts charities by the size and efficiency of their fundraising efforts. I built it with an old A List Apart trick so that each square links to the list of charities in its category. Take a look at it here. We also made a smaller version, currently on the site’s frontpage and in a story-level reefer. Here’s a hideous screenshot to prove it. You’ll have to go to the site if you actually want to play with it.

Alright, I’ve got a few more up my sleeve, but that’s probably enough for now. Per usual, far be it from me to say that these methods are the only or most efficient way to solutions. They’re just the ones I got done on deadline. Feel free to tell me where I screwed up, or how I can do it better next time.

Gasoline and his pet snake.

While walking to work this morning, I came upon Gasoline. Bike courier by day, graffiti artist by night, he told me his cold-blooded companion, an unnamed snake, enjoys soaking up the sun on warm days here in downtown Los Angeles.

Gasoline and his pet snake.

Gasoline said his buddy is a python. I’d like to believe him, but I’m not sure. Palewire massive, what say you?

The view from my window.

This afternoon I experimented with my first effort at photo stitching, using a program called Hugin to piece together the view from my window. Click for greater detail.

The view from my window

As you can see, it’s hardly a perfect job. But I think it fits together well enough. The most obvious flaw seems to be the shift in color that splits St. Vibiania’s in half. I’m a long way from a photo expert, but I suspect that’s caused by the automatic adjustments my camera makes as it saves images in jpg format. There’s probably some easy way to avoid that (shooting all of the photographs in a manually selected adjustment scheme, or RAW format), but I’ll leave figuring that out for another day. But if you are an expert, or if I’m totally off base, please feel free to chide away. I’m eager to learn.

California’s War Dead.

This Memorial Day weekend marked the formal launch of California’s War Dead, our database of the state’s casualties from the wars in Afghanistan and Iraq. It’s the result of a lot of hard work by many people at the paper, a large share of which had already been carried through the years by our many obituary writers.

The site intends to allow users to explore the data using a variety of criteria (for example, you can quickly look up fallen troops by hometown, high school or marital status). And to learn more about individuals by reading their obituaries from our back archives. Choice quotes have been selected to “pop” out of the individual profile pages and visitors are encouraged to leave memories and thoughts as comments.

Besides all my coworkers who pitched in to make this happen on a tight deadline, thank yous should be extended to all the great developers in the Django community. They not only provided the Web programming tools that made this idea possible, but also the leadership that showed me how the tools can be used to make journalism for the Web, not just on the Web. The same goes for all the people in the NICAR community who, by leading by example, have pushed me to keep learning new things and have the courage to take chances outside of journalism’s well worn comfort zones. Personally, I just hope that first group can forgive me for ripping off their ideas and that the second group doesn’t resent my getting the opportunity to do things like this without having to put in the once requisite 5 to 10 years on the cops-and-courts beat.

If you’re stretched for time, or maybe doubting there’s anything new to be learned about the war, let me promote a couple spots that might interest you.

  • Over the course of assembling the data, I was surprised to learn how many immigrants to California have died. It’s more than fifty, from Mexico and the Phillipines and South Korea and a number of other places. Check out the lists here. A fascinating story is of Sgt. Rafael Peralta of San Diego, who enlisted the same day he received his Green Card and died in Fallouja, Iraq, when he sacrificed himself to save his compatriots from a grenade attack. His profile is here and the story of his heroic death is here.
  • The most rewarding part of the project for me has been to see how quickly we’re getting great, thoughtful comments submitted by friends and family members of the deceased. One of my goals in the design was to give their writing equal footing with our previous reporting. It can be heartbreaking to read, but I’m proud to have helped make something that people think is worthy of such sensitive information. Examples I find particularly moving are the memories shared by the family of Sgt. Jason J. Buzzard of Ukiah and Corporal Christopher D. Leon of Lancaster, who I’m honored to know better now than I did before our commentors contributed.
  • It seems natural to expect that spending so much time with casualty data would have a numbing effect. But I think that’s only the case when we let the very real people we’ve lost remain numbers in a casualty count or unknown names on a page. It’s the stories that bring them to life, and my experience has been that the more stories you hear, the less numb you feel. The pain is in the details. A moving example is Teresa Watanabe’s obituary of Lt. Mark J. Daily of Irvine, who was inspired to join the war by the political writing of war advocate Christopher Hitchens. Hitchens has since gone to write a moving response to learning of Daily’s readership, and sacrifice, that you can find here.

Am I too hot for an anonymous American newspaper?

Evidence is mounting that my blog is considered too hot for a variety of Web filter programs. Another screenshot — this time submitted by a friend at an anonymous American newspaper — is displayed below.

Hot.

LA red light cameras on your TomTom or Garmin.

Today our A1 features Rich Connell’s look at the effectiveness of all those automated red light cameras positioned around Los Angeles. Here’s the nut:

In Los Angeles, officials estimate that 80% of red light camera tickets go not to those running through intersections but to drivers making rolling right turns, a Times review has found.

One of the most powerful selling points for photo enforcement systems, which now monitor 175 intersections in Los Angeles County and hundreds more across the United States, has been the promise of reducing collisions caused by drivers barreling through red lights.

But it is the right-turn infraction — a frequently misunderstood and less pressing safety concern — that drives tickets and revenue in the nation’s second-biggest city and at least half a dozen others across the county.

Our web package includes some hot tape put together by Rich, an awesome interactive explainer by Raoul Ranoa, the now perfunctory Google Map, and my own little goofy idea: portable downloads for TomTom and Garmin GPS devices (check out the roadblock halfway down the main story).

Loading the points into your device will not only map them on your dashboard monitor — but you can also easily program your system to give you an audio warning as you approach upcoming lights. And in that same soothing computer voice that already tells you when to turn.

I’m not sure how interested readers will be in this sort of product, but it seemed like a fun experiment. And since Rich had put in a great effort collecting the data from LA’s many fragmented municipalities, it seemed like we had to look for some extra yard to go for.

The technical part is pretty easy. Both manufacturers have handy developer guides that — once the data is prepared — only take a couple hours to suss out. Here’s TomTom. Here’s Garmin.

Any thoughts on other newspapery data projects that might work for GPS? The most dangerous intersections? The location of famous landmarks around town?

Ubuntu Recipe: How to automagically post your Last.fm feed to Twitter.

I signed up for Twitter this morning, opening an account at http://twitter.com/palewire. Since I haven’t seen or heard from my cell phone in a week or two, don’t count on much on the scene reporting. But I did take a few minutes this morning to line up my Last.fm feed, so that my lastest listenings are now automatically Twittered to the huddled masses yearning to have my musical taste shoved down their throat.

For any other Ubuntu users who’d like to follow along, here’s a quick recap on how I made it happen.

1. Move to the folder where you store random scripts. Me, I use…

cd /usr/local/bin

2. Create a new Perl script and open it in gedit.

sudo gedit twitter_fm.pl

3. Copy and paste in the ready-to-serve code provided by Walter Higgins.

4. Edit in your Twitter and Last.fm login information. Save and exit the file.

5. Create a new shell script.

sudo gedit twitter_fm.sh

6. Paste in the following, editing the folder structure to reflect wherever you stuck your steez.

#!/bin/sh
 
perl /usr/local/bin/twitter_fm.pl

7. Set the shell script so it becomes executable.

sudo chmod +x twitter_fm.sh

8. Navigate through the System>Preferences>Session menu as described here and add the shell script to your startup processes.

9. Restart!

I just patched this mess together a couple minutes ago, so there might be some bugs. Either in my setup or in Walter’s script. Don’t know yet. Let me know if you see anything idiotic on my part.

I also installed Wordpress’s Twitter Tools plugin, so now my latest blog posts will also be sent out via Twitter.

Also on the Twitter tip, earlier this week we launched a feed at work for our popular political blog, Top of the Ticket. It includes the latest posts from our team of writers, and, on election nights, live election results as they come in. You can sign up here. For anyone looking to reroute their own data streams to Twitter, I can’t recommend Chris Thompon’s Net::Twitter Perl module enough. Easy. Peasy.

More Many Eyes.

Today we sprung what might be the LAT’s first ever data app plugged directly into the front page. Some new foreclosure numbers came and we were able to quickly turn around the data so users could pop in their zipcode, or drill down and browse around the vast five county area we call “SoCal.”

yep.

Anyway, with a little free time this evening, I ferried the data over to Many Eyes and cooked up a couple data visualizations. They’re too much fun to keep to myself.

First, a visual version of the zipcode search, via ME’s “block histogram.” Try popping in “LA” or “Santa Monica” or 90210. The data isn’t adjusted to account for variations in population, but you can see what a cool spin on the classic search-and-return mechanism this gives you. Not only can you easily learn more about a particular locality, you can — at the same time — see where it falls on the distribution curve.

The second is a bit fancier. It’s a three-dimensional scatterplot charting foreclosure frequency on the Y axis against median household income on the X axis, with the size of the zipcode dots determined by the number of foreclosures per 1000 households (the Z-axis), a number that gives you a nice angle for comparison. Try flipping the Y and Z around, for a fun twist. It gives a quick way to explore the richest and poorest areas hit by the foreclosure boom, and it’s a hell of a lot of fun to mouse around with.

Or at least I think so. What do you think?

History Meme.

Just to join in the fun, here are the most common commands in the bash history on the Ubuntu Linux machine I run at home.

ben@loftbox:/home$ uname -a
Linux loftbox 2.6.22-14-generic #1 SMP Tue Feb 12 07:42:25 UTC 2008 i686 GNU/Linux
 
ben@loftbox:/home$ history|awk '{a[$2]++ } END{for(i in a){print a[i] " " i}}'|sort -rn|head
163 python
152 vim
55 ls
35 cd
16 rm
14 sudo
11 curl
10 vi
7 clear
4 mount

Basically what you’re looking at are the commands I used to write the python recipes I put up on the blog this past week.

I have been cultured, but not yet sophisticated.

Tonight I attended my first world-class classical music performance (I’m guessing my teenage vist to the Cedar Rapids Symphony probably doesn’t count). It was up Bunker Hill at the Disney Music Hall where I saw András Schiff perform four of Beethoven’s sonatas, including the famous no. 14, the Moonlight Sonata. My seat had a clear view of Schiff’s gliding hands, an enjoyable sight. I’m not sure I was ready for two hours of piano playing, but it was definitely an impressive show.

For your enjoyment, here’s a YouTube recording of Schiff performing a Schubert sonata.

And, if you’re into this kind of thing, the Guardian has published a series of lectures Schiff gave on Beethoven’s work, including the four sonatas I saw tonight.