Tuesday, February 23, 2010

Inside Google's Secret Search Algorithm [Google]

Source: http://feeds.gawker.com/~r/gizmodo/full/~3/zzkIcilnJp4/inside-googles-secret-search-algorithm

Wired's Steven Levy takes us inside the "algorithm that rules the web"—Google's search algorithm, of course—and if you use Google, it's kind of a must-read. PageRank? That's so 1997.

It's known that Google constantly updates the algorithm, with 550 improvements this year—to deliver smarter results and weed out the crap—but there are a few major updates in its history that have significantly altered Google's search, distilled in a helpful chart in the Wired piece. For instance, in 2001, they completely rewrote the algorithm; in 2003, they added local connectivity analysis; in 2005, results got personal; and most recently, they've added in real-time search for Twitter and blog posts.

The sum of everything Google's worked on—the quest to understand what you mean, not what you say—can be boiled down to this:

This is the hard-won realization from inside the Google search engine, culled from the data generated by billions of searches: a rock is a rock. It's also a stone, and it could be a boulder. Spell it "rokc" and it's still a rock. But put "little" in front of it and it's the capital of Arkansas. Which is not an ark. Unless Noah is around. "The holy grail of search is to understand what the user wants," Singhal says. "Then you are not matching words; you are actually trying to match meaning."

Oh, and by the way, you're a guinea pig every time you search for something, if you hadn't guessed as much already. Google engineer Patrick Riley tells Levy, "On most Google queries, you're actually in multiple control or experimental groups simultaneously." It lets them constantly experiment on a smaller scale—even if they're only conducting a particular experiment on .001 percent of queries, that's a lot of data.

Be sure to check out the whole piece, it's ridiculously fascinating, and borders on self-knowledge, given how much we all use Google (sorry, Bing). [Wired, Sweet graphic by Wired's Mauricio Alejo]