June 28, 2005
Isn't It del.icio.us?
In my continuing survey of tags, tagging, and folksonomies, I now turn to del.icio.us. (Previous posts on the Googleplex Blog and O'Reilly covered Technorati, universal tagging, Folksonomic Discovery, and Gataga.)
First, the domain name del.icio.us is clever beyond belief. In fact, it is delicious. Whether or not tagging is important, these people should be given credit for cool word play.
To use del.icio.us, you need to register for the service. Once logged in, you can add "bookmarklets" to your browser. In Internet Explorer, the bookmarklets sit on the browser's Links menu.
Adding these bookmarklets to your browser lets you tag web pages - meaning, you can add one word descriptors. You can view your own tags for pages, and those of others, usin another bookmarklet in your browser. Tags can also be discovered in quite a few other ways: by URL for tag, or author (and more) and sliced and diced via RSS. For example, each web page that has been tagged in del.icio.us has an RSS feed (the tags are the items in the feed).
There's also a nice API in del.icio.us, and quite a few programming hacks already out there that use this folksonomy.
Finally, each user has a home page which is an easy way to access del.icio.us tags, users, and categories they've bookmarked.
So why should one bother with this? First, there's an appeal to self-interest. It makes sense to appropriately tag one's own pages in del.icio.us. Good tagging should lead to more traffic in much the same way that developing good meta information helps webmasters with the search engines. Second, it is fun to categorize our worlds. Since del.icio.us tags are used as part of the core library of other web folksonomies (for example, Technorati), efforts tagging via del.icio.us contribute to the common welfare. Finally, browsing popular tags is a good discovery mechanism - both ad hoc serendipity, and also following people who are interested in a research question similar to something you may be looking into.
Once again, if I am missing something, please let me know!
Posted by Harold Davis at
09:59 AM
June 26, 2005
What Is It about Technorati and Tagging?
I'm going to be survey the world of tagging on the web, because I think this is an important (and occasionally obscure) area. Tagging has been dubbed the process of creating a folksonomy because it creates categorization schemes from the ground-up, rather than imposing heirarchies based on the opinion's of experts.
Let's start with Technorati. Technorati bills itself as the leading authority with what is going on in weblogs.
A little less nebulously, Technorati offers three important facilities:
- You can claim a weblog
- You can mark a post with a tag
- You can contribute your own tags
For these services to be meaningful, Technorati needs a library of tags and associations. This compendium comes from photos tagged in Flickr, links tagged in Delicious and Furl, and tags suggested by bloggers.
To claim a blog, you create a Technorati account, enter the URL for your blog, click the Claim Weblog button, add a little JavaScript code to the front page (or sidebar) of your blog, and ping Technorati to tell them you've added the code. (Here are instructions and discussion in the Technorati blog.) The code on your blog front page adds a link to all the blogs that link to yours. For example, I followed this process for Photoblog 2.0; you can see the Technorati logo and link part way down the sidebar on the right.
To add a blog entry of yours to the Technorati discussion "about" a given tag, search in Technorati for the tag, and then add the generated HTML code for the tag to your blog entry. For example, I am adding the Technorati link code for the tags Technorati, Tags, tagging, and folksonomy to this entry. In my opinion, this process and facility is the most important Technorati offering - it's a great way to direct traffic interested in an area of discussion to your blog. This makes for high-quality traffic, because the content is relevant, and it makes it in the interest of bloggers to extent the Technorati folksonomy.
To add tags to the Technorati collection, you can add a tag to one of the sources Technorati uses, such as Flickr. Alternatively, you can tag your tag in one of two ways within your blog page (more information). Creating a category within a blog tells Technorati to treat the name of the category as a tag.
Posted by Harold Davis at
09:45 AM
June 24, 2005
AdSense Site Search Box
As you may know, Google's AdSense program has two major parts: Content and Search.
When one thinks AdSense, one probably thinks content. These are the text ads that appear on sites. But AdSense Search is important, and a real revenue opportunity for content sites as well. With AdSense Search, you put a Google search box on your site. When visitors search using the box, ads are displayed on the search results page. If a visitor clicks through one of the ads, the publisher who displayed the search box makes some money.
There are a couple of neat extensions to this that Google has come up with as part of the AdSense program. You can configure the search box to search sites as well as the web (up to three domains). The caveat here is that you can only search that which Google has indexed, so the site search may not be as good as a true keyword based site search mechanism.
Also, Google recognizes that the search result return page is in some ways part of the publisher's site, even though it is actually served by Google. As a publisher in the AdSense program, you have the opportunity to choose your own graphics scheme, and add your own logo that clicks back to your site.
By way of example, you can check out the Google site search box I configured for Braintique (note that if you search for something, the results page will have a logo you can click to return to Braintique) on the bottom of this page or on the Braintique home page.
Posted by Harold Davis at
09:14 AM
June 19, 2005
Towards a Universal Tagging Standards
I recently wrote (Folksonomic Discovery) about the importance of folksonomy and the dawning era of tagging for the people. But the problem with tagging is that it is balkanized. Flickr, 43 Things, and del.ico.us tags don't speak.
A universal tagging standard (or protocol) would solve this problem. Flickr, 43 Things and all the other tags would be the same kind of animals, and tagging application written on top of this animal kingdom could provide the visualization, correlation, linking, and other services that are needed.
The catch, of course, is that for a standard to be useful it has to be widely adapted. This is a crticial mass issue. With enough vendors implementing tagging using a given standard, pressure can be brought on everyone else to implement their tags in the same way - and to tag in the first place.
But different competing tagging standards might make it hard to write services across the the entire universe of tags.
Spaceman, at del.irio.us, points out to me that de.lirio.us uses tags that follow the xFolk format for tagging. Here's an example of XFolk as it can be used.
Posted by Harold Davis at
10:32 PM
The Googleplex Blog is 200!
Happy Birthday, Googleplex Blog! Well, it's 200 entries, not 200 years (it has only been six months). But 200 entries is really quite a lot - a small book in its own right if you care to look at it that way.
Regrets? Perhaps I would have chosen another name if I had realized how far afield from Google I would wander.
I thought the 200th entry is a good place for me to look back at some (well many :-) ) of my favorites. Thanks for reading. Harold Davis
Some of my favorite posts:
Post 1 Jan 11, 2005 Getting Started!
Modest beginings: "I plan to use this blog to answer questions about research, Google, and using the Google APIs." Also, in support of Building Research Tools with Google for Dummies.
Post 20 Jan 20, 2005 The Lord of the Rings
Starting to go further afield!
Post 27 Jan 21, 2005 Make Lay Pay
About AdSense (Really!)
Post 33 Jan 24, 2005 RSS Rocks!
Syndication becomes an obsession tor me!
Post 38 Jan 27, 2005 Dr Dobson and Squarepants Spongebob
Dr Dobson gets spanked!
Post 43 Feb 1, 2005 Dadiaries
Getting personal about fathering
Post 45 Feb 3, 2005 Beyond the Valley of the Google Search Dolls
Google makes its money from advertising
Post 50 Feb 8, 2005 Euphemism Du Jour "Intelligent Design"
Post 62 Feb 15, 2005 Haruki Murakami
Over the borders of the everyday
Post 64 Feb 16, 2005 Power to the Blogosphere!
Jeff Ganon, Eason Jordan
Post 74 Feb 19, 2005 The Anti-Gates
Bill, Cracker, Somer, and Duplo
Post 79 Feb 21, 2005 The real origins of cyberspace
Post 85 Feb 24, 2005 Dead People Don't Validate
Dead People RSS Feed
Post 92 March 1, 2005 Words for Sale
Google meets the Phantom Tollbooth
Post 99 March 4, 2005 Wal-Mart and Google slug it out!
A big Googlefight
Post 110 March 16, 2005 The Decline and Fall of VB6
Whatever happened to Visual Basic?
Post 123 March 22, 2005 Google Code
The Google APIs now in one place
Post 126 March 27, 2005 Contextual Advertising: Not
Post 130 March 31, 2005 Publish the PageRank Algorithm Now!
More than 100 variables is too many!
Post 139 April 9, 2005 It's Time to Scour the Shire!
Post 147 April 18, 2005 Maps and Satellite Photos @ Google
Post 155 April 22, 2005 Statistically Improbable Phrases (SIP)
Post 166 May 2, 2005 Code for Stripping Google Ads from RSS
PHP, a regular expression, what else do you need?
Post 173 May 11, 2005 My new digital photography site is up!
Now photos go on my *other* blog!
Post 177 May 17, 2005 The Long Tail!
Post 183 May 20, 2005 Google Maps Captures UFO
Post 185 May 24, 2005 Google in the Enterprise
Post 186 May 31, 2005 ODP in Trouble
Post 191 June 8, 2005 Nigritude Ultramarine, Seraphim Proudleduck, and Loquine Glupe
Post 199 June 17, 2005 Folksonomic Discovery
43 Things, Flickr, and del.ico.us
Posted by Harold Davis at
09:40 PM
June 17, 2005
Folksonomic Discovery
In a previous entry, I wrote about 43 Things and Flickr. These are two interesting, trendy, and (in least in the case of Flickr) extremely useful applications. (I think that 43 Things may be powerful in its own way, too.)
Both applications are well worth a look if you don't know them. You use 43 Things to create a list of personal goals (here are some of mine, and a further discussion of the application). Flickr, in contrast, is used to share photos with a global community of photographers - and also for off-site image management, as in my Photoblog 2.0 and Digital Field Guide.
Flickr and 43 Things have in common that they provide a self-tagging mechanism. In 43 Things, you can apply tags you create to goals. In Flickr, you tag photos. A context in which everyone can freely tag (and categorize things) has come to be called a "folksonomy". Put differently, a folksonomy is a bottoms-up taxonomy created by the people for the people rather than a top-down hierarchy constructed by experts - the usual model for a taxonomy.
These folksonomies are very useful for sorting, searching, categorizing, and making relevance determinations within an application. Both 43 Things, on its home page, and Flickr, on the Flickr Tags page, make use of a common visual metaphor in which the larger the font size of the tag, the more people have applied it (and the more important it is).
"Social bookmark" manager del.icio.us lets you tag and categorize web links, creating a web folksonomy competitive with web taxonomies like that of the ODP. (Technorati provides a somewhat reverse service which allows you to track usage by tag in weblogs.) So del.icio.us and Technorati have created folksonomy-related services that distribute across the myriad sites in the web.
But what about aggregating folksonomic discovery across applications (as opposed to sites)? Why shouldn't I be able to cross-correlate 43 Things tags with Flickr tags?
A beta application named Gataga uses a frankly Google-esque user interface to aggregate social bookmark tags from del.icio.us, blogmarks, blinklist, jots, spurl, furl, simpy and connotea.
Gataga will display its folksonomic search results as an RSS feed (just as Technorati does), which is very useful: you can subscribe to stay updated. But there are big missing pieces in this application. For one thing, it doesn't include 43 Things and Flickr, off the beaten track of social bookmarking spanning web content, but far and away my favorites for fun and utility as self-tagging folksonomies.
There's also the issue of what you do with the folksonomic information to make it easy to grasp and genuinely useful. There has to be more than the font size = number of instances visual metaphor. I'd like to see graphic representations of similarity, relevance, occurence, and connection using dynamic link maps. This stuff has ways to go.
Still, it is a big mistake to underestimate the power of bottoms-up technology movements (witness Linux and open source). An apparently humble concept, self-tagging and the folksonomy, has the potential for toppling the hegmony of indexed search as the predominant way we find information on the web.
Posted by Harold Davis at
09:21 AM
June 16, 2005
My 15 (out of 43) Things
If you haven't visited 43 Things, you are missing one of the great current phenomons on the web.
Visitors to the 43 Things site are asked to list 43 things they want to do today, tomorrow, or in general in their life. It's best to establish an account before you start doing this. This sounds like an amusement, or at best an exercise in personal improvement, and this is a fair characterization.
43 Things is also an exercise in using lightweight web development technologies that are often called by the loose term Ajax. In some respects, 43 Things is not the best advertisement for this set of development tools: the UI is irritating and the database connectivity flakey at best.
Where 43 Things does better is as a genuine community site. Who else near you has the same goals? What advice can other members give about the goals? What goals have others found related to your goals, and so on? From this viewpoint, 43 Things is a great social experiment.
43 Things is also interesting as a site that provides a tagging mechanism. You can tag things as you like, and in the aggregate 43 Things is becoming an interesting collection of searchable tags, tag cross-references: a genuine folksonomy (although perhaps not as useful as my favorite folksonomy site, Flickr which lets you add tags to photos).
I'd suspect that 43 Things is also a great host for Google AdSense ads, because when contextual ads come up they are relevant to one's goals.
So now I've done a pretty good start on 43 Thing's 43 Things, but what about mine? I've got up to 15 so far. Here they are:
1 Watch my children grow
2 spend more time in the mountains
3 be financially solvent
4 live passionately
5 write a novel
6 have great sex
7 visit New Zealand
8 Lose a bit of weight
9 Write a neat program
10 be more thoughtful
11 Watch the sunrise from the top of Mt. Whitney (again!)
12 Scuba dive (again!)
13 Spend more time gardening
14 Teach my kids to garden
15 Show my kids Pompei and Delphi
P.S. It's also somewhat satisfying to see all the goals other people have that I *have* already done!
Posted by Harold Davis at
06:08 PM
June 15, 2005
Google and the George W. Bush Fart Doll
Brad Hill in the Google Unofficial Weblog has brought to my attention a story that is circulating that Google exhibits a pro-Clinton (and, by inference, anti-Bush and pro-left-wing) bias.
The story is that Google's AdWords unit declined to run contextual ads for a book bashing Clinton published by World Ahead Publishing. The information comes from a press release put out by World Ahead, a "premier publisher of conservative and Libertarian books" based in Los Angeles.
Now, I've found that AdWords does accept or reject ads in a capricious and silly fashion (this is presumably a function of inadequate software, not political bias), and that there is no effective appeal from an AdWords rejection. That said, some of the claims in the World Ahead "Google Censors Ads for Anti-Clinton Book" press release appear to be false, to wit:
- Google has accepted ads for a George W. Bush Fart Doll. This does not appear to be the case. A Google search for George W. Bush Fart Doll yields Google results pages with plenty of Bush dolls in the AdWords ads, but no flatulence (at least in the ads).
- Claims of "political bias" and "liberal leanings" in ad acceptance policies also are false, in my experience. I've previosuly noted the creationist Google ads that consistently show up on my blog items blasting intelligent design. And my wife's site about successfully managing high-risk pregnancies was getting ads for anti-abortionists until we took steps to ban these organizations (by listing their URLs with Google) from the site. To me, these ads seem to show a conservative bias - and a true contextual analysis would not have placed them on our sites. (Note to readers of whatever political persuasion: if you see a Google ad for an organization, cause or politician you truly hate, by all means click the link. Each time an AdWords link is clicked, it costs the advertiser!)
Admittedly, there are some anomolies. For example, the first search return link in Google for the miserable failure is famously the official White House biography of George W. Bush. (If you enter "miserable failure" in the Google search box and click the I'm Feeling Lucky button, this is the page that will open.)
Personally, I don't disagree with the equation of President Bush with miserable failure (to wear my politics on my sleeve). However, the result appears to have to do with Google bombing rather than intentional bias on the part of Google. As such, it speaks to the automated contextual analysis and relevance ordering algorithms Google uses hitting a wall - particularly when confronted by perpetual efforts to game the system - rather than intentional bias.
Releted entries: Publish the PageRank Algorithm, Humans Tweak Google Rankings
Posted by Harold Davis at
10:34 AM
Addendum to 3-D Mapping @ Google
I didn't make it clear in my story on Google's ground level mapping campaign and truck here and in my ORA blog that Google already has 3-D mapping based on satellite photos (although you likely know this if you've been reading my blog).
To use this feature, go to Google Maps. Enter an address (or don't enter an address - you can do it after you've started 3-D satellite mapping). Click on the Satellite link shown to the far right of the screen capture below (the capture shows a somewhat blurry view roughly centered on my home in Berkeley). Simple. And cool.

Posted by Harold Davis at
09:54 AM
June 14, 2005
Continuing Developments on the Pornstar Dinner with POTUS Story
In a previous story, I detailed the plans for pornstar Mary Cary and porn impressario Mark Kulkis (CEO of Kick Ass Pictures) to dine with President Bush at a dinner arranged by a Republican business group. Now it appears that the dinner is still on, although the White House declines to comment.
Posted by Harold Davis at
04:19 PM
Google 3D Mapping Truck Coming Soon to a City Near You!
According to SiliconValleyWatcher, Google is planning to use trucks equipped with lasers and digital mapping software to create realistic 3-D maps from the ground. There's already an experimental truck cruising San Francisco, which is running into some problems with line of sight measurments due to pedestrians and vehicles.
Apparently, second and subsequent passes by the trucks through the city could eliminate erroneous data due to moving objects. But Google is looking for a way to 3-D map a city with a single pass.
I've been wondering for a while about the arms race into very cool mapping software - wonderful stuff to play with (interesting anomolies and all), but without a clear path to monetization (at least to me). A glimmer of where this is going is beginning to dawn on me: a very sci-fi world of realistic virtual mapping of local information truly would be a yellow pages killer, and Google has the resources and smarts to maybe pull this off.
Posted by Harold Davis at
10:36 AM
June 12, 2005
World Longest Domain Name
http://www.thelongestdomainnameintheworldandthensomeandthensomemoreandmore.com/ claims to be the world's longest domain name. But is it? It turns out that this is a matter of definition. According to the domain registrars, the longest legal domain name is 63 characters starting with a letter or number.
If you included subdomains (which precede the primary domain name and are followed by a period) you can get longer, probably up to some limit supported by individual browser software.
If you include domain suffixes in your character count you also can get longer (+4 for .com and +6 for .co.uk, for example).
I don't think either subdomains or suffixes should be included in the search fro the world's largest domain name, meaning the best you can do is tie for first with 63 characters.
As Esther Dyson notes, nobody will type in these long names - they are opened by clicking links or selecting from a list.
Some other fun long domain names:
http://3.141592653589793238462643383279502884197169399375105820974944592.com/: the 3 is a subdomain, I like the photo of Dr. Evil
http://www.abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijk.com/: a free email service
http://www.llanfairpwllgwyngyllgogerychwyrndrobwyll-llantysiliogogogoch.com/: named after a Welsh Village, claims to be the world's longest domain name (of course, by my definition, it is at best tied) and in Guinness records as the world'd longest domain name (turns out the title is for world's longest domain named after a real place)
Here are more longest domain names according to the Internet Book of Records (a strange name for a web site). Thanks to comments on Google Blogoscoped for getting me started on this trivia.
Posted by Harold Davis at
11:33 AM
More on Google Human Tweaking
There's considerably more commentary on Henk Van Ess's blog about the truth of his disclosures that the Goolge Eval lab exists, whether or not Google Eval Labs are actually used to impact search results, and the ethics of Van Ess's disclosure of this NDA protected material.
For background on this interesting story, see my earlier posts to the Googleplex Blog and on O'Reilly.
Posted by Harold Davis at
10:48 AM
June 08, 2005
Nigritude Ultramarine, Seraphim Proudleduck, and Loquine Glupe
SEO (search engine optimization) contests take nonsense phrases such as (most famously) nigritude ultramarine, seraphim proudleduck, and loquine glupe. At the time the contest begins, the nonsense phrase (which may consist of "real" words) generates no results when entered as a query into Google. Afterwords, of course, this is no longer true. For example, nigritude ultramarine currently gets about 116,000 results in Google.
The contest ends after a specified amount of time, usually six months. The Webmaster whose site is ranked first in Google's search results for the contest phrase wins. Here's the FAQ for nigritude ultramarine, probably the best known of the completed SEO contests sponsored by DarkBlue and SearchGuild.com. (Good SEO forums at this last destination.)
How does a site get to the top of the results for one of these bizarre phrases? Almost anything goes, but the key points are getting inbound links, arranging for an ODP listing (or even two) and keyword stuffing. For an example of rather hilarious keyword stuffing, check out the winner of the loquine glupe contest.
At a time when Google's market capitalization has passed Time Warner's (as of June 8, 2005), it's worth remembering that Google is built on stuff such as nigritude ultramarine, seraphim proudleduck, and loquine glupe - not to mention text ads for Mark Felt, the real deep throat.
Posted by Harold Davis at
09:47 AM
Frank Lloyd Wright Google

Posted by Harold Davis at
09:19 AM
June 02, 2005
Humans Tweak Google Rankings
I've long believed that Google's ranking of responses to search rankings--the famous PageRank algorithm, now with more than 100 variables--is manipulated by human editors working for Google under an algorithmic facade. (See related posts on the Googleplex Blog and in my O'Reilly blog).
Now, there's some hard evidence that this is true. Dutch investigative reporter and search expert Henk Von Ess blogs about what he calls Google's Secret Evaluation Lab.
The real name for this secret part of Google is Rater Hub Google. It's staffed, mostly on a temp basis, mostly from international universities. Google calls these hires "international agents" or "quality raters." Here's a help wanted ad for the position from Monster.com.
Quality raters apparently spend their time checking search results, deprecating spam, moving the best results to the top of the search result stack, and (possibly) testing experimental Google features. This sounds like a kind of fun job!
Seriously, it isn't really surprising that Google has found the need to inject human editors into the equation. My objection is to the false pretence that Google's results derive from some purely formulaic (and supposedly objective) measure (likened in my previous posts to the Wizard of Oz hiding behind a screen while he makes a show for Dorothy and the others).
The Henk Van Ess blog item is really worth checking out. He promises more information to come. By all means review the Flash presentation on his site that shows some of the Rater Hub Google software - very interesting indeed!
Posted by Harold Davis at
03:13 PM