October 31, 2005
Interestingness Is to Flickr As PageRank Is to Google
Interestingness is a measure of the popularity of a photograph on Flickr. While the exact recipe for interestingness is a secret--"proprietary" in corporate speak--here's the general idea as told by Flickr itself:
"There are lots of things that make a photo 'interesting' (or not) in the Flickr. Where the clickthroughs are coming from; who comments on it and when; who marks it as a favorite; its tags and many more things which are constantly changing. Interestingness changes over time, as more and more fantastic photos and stories are added to Flickr."
There's an implication of recursiveness in this description. My photograph will have more interestingness if the people who view it, comment on it, and mark it as a personal favorite themselves have a high interestingness quota--which, in turn, can only be the case if these people have been favorited (etc) by those with relatively high interestingness.
The recursive nature of Flickr's interestingness makes it analogous to Google's PageRank--the complex site ranking system used by Google that ranks sites fundamentally using the PageRanks of the sites that provide links ("inbound links") to the ranked site.
You could also look at Amazon's collaborative filtering mechanism ("people who bought what you bought also bought x,y, and z") as somewhat similar, although the recursive nature of the process is lacking.
Interestingness on Flickr, PageRank on Google, and collaborative filtering on Amazon all point to one fundamental about the Web: its power is in community.
My experience with Flickr is that interestingness works fairly well. It is closely (but not exactly) correlated with number of views and number of people who have "favorited" an image.
When a community works well together it is a democracy in the best sense. However, a collection of people can easily degenerate into a conformist mob, and may not have good judgment--a hazard for these recursive and collaborative algorithms. The best photo in the world on Flickr will not be ranked very interesting if people with good interesting kudos don't recognize it, just as the best Web content in the world may go ignored by Google without the right kind of inbound links.
It's also interesting that interestingness changes over time. As with Google rankings, an interestingness ranking on Flickr is fickle and ephemeral. Today's top photo may well not be ranked so well some time from now.
Here's my (currently) most interesting photo on Flickr (I like the photo too!):

Posted by Harold Davis at
01:53 PM
June 19, 2005
Towards a Universal Tagging Standards
I recently wrote (Folksonomic Discovery) about the importance of folksonomy and the dawning era of tagging for the people. But the problem with tagging is that it is balkanized. Flickr, 43 Things, and del.ico.us tags don't speak.
A universal tagging standard (or protocol) would solve this problem. Flickr, 43 Things and all the other tags would be the same kind of animals, and tagging application written on top of this animal kingdom could provide the visualization, correlation, linking, and other services that are needed.
The catch, of course, is that for a standard to be useful it has to be widely adapted. This is a crticial mass issue. With enough vendors implementing tagging using a given standard, pressure can be brought on everyone else to implement their tags in the same way - and to tag in the first place.
But different competing tagging standards might make it hard to write services across the the entire universe of tags.
Spaceman, at del.irio.us, points out to me that de.lirio.us uses tags that follow the xFolk format for tagging. Here's an example of XFolk as it can be used.
Posted by Harold Davis at
10:32 PM
May 17, 2005
The Long Tail
Reader beware: this is a tale of buzzwords, so if buzzwords bug you, you might want to pass. In fact, it is a long tale of two buzzwords. They are the best of buzzwords, and the worst of buzzwords. And so it begins...
The buzzword, or phrase, that’s my primary topic is “The Long Tail.” “The Long Tail” is a meme—meaning a phrase used to denote a topic of general community discussion. According to various definitions, a meme is (1) an idea that can replicate and evolve, and (2) a basic unit of cultural information subject to mutation, crossover, and adaptation. The use of the term “meme” is credited to evolutionary biologist Richard Dawkins (he used it in his 1976 book The Selfish Gene). Dawkins is the author of many fine books besides The Selfish Gene, including notably The Blind Watchmaker: Why the Evidence of Evolution Reveals a Universe without Design (1986).
“Meme” is, of course, a meme itself, and as such has become a meta-meme. The genetic metaphor is interesting, but from a common sense viewpoint my first, simplest definition (that a meme is a buzzword that is the focus of community discussion) isn’t far off. But I digress. Digression is one of the problems with memes.
To digress further, I like the phrase “the long tail.” It’s evocative. It makes me think of good luck dragons, the greatest story ever told, fur hats, beavers, and particularly Mr. and Mrs. Beaver in C.S. Lewis’s The Lion, the Witch, and the Wardrobe. If you can’t follow this train of associations, don’t worry: it’s my personal weirdness. I also have a racier association, but let’s not even go there.
To statisticians, economists, and econometricians “the long tail” has been a fairly arcane bit of professional slang meaning the long downward slope of many distribution curves, such as the Pareto distribution named after the Italian economist, Marxist, and social theorist Vilfredo Pareto.
Pareto illustrated his distribution curve with his “80-20 rule” regarding wealth in society: 20% of the people own 80% of the wealth. The long tail is (as seen on a graph) the lower and lower probability of a given person having much wealth.
Pareto distribution curves have been found, with the long tail intact, in a wide variety of things including the frequency of words in a text, file size distribution of TCP/IP traffic (few large files, many smaller ones), and the value of oil reserves in oil fields (a few very valuable fields, many less valuable ones), and so on.
Incidentally, I’ve focused on Pareto distribution here because I am so fond of Pareto the historical character who was not only a brilliant thinker, but also stood up for social justice, was an aristocrat (though he disdained the use of his title) who married for love, and a crack practitioner of the arts of dueling. But Pareto distributions are not unique in having long tails. A variety of other probability distributions have long tails, including, for example, exponential distribution.
The concept is pretty intuitively simple: the long tail refers to the long downward slope as probabilities lessen over time. Most distributed phenomena have long tails. For example, a hot product may sell zillions of units for several months. Several years later, it is still selling a few units per month on the long downward slope of its unit sales distribution tail.
Here’s where my tale moves from the long tail (note lower case usage) to the meme “The Long Tail.” The initial capital letters The Long Tail meme was coined by Chris Anderson in a 2004 Wired Magazine article.
As Anderson originally used the phrase, The Long Tail refers to a new reality of the Web in which inventory and distribution costs are low. His article mostly referred to media: “Forget squeezing millions from a few megahits at the top of the charts. The future of entertainment is in the millions of niche markets at the shallow end of the bitstream.”
The implication is that low volume “products,” which sit somewhere down the long tail’s slope, can sell profitably to niche markets. Without The Long Tail, to be successful a product needs high-volume appeal. According to Anderson, and he’s probably right about this, The Long Tail accounts for the success of a variety of Internet business models from Amazon and Netflix (low inventory and distribution costs in stocking even low volume items) to eBay (the ultimate aggregation of niche markets) and beyond.
Anderson is now working on The Long Tail the book (to be published by Hyperion in 2006), and presents his ideas on The Long Tail the Website (“a public diary on the way to a book”).
I think that Anderson and some others are stretching things a bit to apply The Long Tail as a general business principle of our changing times, and I’d hate to see Anderson’s book end up as another vapid business chicken soul for the pocket book.
But what’s quite interesting to me is that the concept of The Long Tail is being applied (I think quite usefully) to the subject of information delivery on the Web. No longer do we have information dissemination via prime time television (the short head). Instead, information is being broken down into thousands of niches, blogs, and feeds (the long tail).
People can reassemble this stuff to find what they need in their own way and on their own time using mechanisms like search, RSS aggregation, and so on. As Eric Schmidt, Google’s CEO, put it at a recent Google shareholder meeting, Google does a good job of serving mid-size businesses who advertise. But Google is looking hard at serving The Long Tail (Schmidt's words)—individuals and small businesses for which self-help tools like AdWords are ideal. (Google is also working on the Fortune 500 short head 80-20 rule side of things, but that’s another story.)
As The Long Tail meme goes around the Web community it cross-pollinates with another meme, Web 2.0 (for example, see this New Media Musings blog entry). Web 2.0 is shorthand for a set of attitudes, practices, technologies, and design disciplines. Stay tuned to this Long Tail channel for my thoughts on the interaction of Web 2.0 and The Long Tail.
Posted by Harold Davis at
11:17 AM
April 14, 2005
Mister Softee in His Big Blue Period
This is a tale of two software companies (or maybe three).
They were the best of companies, and they were the worst of companies. And they've yet to go to a far, far better place to peddle their software.
To get back to my story, corporations, like people, have a lifecycle with a beginning, middle, and end.
Once upon a time, a long time ago, in a land far far away, Big Blue - IBM - was the be-all and end-all of everything to do with the computer industry. Beaurocratic, ponderous, rich, and powerful, IBM watched the nimble Mister Softee - Microsoft - steal the software side of the computer business out from under it.
Mister Softee was everything Big Blue wasn't: a teenage rebel, improvising like crazy, able to turn on a dime, handing out stock options like candy in a bowl near the cash register of a restaurant where the food isn't so good.
Now Mister Softy has grown as soft as its nickname and frumpy, with middle-aged love handles to match. Stock options are long gone. Microsoft pays a dividend!
Microsoft wants to climb the enterprise.
This company is no longer nimble, and takes literally years to pass software through its beaurocratic process before release.
We didn't love Mister Softee when he was young and agile, but he always impressed us with his vigor and chutzpah (though we always wished he built better, less buggy, software).
Now Mister Softee is as rich as Croesus and out-IBMs IBM. He's dangerous! He's fat! He's rich! His speed of innovation is falling way behind compared to younger rivals like Google. More than ever, he's fun to despise.
Maybe, just, maybe, Mister Softee is also starting to become irrelevant (thanks to Open Source, Linux, and the Web).
Posted by Harold Davis at
04:29 PM
March 16, 2005
The Decline and Fall of VB6
Sometimes there's an event that has little significance on its own, but tells the story of great shifts in the underlying technology protoplasmic ether. Such an event is Microsoft's recent announcement that they would no longer officially support VB6. Although greeted with howls of protest by Microsoft's own MVP team, the announcement simply formalizes what has already happened.
By the way, what the official announcement of non-support means is that no further service packs will be issued for VB6, and that all unpaid technical support is ended as of the end of March. Developers can still get technical assistance from Microsoft if they pay for it through 2008.
Once the programming language with the most programmers in the world, Visual Basic is now a backwater of a language with little to recommend it. In the .Net world, C# is a much more elegant language than VB, and even has a bit more functionality. So there's really no reason anyone sensible would use VB.Net, the .Net version of Visual Basic, even if they were building .Net applications.
De facto, Microsoft killed Visual Basic in 2001 when it introduced .Net without a good way to upgrade old-style VB6 code. Now, admittedly .Net, which provides an abstraction layer with a great deal of functionality between operating environments such as Windows or the Web and a fully object-oriented programming syntax such as VB.Net or C#, is cool and a great development environment. Far better, in fact than the old VB6. But there's a problem that the structure of VB6 and VB .Net are so different so there's no reasonable way to move code from one to the other. For any sizable project, you'd truly be better off re-writing in .Net to do it right and follow OOP best practices rather than some kind of mechanical port. It's also the case that some VB6 code actually compiles under .Net, but produces results that are not what the VB6 developer intended.
These issues are significant. But even more important is just who .Net and VB.Net are intended for. These are enterprise products, with an enterprise price tag, and an enterprise overhead in terms of the knowledge necessary to use the product well, computing power required, the operating system needed, and so on. But it is the mom and pop developer that made VB6 so incredibly popular, and VB.Net has left this core constituency in the dust.
To a very great extent, instead of trying to deal with the move from VB6 to VB.Net (or C#.Net), the mom and pop developer decided to put their applications on the Web, using languages such as Javascript, Perl, and (most widely and appropriately) PHP. It's unwise to underestimate the intelligence of any computer programmer, even the mom and pop developer, and given the choice of the horrendous and dubiously appropriate upgrade, these people probably made a very smart move. The Web is the closest thing we have to a universal platform.
All these mom and pop developers have left the Microsoft stable forever, and are not coming back. In a classic case of shutting the barn door after the equine inhabitants have fled, Microsoft is attempting to address this issue with the upcoming release of Visual Web Developer Express (part of the Visual Studio 2005 release). It won't work.
Related link: C# Programming Tips and Techniques on Braintique.com
Posted by Harold Davis at
09:43 AM