<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Then each went to his own home &#187; Tags</title>
	<atom:link href="http://www.pui.ch/phred/archives/category/tags/feed" rel="self" type="application/rss+xml" />
	<link>http://www.pui.ch/phred</link>
	<description>Philipp Kellers weblog</description>
	<lastBuildDate>Tue, 17 Aug 2010 19:58:15 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Remembering on the web &#8211; 5 reasons why online bookmarking is the wrong tool</title>
		<link>http://www.pui.ch/phred/archives/2007/10/remembering-on-the-web-5-reasons-why-social-bookmarking-doesnt-work.html</link>
		<comments>http://www.pui.ch/phred/archives/2007/10/remembering-on-the-web-5-reasons-why-social-bookmarking-doesnt-work.html#comments</comments>
		<pubDate>Tue, 23 Oct 2007 14:28:38 +0000</pubDate>
		<dc:creator>Philipp Keller</dc:creator>
				<category><![CDATA[Bookmarking]]></category>
		<category><![CDATA[Del.icio.us]]></category>
		<category><![CDATA[Tags]]></category>

		<guid isPermaLink="false">http://www.pui.ch/phred/archives/2007/10/remembering-on-the-web-5-reasons-why-social-bookmarking-doesnt-work.html</guid>
		<description><![CDATA[One common task while browsing the web is making sure you will be able to recall a valuable information you are just looking at. This article aims to prove that social bookmarking as in delicious, simpy, magnolia et al. is the wrong tool for that task.
Clarification
According to comments here and on reddit, it was obvious [...]]]></description>
			<content:encoded><![CDATA[<p>One common task while browsing the web is making sure you will be able to recall a valuable information you are just looking at. This article aims to prove that social bookmarking as in <a href="http://www.delicious.com">delicious</a>, <a href="http://www.simpy.com">simpy</a>, <a href="http://ma.gnolia.com/">magnolia </a>et al. is the wrong tool for that task.</p>
<h2>Clarification</h2>
<p>According to comments here and on <a href="http://programming.reddit.com/info/5yy1h/comments/">reddit</a>, it was obvious that my intention of this post was somehow misunderstood &#8211; partly because of the original misleading title (was: &#8220;.. &#8211; 5 reasons why social bookmarking doesn&#8217;t work&#8221;). Maybe these adaptions from <a href="http://xkcd.com/187/">an xkcd comic</a> does clarify:</p>
<h3>Right tool: Use bookmarks to get things done</h3>
<p><img id="image61" src="http://www.pui.ch/phred/wp-content/uploads/2007/10/clarification_gtd.png" alt="clarification_gtd.png" style="float: none" /><br />
I think, <a href="http://programming.reddit.com/info/5yy1h/comments/c02axbo">derefr sums this up very nice</a>:</p>
<blockquote><p>I find a GTD approach works well: what next action are you going to apply to this bookmark? If it&#8217;s just &#8220;well, it was neat!&#8221; you have no reason to save it (perhaps share it, but not save it), and can throw it away.</p></blockquote>
<p>The same goes for using the tag &#8220;mycomment&#8221; to follow up discussions you&#8217;ve partaken or &#8220;toread&#8221; to know what to read once you&#8217;ve got some free time. These bookmarks all serve a purpose that is clear to you while bookmarking. This also helps you picking an appropriate tag. No critique on that one.</p>
<h3>Right tool: Sharing links</h3>
<p><img id="image62" src="http://www.pui.ch/phred/wp-content/uploads/2007/10/clarification_sharing.png" alt="clarification_sharing.png" style="float: none" /><br />
It is clear that bookmark sharing sites such as <a href="http://reddit.com">reddit</a>, <a href="http://www.digg.com/">Digg</a>, or <a href="http://www.stumbleupon.com/">Stumbleupon</a> that all focus on link sharing have proven that this concept works. Delicious, Simpy, Magnolia et al. all have features to help you share your bookmarks. No critique on that one.</p>
<h3>Wrong tool: Remembering potentially interesting links</h3>
<p><img id="image60" src="http://www.pui.ch/phred/wp-content/uploads/2007/10/clarifiction_interesting.png" alt="clarifiction_interesting.png" style="float: none; margin-left: 0" /><br />
This is what this article is dealing about: Saving bookmarks that are not useful to you now but &#8211; without yet knowing what you&#8217;ll use this bookmark for &#8211; you save it because it is potentially interesting in the future. I think that doesn&#8217;t work and the 5 points should prove that.</p>
<p><span id="more-50"></span></p>
<h2>Reason 1: You can&#8217;t foresee the future</h2>
<p>Deciding which web site will be valuable in the future is a very very hard task. I&#8217;m not too good at it. I pile up tons of bookmarks I never look at afterwards and on the other hand I decided to not bookmark sites which I needed afterwards. In fact I&#8217;m so unsure about my ability to bookmark the right pages I often don&#8217;t try searching for a link in my pile of bookmarks but instead google first because I expect being faster this way. Too often I searched my bookmarks altering tags and search terms and didn&#8217;t find the bookmark in the end.</p>
<p>Additionally: Even if I would know which links will be of interest in the future, I can&#8217;t decide how I should tag (categorize) my bookmarks. When I tag an article, I normally have skimmed it and while categorizing I look at its title. When I tag I&#8217;m in a completely different situation &#8211; information wise &#8211; from when I search for the link.</p>
<div class="caption"><img id="image53" src="http://www.pui.ch/phred/wp-content/uploads/2007/10/ipod.png" alt="ipod.png" /><br />Your categories may change when you get<br />familiar with a product or topic</div>
<div class="caption"><img id="image54" src="http://www.pui.ch/phred/wp-content/uploads/2007/10/strategy.png" alt="strategy.png" /><br />Your information level when looking at a document<br />differs from when trying to recall that document</div>
<h2>Reason 2: You tear links out of its context</h2>
<div class="caption"><a href="http://www.flickr.com/photos/ilikespoons/84355382/"><img id="image59" src="http://www.pui.ch/phred/wp-content/uploads/2007/10/dissect_small.jpg" alt="dissect_small.jpg" /></a><br />Bookmarking is like cutting passages<br />from books: you remove information<br />from the context you originally found it</div>
<p>The word &#8220;bookmark&#8221; relates to the pretty carton markers you use when reading books. Although the way it is used in the web is far far from what it means in books lets delve into that comparison a bit:<br />
To go sure you will be able to find an important passage once you finished a book, you underline or write a few words into the margin to outline a paragraph. Then, when you recall that great sentence you most certainly know in which book it was written (unless that book is a conglomeration of quotes). Then, you often can remember the way that statement was used in the argumentation and in what topic it was embedded. And finally, amazingly, your brain often tells you where on a page (e.g. bottom left) the searched sentence is written. So you normally get quite a bunch of context information to guide you in your search and you will find the wanted sentence within a short amount of time, even if it wasn&#8217;t underlined. And even if you don&#8217;t find it, you often have a good time reading through the other amazing statements and end up quoting something you didn&#8217;t intend.</p>
<p>The way bookmarks are handled in the web would mean to books that you tear out that sentence out of the book, stick a few colored post-its to it and throw that snippet onto the pile with the 1325 other quotes. Bookmarking means taking information out of the context you originally found the information in. On the web context means how you found that link: Was it on Google or in your feed aggregator? Was it a blog post of one of your colleagues? Was it in an email? I often remember these things. Without being a psychologist or having an education in these things I guess our brain is pretty good in remembering context. So why don&#8217;t we use techniques that help our brain instead of trying to replace it?</p>
<h2>Reason 3: It takes too much time</h2>
<p>Bookmarking should save you time &#8211; and frustration. Leaving out the frustration bit: Does it really save you time?<br />
Lets say it takes 10 seconds to categorize a bookmark and lets say you&#8217;ll use every 20th of your saved bookmarks (which are rather optimistic guesses). That means that when trying to recall an url from your bookmarking service you need to be 200 seconds faster than when you didn&#8217;t bookmark any pages at all (as it took you 200 seconds for bookmarking the 20 bookmarks out of which you used 1).</p>
<p>I&#8217;m pretty sure you won&#8217;t save over 3 minutes in average searching in your pile of bookmarks compared to thinking for halve a minute where you found that link and then going down that trail. So: Why the hassle?</p>
<h2>Reason 4: It didn&#8217;t work for me</h2>
<p>I tried it. I gathered 3444 bookmarks in 2 years using 3034 tags. I asked myself how I could change my tagging practices to improve the recall. I failed. <a href="http://www.pui.ch/phred/archives/2007/09/the-delicious-lesson-revisited.html">I gave up</a>. I cannot believe there&#8217;s no one out there feeling the same.</p>
<p>I stopped bookmarking nearly two months ago. First, when reading articles that felt so interesting it was hard to not bookmark them. Then, it was kind of liberating not having to think &#8220;is this page valuable in the future?&#8221; &#8220;what tags should I use?&#8221;.</p>
<p>I never missed it. I always found that link. I don&#8217;t regret.</p>
<h2>Reason 5: Social bookmarking won&#8217;t improve that soon</h2>
<p>You may argue that there soon will be techniques to overcome the problems I just mentioned. But my claim is that social bookmarking sites won&#8217;t improve that soon.</p>
<p>In my last post I asked: &#8220;Why is tagging stuck?&#8221;. Gene Smith <a href="http://www.atomiq.org/archives/2007/09/is_tagging_stuck_hardly.html">argues correctly that tagging isn&#8217;t stuck</a>. He continues:</p>
<blockquote><p>
Want to know what <em>is</em> stuck? Del.icio.us
</p></blockquote>
<p>The same is true for all the other social bookmarking sites. RawSugar did a <a href="http://vanderwal.net/random/entrysel.php?blog=1945#futurepromise">brilliant next step</a> (before it went offline) but the social bookmarking market is quiet ever since. I couldn&#8217;t find fresh ideas in <a href="http://blog.delicious.com/blog/2007/09/taste-test.html">delicious&#8217; current redesign</a>. It seems like they moved buttons from here to there. I hoped they wouldn&#8217;t just redesign the appearance but would also change the way users interact with their data.</p>
<p>So, I guess these services are just as good as it gets. No improvements to wait for. That means it&#8217;s our &#8211; the users &#8211; turn to change our habits, to find the right tool for the job.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.pui.ch/phred/archives/2007/10/remembering-on-the-web-5-reasons-why-social-bookmarking-doesnt-work.html/feed</wfw:commentRss>
		<slash:comments>23</slash:comments>
		</item>
		<item>
		<title>The delicious lesson &#8211; revisited</title>
		<link>http://www.pui.ch/phred/archives/2007/09/the-delicious-lesson-revisited.html</link>
		<comments>http://www.pui.ch/phred/archives/2007/09/the-delicious-lesson-revisited.html#comments</comments>
		<pubDate>Mon, 03 Sep 2007 15:31:27 +0000</pubDate>
		<dc:creator>Philipp Keller</dc:creator>
				<category><![CDATA[Del.icio.us]]></category>
		<category><![CDATA[History]]></category>
		<category><![CDATA[Tags]]></category>

		<guid isPermaLink="false">http://www.pui.ch/phred/archives/2007/09/the-delicious-lesson-revisited.html</guid>
		<description><![CDATA[I&#8217;m very happy that a recent post titled «Tag history and gartners hype cycles» stirred up a discussion in the
folksonomy-blog-space that got some people musing about the state of tagging:
Paolo Valdemarin:

4 years later I&#8217;m still wondering when will we get some truly advanced tagging tools.
Where are all these tools to manage all my tags (on [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m very happy that a recent post titled «<a href="http://www.pui.ch/phred/archives/2007/05/tag-history-and-gartners-hype-cycles.html">Tag history and gartners hype cycles</a>» stirred up a discussion in the<br />
folksonomy-blog-space that got some people musing about the state of tagging:</p>
<p><a href="http://paolo.evectors.it/2007/08/28.html">Paolo Valdemarin</a>:</p>
<blockquote><p>
4 years later I&#8217;m still wondering when will we get some truly advanced tagging tools.<br />
Where are all these tools to manage all my tags (on Flickr, on del.icio.us, on technorati, in my RSS reader, on my blog, etc), to help me organizing them, to allow me to gain more advantages from tagging? (maybe they are somewhere and I simply have not found them yet&#8230;)
</p></blockquote>
<p><a href="http://matt.blogs.it/entries/00002618.html">Matt Mower</a>:</p>
<blockquote><p>
I have been surprised, that [...] the state of the art in tagging seems firmly wedged in 2003. Surprised because there seemed [...] to be a momentum building in the use of tagging
</p></blockquote>
<p><a href="http://www.everythingismiscellaneous.com/2007/08/28/tagging-like-it-was-2002/">David Weinberger</a>:</p>
<blockquote><p>
Tagging like it was 2002
</p></blockquote>
<p><a href="http://vanderwal.net/random/entrysel.php?blog=1945">Thomas Vander Wal</a>:</p>
<blockquote><p>
In the consumer space thing have been stagnant for a while, but in the enterprise space there is some good forward movement and some innovation taking place<br />
[...]<br />
While there are examples that tagging services have moved forward, there is so much more room to advance and improve. As people&#8217;s own collection of tagged pages and objects have grown the tools are needed to better refind them.
</p></blockquote>
<p>Vander Wals post is very very insightful and worth a read: He sums up the tagging history and expresses a few brilliant ideas how to proceed.</p>
<p><span id="more-49"></span></p>
<h3>The delicious lesson &#8211; revisited</h3>
<p>The big question remains: Why is tagging stuck?</p>
<p>My suggestion is that we may rethink <a href="http://bokardo.com/archives/the-delicious-lesson/">the delicious lesson</a>: Not in terms of “is it true that personal value precedes network value?” but in terms of “what is the real benefit of the users?” or in other words: “How can we design the itch that causes users to generate valuable metadata?”</p>
<p>Recently I talked with <a href="http://www.keepthebyte.ch/blog.html">Cédric Huesler, a coworker of mine</a> about <a href="http://del.icio.us/keepthebyte">his use of del.icio.us</a>: Instead of using delicious for storing his bookmarks for later retrieval he stores them to exchange links with strangers. Indeed he has <a href="http://del.icio.us/network/keepthebyte">19 regular consumers of his bookmarks</a>, 7 of these users he is consumer as well.</p>
<p>He doesn&#8217;t store his personal bookmarks at all. He can recall from memory where or how he found a certain website and goes back to his <a href="http://www.google.com/history/">google history</a>.</p>
<p>There are just a few entry points into new information on the web: there is Google, <a href="http://beta.bloglines.com/">feed aggregators</a> or <a href="http://programming.reddit.org">frontpage sites</a>. When there are good search utilities in those tools who needs bookmarks? I must confess that searching at those entry points feels more natural to me than remembering the exact tag I used.</p>
<p>Let&#8217;s put it straight: Using tags to find my bookmarks later just doesn&#8217;t work. I give up. And no, it&#8217;s not just the lack of good tools that help me going through my bookmarks to reorganize them. I won&#8217;t do that for all my 3444 bookmarks. And no, this won&#8217;t be solved with better tools to refind my items. What do you want to throw into the mix? Fulltext search and time based drill-down? This has nothing to do with tags.</p>
<p>So, we might have to rephrase the users motivation to tag, as I don&#8217;t think <a href="http://bokardo.com/archives/the-delicious-lesson/">Joshua Porter was right when he wrote</a>:</p>
<blockquote><p>
in order to gain more personal value, <i>they use tags to be able to find their bookmarks later</i>
</p></blockquote>
<p>I&#8217;m not yet at the point where I could correctly rephrase that statement, but I think Cédrics approach in using tags not for personal recall but for publishing is worth a thought. The value therein is close to the value of blogging: You get attention and you communicate. And that&#8217;s what the web is about, isn&#8217;t it?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.pui.ch/phred/archives/2007/09/the-delicious-lesson-revisited.html/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Improving navigation in tag spaces</title>
		<link>http://www.pui.ch/phred/archives/2007/06/improving-navigation-in-tag-spaces.html</link>
		<comments>http://www.pui.ch/phred/archives/2007/06/improving-navigation-in-tag-spaces.html#comments</comments>
		<pubDate>Thu, 21 Jun 2007 19:46:05 +0000</pubDate>
		<dc:creator>Philipp Keller</dc:creator>
				<category><![CDATA[Clustering]]></category>
		<category><![CDATA[History]]></category>
		<category><![CDATA[Tags]]></category>

		<guid isPermaLink="false">http://www.pui.ch/phred/archives/2007/06/improving-navigation-in-tag-spaces.html</guid>
		<description><![CDATA[In beginning of May at webtuesday, I gave a presentation about the current problems with tags and what could be done to improve that situation.
Corsin was kind enough to record the presentation (thanks a lot for that!). I&#8217;m not completely happy with the presentation &#8211; especially the part about tag history was way too long. [...]]]></description>
			<content:encoded><![CDATA[<p>In beginning of May <a href="http://webtuesday.ch/meetings/20070508">at webtuesday, I gave a presentation</a> about the current problems with tags and what could be done to improve that situation.<br />
<a href="http://cocaman.ch/">Corsin was kind enough</a> to record the presentation (thanks a lot for that!). I&#8217;m not completely happy with the presentation &#8211; especially the part about tag history was way too long. I&#8217;d suggest to skip that part and read <a href="http://www.pui.ch/phred/archives/2007/05/tag-history-and-gartners-hype-cycles.html">my blog post about this subject</a> (this part probably works better in a blog post than in a presentation). Ah, and the last 3 or 4 minutes are missing but you don&#8217;t really miss something.</p>
<p><embed style="width:400px; height:326px;" id="VideoPlayback" type="application/x-shockwave-flash" src="http://video.google.com/googleplayer.swf?docId=7213509817373019825&#038;hl=en" flashvars=""> </embed></p>
]]></content:encoded>
			<wfw:commentRss>http://www.pui.ch/phred/archives/2007/06/improving-navigation-in-tag-spaces.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Tag history and gartners hype cycles</title>
		<link>http://www.pui.ch/phred/archives/2007/05/tag-history-and-gartners-hype-cycles.html</link>
		<comments>http://www.pui.ch/phred/archives/2007/05/tag-history-and-gartners-hype-cycles.html#comments</comments>
		<pubDate>Sat, 12 May 2007 13:21:49 +0000</pubDate>
		<dc:creator>Philipp Keller</dc:creator>
				<category><![CDATA[History]]></category>
		<category><![CDATA[Tags]]></category>

		<guid isPermaLink="false">http://www.pui.ch/phred/archives/2007/05/tag-history-and-gartners-hype-cycles.html</guid>
		<description><![CDATA[For last Webtuesday I gathered a few historic data of the «tag movement» (that got very quiet in the last two years).

History of tags



Feb&#160;2002
Delicious


Dez 2003
Delicious &#34;takes off&#34;


 Feb 2004
Flickr


 Feb 2004
last.fm


 Mar 2004
spurl.net


 May 2004
simpy.com


 May 2004
furl.net


 May 2004
del.icio.us has 400k bookmarks


 Jun 2004
Flickr adds tagging


 Aug 2004
Vander Wal coins &#34;folksonomy&#34;


 Dez 2004
Connotea


 Jan 2005
Louis [...]]]></description>
			<content:encoded><![CDATA[<p>For <a href="http://www.webtuesday.ch/meetings/20070508">last Webtuesday</a> I gathered a few historic data of the «tag movement» (that got very quiet in the last two years).</p>
<div class="caption"><a href="/phred/images/tagging_history_900.gif"><img alt="History of tags" src="/phred/images/tagging_history_400.gif" /><br />
<strong>History of tags</strong></a></div>
<table class="muse-table" border="2" cellpadding="5">
<tbody>
<tr>
<td>Feb&nbsp;2002</td>
<td><a href="http://del.icio.us">Delicious</a></td>
</tr>
<tr>
<td>Dez 2003</td>
<td>Delicious &quot;takes off&quot;</td>
</tr>
<tr>
<td> Feb 2004</td>
<td><a href="http://www.flickr.com">Flickr</a></td>
</tr>
<tr>
<td> Feb 2004</td>
<td><a href="http://last.fm">last.fm</a></td>
</tr>
<tr>
<td> Mar 2004</td>
<td><a href="http://www.spurl.net/">spurl.net</a></td>
</tr>
<tr>
<td> May 2004</td>
<td><a href="http://www.simpy.com">simpy.com</a></td>
</tr>
<tr>
<td> May 2004</td>
<td><a href="http://www.furl.net/">furl.net</a></td>
</tr>
<tr>
<td> May 2004</td>
<td>del.icio.us has 400k bookmarks</td>
</tr>
<tr>
<td> Jun 2004</td>
<td>Flickr adds tagging</td>
</tr>
<tr>
<td> Aug 2004</td>
<td><a href="http://atomiq.org/archives/2004/08/folksonomy_social_classification.html">Vander Wal coins &quot;folksonomy&quot;</a></td>
</tr>
<tr>
<td> Dez 2004</td>
<td><a href="http://www.connotea.org/">Connotea</a></td>
</tr>
<tr>
<td> Jan 2005</td>
<td><a href="http://louisrosenfeld.com/home/bloug_archive/000330.html">Louis Rosenfeld</a> warns that tags won&#8217;t be the answer to everything</td>
</tr>
<tr>
<td> Mar 2005</td>
<td>Yahoo! buys Flickr</td>
</tr>
<tr>
<td> May 2005</td>
<td><a href="http://www.shirky.com/writings/ontology_overrated.html">Clay Shirky: Ontology is overrated</a>: Tags are the answer to everything</td>
</tr>
<tr>
<td> Jun 2005</td>
<td><a href="http://myweb2.search.yahoo.com/">Yahoo! My Web 2.0</a></td>
</tr>
<tr>
<td> Jun 2005</td>
<td><a href="http://www.youtube.com">YouTube</a> &#8211; with tags</td>
</tr>
<tr>
<td> Aug 2005</td>
<td><a href="http://blog.flickr.com/flickrblog/2005/08/the_new_new_thi.html">Flickr adds tag clustering</a></td>
</tr>
<tr>
<td> Aug 2005</td>
<td>Last.fm adds tagging</td>
</tr>
<tr>
<td> Aug 2005</td>
<td><a href="http://www.randomhouse.com/anchor/catalog/display.pperl?isbn=9780385721707">The Wisdom Of Crowds</a></td>
</tr>
<tr>
<td> Sep 2005</td>
<td><a href="http://www.librarything.com/">LibraryThing</a> &#8211; tag your books</td>
</tr>
<tr>
<td> Oct 2005</td>
<td><a href="http://ma.gnolia.com/">Ma.gnolia.com</a></td>
</tr>
<tr>
<td> Dez 2005</td>
<td>Yahoo! buys Delicious</td>
</tr>
<tr>
<td> Dez 2006</td>
<td>rawsugar closes R&amp;D</td>
</tr>
<tr>
<td> Mar 2007</td>
<td><a href="http://www.buzzillions.com/">buzzillions.com</a>: faceted tagging</td>
</tr>
</tbody>
</table>
<p><strong>Update September, 2007</strong>: <a href="http://vanderwal.net/random/entrysel.php?blog=1945">Thomas Vander Wal wrote a very good roundup on the tag history</a>.</p>
<p><span id="more-46"></span></p>
<h3>Gartners hype cycles applied to tag history</h3>
<p class="first">I think <a href="http://en.wikipedia.org/wiki/Hype_cycle">gartners hype cycles</a> prove to be right when applied to the tag history (hype cycle descriptions taken from <a href="http://www.floor.nl/ebiz/gartnershypecycle.htm">Floor eTrends</a>):</p>
<h4>Technology trigger</h4>
<blockquote>
<p class="quoted">
A breakthrough, public demonstration, product launch or other event that generates significant<br />
press and industry interest.</p>
</blockquote>
<p>The technology trigger most likely was <a href="http://del.icio.us">del.icio.us</a> and subsequently flickr adding tagging to their service.</p>
<h4>Peak of inflated expectations</h4>
<blockquote>
<p class="quoted">
A phase of overenthusiasm and unrealistic projections during which a flurry of publicized<br />
activity by technology leaders results in some successes but more failures as the technology is<br />
pushed to its limits. The only enterprises making money at this stage are conference organizers<br />
and magazine publishers.</p>
</blockquote>
<p>In this phase there were indeed many blog posts talking about this subject, as <a href="http://louisrosenfeld.com/home/bloug_archive/000330.html">Louis Rosenfeld</a><br />
put it:</p>
<blockquote>
<p class="quoted">
Lately, you can&#8217;t surf information architecture blogs for five minutes without stumbling on a<br />
discussion of folksonomies</p>
</blockquote>
<p>I guess in this phase many people said things they now feel embarassed about.</p>
<h4>Trough of disillusionment</h4>
<blockquote>
<p class="quoted">
The point at which the technology becomes unfashionable and the press abandons the<br />
topic, because the technology did not live up to its overinflated expectations.</p>
</blockquote>
<p>This is the phase we&#8217;re in now. There are no blog posts any more. Tagging is not really<br />
unfashionable but the topic is &#8220;done&#8221; à la «if that&#8217;s all what&#8217;s tagging adds to the web experience, I&#8217;m not interested in this technology any more». There isn&#8217;t much thinking and innovation going on.</p>
<h4>Slope of enlightenment</h4>
<blockquote>
<p class="quoted">
Focused experimentation and solid hard work by an increasingly diverse range of organizations<br />
lead to a true understanding of the technology&#8217;s applicability, risks and benefits. Commercial<br />
off-the-shelf methodologies and tools become available to ease the development process.</p>
</blockquote>
<p>Let&#8217;s hope gartner is right about the future of folksonomies!</p>
<h4>Plateau of productivity</h4>
<blockquote>
<p class="quoted">
The real-world benefits of the technology are demonstrated and accepted. Tools and<br />
methodologies are increasingly stable as they enter their second and third generation. The final<br />
height of the plateau varies according to whether the technology is broadly applicable or only<br />
benefits a niche market.</p>
</blockquote>
<p>It has yet to show if folksonomies such as in del.icio.us or flickr prove themselves for the masses.</p>
<h4 id="apply_at_all">Update (September, 2007): Do folksonomies apply to hype cycles at all?</h4>
<p>Joe Lamantia <a href="http://tagsonomy.com/index.php/the-tagging-hype-cycle/">raises the question if tagging should be applied at all to Gartners Hype Cycles:</a></p>
<blockquote><p>
Tagging in fact shows few characteristics of the enterprise technologies that Gartner&#8217;s Hype Cycle is built around
</p></blockquote>
<p>Joe argues rightly, that tagging has not yet reached the broad economy, it&#8217;s not that Gartner would care to apply folksonomies to their Hype Cycles.</p>
<p>Although: Gartner apply the hype cycle to technologies such as <a href="http://www.gartner.com/DisplayDocument?doc_cd=140881&amp;ref=g_SiteLink">&#8220;corporate blogging&#8221; or wikis</a>. It seems it does not lie in the nature of tagging that it won&#8217;t ever apply to hype cycles, the only fact that hinders Gartner to apply tagging to their hype cycles is that there is no money earned with it. I&#8217;m not into business analysis at all so I am grateful for Joes insights which he concludes with:</p>
<blockquote><p>
If it doesn&#8217;t cost money, the perceived risks of the technology are lower, and the big analysis firms pay less attention, because their customers see less need to pay for analysis
</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.pui.ch/phred/archives/2007/05/tag-history-and-gartners-hype-cycles.html/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>New Job / Presentation at Webtuesday</title>
		<link>http://www.pui.ch/phred/archives/2007/04/new-job-presentation-at-webtuesday.html</link>
		<comments>http://www.pui.ch/phred/archives/2007/04/new-job-presentation-at-webtuesday.html#comments</comments>
		<pubDate>Thu, 26 Apr 2007 18:09:26 +0000</pubDate>
		<dc:creator>Philipp Keller</dc:creator>
				<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Job]]></category>
		<category><![CDATA[Tags]]></category>

		<guid isPermaLink="false">http://www.pui.ch/phred/archives/2007/04/new-job-presentation-at-webtuesday.html</guid>
		<description><![CDATA[I started a new job at local.ch in February &#8211; yeah, it&#8217;s been a while already.
Local.ch is a local search engine for Switzerland, that means I can now work on information retrieval related stuff full time &#8211; which was what I did in my free time already. Being paid for doing the things I like [...]]]></description>
			<content:encoded><![CDATA[<p class="first">I started a new job at <a href="http://www.local.ch">local.ch</a> in February &#8211; yeah, it&#8217;s been a while already.</p>
<p>Local.ch is a local search engine for Switzerland, that means I can now work on information retrieval related stuff full time &#8211; which was what <a href="http://www.pui.ch/phred/archives/2006/07/automated_tag_clustering.html">I did in my free time already</a>. Being paid for doing the things I like is a gift I don&#8217;t take for granted.</p>
<p>The R&amp;D team <a href="http://weblog.patrice.ch/">consists</a> <a href="http://www.dexter.cc/">of</a> <a href="http://www.keepthebyte.ch/blog.html">about</a> <a href="http://www.sitepoint.com/articlelist/210">10</a> people &#8211; all very talented and smart. Plus, the atmosphere is friendly yet challenging.</p>
<h3>Say bye to tag clouds</h3>
<p class="first">Then, I&#8217;ll <a href="http://www.webtuesday.ch/meetings/20070508">give a talk at webtuesday</a>, Zurich about &quot;Improving navigation in tag spaces&quot;: Why tag clouds don&#8217;t make much sense, why<br />
tagging lost its ground and what could be done to improve the users experience.</p>
<p>The talk will be based on the few blog posts I wrote about this subject plus some newly gained insights.<br />
If you&#8217;re living near Zurich it would be a pleasure to meet you there.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.pui.ch/phred/archives/2007/04/new-job-presentation-at-webtuesday.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Automated tag clustering</title>
		<link>http://www.pui.ch/phred/archives/2006/07/automated-tag-clustering.html</link>
		<comments>http://www.pui.ch/phred/archives/2006/07/automated-tag-clustering.html#comments</comments>
		<pubDate>Tue, 11 Jul 2006 06:03:37 +0000</pubDate>
		<dc:creator>Philipp Keller</dc:creator>
				<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Del.icio.us]]></category>
		<category><![CDATA[RawSugar]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Tags]]></category>

		<guid isPermaLink="false">http://www.pui.ch/phred/archives/2006/07/automated_tag_clustering.html</guid>
		<description><![CDATA[Grigory Begelman (Technion &#8211; Israel Institute of Technology Computer Science Dpt), Frank Smadja (RawSugar) and I did a paper for www2006 called &#8220;automated tag clustering&#8221;. It deals with why clustering the tag space makes sense and how this could be done.
After the presentation at the tagging workshop at www2006 we felt the need to give [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.cs.technion.ac.il/%7Egbeg/">Grigory Begelman</a> (<a href="http://www.cs.technion.ac.il/">Technion &#8211; Israel Institute of Technology Computer Science Dpt)</a>, <a href="http://smadja.us/">Frank Smadja</a> (<a href="http://www.rawsugar.com/">RawSugar</a>) and I did a paper for <a href="http://www2006.org">www2006</a> called &#8220;automated tag clustering&#8221;. It deals with why clustering the tag space makes sense and how this could be done.</p>
<p>After the presentation at the <a href="http://blog.rawsugar.com/wikka/wikka.php?wakka=HomePage">tagging workshop</a> at www2006 we felt the need to give our paper a more www-friendly, I-don&#8217;t-want-to-read-through-those-theoretical-equation-flooded-papers face.</p>
<p>So, here you go: <a href="http://www.pui.ch/phred/automated_tag_clustering/">Automated Tag Clustering: Improving search and exploration in the tag space</a>. To read this document you should have a clue what tags are about, you should also know some tag services as <a href="http://del.icio.us">delicious</a> or <a href="http://www.flickr.com">flickr</a> so you can understand the limitations these services currently have. <span id="more-41"></span><a href="http://www.pui.ch/phred/automated_tag_clustering/#cluster"><img title="clustering the tag space" alt="clustering the tag space" id="image42" src="http://www.pui.ch/phred/wp-content/uploads/2006/07/clusters.png" /></a>If you don&#8217;t want to read through the whole papers, the numerous figures give you a good summary. Finally, to wet your appetite, here a few excerpts of the document:</p>
<blockquote><p>Currently tagging services still provide a relatively marginal value for information discovery and we claim that with the use of clustering techniques this can be greatly improved [from <a href="http://www.pui.ch/phred/automated_tag_clustering/#p_motivation">introduction</a>]</p></blockquote>
<blockquote><p>The whole promise of collaborative tagging is that by exploring the tag space you can discover a lot of useful information you would not find with traditional search engines.  When your information need is not well defined, the idea that you can explore and see what other people tagged with certain tags is very attractive. We believe that tagging will be able to reach a very wide audience only when exploration techniques will be effective. [from <a href="http://www.pui.ch/phred/automated_tag_clustering/#p_exploration">limited exploration</a>]</p></blockquote>
<blockquote><p>Although a great visualization paradigm, we believe that with today&#8217;s tagclouds it is hard to find more than one or two tags to click on. Tags are not grouped, there is too much information, so that you find lot of related tags scattered on the tag cloud.  One or two popular topics and all their related tags tend to dominate the whole cloud.  For example, looking at the del.icio.us tagcloud, one would mostly see tags related to web design and technologies. This is because these topics are overwhelmingly more frequent than anything else. [from <a href="http://www.pui.ch/phred/automated_tag_clustering/#p_exploration">limited exploration</a>]</p></blockquote>
<blockquote><p>Tag <em>web2.0</em> nowadays is so popular and is combined wildly with anything. In fact this tag is so overused that if you look at <a href="http://del.icio.us/tag/bookmarks">tag <em>bookmarks</em> in the del.icio.us dataset</a>, the most used cotag is <em>web2.0</em>[...]. Basing tag similarity on these numbers often doesn&#8217;t make sense at all. The similarity measure should be chosen so the popularity of a tag doesn&#8217;t affect the set of a tags related tags. Don&#8217;t cut the <a href="http://en.wikipedia.org/wiki/Long_tail">long tail</a>. The success of blogs is driven by the importance of the long tail. We all know that it is crucial to support the niches. Tagging applications should empower the long tail too. If you just sort by popularity, you&#8217;d loose all those niches. [from <a href="http://www.pui.ch/phred/automated_tag_clustering/#p_similarity">choosing a similarity measure</a>]</p></blockquote>
<p>We&#8217;d be happy to get any kind of feedback on the article. Just post a comment to this blog post.</p>
<p><strong>Edit (4 years later!)</strong>: A few guys asked me about the source code: <a href="http://pastie.org/1098455">Source code with syntax highlighting</a>, <a href="http://www.pui.ch/phred/archives/cluster.py">download</a>.<br />
You need <a href="http://people.sc.fsu.edu/~jburkardt/c_src/kmetis/kmetis.html">kmetis</a> to make this run, see <code>usage()</code> to see how it should be used.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.pui.ch/phred/archives/2006/07/automated-tag-clustering.html/feed</wfw:commentRss>
		<slash:comments>16</slash:comments>
		</item>
		<item>
		<title>www2006 and collaborative tagging workshop</title>
		<link>http://www.pui.ch/phred/archives/2006/04/www2006-and-collaborative-tagging-workshop.html</link>
		<comments>http://www.pui.ch/phred/archives/2006/04/www2006-and-collaborative-tagging-workshop.html#comments</comments>
		<pubDate>Tue, 25 Apr 2006 06:16:47 +0000</pubDate>
		<dc:creator>Philipp Keller</dc:creator>
				<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Conferences]]></category>
		<category><![CDATA[Projects]]></category>

		<guid isPermaLink="false">http://www.pui.ch/phred/?p=39</guid>
		<description><![CDATA[Just a short note:
Grigory Begelman (Technion &#8211; Israel Institute of Technology Computer Science Dpt), Frank Smadja (RawSugar) and me are giving a presentation at this years www2006 conference in Edinburgh. I&#8217;m very glad our paper was accepted to the Collaborate Web Tagging Workshop. We will talk about automated tag clustering. I will give a demo [...]]]></description>
			<content:encoded><![CDATA[<p>Just a short note:<br />
<a href="http://www.cs.technion.ac.il/%7Egbeg/">Grigory Begelman</a> (<a href="http://www.cs.technion.ac.il/">Technion &#8211; Israel Institute of Technology Computer Science Dpt</a>), <a href="http://smadja.us/">Frank Smadja</a> (<a href="http://www.rawsugar.com">RawSugar</a>) and me are giving a presentation at this years <a href="http://www2006.org/">www2006</a> conference in Edinburgh. I&#8217;m very glad our <a href="http://www.rawsugar.com/www2006/20.pdf">paper</a> was accepted to the <a href="http://www.rawsugar.com/www2006/taggingworkshopschedule.html">Collaborate Web Tagging Workshop</a>. We will talk about automated tag clustering. I will give a demo of clustering popular urls. It&#8217;s like <a href="http://popurls.com/">popurls</a> grouped by categories instead of origin.
</p>
<p>
I will write more about it afterwards as I&#8217;m pretty busy finishing my demo.
</p>
<p>
If you will attend the conference, leave me a note so we could meet somewhen at the conference.<br />
I&#8217;m looking forward to this conference as it will be my first one.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.pui.ch/phred/archives/2006/04/www2006-and-collaborative-tagging-workshop.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>How tagging could gain ground</title>
		<link>http://www.pui.ch/phred/archives/2005/11/how-tagging-could-gain-ground.html</link>
		<comments>http://www.pui.ch/phred/archives/2005/11/how-tagging-could-gain-ground.html#comments</comments>
		<pubDate>Tue, 29 Nov 2005 20:54:28 +0000</pubDate>
		<dc:creator>Philipp Keller</dc:creator>
				<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Del.icio.us]]></category>
		<category><![CDATA[Tags]]></category>

		<guid isPermaLink="false">http://www.pui.ch/phred/?p=35</guid>
		<description><![CDATA[Is the revolution stuck?
When I first heard about del.icio.us (and after that few days when I didn&#8217;t get it..) I thought: &#8220;This is revolutionary&#8221;. There were many things tags made possible that were just not possible until that day.
Joshua Schachter was the guy that invented tags (or at least that&#8217;s how the story is being [...]]]></description>
			<content:encoded><![CDATA[<h2>Is the revolution stuck?</h2>
<p>When <a href="http://www.pui.ch/phred/archives/2005/02/delicious_is_te.html">I first heard about del.icio.us</a> (and after that few days when I didn&#8217;t get it..) I thought: &#8220;This is revolutionary&#8221;. There were many things tags made possible that were just not possible until that day.</p>
<p><a href="http://burri.to/~joshua/">Joshua Schachter</a> was the guy that invented tags (or at least that&#8217;s how the story is being told). Originally <a href="http://loosewire.typepad.com/blog/2005/01/the_tag_report__3.html">thought as a way to organize ones own bookmarks</a> the social effect became obvious:</p>
<blockquote><p>If everyone tags, the &#8220;community&#8221; profits.</p></blockquote>
<p>Now, we have del.icio.us. Now we organize our bookmarks with tags. <a href="http://www.flickr.com">And our photos</a>.<br />
And our <a href="http://www.librarything.com/">books</a>, <a href="http://www.millionsofgames.com/">our games</a>, <a href="http://myprogs.net/">our software</a>, <a href="http://supr.c.ilio.us/">our tagging sites</a>, and <a href="http://bulldogster.ning.com/">also your bulldogs</a>, if you have any.</p>
<p>However, as we have tagged our whole life, what do we do with it? What is it good for?<br />
I fear the tagging-revolution is about to calm. And I believe that&#8217;s because many people don&#8217;t see the advantages in tagging. I believe that <strong>many many</strong> things can be made possible by using tag-based systems. If we realized this, tagging would get some fresh air and eventually tagging gets mainstream.</p>
<p>Is it just me, or is the tagging revolution really stuck? I desperately miss new, visionary, inventive articles on tags.</p>
<ul>
<li>To all smart people, where are your ideas?</li>
<li>To all programming geeks: Where are your algorithms, your &#8220;proof of concept&#8221; web services?</li>
</ul>
<p>I could stop here with my article, but, hey, I don&#8217;t want to be the grumbling guy that sits and waits for new things coming up, so here I am, trying to expose my brain to you.<br />
In this article I want to take a look at what areas tags are already strong in and how tagging could gain ground in these areas.<br />
<span id="more-35"></span></p>
<h2>Tags help you to organize</h2>
<p>When Joshua came up with the idea of tags, it was purely meant for organizing. It was only when also other people started organizing by tags, when the whole idea of &#8220;folksonomy&#8221; came up.<br />
What does organizing mean? It is like tidying ones room: You put every paper and pencil you have at a place you can remember and seems logical to you so you can easily remember where you have put that thing. Now as we are not limited into physical means when we organize data we have many new possibilities. There are already <a href="http://wiki.osafoundation.org/bin/view/Journal/HierarchyVersusFacetsVersusTags">good articles</a> about this so I won&#8217;t discuss this in detail here.</p>
<p>At the end of the day, the question arises: Is organizing your bookmarks by tags really that good? </p>
<p>Just to make the point I come up with another way to remember things: While browsing, the browser could save all pages in a cache and when you are searching for a page you have visited (which is why you originally bookmark the page anyway), you make a fulltext search through all your cached pages. It&#8217;s a kind of &#8220;Google search&#8221; over pages you have already visited. I know this would have some downsides but it would have some advantages too. I often searched for a page I bookmarked and couldn&#8217;t remember the tag I used. This problem wouldn&#8217;t occur in the &#8220;searching through cache&#8221; system.</p>
<p>What I am trying to say is: <strong>If tagging solely would be for organizing your stuff, it wouldn&#8217;t be worth the trouble</strong>.</p>
<h2>Folksonomy &#8211; Classification of the masses</h2>
<p><a href="http://en.wikipedia.org/wiki/Folksonomy">Folksonomy</a> is &#8211; as I understand it &#8211; the distributed classification of data by the big mass of people who tag stuff. Folksonomy often is said as a new system to build a <a href="http://en.wikipedia.org/wiki/Taxonomy">taxonomy</a>. It&#8217;s like building the <a href="http://dmoz.org/about.html">Open Directory</a> by thousands of people tagging stuff.</p>
<p>What is folksonomy good for? Why do we want to put bookmarks into categories?</p>
<h3>Folksonomy enables to explore</h3>
<p>Where do you go to start building a new expertise? Is it del.icio.us? Is it Google?<br />
Let&#8217;s say your boss tells you that the data your software saves in the database should be encoded. You didn&#8217;t think much about cryptography, it merely was a topic that you &#8220;should know about&#8221; but you were never really interested in cryptography (I&#8217;m speaking for myself here.. :-) ). You don&#8217;t really know where to start. You know you want to know something about cryptography, but you don&#8217;t know exactly what.<br />
A good list of articles or even starting points could shorten your learning curve.<br />
Thereafter, you may want to &#8220;travel through the cryptography universe&#8221;. And to travel means knowing which articles are related to the one you just read and are so enthusiastic about. You need a map of the cryptography universe, you want to know what is left and right, top and bottom, you want to know everything and everyone related to &#8220;cryptography&#8221;.<br />
Now then: What would you do?</p>
<h3>Do tag systems help you to explore?</h3>
<h4>Delicious on cryptography</h4>
<div class="caption"><a href="http://del.icio.us/tag/cryptography"><img src="/phred/modules/delicious_cryptography.png" alt="delicious results on cryptography" title="delicious results on cryptography" /></a><br />
Delicious results on &laquo;cryptography&raquo;</div>
<p>I would go on <a href="http://del.icio.us/tag/cryptography+introduction">del.icio.us/tag/cryptography+introduction</a>. There I find a nice article titled &#8220;<a href="http://www.garykessler.net/library/crypto.html">An Overview of Cryptography</a>&#8220;. I guess I&#8217;m lucky! If I&#8217;d read the article, I&#8217;d probably find out which subtopics exist, how cryptography is related to similar issues and so on. You kind of get this &#8220;map of the cryptography&#8221; universe. But, this is done by only one author. Probably I don&#8217;t trust him (probably I should do so, after reading his <a href="http://www.garykessler.net/resume.html">cv</a>), or you simply do not have time and/or energy to read through 44 pages, although the article looks good. I&#8217;ll probably <a href="http://del.icio.us/tag/cryptography">go back to delicious and find out</a>, that the related tags of &#8220;cryptography&#8221; are:</p>
<ul>
<li>security</li>
<li>reference</li>
<li>encryption</li>
<li>crypto</li>
<li>algorithms</li>
<li>computing</li>
<li>software</li>
<li>nsa</li>
<li>tutorial</li>
<li>kids</li>
<li>education</li>
</ul>
<p>Now this is not very convincing, is it? You argue:</p>
<blockquote><p>Yeah, but this is far better that what I get on Google</p></blockquote>
<p>. </p>
<h4>Google on cryptography</h4>
<div class="caption"><a href="http://www.google.ch/search?q=cryptography"><img src="/phred/modules/google_cryptography.png" alt="Google results on cryptography" title="Google results on cryptography" /></a><br />
Google results on &laquo;cryptography&raquo;</div>
<p><a href="http://www.google.ch/search?q=cryptography">It is</a>. When looking at this Google results I remember that Google is meant for searching when I already know what I search for. But now I am at a different stage. I don&#8217;t know exactly what to search for. I don&#8217;t know, because I don&#8217;t have any expertise in cryptography. BTW: Google does come up with an article that looks like a good introduction into cryptography as well..</p>
<h4>Open directory on security</h4>
<p>What about <a href="http://dmoz.org/about.html">open directory</a>? Let&#8217;s give it a try: After typing in &#8220;cryptography&#8221; I find out that this topic is classified in <a href="http://www.google.com/Top/Science/Math/Applications/Communication_Theory/Cryptography">Science &gt; Math &gt; Applications &gt; Communication_Theory &gt; Communication Theory &gt; Cryptography</a>. Clicking this link you get what you were probably looking for.<br />
You get a nice overview:
<div class="caption"><a href="http://www.google.com/Top/Science/Math/Applications/Communication_Theory/Cryptography"><img src="/phred/modules/google_directory_cryptography.png" alt="Google open directory on cryptography" title="Google open directory on cryptography"/></a><br />
Google open directory on &laquo;cryptography&raquo;</div>
<ul>
<li>Algorithms</li>
<li>Books</li>
<li>Events</li>
<li>Historical</li>
<li>Journals</li>
<li>People</li>
<li>Programming Libraries</li>
<li>Research Groups</li>
<li>Theory</li>
</ul>
<p>Now you stand at a guidepost. You see the &#8220;cryptography universe&#8221;. You probably don&#8217;t see what is left and right to cryptography, but here you have a &#8220;cryptography at a glance&#8221;.<br />
Now it&#8217;s up to you: Do you want to explore &#8220;algorithm land&#8221;, take the shortcut and download the programming library of the language of your choice? Or do you even want to get advice from people that are experts on that matter?<br />
Even if the links provided here don&#8217;t give you what you are looking for, here you get a clue what you should look for.</p>
<h4>Comparing the three</h4>
<p>Let&#8217;s compare browsing to a reallife quest: Finding out where your next conference will take place. Say you want to go to the next <a href="http://conferences.oreillynet.com/etech/">etech conference</a>, you don&#8217;t know where it is and you are not an American citizen.</p>
<div class="caption"><img src="/phred/modules/too_near.png" alt="Ouch, nearly bumped my head into horton plaza!" title="Ouch, nearly bumped my head into horton plaza!"/><br />
Ouch, nearly bumped my head into horton plaza!</div>
<p>On the conference websites they often put a map showing the conference place like 10 meters above surface. This map <strong>is</strong> helpful. But only at the point when you are quite next to the conference. </p>
<div class="caption"><img src="/phred/modules/too_far.png" alt="Help, I cannot breathe out there!" title="Help, I cannot breathe out there!"/><br />
Help, I cannot breathe out there!</div>
<p>Then, when you desperately search for a more general map, you&#8217;ll possibly find a map of how it looks from outer space. Yeah, I know that San Diego is in the US, but I&#8217;d like to know which airport is next to the conference.</p>
<div class="caption"><img src="/phred/modules/web_organization.png" alt="Distances between observer and data" title="Distances between observer and data" /><br />
Distances between observer and data</div>
<p>That&#8217;s quite similar to the views we have with del.icio.us and open directory.<br />
Delicious would tell you: &#8220;the roads nearby are &#8216;union street&#8217;, &#8216;Broadway circle&#8217; and &#8216;Broadway&#8217;&#8230;&#8221;, open directory proclaims: &#8220;we have five continents in the world: &#8216;America&#8217;, &#8216;Asia&#8217;, &#8216;Africa&#8217;, &#8216;Australia&#8217; and &#8216;Europe&#8217;&#8230;&#8221;. Now, I&#8217;m exaggerating a bit but you get the point: Sometimes you need a map that lays between the too detailed and the too general map.<br />
Looking for this type of view is like saying: &#8220;I want a bit more <a href="http://en.wikipedia.org/wiki/Ontology">ontology</a> than tags but not that much <a href="http://en.wikipedia.org/wiki/Taxonomy">taxonomy</a> as open directory&#8221;. That&#8217;s where I&#8217;ve put the question mark. It&#8217;s not that you always want to see the data at that distance but sometimes you desperately want to have that viewpoint.</p>
<p>Now, what has this to do with tagging? I believe that this missing in-between view can be won by analyzing tags.<br />
Have you noticed how flickr does this in-between view?<br />
When you search for love, <a href="http://www.flickr.com/photos/tags/love/clusters/">flickr cluster</a> asks you: &#8220;What do you mean by &#8216;love&#8217;?&#8221;:</p>
<div class="caption"><a href="http://www.flickr.com/photos/tags/love/clusters/"><img src="/phred/modules/flickr_clusters.png" alt="flickr clusters on love" title="flickr clusters on love" /></a><br />
flickr cluster results on &laquo;love&raquo;</div>
<ul>
<li>a <strong>couple</strong> <strong>kiss</strong>ing?</li>
<li>a <strong>mother</strong> holding it&#8217;s <strong>baby</strong>?</li>
<li>a <strong>red</strong> <strong>heart</strong>?</li>
</ul>
<p>&#8220;Wait: Flickr is a bit different from del.icio.us&#8221;, you say. Yup. Flickr uses a <a href="http://www.personalinfocloud.com/2005/02/explaining_and_.html">narrow</a>, del.icio.us a broad <a href="http://www.personalinfocloud.com/2005/02/explaining_and_.html">folksonomy</a> system.<br />
But I believe that the data clusters, flickr creates with it&#8217;s narrow folksonomy data, can also be generated with delicious&#8217; broad folksonomy data. I am programming an algorithm that computes del.icio.us clusters. I&#8217;m still at an early stage but I get clusters like this &#8220;shopping cluster&#8221;:</p>
<div class="caption"><img src='/phred/modules/shopping_cluster.png' alt="shopping cluster" title="shopping cluster" /><br />
&laquo;shopping&raquo; cluster</div>
<p>I realize that even if the cluster data is available, there&#8217;s the question how to navigate through the data. The &#8220;zooming in&#8221; and &#8220;zooming out&#8221; won&#8217;t be as easy as with Google maps.<br />
But anyway, here is the land no one has explored before. I think this is the area we should talk about. Here is room for improvement.</p>
<h3>Folksonomy helps you to stay informed about a certain topic</h3>
<p>Back to what folksonomies are good for: If you have built an expertise in cryptography, you want to stay informed. If <a href="http://en.wikipedia.org/wiki/RSA">RSA</a> is hacked, you certainly want to be informed.<br />
Delicious has got an &#8220;<a href="http://del.icio.us/inbox/phred">Inbox</a>&#8221; where you can subscribe to a tag, e.g. &#8220;cryptography&#8221;.<br />
Each bookmark that is tagged &#8220;cryptography&#8221; gets in your inbox. That&#8217;s a great way to <strong>stay</strong> informed. Alternatively you have a list of <a href="http://del.icio.us/popular/cryptography">of recent popular sites</a> tagged &#8220;cryptography&#8221;. You can subscribe to this lists using RSS and hopefully you get informed timely if RSA is hacked..</p>
<h3>Do tag systems keep you informed?</h3>
<p>I think the comparison with the distance to the data applies here too:<br />
If I&#8217;d <a href="http://del.icio.us/rss/tag/cryptography">subscribe to cryptography</a>, I&#8217;d probably miss some important items, just because the guy who bookmarked it used the tag &#8220;crypto&#8221;. On the other hand, I do not want to be informed about another <a href="http://en.wikipedia.org/wiki/Rijndael">Rijndael</a> algorithm, I want to narrow the incoming links to articles or essays that deal with cryptography.<br />
Delicious already offers to narrow results: I could <a href="http://del.icio.us/rss/tag/cryptography+essay">subscribe to &laquo;cryptography&raquo; and &laquo;essay&raquo;</a>, and, when delicious will support union (and it will, <a href="http://lists.del.icio.us/pipermail/discuss/2005-November/004390.html">as Joshua promises</a>), I also could have <a>subscribe to (cryptography or crypto) and (essay or article)</a> but you see that it doesn&#8217;t really solve the problem.<br />
I imagine that one day you can say:</p>
<blockquote><p>I want to keep being informed about cryptography</p></blockquote>
<p>and the service asks you:</p>
<blockquote><p>Should I keep you informed about</p>
<ul>
<li>new implementations</li>
<li>new articles/essays</li>
<li>security issues</li>
</ul>
</blockquote>
<p>And I believe this is possible. Flickr already asks you this when you are searching for <a href="http://www.flickr.com/photos/tags/love/clusters/">love pictures</a>. I guess it will be based on clusters again.</p>
<h2>Tags help you sharing Lists</h2>
<p>Back to what tags are good for: They help you building lists. Let&#8217;s name a few examples:</p>
<ul>
<li><strong>Wish lists</strong>: I know that <a href="http://www.amazon.com/exec/obidos/wishlist">numerous</a> <a href="http://froogle.google.com/shoppinglist">online</a> <a href="http://www.giftboxhome.com/">shops</a> enable you building whishlists. But I&#8217;d like to have a whishlist that&#8217;s not bound to a company, that I can arrange and rearrange. <a href="http://del.icio.us/mpe/whishlist">Many</a> <a href="http://del.icio.us/janson/wishlist/">are</a> <a href="http://del.icio.us/Lillith_Within/whishlist">already</a> <a href="http://del.icio.us/a9bejo/whishlist">using</a> del.icio.us as a storage of their wish list.</li>
<li><strong>Share your bookmarks</strong>: A friend asked me for some links to javascript WYSIWYG editors. <a href="http://del.icio.us/phred/javascript+editor">I gave him a list</a> of all my bookmarks tagged <code>javascript</code> and <code>editor</code></li>
<li><strong>Offer viewpoints of your data</strong>: Let&#8217;s say your favourite CMS features tagging (<a href="http://dema.ruby.com.br/articles/2005/08/27/easy-tagging-with-rails">featured in many of those new fancy ruby on rails applications</a>), I&#8217;m not speaking about blogs here: To allow &#8220;normal&#8221; visitors to view your data, you&#8217;ll add a navigation providing starting points to your entries; specific locations a visitor can jump in to so he could take bathe in your articles. Probably you would add a link to all items tagged &#8220;references&#8221; and &#8220;networking&#8221; to achieve that.</li>
</ul>
<h3>How can tag lists be improved?</h3>
<p>I&#8217;m often annoyed that I cannot put my del.icio.us links in a specific order. I <a href="http://www.pui.ch/del_list/">did a little script</a> that puts my newest bootkmark at the bottom but it doesn&#8217;t fully solve the problem.<br />
Actually I&#8217;d like being able to compose a <a href="http://en.wikipedia.org/wiki/View_%28database%29">view</a> of tagged bookmarks, i.e. I want to offer a list of all firms our company has built the network for:</p>
<blockquote>
<h3>Networking references</h3>
<h4>Big firms</h4>
<ul>
<li><a href="http://www.ubs.ch">UBS</a></li>
<li><a href="http://www.migros.ch">Migros</a></li>
<li><a href="http://www.abb.ch">ABB</a></li>
</ul>
<h4>Medium-sized firms</h4>
<ul>
<li><a href="http://www.stadlerrail.ch/">Stadlerrail</a></li>
<li><a href="http://www.search.ch/rim.html">Räber Information Management GmbH</a></li>
</ul>
<h4>Small firms</h4>
<ul>
<li><a href="http://www.citrin.ch">Citrin Informatik GmbH</a></li>
<li><a href="http://www.thildykeller.ch">Goldschmiedeatelier Thildy Keller</a></li>
<li><a href="http://www.minifruits.ch">Mini Fruits Trading</a></li>
</ul>
</blockquote>
<p>Nowadays, such a list can&#8217;t be automatically generated from my bookmarks, but it could be, by letting me configure my view as <code>myView = (references+networking, "Networking References", (big_firms, medium-sized_firms, small_firms))</code>.<br />
I know it&#8217;s not a <strong>big</strong> challenge to program such a thing, but nonetheless it doesn&#8217;t exist, as far as I know?</p>
<h2>Bottom line</h2>
<p>It appears to me that there&#8217;s not been much progress being done related to tagging systems lately. What rather became better is the <a href="http://blog.del.icio.us/blog/2005/11/find_the_url_of.html">embedding of tagging systems into already existing technologies such as search</a>. It gives the impression that core issues are done and that there&#8217;s no much room for improvement. In this article I wanted to disprove this.<br />
I think that there&#8217;s much much more than I have written in here, I even believe that todays tagging applications cover just about 5% of all the possible features tagging makes possible. Thus, let&#8217;s gain ground.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.pui.ch/phred/archives/2005/11/how-tagging-could-gain-ground.html/feed</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Does del.icio.us scale?</title>
		<link>http://www.pui.ch/phred/archives/2005/08/does-delicious-scale.html</link>
		<comments>http://www.pui.ch/phred/archives/2005/08/does-delicious-scale.html#comments</comments>
		<pubDate>Wed, 31 Aug 2005 06:12:50 +0000</pubDate>
		<dc:creator>Philipp Keller</dc:creator>
				<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Del.icio.us]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Tags]]></category>

		<guid isPermaLink="false">http://www.pui.ch/phred/archives/2005/08/does-delicious-scale.html</guid>
		<description><![CDATA[Lately it became very silent around del.icio.us. There are some new features but nothing groundbreaking. Either people are used to it and use it as a daily tool and there&#8217;s no need for new things or otherwise folks just don&#8217;t have faith in the future of del.icio.us.
I am a big fan of delicious. I&#8217;ve got [...]]]></description>
			<content:encoded><![CDATA[<p>Lately it became very silent around <a href="http://del.icio.us">del.icio.us</a>. There are <a href="http://blog.del.icio.us/blog/2005/08/we_rolling.html">some</a> <a href="http://blog.del.icio.us/blog/2005/08/search_me.html">new</a> <a href="http://blog.del.icio.us/blog/2005/08/people_who_like.html">features</a> but nothing groundbreaking. Either people are used to it and use it as a daily tool and there&#8217;s no need for new things or otherwise folks just don&#8217;t have faith in the future of del.icio.us.</p>
<p>I am a big fan of delicious. I&#8217;ve got 1.5K bookmarks there, I like it&#8217;s spirit and how open everything is. This article isn&#8217;t meant to criticize, but I think delicious is facing some problems.<br />
<span id="more-34"></span></p>
<h2>Performance scale</h2>
<p>You might have read my article about <a href="http://www.pui.ch/phred/archives/2005/06/tagsystems-performance-tests.html">Tag system performance</a>. To summarize my tests: MySQL is just not built for large tag-systems. It just doesn&#8217;t scale. It does scale up to 1 Million items but delicious does have far more posts.<br />
I am pretty sure delicious is still on the MySQL train, this strong believe comes from my performance tests: The mysql-schemas I tested really have the same characteristics as delicious has.<br />
I fear delicious faces a performance dead end: They <a href="http://blog.del.icio.us/blog/2005/06/moving_to_new_s.html">have put more servers in the mix</a>, they cache quite a bit, it still is slow. I strongly believe that for delicious to have a future it must become much faster. For me this is the number one downside of delicious. I dream of a bookmark service that has billions of bookmark-posts yet it still will perform nicely. I think it is time for new tag-systems to come up. On <a href="http://lists.tagschema.com/mailman/listinfo/tagdb">tagdb mailing list</a>, there are very good ideas how large scaled tagging systems should work (e.g. systems powered by <a href="http://lucene.apache.org/">Lucene</a>).</p>
<h2>Popular link scale</h2>
<p>I think one of the coolest feature of delicious is the <a href="http://del.icio.us/popular/">popular</a> page. When you read this page regularly you are up to date.. wait: you are up to date concerning CSS tips and firefox and live hacks. You all know that if delicious would get mainstream that page wouldn&#8217;t be that interesting any more. It already got boring a bit. As someone put it: </p>
<blockquote><p>I particularly cannot look at that CSS link lists anymore</p></blockquote>
<p>I think this page doesn&#8217;t scale. It is stuck. And moreover it&#8217;s a pity that the coolest page on delicious is not about tags. At first glance you don&#8217;t even see what tags a popular link has.<br />
IMHO what is needed here are clusters. Bookmarks go into categories: &#8220;browsers&#8221;, &#8220;programming&#8221;, &#8220;design&#8221; but also &#8220;health&#8221;, &#8220;politics&#8221;. When delicious gets mainstream there most certainly will be &#8220;sports&#8221; or &#8220;stars&#8221;.<br />
One should then have the possibility to subscribe to certain clusters or better make this subscription automatically out of tags in a users bookmarks.</p>
<h2>Bottom line</h2>
<p>I think there are some fundamental things that must be rearranged at delicious, otherwise there will be</p>
<ul>
<li>a) a big competitor (Google? Yahoo? Microsoft?) coming up or </li>
<li>b) people will spread to different bookmark services that concentrate on certain clusters. Probably some meta-sites will arise where you can have an overview over all the different sites</li>
</ul>
<p>I think this problems will arise for every bigger tagsystem. I hope that people will not sniff at tagging systems thinking that they don&#8217;t perform well enough..</p>
]]></content:encoded>
			<wfw:commentRss>http://www.pui.ch/phred/archives/2005/08/does-delicious-scale.html/feed</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Analyzing tag-connections</title>
		<link>http://www.pui.ch/phred/archives/2005/07/analyzing-tag-connections.html</link>
		<comments>http://www.pui.ch/phred/archives/2005/07/analyzing-tag-connections.html#comments</comments>
		<pubDate>Sun, 17 Jul 2005 18:03:43 +0000</pubDate>
		<dc:creator>Philipp Keller</dc:creator>
				<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Del.icio.us]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Tags]]></category>

		<guid isPermaLink="false">http://www.pui.ch/phred/archives/2005/07/analyzing-tag-connections.html</guid>
		<description><![CDATA[When you tag an item, for instance a bookmark, you give them different tags, for instance I tagged the bookmark for &#8220;How to Write More Clearly, Think More Clearly, and Learn Complex Material More Easily&#8221; (you know this link if you give attention to delicious popular.. :-)) with 
&#8220;writing&#8221;, &#8220;toread&#8221;, &#8220;productivity&#8221;, &#8220;language&#8221;
Now what instantially pops [...]]]></description>
			<content:encoded><![CDATA[<p>When you tag an item, for instance a bookmark, you give them different tags, for instance I tagged the bookmark for &#8220;<a href="http://www.ai.uga.edu/mc/WriteThinkLearn_files/frame.htm">How to Write More Clearly, Think More Clearly, and Learn Complex Material More Easily</a>&#8221; (you know this link if you give attention to <a href="http://del.icio.us/popular">delicious popular</a>.. :-)) with </p>
<blockquote><p>&#8220;writing&#8221;, &#8220;toread&#8221;, &#8220;productivity&#8221;, &#8220;language&#8221;</p></blockquote>
<p>Now what instantially pops into my mind is, that the tag &#8220;toread&#8221; is quite different from the other tags. In fact it is something I want to do with this bookmark further on. I name this type of tag &#8220;<strong>adjective</strong>&#8221; (I will come back to that name later on..). The other tags I consider as &#8220;<strong>categories</strong>&#8220;.<br />
Now you&#8217;ll probably say &#8220;ah, this is a rare exception&#8221;. This is not true. I often tag items with &#8220;blog&#8221; because it happens that the interesting page I found about my favourite hobby happens to be a blog. Therefore I named this type of tag as &#8220;adjective&#8221; as it is rather a description to the item than it is a category to it.<br />
Other tags used often as adjectives are &#8220;reference&#8221;, &#8220;tutorial&#8221;, &#8220;fun&#8221;, &#8220;cool&#8221;, &#8220;news&#8221;, &#8220;free&#8221;..<br />
<span id="more-33"></span><br />
Now this categorization is not very correct. Sometimes, I use &#8220;blog&#8221; not as a adjective. This is, if I want to bookmark a blog that has no content that interests me but it just looks good. Then, I&#8217;ll probably blog it as &#8220;design blog&#8221;. In that day when I redesign my blog, I want to search for all design blogs I tagged..<br />
You see: it lays all in the connection between those tags, not in the tags itself. This is IMO pretty important.</p>
<h2>What is that for?</h2>
<p></p>
<h3>Clusters</h3>
<p>You probably tried to cluster your bookmarks by using <a href="http://laurie.informatik.uni-bremen.de/clusty/">clusty</a>. What this service does: It tries to put your tags into separate clouds. You know the &#8220;<a href="http://lists.del.icio.us/pipermail/discuss/2005-March/002266.html">tag-bundles</a>&#8221; of delicious? This is something like a &#8220;auto-tag-bundle&#8221; feature. Try it out, if you not already did so and see the problems that arise..<br />
I think the key problem in this cluster-service lies in the fact that this service considers all connections (also the adjectives). But it shouldn&#8217;t do so! Adjectives aren&#8217;t tags I want in my clusters. Adjectives are spread all over my tags, so they should first be cut away from my &#8220;tag-tree&#8221; (the tree that is built out of your tag-connections you built by tagging bookmarks).</p>
<h3>Similar items</h3>
<p>This categorization is also important when you search for &#8220;similar&#8221; items of a bookmark. When I want to search for similar items of that &#8220;how to write more clearly&#8221;-article, I&#8217;ll search for &#8220;writing+productivity+language&#8221; and will leave out the &#8220;toread&#8221; tag (adjective).<br />
Probably this made you realize that categorizing tag-connections is an important task. </p>
<h3>Tag clouds</h3>
<p>Now there are those tag clouds. When I look at <a href="http://kevan.org/extispicious.cgi?name=phred">my taggloud</a> then the &#8220;biggest&#8221; tag is &#8220;resource&#8221;. Now tag clouds are here to easily find bookmarks (I never search my bookmarks for solely &#8220;resource&#8221;) or to have a map of your main interests (&#8220;what is your hobby?&#8221; &#8220;ah, I am a big fan of resources&#8221;.. :-) I am sure you were also annoyed by that. I want those adjective-tags cut away..!</p>
<h2>Synonyms</h2>
<p>Now back to some therory: There is a third type of tag-connections: Synonyms. &#8220;delicious&#8221; and &#8220;del.icio.us&#8221; are classic synonyms. But I consider &#8220;ruby&#8221; and &#8220;rails&#8221; as synonyms too (no, they aren&#8217;t synonyms but up to now they are used as synonyms). You type in the second tag just to be sure that you won&#8217;t search for the second and find nothing.. I don&#8217;t think this category is too important for the cluster-task but I just name it here because I&#8217;ll use it further on.</p>
<h2>Example</h2>
<p>Let&#8217;s go for an example.<br />
Lets consider tags that are connected to the tag &#8220;ajax&#8221;. I gathered some tag-connection-data from delicious (via its <a href="http://del.icio.us/rss/">rss-feed</a>). And I run a query on my statistical data. This is data gathered during the period of one week. It is not complete. But our experiment will work anyway:</p>
<table>
<thead>
<tr>
<td>tag-connection</td>
<td>weight</td>
<td>type</td>
</tr>
</thead>
<tbody>
<tr>
<td><strong>ajax-javascript</strong></td>
<td>234</td>
<td>synonym</td>
</tr>
<tr>
<td><strong>ajax-web</strong></td>
<td>105</td>
<td>category</td>
</tr>
<tr>
<td><strong>ajax-programming</strong></td>
<td>100</td>
<td>category</td>
</tr>
<tr>
<td><strong>ajax-xmlhttprequest</strong></td>
<td>52</td>
<td>synonym</td>
</tr>
<tr>
<td>ajax-css</td>
<td>51</td>
<td>adjective</td>
</tr>
<tr>
<td>ajax-design</td>
<td>46</td>
<td>adjective</td>
</tr>
<tr>
<td>ajax-php</td>
<td>44</td>
<td>adjective</td>
</tr>
<tr>
<td>ajax-development</td>
<td>36</td>
<td>adjective</td>
</tr>
<tr>
<td>ajax-xml</td>
<td>34</td>
<td>adjective</td>
</tr>
<tr>
<td>ajax-DHTML</td>
<td>33</td>
<td>adjective</td>
</tr>
<tr>
<td>ajax-webdev</td>
<td>33</td>
<td>adjective</td>
</tr>
<tr>
<td>ajax-webdesign</td>
<td>31</td>
<td>adjective</td>
</tr>
<tr>
<td>ajax-google</td>
<td>23</td>
<td>adjective</td>
</tr>
<tr>
<td>ajax-HTML</td>
<td>21</td>
<td>adjective</td>
</tr>
<tr>
<td>tutorial</td>
<td>14</td>
<td>adjective</td>
</tr>
</tbody>
</table>
<p>Column &#8220;tag-connection&#8221; is the tag connected to &#8220;ajax&#8221; (i.e. javascript), column &#8220;weight&#8221; depicts the number of times this connection occurred in a bookmark-post on delicious. The tags are ordered by weight. In column &#8220;type&#8221; you see the result of my computations for this tag-connection. Just to make it clear: These are all tags connected to tag &#8220;ajax&#8221; ordered number by occurrence of the connection. If a bookmark-post somebody did on delicious is tagged with &#8220;ajax&#8221; and &#8220;javascript&#8221; that gives one point for the &#8220;weight&#8221;-column for &#8220;ajax-javascript&#8221;.<br />
The outcome is quite good, I think (I must admit that I have taken the example that worked out best :-))<br />
There are some errors, sure: xml-ajax should be a &#8220;category&#8221;-type as well. But we are looking at the usage of these tags not their &#8220;real&#8221; meaning (whatever that is).</p>
<h2>Computation</h2>
<p></p>
<h3>Synonyms</h3>
<p>To compute these categorization I first went for the &#8220;synonyms&#8221;. The connection &#8220;ajax-javascript&#8221; is considered as synonym because &#8220;ajax-javascript&#8221; is &#8220;number one connection&#8221; of all connections where ajax is a part of. And when considering the connections of &#8220;javascript&#8221; (the &#8220;vice-versa-connection&#8221;), ajax is number two.<br />
I consider two tags as synonyms if &#8220;in one direction&#8221; the other tag is number one and in the other &#8220;direction&#8221; the other tag is in the top 10. I made up this rule because I think that in most cases there is one &#8220;stronger&#8221; synonym that is used most of the time when the &#8220;weaker&#8221; one is used. The fact that the tag &#8220;ajax&#8221; is mostly used with tag &#8220;javascript&#8221; could also mean that &#8220;javascript&#8221; is a supercategory of ajax (which it somehow is). To avoid that this sub-super-categogy-connections are considered as synonyms, we go sure that &#8220;ajax&#8221; is also important for &#8220;javascript&#8221; so ajax is not so sub to javascript.. I hope you can follow :-)</p>
<h3>Category/Adjective</h3>
<p>Then I compute the &#8220;category&#8221;. Lets put the values of the above table into a graph.<br />
<img src="/phred/modules/ajax_dist.png" alt="distribution of tags related to ajax" title="distribution of tags related to ajax"/><br />
On the x-axis you see the tags: The tick 1 stands for &#8220;web&#8221;, 2 for &#8220;programming&#8221;, 3 for &#8220;css&#8221;, 4=&#8221;design&#8221;, 5=&#8221;php&#8221; and so on. You see I removed the synonym-connections &#8220;ajax-javascript&#8221; and &#8220;ajax=xmlhttprequest&#8221; as I think they &#8220;disturb&#8221; the distribution.<br />
The y-axis depicts the weight of the connection: ajax-web has weight &#8220;105&#8243;, ajax-programming has weight &#8220;100&#8243; and so on.<br />
The black line is the &#8220;weight&#8221;-column of the table above, the red one is the first <a href="http://en.wikipedia.org/wiki/Derivative">derivative</a>, the blue one the second derivative of the weight function.<br />
This graph makes it clear that &#8220;web&#8221; and &#8220;programming&#8221; are used quite often in combination with &#8220;ajax&#8221;, then, there is quite a &#8220;gap&#8221; followed by the &#8220;adjective tail&#8221;. I consider the &#8220;adjective tail&#8221; as connections to be categorized as &#8220;adjective&#8221;. The tags in this tail are used &#8220;out of context&#8221;: They don&#8217;t really belong to the &#8220;ajax-cluster&#8221;. They sometimes occur together with ajax, but just sometimes. Mostly not. Therefore they are considered as &#8220;adjectives&#8221;.<br />
Now the task is to find this &#8220;gap&#8221;. In my experiments I tried to find the last gap. To find the last gap I started at the end of the tail and searched for the first peak of the first derivative (that is when the second derivative goes from positive to negative) and checked if the peak was high enough. If these to conditions were fulfilled, I snipped the connections into two parts the &#8220;pre-gap&#8221; connections (category) and the &#8220;post-gap&#8221; connections (adjective).<br />
The same computation has to be made for the &#8220;vice-versa&#8221; connection. I considered connections as &#8220;category&#8221; if one of both computations told that it is a &#8220;category&#8221;.</p>
<p><ins datetime="2005-07-18T15:43:36-02:00"></p>
<h2>Further processing: Ambiguous tags</h2>
<p>To achieve good clustering results, I think there is a need of checking if the tag is used in different ways. The prominent example hereof is &#8220;apple&#8221;. Now, when delicious is still restricted to the blogworld, it is clear that apple means Mac-apple. But in future this may change. To recognize if a tag is used in different environments, the algorithm would have to check the &#8220;neighbours of neighbours&#8221; (<a href="http://blog.pietrosperoni.it/2004/09/19/clustering-delicious-tags/">as suggested by Pietro Speroni</a>). That is for ajax: check if the neighbours of &#8220;javascript&#8221; are more or less the same as the neighbours of &#8220;web&#8221;. You see that it all lays in the connections between tags. The tag per se is not well-defined but the tag in connection with another tag defines it quite well. Therefore for clustering I&#8217;m proposing splitting up amiguous tags. That would add much more simplicity to the resulting clusters.</ins></p>
<h2>We are onto something</h2>
<p>I&#8217;m pretty sure we are onto something. I think this is direction it should go. Computations over tag-connection-distributions are cool. Users shouldn&#8217;t insert these infos when posting the bookmarks. Posting should stay easy. I&#8217;m not that sure about this &#8220;synonym&#8221;-computation but I think the &#8220;category&#8221;-computation turned out pretty good. I tried to build some clusters by hand just by considering the category and synonym-connections and I found a completely detached cluster consisting of the tags &#8220;cooking&#8221;, &#8220;health&#8221;, &#8220;recipes&#8221;, &#8220;diet&#8221; and &#8220;food&#8221;. As I said, I think we are onto something..</p>
<h2>Further reading</h2>
<ul>
<li><a href="http://www.rashmisinha.com/archives/05_02/tag-sorting.html">Building tag clusters by hand</a></li>
<li><a href="http://blog.pietrosperoni.it/2004/09/19/clustering-delicious-tags/">Pietro Speronis different approach to clustering tags (with java-mindmap-visualisation!)</a>
</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.pui.ch/phred/archives/2005/07/analyzing-tag-connections.html/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
