On 23 Things we were asked (last week!) to think about the use of tags (in the sense of informal user-created labels on internet objects). One of the problems with discussing tagging is that the same technology is used for tagging very different kinds of objects, in collections that vary widely in size, and can be used both by creators of a particular object and other users. Trying to generalise about these is tricky, so I want to look at a few particular cases.
Lets start with my blog, which has around 340 posts. If youre looking for a specific topic, such as iGoogle or naked monks, then its best to search the blog, because theres only one blog post which refers to each (although theres now also this post as well). I use tags only as broader based groupings for example, to group all my posts on a particular conference or on themes such as US politics, where I may not use the specific phrase in the post. Its partly for that reason that I have a medieval tag, but not a Carolingian one, because almost all my discussions of the Carolingian empire will use the term and so can be found via searches.
Even though I keep my tagging so simple and Im an experienced librarian, my tagging of entries is still not of particularly high quality and suffers biases. Im inconsistent about the use of the tag religion, as contrasted with specific religions, such as Christianity. I have a tag for homosexuality, but not heterosexuality, even though I discuss both. And I periodically discover that there are useful themes that I havent tagged, and either have to go back and update the tags, or decide that its too big a task (as with tagging things as Carolingian).
My tagging doesnt need to be very good, because Im tagging objects that already have lots of searchable text. In contrast, tagging non-textual objects, such as images or bookmarks, is a lot harder, and the quality of tagging is very variable. Take something as simple and definite as a place name. I found a couple of pictures on Flickr from the tiny Sussex village where I grew up. One is tagged bonfire, bonfire night, fireworks, madehurst. The other is tagged d40, 18-55mm f/3.5-56G, lenstagged, unmodified, 20081001, madehurst, church, madehurst church, west sussex, england, uk, 200810, 3008×2000. If I wanted to find photos from West Sussex, Id only find the first one by knowing that Madehurst was in West Sussex (and very few people have ever heard of Madehurst). And I suspect there are pictures on Flickr from Madehurst that havent been tagged or captioned with a place, and that therefore I cant find at all.
Why then, is there such enthusiasm by some internet gurus for tags? One of the articles we were pointed to for this week was Clay Shirkeys Ontology is overrated: categories, links, tags . Shirkey has two main points. One is that methods of formal subject categorisation doesnt work for something as big and varied as the internet. Im working with Library of Congress Subject Headings myself at the moment in my job, and I know their many weaknesses. But its perfectly possible to admit that formal classification schemes often dont work effectively, but still point out that informal tagging has even more problems with inconsistency and inadequacy.
Shirkeys false step seems to me to be assuming that you can somehow generate adequate forms of categorisation by aggregating poor forms of categorisation. I take this to be a variant of the wisdom of crowds approach: that averaging the views of people can sometimes give a better answer than any individual one, as for example, when guessing the weight of a cake. Unfortunately, aggregating answers only really works when people have similar levels of knowledge. If you need to ask the audience in Who wants to be a millionaire youll almost certainly get the right answer for one of the early questions. For the million-pound question, theyre unlikely to be much help.
In the same way, Shirkey is wrong to claim that As long as at least one other person tags something the way you would, youll find it. Eventually, maybe. But if there are 241 delicious bookmarks for Edward II, how do you plod through them to find the ones about the king, as opposed to the play or the “mutant calypso/reggae/African style English dance band”? Or Edward Wells II?
Shirkey starts with a contrast between Yahoos attempts at categorisation and Googles lack of hierarchy. But Google doesnt actually make much use of tags: it uses hyperlinks and clusters of interest. A site is about something not just because of terminology within it, but because lots of other sites point to it. When you start looking at the useful forms of recommendations in large systems, they dont predominantly work on tags, they work on such clusters of interest. Amazon and Library Things recommendations are based on the fact that the people who buy or own one book also buy or own similar books. Delicious seems to work best when you can find a person whose interests mean they bookmark the kind of sites youre interested in, even if they tag them slightly differently. Flickrs ability to create pools of pictures can link together specific themes more effectively than tags.
The message Id draw is that people are often poor at labelling things, but theyre a lot better at knowing what they like or find useful. Should librarians be using tags? They may have a limited role in blogs or on social media sites, but Im not convinced theyre the right way forward for library catalogues. Why should we make users do the work of tagging, when we can provide far more useful information for them automatically via a people who borrowed this also borrowed that button? (At the University of Huddersfield Library, theyve got even more whizzy tricks than this, thanks to Dave Pattern). Tagging for yourself may make sense if your needs are simple: using other peoples tags is often a waste of time.