Thing 8 of 23: the tags dont work

On 23 Things we were asked (last week!) to think about the use of tags (in the sense of informal user-created labels on internet objects). One of the problems with discussing tagging is that the same technology is used for tagging very different kinds of objects, in collections that vary widely in size, and can be used both by creators of a particular object and other users. Trying to generalise about these is tricky, so I want to look at a few particular cases.

Lets start with my blog, which has around 340 posts. If youre looking for a specific topic, such as iGoogle or naked monks, then its best to search the blog, because theres only one blog post which refers to each (although theres now also this post as well). I use tags only as broader based groupings for example, to group all my posts on a particular conference or on themes such as US politics, where I may not use the specific phrase in the post. Its partly for that reason that I have a medieval tag, but not a Carolingian one, because almost all my discussions of the Carolingian empire will use the term and so can be found via searches.

Even though I keep my tagging so simple and Im an experienced librarian, my tagging of entries is still not of particularly high quality and suffers biases. Im inconsistent about the use of the tag religion, as contrasted with specific religions, such as Christianity. I have a tag for homosexuality, but not heterosexuality, even though I discuss both. And I periodically discover that there are useful themes that I havent tagged, and either have to go back and update the tags, or decide that its too big a task (as with tagging things as Carolingian).

My tagging doesnt need to be very good, because Im tagging objects that already have lots of searchable text. In contrast, tagging non-textual objects, such as images or bookmarks, is a lot harder, and the quality of tagging is very variable. Take something as simple and definite as a place name. I found a couple of pictures on Flickr from the tiny Sussex village where I grew up. One is tagged bonfire, bonfire night, fireworks, madehurst. The other is tagged d40, 18-55mm f/3.5-56G, lenstagged, unmodified, 20081001, madehurst, church, madehurst church, west sussex, england, uk, 200810, 3008×2000. If I wanted to find photos from West Sussex, Id only find the first one by knowing that Madehurst was in West Sussex (and very few people have ever heard of Madehurst). And I suspect there are pictures on Flickr from Madehurst that havent been tagged or captioned with a place, and that therefore I cant find at all.

Why then, is there such enthusiasm by some internet gurus for tags? One of the articles we were pointed to for this week was Clay Shirkeys Ontology is overrated: categories, links, tags . Shirkey has two main points. One is that methods of formal subject categorisation doesnt work for something as big and varied as the internet. Im working with Library of Congress Subject Headings myself at the moment in my job, and I know their many weaknesses. But its perfectly possible to admit that formal classification schemes often dont work effectively, but still point out that informal tagging has even more problems with inconsistency and inadequacy.

Shirkeys false step seems to me to be assuming that you can somehow generate adequate forms of categorisation by aggregating poor forms of categorisation. I take this to be a variant of the wisdom of crowds approach: that averaging the views of people can sometimes give a better answer than any individual one, as for example, when guessing the weight of a cake. Unfortunately, aggregating answers only really works when people have similar levels of knowledge. If you need to ask the audience in Who wants to be a millionaire youll almost certainly get the right answer for one of the early questions. For the million-pound question, theyre unlikely to be much help.

In the same way, Shirkey is wrong to claim that As long as at least one other person tags something the way you would, youll find it. Eventually, maybe. But if there are 241 delicious bookmarks for Edward II, how do you plod through them to find the ones about the king, as opposed to the play or the “mutant calypso/reggae/African style English dance band”? Or Edward Wells II?

Shirkey starts with a contrast between Yahoos attempts at categorisation and Googles lack of hierarchy. But Google doesnt actually make much use of tags: it uses hyperlinks and clusters of interest. A site is about something not just because of terminology within it, but because lots of other sites point to it. When you start looking at the useful forms of recommendations in large systems, they dont predominantly work on tags, they work on such clusters of interest. Amazon and Library Things recommendations are based on the fact that the people who buy or own one book also buy or own similar books. Delicious seems to work best when you can find a person whose interests mean they bookmark the kind of sites youre interested in, even if they tag them slightly differently. Flickrs ability to create pools of pictures can link together specific themes more effectively than tags.

The message Id draw is that people are often poor at labelling things, but theyre a lot better at knowing what they like or find useful. Should librarians be using tags? They may have a limited role in blogs or on social media sites, but Im not convinced theyre the right way forward for library catalogues. Why should we make users do the work of tagging, when we can provide far more useful information for them automatically via a people who borrowed this also borrowed that button? (At the University of Huddersfield Library, theyve got even more whizzy tricks than this, thanks to Dave Pattern). Tagging for yourself may make sense if your needs are simple: using other peoples tags is often a waste of time.

4 thoughts on “Thing 8 of 23: the tags dont work”

Niamh says:

June 22, 2010 at 7:48 am

This is a really interesting post, thank you! I do think there’s a role for user tagging alongside the formal catalogue, but agree that the recommendation system would be even more useful.

LikeLike

Celine says:

June 22, 2010 at 8:06 am

Very interesting post, Magistra.

I really enjoyed reading about Dave Pattern’s work at the University of Huddersfield Library, there are some really good ideas there using the vast amounts of data we already have (even if we don’t know it yet).

LikeLike

Cosma Shalizi says:

June 28, 2010 at 4:51 pm

This made me re-think my fondness for tags. I now suspect (read: guess) that tags will ultimately prove most valuable to other people when we start treating them as features for recommendation systems, rather than just doing brute-force searching on them. And some of the techniques (“latent semantic indexing” etc.) that help for text processing could also finesse the Madehurst/West Sussex issue.

Also: thanks for linking to David Pattern’s very cool stuff.

LikeLike

- magistra says:
  
  July 2, 2010 at 8:57 pm
  
  I don’t know how much research has been done on tagging yet (though another cam23 participant pointed out an old but interesting article. Intuitively, I’d expect tagging to work better for scientific/technical topics (which tend to be about something very specific), rather than the more nebulous concepts in many discussions of the humanities. And the ability of people to tag well is also going to vary a lot, because it’s essentially an analytical operation (what combination of things is this about?) and that kind of analytical thought, which comes very naturally to some people, is much harder for others. One of the things it would be very interesting to research is whether people who tend to use very limited tags would be able to provide more useful keywords if they instead wrote a free-form description of a resource, i.e. whether they find tagging hard because of its intellectual activity or its specific format.
  
  But you’re still stuck with the problem of scalability: if you have 100 resources and you use 10 tags, you probably have at most 20-30 things for every tag, and that’s manageable to browse through. If you then expand your collection to 1000 things, you either have to change your tags or you’re looking at a couple of hundred things with the same tag, and that’s losing its usefulness. There are some things tags probably do make sense for, but we need to think harder about when and how we use them.
  
  LikeLike

	sharonedoyle on Did Charlemagne’s daught…
	sharonedoyle on Did Charlemagne’s daught…
	magistraetmater on Did Charlemagne’s daught…
	sharonedoyle on Did Charlemagne’s daught…
	Remy Bargout on Whats wrong with Judith …

Magistra et Mater

Where history, religion and feminism meet and have a long intellectual conversation

4 thoughts on “Thing 8 of 23: the tags dont work”

Leave a comment Cancel reply

Share this:

Related

4 thoughts on “Thing 8 of 23: the tags dont work”

Leave a comment Cancel reply

4 thoughts on “Thing 8 of 23: the tags dont work”