Recently I have been interested in trying to create hierarchical taxonomiesfrom flat tag data. Tagging systems like del.icio.us, Flickr, and CiteULike tend to have (relatively) flattags. This means that while one can easily browse by a tag, like photography, one cannot aseasily see tags which are more or less broad than that tag. It is alsodifficult to get a broad overview of what tags exist in these sorts ofsystems as a result, aside from frequency based displays like tag clouds.
Some commentators have suggestedthat ontology is overrated, even irrelevant. That there is no hierarchy inideas, only links:
Tagging systems are excellent at the task that they were designedfor---allowing a large, disparate group of users to collaboratively labelmassive, dynamic information systems like the web, media collections ofmillions of images, and so on. We are working to make these systems betterby automating production of hierarchical taxonomies that describe the datafrom the raw flat tags generated by users.
I've found some interesting features of tagging datasets from del.icio.us and CiteULike which have in turn suggestedreasonably good ways to create hierarchies. An example hierarchy generatedusing some of these methods from del.icio.us is here: mgfgsm-hierarchy.
Title: | Collaborative Creation of Communal Hierarchical Taxonomies in SocialTagging Systems |
Authors: | Paul Heymann and Hector Garcia-Molina |
Type: | Preliminary Technical Report |
Accessible: | (info)(ps)(pdf) |
Description: | This paper describes a simple algorithm for constructinghierarchies in social tagging systems that usually works reasonably well.The main contribution is a notion of generality in social tagging systemsbased on centrality in a similarity graph. |
联系客服