Alice in Wonderland

A few posts ago I showed the picture on the right as a colourful visualisation of a (my) blog. The cells are terms I use, and the colour of a cell indicates relatedness to neighbouring cells (red is much, green is medium, blue is low). When I showed a visualisation like this derived from a weblog of a colleague, a dialogue along the lines of Alice in Wonderland took place:
Alice: "This map of my words is rubbish. The cell for RSS reader is all red. It cannot be red, I don't blog about RSS readers." Humpty Dumpty frowned. "That cell has to be red. Look at the cells around it, RSS feed, reader, aggregator, and feed. It all makes perfect sense!". Alice' sister agreed with Humpty Dumpty: "Although, Alice does have these strange associations. Each time, she finds a place with wifi she tags it with orange juice. Can we find wifi on the map?". Alice, her logic dictates that wifi places that also serve fresh orange juice are real wifi places, explored the map further and found another red spot called blog community. "That is a real topic, I blog about that!" And Alice got really excited when she discovered that blog community had blog network, tie, boundary, and blogosphere as neighbours. "It is working, sometimes", Alice concluded, "and when it is working the red spots should be shown as peaks, and when it is not working they should be shown as valleys. That would make finding real topics a lot easier." Humpty Dumpty did not understand any of this. "The map is working, and if you don't like it, you'll have to write differently. Then, perhaps, you get a better map."
Strange as it may seem Alice actually has a point. Previously, Topics: on and off, I showed the strongest co-occurrences for Bush from a community of Knowledge Management bloggers:
bush: bush, patriot act, civil liberty, court, voter, neocons, litigation, opposition, john kerry, republican, welfare, abortion, gay, immigrant, vote, americans, swing, candidate, iran, gun, iraqis, mob, kerry, ballot, tax cut, county, legislation, province, charter, delay.
Replace Bush in the KM community by RSS reader in Alice' blog and one the similiraties (meta-wise that is) become apparent. When Alice is using the term RSS reader she is bound also to mention the related terms given above (and vice versa for the related terms). The strong co-occurrence, and hence the red cell for RSS reader, occurs because she has a (small) set of terms she will (always) use when using RSS reader. In other words, a very strong relation is a possible indicator of not being on topic. Afterall, if Alice is on topic she has hundreds of terms available which can be used and given that there are only a few neighbours in the visualisation the real topics (peaks) will be orange rather than red.
The preliminary conclusion is that the maps may be more useful than I originally thought: my understanding of topics related to weblogs is growing and one day, hopefully, it will result in something useful. In our research we (with Lilia, Rogier and Robert de Hoog) are going to evaluate blog-related visualisations like the Alice in Wonderland map.
PS. A number of readers asked whether they can also draw maps of their blogs. The answer is yes, the map is part of the open source version of tOKo / Sigmund (at the moment I'm finalising the documentation, should be finished middle of November at the latest). There is a small caveat though, at the moment tOKo can read Movable Type files automatically and I'm at a loss as to what other formats people can export their blogs to.
Alice had lots of fun reading this and explaining to her husband what the whole thing was about :)
But seriousely: the essence of my comment was that the core topics of my blog rarely appear as red areas, while some side topics (RSS) do appear as such.
Current "reds" are good topical clusters, no doubt about it, and I do blog about those. However, the whole thing doesn't help to define key topics of a weblog. I guess because those seem to include more various connections between terms, so the placing algorithm ends up scattering them all over the place.
And - I only tag 'fresh orange juice' places with fresh orange juice. High co-occurance between those 'wifi' and 'fresh orange juice' tells more about my selection criteria for wifi locations than about natural correlation between those two :)))
Posted by: Lilia | October 20, 2006 at 05:07 PM