More than 25 centimeters of snow in a day and the area has come to a halt. No paper, television channels are also full of snow, and after struggling to get to the supermarket 100 meters away, I discovered they had run out of fresh food (and no papers either). Luckily, the power is still on in my house, in the immediate neighbourhood more than 250,000 people have been without power and heating for days.
Out of attractive other options to keep myself occupied over the weekend (chess match was also cancelled), I picked up an old "problem": that of looking at weblog conversations. Lilia Efimova and Aldo de Moor have written a paper on the subject and manually drew a diagram of what the conversations looked like. The first challenge was therefore to draw the diagram of a conversation automatically. Ignoring the notion of what a conversation is for the moment, one of the diagrams is shown below:
Left to right is time (the data was 2004 posts of KM bloggers). Top to bottom is chronological order of a blogger entering the conversation. Colours are: blue (both links to and is linked from other posts in the conversation), red (is linked from in the conversation), green (links to the conversation). Note that posts by the same blogger on (nearly) the same day overlap each other (one pixel is one day).
For the example above, it is easy to see the "conversation" continues for a while (first post is Feb. 26, 2004 and last is Dec. 31, 2004).
Another one, and I quite like its geometry, is:
The topic of this "conversation" is the KM Europe conference in 2004. Several people comment on it independently (red posts), some also link to other posts (blue) and Gabriela Avram wraps most of them up (green one in the middle right).
The two examples above are spread over time widely. Below is an example of a very topical conversation compressed in time (BlogTalk 2.0).
The above are just some examples of how posts are linked over time. Do these links (and the networks they depict) constitute a conversation? This is a tricky question. The colours and the distribution over time provide some clues, but the meat of the matter has to come from text analysis. Is there a common topic that can be identified? Or, phrased otherwise, can it be determined "why" bloggers join the conversation?