« Passion and Profession | Main | Passion and Profession (3) »

Passion and Profession (2)

Below is an image of the "ties" in the weblog community given in the previous post. The "ties" are entirely based on text analysis and not on (reciprocal) linking. For all members in the community a tie is created to the five blogs that are most similar in term usage. (Precisely how this was done would make this post much longer, would include some mathematical symbols, and, to be honest I'm not entirely sure the method used is valid at all.) The resulting tie data is fed into NetDraw a Social Network Analysis tool Stephanie rather seems to like.

Colours of the nodes in the image correspond to a tentative manual classification of the weblogs involved by Lilia and Stephanie: Knowledge Management (blue), Education (red), Internet Research (black), A-list (dark green). If all is well, the text analysis should be able to pick up the manual classification and cluster the blogs by "topic". Theoretically, the notion of an A-list as a category provides a little problem as A-list is orthogonal to topic. Nevertheless, it apparently clusters well. There is a large blue cluster North-West, an Internet Research cluster South-West, and the A-list bloggers are happily cuddling North-East.

Perhaps weblog communities exist :-).

NB. Data is based on analysis of 2004 blogging.




TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/17700/3338681

Listed below are links to weblogs that reference Passion and Profession (2):

» More on the Science of Networks from ghjunior.com
Ive finally managed to finish reading through Albert-László Barabásis Linked. Its a great book and Im going to read over some parts again before I start making some conclusions of my own. In the meanwhile I... [Read More]

» You say profession, I say passion from Monkeymagic
Missed this while on my travels. Anjo, Lilia and Stephanie have been doing some interesting things analysing blog communities through terminology rather than links. [Anjo writes it up here, here and here.] One of the upshots of it all is... [Read More]

» Relationships Above Information Exchange from Ton's Interdependent Thoughts
Whenever I talk about knowledge work and the role of social software in doing knowledge work in complex environments I always say that the key to social software is it's emphasis on relationships above information. Whatever you do with social... [Read More]

Comments

Cool :)

Next to all other things - it's may be the time to revisit the ideas about controlling for quotes from others in the text...

Good stuff, Anjo. I got a preview Wednesday night with Lilia in Chicago.

What is the next step for this? Can it do any predictive work? Could one take a new blog and analyze the text to determine whether it clumps with the others?

The next step is to think! Two rather obvious directions are:

- Communities change over time.
- Does linking "predict" terminology, or does terminology "predict" linking?

Part of the research agenda, we call it "Dynamic Conversations".

Great stuff, Anjo! Would also be interested in seeing how it fits with the various people's geolocation, if only on a percontinent basis. E.g. from a cursory glance, there's more US "action" on the right ... sort of ;)

Lilia and Stephanie labelled the data by "topic" (KM and such), gender and country. I played a little with gender, and some patterns showed :-). Did not look at location, although you are right in noticing there may be relations there as well (for example, there is a Scandinavian and a Dutch cluster, I meet some of them at birthday parties).

One problem is that, for my blog, I have to select from the data to keep the posts readable and "somewhat" relevant.

Hello Anjo, your experiment is an interesting one. I was triggered to read that blog relationships are established on the basis of text analysis. Was this done on a lemma-basis (word corpus) or semantic basis (similarity of meaning)?

How would this model to help you establish chunks of information across various blogs on the basis of text similarity, if the blogs analysed are covering a wider variety of topics? Would a multi and different topic blog rank lower in text similarity than a narrow topic blog?
I have done an experiment with my own blog: I have integrated the four topic blogs I had into one. The result: reader satisfaction has gone up, but I am observing that automated matching engines are having a hard time to classify the blog. This puts that question as to what are the different levels of analysis: person, blog, topic or posting?
Best wishes, Olaf.

Post a comment

If you have a TypeKey or TypePad account, please Sign In