« Metification | Main | The syntax level »

Blog certificates

The last couple of years I have become associated with research in which people set up experiments to investigate cognitively complex tasks (e.g. indexing learning material or learning Knowledge Management). Once the experimental data (action, time taken) is obtained hardly any methods exist to analyse it.

The amount of data is usually too large to manually "replay" the subject's behaviour and draw conclusions that way. So, the raw data has to be "compressed" such that it can be interpreted by the researcher (or SPSS) and published (very important!). Unfortunately, the problem is only reformulated: how do we compress the data? Experience dictates that this is not easy in the general case.

I was reminded about this problem reading: Bridging the Gap: A Genre Analysis of Weblogs, Susan C. Herring, Lois Ann Scheidt, Sabrina Bonus, Elijah Wright, Proceedings of the 37th Hawaii International Conference on System Sciences (HICS 2004), January 2004.

The authors have collected a lot of data (203 blogs to be precise) and manually analysed these. This results in observations like 70.4% of the blogs are a "personal journal", 51.2% of the links are to other blogs, and the average blog entry contains 210.4 words. Drawing conclusions from these compressions is not easy, although it has to be said that the authors succeeded in writing an interesting article that contains them.

The question that remained in my mind after reading the paper is whether it would be possible to create a "blog certificate". Such a certificate would present to the user (e.g. potential blog reader) in one glance some fundamental characteristics of the blog, comparable to certificates for movies. In order to avoid that a certificate has to be compiled manually the characteristics must be mechanically computable. An initial selection would be:

  • Frequency (average number of posts per day).
  • Length (average number of words per post).
  • Quotes (percentage of content that is quoted from elsewhere).
  • Self links (percentage of links to own blog).
  • Other blogs (percentage of links to others blogs).
  • Other links (percentage of links to other websites).

If we would have averages for the above and then compute relative values for a particular blog, the result might be useful. For example, a relatively high score on "Self links" points to an introspective blogger, and a relatively high score on "Other links" points to a blog that filters the web (also given in the cited paper).

In true metification fashion the relative values could be compressed to a six-coloured icon that might aid navigation in blog space.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d83452af8f69e200d83420b8b653ef

Listed below are links to weblogs that reference Blog certificates:

» Metification and blog certificates. from Mathemagenic
My colleagues are moving into blogging :) One of them, Anjo Anjewierden [Read More]

» Blog certificates from Blogzine
Un profesor de la Universidad de Amsterdam experto en gestión del conocimiento, Anjo Anjewierden, plantea en Blog Certificates la hipótesis de un "certificado" que presentara a los lectores la información básica sobre un weblog concentrada. [Read More]

» Lilia introduces Anjo from Monkeymagic
New bloggers introduced by Mathemagenic: on Anjo and blog certificates [Read More]

Comments

Anjo,

just got an idea: what if we experiment with blog certificate for my weblog? :)))

Lilia,

I will, but first have to collect enough data. Meanwhile, I am awaiting
suggestions for what the certificate might look like.

Anjo,

I found some comments to your ideas in Spanish :)
As you don't have a trackback from it - http://blogzine.blogalia.com/historias/16266

The author also points to http://www.popdex.com/ticker/

>>the result might be useful

For what?
It is only useful if it is a statistical predictor for something else, e.g. Blog Quality.
Proove the hypothesis first manually, then we can judge if we want to provide data automatically.

Personally, I think Blog Quality depends on the content, not on its length or link density.

Stu

Stu,
"It is only useful if it is a statistical predictor for something else, e.g. Blog Quality". Unfortunately, quality is not easy to measure, and certainly movie certificates do not measure quality, still they are used (and useful :-)).

"Personally, I think Blog Quality depends on the content, not on its length or link density." True, in general, but still raw counting can be useful to appreciate quality. For example, I prefer television stations that have few (or no) advertising breaks. And a station that has good content and a lot of ads will make me zap.

Anyway, in the other posts there is some proof I'm not just counting...
Anjo.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment