The last couple of years I have become associated with research in which people set up experiments to investigate cognitively complex tasks (e.g. indexing learning material or learning Knowledge Management). Once the experimental data (action, time taken) is obtained hardly any methods exist to analyse it.
The amount of data is usually too large to manually "replay" the subject's behaviour and draw conclusions that way. So, the raw data has to be "compressed" such that it can be interpreted by the researcher (or SPSS) and published (very important!). Unfortunately, the problem is only reformulated: how do we compress the data? Experience dictates that this is not easy in the general case.
I was reminded about this problem reading: Bridging the Gap: A Genre Analysis of Weblogs, Susan C. Herring, Lois Ann Scheidt, Sabrina Bonus, Elijah Wright, Proceedings of the 37th Hawaii International Conference on System Sciences (HICS 2004), January 2004.
The authors have collected a lot of data (203 blogs to be precise) and manually analysed these. This results in observations like 70.4% of the blogs are a "personal journal", 51.2% of the links are to other blogs, and the average blog entry contains 210.4 words. Drawing conclusions from these compressions is not easy, although it has to be said that the authors succeeded in writing an interesting article that contains them.
The question that remained in my mind after reading the paper is whether it would be possible to create a "blog certificate". Such a certificate would present to the user (e.g. potential blog reader) in one glance some fundamental characteristics of the blog, comparable to certificates for movies. In order to avoid that a certificate has to be compiled manually the characteristics must be mechanically computable. An initial selection would be:
- Frequency (average number of posts per day).
- Length (average number of words per post).
- Quotes (percentage of content that is quoted from elsewhere).
- Self links (percentage of links to own blog).
- Other blogs (percentage of links to others blogs).
- Other links (percentage of links to other websites).
If we would have averages for the above and then compute relative values for a particular blog, the result might be useful. For example, a relatively high score on "Self links" points to an introspective blogger, and a relatively high score on "Other links" points to a blog that filters the web (also given in the cited paper).
In true metification fashion the relative values could be compressed to a six-coloured icon that might aid navigation in blog space.
Anjo,
just got an idea: what if we experiment with blog certificate for my weblog? :)))
Posted by: Lilia | March 01, 2004 at 10:49 AM
Lilia,
I will, but first have to collect enough data. Meanwhile, I am awaiting
suggestions for what the certificate might look like.
Posted by: Anjo | March 01, 2004 at 02:44 PM
Anjo,
I found some comments to your ideas in Spanish :)
As you don't have a trackback from it - http://blogzine.blogalia.com/historias/16266
The author also points to http://www.popdex.com/ticker/
Posted by: Lilia | March 02, 2004 at 01:43 AM
>>the result might be useful
For what?
It is only useful if it is a statistical predictor for something else, e.g. Blog Quality.
Proove the hypothesis first manually, then we can judge if we want to provide data automatically.
Personally, I think Blog Quality depends on the content, not on its length or link density.
Stu
Posted by: Stu Savory | March 15, 2004 at 02:16 PM
Stu,
"It is only useful if it is a statistical predictor for something else, e.g. Blog Quality". Unfortunately, quality is not easy to measure, and certainly movie certificates do not measure quality, still they are used (and useful :-)).
"Personally, I think Blog Quality depends on the content, not on its length or link density." True, in general, but still raw counting can be useful to appreciate quality. For example, I prefer television stations that have few (or no) advertising breaks. And a station that has good content and a lot of ads will make me zap.
Anyway, in the other posts there is some proof I'm not just counting...
Anjo.
Posted by: Anjo | March 16, 2004 at 12:21 AM
I prefer television stations that have few (or no) advertising breaks. And a station that has good content and a lot of ads will make me zap.
Posted by: cheap computers canada | January 15, 2010 at 02:45 PM
I do think this is a most incredible website for proclaiming great wonders of Our God!
Posted by: AnnedoWer | March 01, 2010 at 11:58 PM
increase your semen volume and count - effect on semen volume and aumentar volumen semen.
Posted by: increase naturally semen volume | October 10, 2011 at 10:26 AM