« Open Source: tOKo, Sigmund, BlogTrace | Main | tOKo does ... »

tOKo does Movable Type

A little bit of progress on the open source version of tOKo (and the like), and in particular making it suitable for bloggers.

The first problem is turning a (your?) blog into a corpus. tOKo is pretty flexible as to what a corpus looks like, but the process must be automated. Jack Vinson and Ton Zijlstra provided great help by converting their blogs to a Movable Type export file and making the result available. Therefore, tOKo now contains a "Create corpus from Movable Type" function. The nice thing is that several blogging platforms provide Movable Type (MT) export. For example, in TypePad (which I use) a MT file can be generated from the web interface. Moreover, an MT file contains all information, including comments and trackbacks. Compared to RSS or Atom there is also a drawback: the MT file does not contain permalinks. The Movable Type site provides some hints on how to turn post titles into a permalink, I'm not sure how reliable this is.

On the right is a snapshot of the hierarchy of the tOKo corpus after automatically converting Ton's weblog from an MT export file. The hierarchy contains three indexes: Posts by date (at the top), Posts by topic (derived from the categories) and Posts by keyword. For all indexes, the comments and trackbacks are also in the hierarchy.

This is only an intial attempt to make tOKo an interesting tool for bloggers. Perhaps Ton's reaction says it all: "it looks pretty".

The Movable Type format is pretty much fixed. This is not the case for RSS (and perhaps also Atom). The obvious next step would be to write a "Create corpus from RSS / Atom" as well to accomodate those not using Movable Type. For this, it would be necessary to have example "real data" (i.e. full-text RSS / Atom feeds preferably with categories, optionally with comments and trackbacks). If you have linked to Jack, Ton, Lilia or myself in the past, this would be particularly interesting (also if you can only export to Movable Type). The only disadvantage of making your weblog available is that I might ask you to alpha-test tOKo :-).

My email address is: anjo science uva nl (one at, two dots).

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/17700/4639501

Listed below are links to weblogs that reference tOKo does Movable Type:

» tOKo Eats Movable Type Alive from Ton's Interdependent Thoughts
Those of you who have been at BlogTalk 2 in Vienna in 2004, have glimpsed some of the work Anjo Anjewierden (University of Amsterdam) has been doing on tracking communities, conversations and topics through blognetworks. Original aim was to... [Read More]

» tOKo from Monkeymagic
Anjo is progessing apace with his tOKo tool, which aims (I think) to give some analytical clout to questions such as whether or there is any "knowledge transfer" between blogs. It's intriguing stuff. Spurred on by Ton's call to arms,... [Read More]

» OPML archive for your blog from Library clips
Can RSS wrapped in OPML be the answer for blog archiving, sharing blog data, and displaying a directory type index for your blog. Archiving blog content (blog posts) has been emerging again, and Im hoping OPML can be the container. The questi... [Read More]

Comments

Anjo, I'm willing to help. I have not read up on the utility yet, and I blog on blogger. But if my data is useful, please, urge me on!

Nancy (and other Blogger users),

For users of Blogger the way to export appears to be described here: http://help.blogger.com/bin/answer.py?answer=130&query=export&topic=0&type=f

Anjo- I can create an RSS file with all comments included, based on my current expanded feed: http://blog.jackvinson.com/atom_w_comments.xml. I could include trackbacks too, but there is no clear way to reference trackbacks in RSS. (Actually, there isn't a good way to do comments either. I've set them up as separate entries in the RSS feed.)

Jack, Comments and trackbacks are secondary: it is nice that Movable Type makes it so easy and portable. If you could provide me with an RSS and/or Atom feed as well that would be extremely helpful.

Many thanks once again.

I made a discoverable archive of my weblog available. If you're interested the detail is here:

http://matt.blogs.it/entries/00002181.html

M

Thanks Matt, I'll have a go at it.

If I can make them easier to process or more useful somehow, please let me know and I will see what I can do.

M

Matt: finished the conversion, it was easy.
This also seems a suitable format for exchanging full weblog content.

Post a comment

If you have a TypeKey or TypePad account, please Sign In