I'm still recovering a little bit from the workshop we organised yesterday. Our main objective was to show the tools we have developed for text analysis (tOKo, Sigmund) and ontology development (Triple20), and how these tools co-operate in supporting semantic web type applications.
The audience had lots of questions and comments which is probably a good indicator of interest in what was being presented and demonstrated. The obvious question, regarding the availability of tOKo (you can already download Triple20, although it lacks documentation which I think is a big problem to get it mainstream), was also partially answered. The plan is to make tOKo Open Source, before that there will be a restricted version for use by Ph.D. students and other researchers who have an immediate need for it (quite a few were at the workshop :-)).
Wouter Jansweijer, in his presentation said that he only used 10% of the features of tOKo. This is shocking! And he certainly must be wrong. The user interface of tOKo makes available about 20% of the functionality that is inside. The remaining 80% is either used by internal modules (e.g. Sigmund), under development, too time-consuming to be interactive or only specific for certain types of documents. I think Wouter's 10% should be at least 20% and even that would mean you can develop ontologies with tOKo using (20% * 20%) is 4% of the functionality. Is there anyone who knows a beach were it is sunny 96% of the year?
A question of prime importance was related to the tOKo buttons that connect language and the semantic web. The default set of buttons is shown above, their function and how these functions are mapped onto RDF(S) is given in the table below (this was actually part of Wouter's presentation):
The question was: "Suppose, in my domain, I have a relation (property) that is different from the default set". My response, was "Just add a button". Disbelief, and I was fortunate the person who asked the question had not brought her glasses, the tomato missed the target. Can it really be that simple? Yes, as the table above shows each of the buttons corresponds to an RDF(S) property and designing buttons is a lot more difficult than adding RDF(S) properties. In the cooking domain, an example would be the usePreparationMethod property between food and food preparation methods, an example is: cake usePreparationMethod bake. And the nice thing is that tOKo will actually suggest it, enter cake use the function Collocated verb and there it is: bake is the most frequently collocated verb of cake. This is, of course, not a tOKo feature, it doesn't know anything about cooking, it just happens that Chocolate and Zucchini does (and Clotilde likes baking cakes). tOKo, applying some statistics, just picks it up.
Another set of questions can be paraphrased under the heading "Can tOKo discover interesting patterns in a text by itself". The answer to this question is "Yes, but it takes a lot of computing power". Given that tOKo is interactive, the UI does not provide these functions. A simple example, see above, is the noun-verb collocation of bake/cake. There are lots of nouns and verbs and comparing each verb to each noun (possibly compound) takes time. Ultimately, such time-consuming functions are useful (provided the user knows they take time) and perhaps in the Open Source version of tOKo it will show a little popup that states: "Bake a cake!".
You probably guessed it. The workshop was a lot of fun for presenters and audience, and I think the message was clear and appreciated: tOKo keeps it simple and obvious (you don't want to know how much time it took me to figure out what simple and obvious means in this context :-)).