One of the noun phrases identified in the US Presidential Debate between Bush and Kerry was: Osama bin. It can be questioned the name of a person is a noun phrase. In text analysis, persons are normally labelled as a specific instance of named entities (along with organisations, locations, etc.).
Acknowledging the sub-title of this blog ("The source code is the ultimate documentation") I understand what has gone wrong. The three letter word "bin" has not been recognised as a person preposition. Although person prepositions are not very common in English names, they are very common in some European names (in particular Dutch, but also in Spanish, French and Italian) and in Arabic. Unfortunately, a lot of language technology developed in English speaking countries ignores this. Perhaps the intelligence agencies hired by the Bush and Blair governments should consider this if they think the threats originate from persons from non-English speaking countries. Knowing the correct name of your enemy seems a first step in the right direction.
The following is a list of person prepositions I use in Sigmund: al, bin, da, de, del, den, der, des, di, din, du, el, het, in, la, las, los, mac, mc, op, s, t, ta, ten, ter, van, von, and y. Using this list Sigmund correctly identifies Osama bin Laden as the name of a person.
I believe this list to be complete. If not let me know.
Later. Victor de Boer points out that "ibn" and "ben" are also (Arabic) person prepositions.
Speaking about Arabic names, there are also "ibn", "bint", "abu", "umm". (from http://faculty.juniata.edu/tuten/islamic/names.html )
Posted by: Map | November 27, 2004 at 12:14 AM