|Evaluation of Module|
|Evaluation of facts|
|Evaluation of semantic search|
|Evaluation of the knowledge editor|
The ultimate goal of KYOTO is the extraction of facts from text.This is done by the Kybot system at the end of the pipeline of processing. The Kybots depend on the results of all the previous processing modules. Errors in the processing may be stacked. For example, if the parser assigns the wrong part-of-speech to a word, the word-sense-disambiguation will fail to assign the proper concept, and the onto tagger will not insert the correct ontological implications into the KAF file. Consequently, the Kybots may fail because a profile specified for a certain part-of-speech does not match or the wrong profile specified for a concept is matched, thus affecting recall and or precision.
This being said, modules may also introduce errors that have no effect on the extraction of facts. Therefore, it is a wise thing to evaluate both the modules and the end-applications that depend on it to learn about the effects and relevance of errors. Likewise, the results of KYOTO have been evaluated in different ways:
- System evaluation:
- Evaluation of different modules
- Evaluation of the facts extracted by the KYOTO system
- User evaluation:
- Use of the knowledge editor
- Use of the Semantic search in the extracted facts
Evaluation of modules
Module evaluations have been carried out for Word-Sense-Disambiguation and Named-Entity-Recognition. These are the most important modules that connect text to concepts.
- Word-Sense-Disambiguation (WSD): the results of this have been described in the SemEval2010 workshop that was organized by KYOTO: http://aclweb.org/anthology-new/S/S10/S10-1013.pdf
- Named-Entity-Recognition: the results of this have been described in the KYOTO deliverable D09.2.
Named-entity recognition (NER) has been developed for dates and places. Dates and places are necessary to turn event-relations into facts with a time and place. The NER module detects 85.3% of the locations and 92.5% of the dates. 95.5% of the dates are correctly interpreted; of the locations, 89.1% are disambiguated to the correct country and 42.0% to the correct feature type.
Evaluation of facts:
We developed a complete package to evaluate the output of the event/fact mining by the Kybots.We defined a neutral triplet representation for representing the facts, which consists of:
- a relation
- a list of word token identifiers that represent the event
- a list of word token identifiers that represent the participant
The evaluation data and software can be downloaded from the following URL:
https://kyoto.let.vu.nl/~kyoto/files/data/kybotevaluation/11767/KybotEvaluationDataMarch2011.zip (12 MB)
We also carried out an open competition evaluation in combination with the 2nd KYOTO workshop for which we created a different gold-standard. Two other groups participated. The details can be found on the workshop-event page.
Evaluation of semantic search
The facts extracted by the Kybots are indexed and searchable in a cross-lingual semantic index. To evaluate the usefulness of these facts, we carried out a benchmark evalauation of the search indexes and an end-user evaluation. In this evaluation, we compare the search on Kybot facts with a standard text search solution. The benchmark evaluation checks to what extend the kybot facts represent all the information in the database. We compared the recall of the semantic search in the Kybot system with the recall of a standard text search. This showed that 35% of the coverage of a standard system is achieved. The extraction of the kybots is thus a good step towards a full text representation but still there is room for improvement. The details of the benchmark can be found in the KYOTO deliverable D09.5.
For the end-user evaluation, we had 21 users (students and environmentalists) that had to find answers to questions using the standard retrieval system, a mash-up fact retrieval system and the semantic search in the kybot facts. The evaluation showed no significant difference in quality and performance but the users had major difficulties understanding and appreciating the innovative features of the semantic search in the fact index. Some less-conservative users on the other hand exactly liked these features. The details are discussed in the KYOTO deliverable D09.6.
Evaluation of the Knowledge editor
The Wikyoto was used by environmentalist that have no training in linguistics or knowledge engineering. They created an English domain wordnet with mappings to the central ontology. The details are described in deliverable D08.4.