The goal of KYOTO is a system that allows people in communities to define the meaning of their words and terms in a shared Wiki platform so that it becomes anchored across languages and cultures but also so that a computer can use this knowledge to detect knowledge and facts in text. Whereas the current Wikipedia uses free text to share knowledge, KYOTO will represent this knowledge so that a computer can understand it. For example, the notion of environmental footprint will become defined in the same way in all these languages but also in such a way that the computer knows what information is necessary to calculate a footprint. With these definitions it will be possible to find information on footprints in documents, websites and reports so that users can directly ask the computer for actual information in their environment, e.g. what is the footprint of their town, their region or their company.
The KYOTO system roughly works in 6 steps, as shown in the figure:
- People from a domain specify the locations of diverse and distributed sources of knowledge in different languages.
- The text in various languages is collected from the sources by a capture module, for example the text "Sudden increase of CO2 emissions in 2008 in Europe". So-called linguistic processors will analyse the text and generate a representation of the lingistic structure.
- Term yielding robots (so-called Tybots) automatically extract all the important terms and possible semantic relations, e.g. "CO2 emission", and relate these to existing semantic networks (so-called Wordnets) in each language. The Tybot can do that for any language.
- The wiki-environment (so-called Wikyoto) allows the domain people to maintain the terms and concepts and agree on their meaning within the community and across languages. The meanings are formalized in a domain ontology which can be used by computer programs. Similar terms from different languages map to the same ontology concept. For example the Dutch term "CO2-uitstoot" also relates to CO2Emission in the ontology.
- Knowledge yielding robots (so-called Kybots), use the terms and knowledge to detect factual data in the text in various languages. In the case of a fact, we are dealing with a specific instance or case of CO2 emission that took place at some point in time and in a specific location or region. Given a shred ontology and wordnets in different languages, Tybots can collect these facts again and again, from large volumes of data from different countries and in different languages.
- The factual data is indexed and can be accessed by anybody through semantic search, again in various languages, e.g. facts on CO2 emission in Europe from 2000 to 2009.
More details and demos of the system can be found in the Demos section.