Providing Information

There are many different information sources in a company. First there is a multitude of IT systems which store information in different formats and qualities: These sources are, among others, File Systems, Microsoft Exchange Servers, Lotus Notes, Web Sites (Internet and Intranet), Databases, News Groups and E-Mail systems. Secondly there is also a lot of knowledge which only resides in people's heads (and their paper notes).

With respect to this diversity, a comprehensive knowledge system must offer a set of methods and tools to tap these sources efficiently. Each source must be accessed by a special mechanism, and the information must then be transferred into the concepts of Ontobroker. For databases, it is only necessary to map the DB concepts into the concepts of our system. For documents, it is however necessary to know something about the information which they contain, which means that we need some kind of metadata.

This chapter demonstrates different ways of capturing information with a strong focus on how the information sources in the BT case study are tapped. In this case study, we want to concentrate on three main points in the area of information provisioning. We try:

1. to collect experience with a wrapper tool that allows to transform existing HTML documents into XML documents by adding metadata.

2. to demonstrate that Ontobroker can handle XML documents.

3. to demonstrate that by using the conceptual basis of an ontology we can achieve a coherent integration of information through structuring loose pieces of information into a coherent frame.

To achieve those three points we first wrap the information from HTML documents into XML documents. In a second step, we parse those XML documents into F-Logic (the underlying language of Ontobroker, see 4.3.2). In principle Ontobroker could be started with these facts and the ontology. But due to the amount of facts in the case study it was necessary to insert an additional step. The F-Logic output from the XML Parser is mapped according to rules into the facts format of the ontology that structures the knowledge base. For this mapping we have used the capabilities of Ontobrokers' inference engine (see Figure 8).

Another possibility to get the information into Ontobroker is to directly wrap the information from the HTML documents into the F-Logic form by writing a special wrapper (e.g. Perl programme). However, since in the near future more and more documents will be available in XML either because they are directly written in XML or because there will be gateways from most proprietary systems, we want to show how XML sources can be tapped to use their information in Ontobroker.

In the following, the information sources are discussed which can be tapped by a wrapper, are discussed first. I distinguish between semi-structured and structured sources. Afterwards, I show metadata information sources that explicitly describe contents of documents on a semantic basis and how their information is read into Ontobroker.

