Information Systems and Machine Learning Lab, University of Hildesheim, Germany

Courses in winter term 2005/6 / "Hauptstudiums-Praktikum"/Project on XML and Semantic Web Technologies:

abstract

Time:	not regularly; Wednesday 14-18
Location:	SR 01-016, Geb. 101
Begin:	26.10.2005
Kick-off:	Wed. 26.10.2005, 14:15, SR 01-016, Geb. 101
Credits:	4 SWS, 6 CPs

The praktikum allows students to gain practical knowledge and capabilities in the usage of XML and semantic web technologies (XML, XML Schema, XSLT, XQuery, RDF, RDFS, OWL, query languages and inferencing) in different application scenarios.

Each topic is intended for a small group of 3-4 students.
Software should be written in Java or C++.
Final talks can be given in English or German.
Each topic consists of a generic tool and its proof-of-concept application in an example domain.
There is lots of cooperation potential between topics that is worth exploitation !

Organisation:

Groups can start immediately.
Each group is supposed to give at least three presentations:
- a first presentation introducing the concept and plan (as early as possible, mid November latest)
- a second presentation about ongoing work, showing a first implementation and commenting on problems (before Christmas),
- a final presentation of the whole worl (end of term).

Registration:

Kickoff meeting Wed. 26.10.2005, 14:15, SR 01-016, Geb. 101
You can register for topics from now via email (lst@informatik.uni-freiburg.de).
Topics will assigned in order of arrival of registration emails .
If you state several topcis in decreasing preference, you will get assigned the first one that is available.
Registration of pre-formed groups is preferred.

Topics: (

methodological focus,

technical focus)

available

1. Ontology-based Information Extraction

example domains: price comparison in e-commerce, online communities, bibliographic metadata.

Design and implement a generic software application that crawls data from the web, identifies and extracts instances w.r.t. a given ontology and represent the information in OWL. The application should allow to customize the heuristics for entity recognition via a plugin mechanism, and eventually make use of existing techniques for entity recognition.

The populated ontology should be browsable, either by means of some generic XML based portal technology (Cocoon; Jetspeed; KAON portal; see also topic #) or by a small web app specific for the show case.

available

2. Duplicate Detection

example domains: bibliographic metadata; full-text retrieval.

sources: DBLP

Design and implement a generic software tool that detects duplicate instances in large XML or OWL repositories. The tool should be able to customize the heuristics for spotting duplicates, e.g., to allow to plug in existing libraries for edit distances and all sorts of semantic distances.

The tool should provide a user interface that shows users a list of potential duplicates, allows them to accept or reject the proposed duplicates (and to manage their choices) as well as provides a view on the merged results. -- As an application, use the huge DBLP bibliographic metadata and spot duplicate publications and authors.

available

3. Semantic Browsing and Annotating

example domains: semantic email; bibliographic metadata; research groups homepages; software repositories.

Design and implement a web application that allows to browse any given ontology, i.e., navigate along the taxonomy, view list of instances, search for instances and classes etc. Additional metadata (itself specified by an ontology?) and stylesheets should allow customization of the presentation. -- As proof-of-concept implement a browsing and query interface to a database of massive bibliographic metadata in RDF / OWL.

In a second step, user should be allowed to annotate parts of textual properties of some entities, i.e., mark the occurrences of persons or topics in emails or abstracts (annotating). These annotations should be included in the browsing process on-the-fly, i.e., once a phrase in an email has been marked as person A, this email should be accessible throught the taxonomy in the branch --> persons --> A.

available

4. Semantic Recommender Systems

example domains: movies; bibliographic metadata.

Design and implement a platform for recommender systems that models products, user data and explicit and implicit rating information with an ontology and provides access to that data, i.e., allows users to rate products, view rated products, get recommendatios, etc. as well as logs users actions. Use a semantic information portal (cf. topic 4, ev. KAON portal) or an XML based portal software (Cocoon, Jetspeed, etc.) as basis. -- Implement a semantic movie-recommender as proof-of-concept by crawling some product data from IMDB.

possible extensions:

Allow users to evolve their ontology, e.g., specifiy their own topic hierarchy, add relations.
Model users' reputation.

available

*6. An inference engine for the semantic web

Research the state-of-the-art of inferencing in description logics. Write a reasoner that can read RDF triples (in N3 syntax) and is capable to reason based on OWL-DL and a suitable query language, starting with a quasi-naive tableaux calculus and then adding optimization options. -- Compare your prototype with other reasoners in the field (e.g., KAON2, FaCT and Racer) based on published benchmark suites.