wir bieten...
Dekobild im Seitenkopf ISMLL
 
Courses in summer term 2006 / Seminar on Text Mining and Ontology Learning:
abstract

Time: Tue. 14-16
Location:SR 01-018, Geb. 101
Vorbesprechung: Mon., April 24, 16-18 (Room 026, Building 101)
Begin: 25.04.2005

Machines understanding text written in an human language such as English or German is one of the major goals of artificial intelligence and machine learning. While it may not so clear what "understanding" means exactly for machines and anything worth to be called true machine understanding is decades ahead, for many interesting applications the ability to deal with textual data in a very narrow context already is sufficient. For example:

(i) All of us rely on capabilities of our spam filter to automatically sort incoming emails in legitimate and spam email, a binary classifier learned from example texts/emails.

(ii) Retailers and comparison shopping platforms integrating offers from hundreds of different providers have to identify offers of the same product based on textual descriptions that typically vary a little bit from provider to provider, so that their customers can view all offers to the same product in a single place.

(iii) Information portals for bibliographic data such as citeseer or for job offerings often crawl their information automatically from the net and therefore have to extract relevant pieces of information such as addresses, job titles, etc. from texts.

Methods that address these tasks have been developed in different research communitites lately, such as Statistical Natural Language Processing (NLP) and Computational Linguistics, Text Mining, Information Retrieval, Information Extraction, etc.

More recently, the output of such methods also is represented formally in logics, e.g., entities as instances, relations between entities as predicates, etc. Especially, some fragments of first order logics, description logics, sometimes also called ontologies, have been used for this task. In this context, the task often is called ontology learning.

This seminar aims at presenting an broad overview of methods for dealing with texts that address some of these problems.

Talks can be given in English or German.

Interested students can register for a topic from now via email to . Topics also will be assigned at the common seminar introduction session of the Department of Computer Science at Monday, April 24, 16-18 (Room 026, Building 101).

Supervisors are: Prof. Dr. Lars Schmidt-Thieme, Christine Preisach, Karen Tso and Leandro Balby Marinho

Topics and preliminary schedule:

 Tue. 25.04(0)-- Introduction --
I. Text Classification
Tue. 16.05(1)A Survey of Text Classification Methods, especially Support Vector Machines
Tue. 23.05(2)Text Classification considering Background Knowledge
Tue. 30.05 No Seminar
Tue. 06.06 Pentecost Holydays
Tue. 13.06(3)Automatic Classification based on semantic hierarchies
II. Some Basic Problems
Tue. 20.06(4)Named Entity Recognition
Tue. 27.06(5) Word Sense Disambiguation
Tue. 04.07(6) Coreference Resolution
III. Learning Concept Taxonomies
Tue. 11.07 (7) Ontology Semantic Similarity
-- (8) Evaluation of information extraction tasks and ontologies
Tue. 18.07 (9) Learning Concept Taxonomies
IV. Learning General Relations
Tue. 25.07 (10) Learning Relations using Association Rules
-- (11) Learning Relations using Kernel Methods
-- (12) Adaptive Information Extraction (LP2 algorithm)
V. Applications
-- (13) Text Summarization