abstract
Time: | Tue. 14-16 |
Location: | SR 01-018, Geb. 101 |
Vorbesprechung: | Mon., April 24, 16-18 (Room 026, Building 101) |
Begin: | 25.04.2005 |
Machines understanding text written in an human language such as English or German is one of the major goals of artificial intelligence and machine learning. While it may not so clear what "understanding" means exactly for machines and anything worth to be called true machine understanding is decades ahead, for many interesting applications the ability to deal with textual data in a very narrow context already is sufficient. For example:
(i) All of us rely on capabilities of our spam filter to automatically sort incoming emails in legitimate and spam email, a binary classifier learned from example texts/emails.
(ii) Retailers and comparison shopping platforms integrating offers from hundreds of different providers have to identify offers of the same product based on textual descriptions that typically vary a little bit from provider to provider, so that their customers can view all offers to the same product in a single place.
(iii) Information portals for bibliographic data such as citeseer or for job offerings often crawl their information automatically from the net and therefore have to extract relevant pieces of information such as addresses, job titles, etc. from texts.
Methods that address these tasks have been developed in different research communitites lately, such as Statistical Natural Language Processing (NLP) and Computational Linguistics, Text Mining, Information Retrieval, Information Extraction, etc.
More recently, the output of such methods also is represented formally in logics, e.g., entities as instances, relations between entities as predicates, etc. Especially, some fragments of first order logics, description logics, sometimes also called ontologies, have been used for this task. In this context, the task often is called ontology learning.
This seminar aims at presenting an broad overview of methods for dealing with texts that address some of these problems.
Talks can be given in English or German.
Interested students can register for a topic from now via email to
.
Topics also will be assigned
at the common seminar introduction session of the Department of Computer Science
at Monday, April 24, 16-18 (Room 026, Building 101).
Supervisors are: Prof. Dr. Lars Schmidt-Thieme, Christine Preisach, Karen Tso and Leandro Balby Marinho
Topics and preliminary schedule:
Tue. 25.04 | (0) | -- Introduction -- | |
I. Text Classification | |||
Tue. 16.05 | (1) | A Survey of Text Classification Methods, especially Support Vector Machines | |
Tue. 23.05 | (2) | Text Classification considering Background Knowledge | |
Tue. 30.05 | No Seminar | ||
Tue. 06.06 | Pentecost Holydays | ||
Tue. 13.06 | (3) | Automatic Classification based on semantic hierarchies | |
II. Some Basic Problems | |||
Tue. 20.06 | (4) | Named Entity Recognition | |
Tue. 27.06 | (5) | Word Sense Disambiguation | |
Tue. 04.07 | (6) | Coreference Resolution | |
III. Learning Concept Taxonomies | |||
Tue. 11.07 | (7) | Ontology Semantic Similarity | |
-- | (8) | Evaluation of information extraction tasks and ontologies | |
Tue. 18.07 | (9) | Learning Concept Taxonomies | |
IV. Learning General Relations | |||
Tue. 25.07 | (10) | Learning Relations using Association Rules | |
-- | (11) | Learning Relations using Kernel Methods | |
-- | (12) | Adaptive Information Extraction (LP2 algorithm) | |
V. Applications | |||
-- | (13) | Text Summarization |