Information Systems and Machine Learning Lab, University of Hildesheim, Germany

Courses in Summer term 2008 / Semi-supervised Learning:

abstract

readings

Time:	Wednesday 14-16
Location:	B26
Vorbesprechung:	09.04
Begin:	09.04

Semi-supervised learning is a class of machine learning techniques that make use of both labeled and unlabeled data for training - typically a small amount of labeled data with a large amount of unlabeled data. Semi-supervised learning falls between unsupervised learning (without any labeled training data) and supervised learning (with completely labeled training data). Many machine-learning researchers have found that unlabeled data, when used in conjunction with a small amount of labeled data, can produce considerable improvement in learning accuracy. The acquisition of labeled data for a learning problem often requires a skilled human agent to manually classify training examples. The cost associated with the labeling process thus may render a fully labeled training set infeasible, whereas acquisition of unlabeled data is relatively inexpensive. In such situations, semi-supervised learning can be of great practical value. Semi-supervised learning methods have been applied in different domains, for instance in text and web mining, speech recognition and bioinformatics.

The seminar aims at presenting a broad overview of different semi-supervised learning methods. The seminar is based on the book Chapelle, Schölkopf, and Zien (eds.): "Semi-Supervised Learning" and on selected papers. Knowledge in statistics or machine learning could be useful (but are not formaly required).

Talks can be given in English or German.

Supervisor: Christine Preisach

Topics:

Introduction to Supervised Learning
Introduction to Unsupervised Learning

Co-Training Algorithm
Kernel Methods and Fisher Kernel
Semi-Supervised Text Classification Using EM
Risks of Semi-Supervised Learning
Probabilistic Semi-Supervised Clustering with Constraints

Transductive Support Vector Machines
Gaussian Processes and the Null-Category Noise Model

Label Propagation in Graphs
Semi-Supervised Learning with Conditional Harmonic Mixing

Spectral Methods for Dimensionality Reduction

Semi-Supervised Protein Classification Using Cluster Kernels
Prediction of Protein Function from Networks

Interested students can register for a topic from now via email to .