Information Systems and Machine Learning Lab, University of Hildesheim, Germany

Courses in summer term 2005 / Seminar on Predictive Modelling:

abstract

presentations

Time:	Tue. 14-16
Location:	SR 00-007, Geb. 106
Vorbesprechung:	12.04.2005
Begin:	12.04.2005

Predictive modelling (aka supervised learning or classification / regression) is the key approach for automating tasks by learning from examples. By means of a predictive model as e.g., a decision tree, a neural network or a support vector machine, a property can be inferred from other properties or some decision be made based on some information. Applications are abundant, as, e.g., automatically detecting spam emails, predicting consumer choices, translating speech signals to text etc.

While the basic methods and theory are developed for the scenario of real-valued and/or categorical predictor and target variables that can represented by a data matrix, and thus, machine learning and data mining textbooks focus on introducing different model families as e.g., decision trees, neural networks and support vector machines, in practice this situation never occurs. For any real problem, some observations might be missing (e.g., test persons not answering), some target class might be extremely rare rendering straight-forward methods useless (e.g., fraudulent transaction), predictors or targets might carry some structure (e.g., a topic in a topic hierarchy, a web or citation graph), or valuable information might be missed by applying classifiers in a straight-forward manner, as e.g., information from objects without observed target or relations between objects.

This seminar aims at presenting a broad overview of methods for predictive modelling that address some of these problems. As any of these problems easily could fill a complete seminar in itself, we will study selected approaches exemplarily.

Talks can be given in English or German.

Topics and preliminary schedule:

	Tue. 12.4.	(0)	-- Introduction --
I. Some Fundamentals
	---	(1)	Evaluation of Classifier Performance
	Tue. 10.5.	(2)	Support Vector Machines (SVMs)
	Tue. 24.5.	(3)	Missing Values and the Expectation Maximization (EM) algorithm
II. Some Basic Problems
	---	(4)	Imbalanced class distributions
	Tue. 31.5.	(5)	Multi-class predictions (aka multi-label, multi-category)
	Tue. 7.6.	[cancelled]
	Fr. 10.6.	(6)	Hierarchical targets
III. Classification of structured objects (structured input) and
Predicting structured targets (structured output)
	Tue. 14.6.	(7)	Classification of sequences with Support Vector Machines
	Fr. 24.6.	(9)	Predicting sequences
	Tue. 21.6.	[cancelled]
	Fr. 24.6.	(10)	Predicting structured targets in general (e.g., rankings)
IV. Using unlabeled data (transductive inference, transduction,
semi-supervised classification)
	---	(11)	Transductive ridge regression.
	Tue. 28.6.	(12)	Transductive Support Vector Machines
	---	(13)	Transductive k-nearest neighbor classifiers
V. Using relations between instances (relational learning, collective inference)
	Tue. 5.7.	(8)	Classification of graphs with Support Vector Machines
	?Tue. 5.7.	(14)	Co-training
	Tue. 12.7.	(15)	Collective Inference

abstract

readings

presentations