wir bieten...
Dekobild im Seitenkopf ISMLL
 
Courses in winter term 2003/2004 / Seminar on Spam:

Time: Wednesday 14-16
Location: SR 01-018, Geb. 101
Begin: 15.10.2003
Spam or unsolicited bulk email is both, a nuisance for users who are flooded with advertising messages, and an interesting and evolving problem for the design of messaging services on the technical side as well as text classification on the methodological side. In the last two years the amount of spam came up to a level that enforced most of non-casual users to use some kind of automatic spam filtering. Since the first Spam conference in Stanford in spring 2003, interest in this topic increased even more in the scientific community.

The seminar aims at working out a picture of the actual state of spam prevention. In the first part we will look at technical issues as the email transport protocols and systems in use as well as architectures of recent anti-spam software. In the second part we will focus on text classification methods for spam prediction, covering a wide range of methods from simple nearest neighbor and naive Bayes classifiers to more advanced techniques as boosting and genetic algorithms.

Talks can be given in German or English.

Topics (t = with technical focus, m = with focus on methods):

  1. (t) Spam at transport level
  2. (t) Reporting and tracing spam
  3. (t) Architectures of spam filters
  4. (t) CRM114 - a programming language for filters
  5. (t) Filtering spam in the haystack framework
  6. (m) Rule and memory based filtering
  7. (m) Bayesian filtering
  8. (m) Classifying spam using support vector machines
  9. (m) Classifying spam using association rules
  10. (m) Improving spam classifiers by genetic algorithms
  11. (m) Improving spam classifiers by stacking and boosting
  12. (m) Text classification using background knowledge