Unbiased Assesment of Learning Algorithms

Scheffer, Tobias; Herbrich, Ralf

Unbiased Assesment of Learning Algorithms

Journal

Proceedings of the International Joint Conference on Artificial Intelligence

Date Issued

1997

Author(s)

Scheffer, Tobias

Herbrich, Ralf

Abstract

In order to rank the performance of machine learning algorithms, many researchs conduct experiments on benchmark datasets. Since most learning algorithms have domain-specific parameters, it is a popular custom to adapt these parameters to obtain a minimal error rate on the test set. The same rate is used to rank the algorithm which causes an optimistic bias. We quantify this bias, showing in particular that an algorithm with more parameters will probably be ranked higher than an equally good algorithm with fewer parameters. We demonstrate this result, showing the number of parameters and trials required in order to pretend to outperform C4.5 or FOIL, respectively, for various benchmark problems. We then describe how unbiased ranking experiments should be conducted.

Options

Unbiased Assesment of Learning Algorithms