Hauptnavigation

myKLR - kernel logistic regression

by Stefan Rüping, rueping@ls8.cs.uni-dortmund.de

About myKLR

myKLR is a tool for large scale kernel logistic regression based on the algorithm of [Keerthi/etal/2003a] and the code of mySVM.

License

This software is free only for non-commercial use. It must not be modified and distributed without prior permission of the author. The author is not responsible for implications from the use of this software.

Installation

Installation under Unix

  • Download myKLR.
  • Create a new directory, change into it and unpack the files into this directory
  • On typical UN*X systems simply type make to compile mySVM. On other systems you have to call your C++ compiler manually.

On some systems you might get an error message about sys/times.h. If you do, open the file globals.h and uncomment the line #undef use_time.

Using myKLR

myKLR is based on the code of mySVM. Hence, the format of example files, parameter files and kernel definition are identical. Please see the documentation of mySVM for further information.

When myKLR is called, it reads its parameters and kernel definition from the given files and computes the KLR function on the first given examples set. The function is applied to the examples in the subsequent example set. You can also use the predict tool of mySVM to apply the KLR function to further data sets.

Note that the KLR function f(x) is not the probability estimate P(Y=1|x). This estimate can be calculated by the transformation P(Y=1|x) = 1 / (1+exp(-f(x))). For compatibility reasons, the model of myKLR differs slightly from those of [Keerthi/etal/2003a]: myKLR uses -b instead of b and stores y*alpha instead of alpha.

Parameters:

The format of kernels and example definitions are identical to those in mySVM. Parameter definition also have the same syntax, but different semantics. The following parameters are used in myKLR:

C float the complexity constant.
max_iterations int stop after this much iterations
is_zero float numerical precision (epsilon and my in the paper, default: 1e-10)
convergence_epsilon float precision on the KKT conditions (tol in the paper, default: 1e-3)
kernel_cache int size of the cache for kernel evaluations im MB (default: 256)
scale scale the training examples to mean 0 and variance 1 (default)
no_scale do not scale the training examples (may be numerically less stable)
format set the default example file format
delimiter set the default example file format.

References

Keerthi/etal/2002a    Keerthi, S. S. and Duan, K. and Shevade, S. K. and Poo, A.N. A Fast Dual Algorithm for Kernel Logistic Regression