Hauptnavigation

Support Vector Machine

SVMlight: Support Vector Machine FAQ


Author: Thorsten Joachims <thorsten@ls8.cs.uni-dortmund.de>
University of Dortmund, Informatik, AI-Unit
Collaborative Research Center on 'Complexity Reduction in Multivariate Data'

Arrghs, SVMlight does not compile properly!

  • I want to compile SVMlight on a PowerMac using Code Warrior:
    • You need to modify the source code a little (as suggested by Jewgeni Starikow). Use #include "console.h" to emulate a UNIX shell. Then add argc=ccommand(&argv) as the first instruction of each main(). Furthermore, remove #include "sys/times.h", since it does not exist on Macs. The timing routines are used to calculated the runtime of the program. If you do not need this feature, remove the body of get_runtime() in svm_common.c. Otherwise replace the body with the appropriate Mac routines from 'time.h'.
  • There are no other known compilation problems at the moment. Send me mail.

I get the following error message:

  • ERROR: terminating optimizer - choldc failed, matrix not positive definite
    • If the program terminates after this message, get the lastest version of PR_LOQO and SVM-Light V2.01 (or later).
    • It the program continues after this error message, don't worry :-)

The program hangs when ...

  • ... reading in the examples.

  •  
    • Get version 3.02 or later.

Convergence during learning is very slow!

  • In verbose mode 2 I observe that max violation bounces around and does not really converge.

  •  
    • Use a smaller C (option -c).
    • Your data contains a lot of noise and so the convergence simply IS very slow. Sorry, not much you can do about it.
  • Nearly all my training examples end up as support vectors.

  •  
    • Use a "stiffer" kernel (e g. a lower value of gamma for the RBF-kernel). If you pick a kernel which is very far away from the optimum, you will not get good generalization performance anyway.
    • Your data is really difficult to separate. Think about a better representation of your data. It is a bad idea - for example - if different features have values in very much different orders of magnitude. You might want to normalize all features to the range [-1,+1].


It does not converge!

  • If you are using the built-in HIDEO optimizer, get the version 3.01 or later.
  • It should always converge, unless there are numerical problems :-)
  • Numerical problems.
    • Make sure your data is properly scaled. It is a bad idea - for example - if different features have values in different orders of magnitude. You might want to normalize all features to the range [-1,+1], especially for problems with more than 100 features.