打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
Xu Cui ? SVM (support vector machine) with libsvm

I am learning svm lately and tried libsvm. It’s a good package.

Linear kernel example (support vectors are in circles):

Linear

Nonlinear example (radial basis)

Nonlinear, circle

Nonlinear, two circles

Nonlinear, quadrant

3-class example

Linear, 3 classes

Basic procedure to use libsvm:

  1. Preprocess your data. This including normalization (make all values between 0 and 1) and transform non-numeric values to numeric. You can use the following code to normalize (from libsvm webpage):
    (data - repmat(min(data,[],1),size(data,1),1))*spdiags(1./(max(data,[],1)-min(data,[],1))',0,size(data,2),size(data,2))
  2. Find optimal parameter values. For linear kernel, you have 1 parameter C (penalize parameter). For commonly used radial kernel, you have two parameters (C and gamma). Different parameter values will yield different accuracy rate. To avoid over fitting, you use n-fold cross validation. For example, a 5-fold cross validation is to use 4/5 of the data to train the svm model and the rest 1/5 to test. The option -c, -g, and -v controls parameter C, gamma and n-fold cross validation. A piece of code from libsvm website is:
    bestcv = 0;
    for log2c = -1:3,
    for log2g = -4:1,
    cmd = ['-v 5 -c ', num2str(2^log2c), ' -g ', num2str(2^log2g)];
    cv = svmtrain(heart_scale_label, heart_scale_inst, cmd);
    if (cv >= bestcv),
    bestcv = cv; bestc = 2^log2c; bestg = 2^log2g;
    end
    fprintf('%g %g %g (best c=%g, g=%g, rate=%g)\n', log2c, log2g, cv, bestc, bestg, bestcv);
    end
    end
  3. You may have to run the above code several times with different range of parameter values to find the optimal values. For example, you might want to start from a bigger range with coarse resolution; then fine tune to smaller regions with higher resolution.
  4. After finding the optimal parameter values, use all data to train your model with your optimal parameter values.
    cmd = ['-t 2 -c ', num2str(bestc), ' -g ', num2str(bestg)];
    model = svmtrain(l, d, cmd);
  5. If you have new data, you may use this model to classify the new data.
    [predicted_label, accuracy, decision_values] = svmpredict(zeros(size(dd,1),1), dd, model);

Commonly used options

  • -v n: n-fold cross validation
  • -t 0: linear kernel
  • -t 2: radial basis (default)
  • -s 0: SVC type = C-SVC
  • -C: C parameter value, default 1
  • -g: gamma parameter value

libsvm performance

I tested on different data size and record the time spent (in second).

Computer: Processor: 2×2.66G, memory: 12G, OS: Windows XP installed in VMWare in Mac OS 10.5

data size    # features    svmtrain    svmpredict
100    2    0.00    0.00
100    6    0.00    0.00
100    10    0.00    0.00
100    20    0.00    0.00
100    50    0.01    0.00
100    100    0.02    0.01
500    2    0.02    0.01
500    6    0.03    0.02
500    10    0.05    0.03
500    20    0.08    0.03
500    50    0.46    0.07
500    100    0.56    0.12
1000    2    0.07    0.04
1000    6    0.10    0.06
1000    10    0.15    0.10
1000    20    0.36    0.14
1000    50    1.09    0.30
1000    100    3.07    0.50

It’s fairly fast.

Resources:

MatLab code to generate the plots above:cuixu_test_svm1

SVM basics: http://en.wikipedia.org/wiki/Support_vector_machine

Download libsvm for matlab at: http://www.csie.ntu.edu.tw/~cjlin/libsvm/#matlab

The meaning of libsvm output is at: http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html#f804

本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
【热】打开小程序,算一算2024你的财运
支持向量机算法及其代码实现
libsvm 2.6 的代码注释.
LibSVM for Python 使用
libSVM介绍(二)
libsvm多分类模型文件解释
LibSVM学习(六)——easy.py和grid.py的使用 - 东海的日志 - 网易博...
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服