cuSVM for CUDA 5.0 and Matlab x64

This page shows how to build cuSVM, GPU accelerated SVM with dense format. Library has been written by AUSTIN CARPENTER. The procedure use CUDA 5.0, MATLAB x64 and Visual Studio 2010. The code and project files were modified in order to compile and link library, many steps were taken from http://www.parallelcoding.com/2012/02/09/cusvm-in-visual-studio-2010-with-cuda-4-0/

Modifications:

  1. Add matlab variables:
    1. cuSVMTrainIter – contains number of iteration the solver does
    2. cuSVMTrainObj –  contains the final objective function value after the trainning
  2. In file cuSVMSolver.cu lines 869-874 all calls of cudaMemcpyToSymbol was changed, because of changes made in CUDA 5.0 runtime library – http://stackoverflow.com/questions/12947914/error-in-cudamemcpytosymbol-using-cuda-5
    before the change:
    mxCUDA_SAFE_CALL(cudaMemcpyToSymbol(„taumin”, &h_taumin, sizeof(float) ));
    after the change:
    mxCUDA_SAFE_CALL(cudaMemcpyToSymbol(taumin, &h_taumin, sizeof(float) ));
  3. In functions FindBI, FindBJ, FindStoppingJ – change the way reduction in shared memory was done (http://stackoverflow.com/questions/6510427/cuda-finding-max-using-reduction-error)
  4. The kernel cache size is constrained to 400MB, if you want bigger cache you can modify cuSVMSolver.cu line 24
    #define KERNEL_CACHE_SIZE (400*1024*1024)

 

Build Procedure

Download preconfigure cuSVM Visual Studio 2010 solution with LibSVM and matlab scritp for classification

All steps describe below are done, you have to check if all paths are set correctly and yours GPU computational capability is set properly.

My setup:

  • windows 7 x64
  • visual studio 2010
  • CUDA 5.0
  • Matlab R2011b
  • the code was tested on GeForce GT 330M and Geforce GTX 690

Prerequisites:

Determine paths:

  1. Matlab include path, mine is „D:\Program Files\MATLAB\R2011b\extern\include” (Matlab was installed on drive d:\)
  2. Matlab library path: „D:\Program Files\MATLAB\R2011b\extern\lib\win64\microsoft”
  3. CUDA toolkit include path: „C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\include”
  4. GPU compute capability, mine is 1.2 in case of GeForce GT 330M(compute_12,sm_12), and 3.0 in case GeForce GTX 690 (compute_30,sm_30)
Changes made in projects properties (the same steps are for both projects: cuSVMPredict, cuSVMTrain):
  1. Open solution in VS 2010
  2. Right click on project (cuSVMTrain or cuSVMPredict)  and choose „Build Customizations …”, make sure that „CUDA 5.0(.targets, .props)” is checked
  3. Right click on cuSVMTrain and choose project „Properties”
    1. Expand „Configuration Properties”
      1. General->Target Extension: .mexw64
      2. General->Configuration Type: Dynamic Library (.dll)
    2. Expand c/c++-
      1. General->Additional Include Directories: $(SolutionDir)inc\;D:\Program Files\MATLAB\R2011b\extern\include;$(CudaToolkitIncludeDir);%(AdditionalIncludeDirectories)
    3. Expand CUDA C/C++
      1. Common->Additional Include Directories: $(SolutionDir)inc\;D:\Program Files\MATLAB\R2011b\extern\include;$(CudaToolkitIncludeDir);%(AdditionalIncludeDirectories)
      2. Common->Target Machine Platform: 64-bit (–machine 64)
      3. Device->Code Generation: compute_30,sm_30 – this depends on your GPU compute capability
    4. Expand Linker
      1. General->Additional Library Directories: %(AdditionalLibraryDirectories); $(CudaToolkitLibDir); D:\Program Files\MATLAB\R2011b\extern\lib\win64\microsoft
      2. Input->Additional Dependencies: cuda.lib;cublas.lib;libmex.lib;libmat.lib;libmx.lib;cudart.lib;kernel32.lib;user32.lib;gdi32.lib;winspool.lib;comdlg32.lib;advapi32.lib;shell32.lib;ole32.lib;oleaut32.lib;uuid.lib;odbc32.lib;odbccp32.lib;%(AdditionalDependencies)
      3. Input->Module Definition File: TrainModule.def (for cuSVMTrain project, for cuSVMPredict set PredictModule.def)
    5. Expand Build Events
      1. Post-Build Event->Command Line:
        echo copy „$(CudaToolkitBinDir)\cudart*.dll” „$(OutDir)”
        copy „$(CudaToolkitBinDir)\cudart*.dll” „$(OutDir)”
        each command in separate line

Eventually you can check if it is „Release” or „Debug” build.

How to use cuSVM

The zip package contains two folders:

  • cuSVM – Visual Studio 2010 solution
  • cuSVMmatlab – contains:
    1. libsvm,
    2. compile cuSVMTrain.mexw64 and cuSVMPredict.mexw64 in Lib folder,
    3. sample datasets in data folder
    4. matlab script cuSVMTest.m
  1. Build cuSVM in Release or Debug mode – important check your GPU compute capability
  2. Copy cuSVMTrain.mexw64 and cuSVMPredict.mexw64 to Lib folder
  3. Add Lib folder matlab search path.
  4. If you want classify some dataset open  cuSVMTest.m file.

 

Comments (5) Trackbacks (0)
  1. Felix
    2:57 on Grudzień 3rd, 2013

    Hi, I got a problem when I use cuSVMTest.m, when it runs to cuSVMTrain, errors are as below.

    Cuda errro in line 952: invalid device symbol.
    Cuda errro in line 955: invalid device symbol.
    Cuda errro in line 957: invalid device symbol.
    Kernel Cache GPU memory allocation failed invalid device function . Error using cuSVMTrain
    Kernel Cache GPU memory allocation failed

    My GPU is GeForce GT 620.

    I did all the step but still cannot find the reason.

  2. ksopyla
    9:46 on Luty 4th, 2014

    Have you download the code from my site or use original Caprpenter code?

  3. itzahk
    21:41 on Marzec 12th, 2014

    To fix Felix’s „Kernel Cache GPU memory allocation failed” error, replace:

    size_t free, total;
    cuMemGetInfo( &free, &total );

    with

    CUdevice device;
    CUcontext context;
    cuInit(0);
    cuDeviceGet( &device, 0 );
    cuCtxCreate( &context, 0, device );
    cuMemGetInfo( &free, &total );
    cuCtxDetach( context );

  4. Marcus
    17:43 on Marzec 19th, 2015

    Hi ksopyla,

    thank you very much for sharing your work. It helped me a lot!
    I can build your solution with Visual Studio and your precompiled binaries worked for me out of the box. At least your classification example did.
    But when I modify the example to run a regression task I get a super high mse and the presictions don’t seem to make sense at all.
    I think there are two main changes necessary: replace „epsilon = [] ” with a correct value -> For compariaon reasons I took the standard libsvm value (which I think is 0.1 or 0.001). Next I changed the regression indicator variable from 0 to 1.
    LibSVM produces correct (expected) results, but the cuSVM does not. The mse of LibSVM is 0.17 while cuSVM’s mse is something like 659135.
    The compute capability for my geforce gt 650m (MacBook Pro) should be 3.0 – but I also tried building with settings for a capability of 1.2
    Can I send you the modified regresseion cuSVM example and you try it with some of your own datasets?
    Or maybe you could provide a simple regression example? But I think I did everything like described in the cuSVM documentation.

    Many thanks in advance,
    Marcus

  5. Hamnis
    22:06 on Czerwiec 25th, 2015

    Kernel Cache GPU memory allocation failed invalid device function . Error using cuSVMTrain
    Kernel Cache GPU memory allocation failed

    I got this problem too

    And i download the code from this site (ksopyla)

    Can you give me solution, sir?

Leave a comment

No trackbacks yet.