cuSVM for CUDA 5.0 and Matlab x64
This page shows how to build cuSVM, GPU accelerated SVM with dense format. Library has been written by AUSTIN CARPENTER. The procedure use CUDA 5.0, MATLAB x64 and Visual Studio 2010. The code and project files were modified in order to compile and link library, many steps were taken from http://www.parallelcoding.com/2012/02/09/cusvm-in-visual-studio-2010-with-cuda-4-0/
Modifications:
- Add matlab variables:
- cuSVMTrainIter – contains number of iteration the solver does
- cuSVMTrainObj – contains the final objective function value after the trainning
- In file cuSVMSolver.cu lines 869-874 all calls of cudaMemcpyToSymbol was changed, because of changes made in CUDA 5.0 runtime library – http://stackoverflow.com/questions/12947914/error-in-cudamemcpytosymbol-using-cuda-5
before the change:
mxCUDA_SAFE_CALL(cudaMemcpyToSymbol(„taumin”, &h_taumin, sizeof(float) ));
after the change:
mxCUDA_SAFE_CALL(cudaMemcpyToSymbol(taumin, &h_taumin, sizeof(float) )); - In functions FindBI, FindBJ, FindStoppingJ – change the way reduction in shared memory was done (http://stackoverflow.com/questions/6510427/cuda-finding-max-using-reduction-error)
- The kernel cache size is constrained to 400MB, if you want bigger cache you can modify cuSVMSolver.cu line 24
#define KERNEL_CACHE_SIZE (400*1024*1024)
Build Procedure
All steps describe below are done, you have to check if all paths are set correctly and yours GPU computational capability is set properly.
My setup:
- windows 7 x64
- visual studio 2010
- CUDA 5.0
- Matlab R2011b
- the code was tested on GeForce GT 330M and Geforce GTX 690
Prerequisites:
Determine paths:
- Matlab include path, mine is „D:\Program Files\MATLAB\R2011b\extern\include” (Matlab was installed on drive d:\)
- Matlab library path: „D:\Program Files\MATLAB\R2011b\extern\lib\win64\microsoft”
- CUDA toolkit include path: „C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\include”
- GPU compute capability, mine is 1.2 in case of GeForce GT 330M(compute_12,sm_12), and 3.0 in case GeForce GTX 690 (compute_30,sm_30)
Changes made in projects properties (the same steps are for both projects: cuSVMPredict, cuSVMTrain):
- Open solution in VS 2010
- Right click on project (cuSVMTrain or cuSVMPredict) and choose „Build Customizations …”, make sure that „CUDA 5.0(.targets, .props)” is checked
- Right click on cuSVMTrain and choose project „Properties”
- Expand „Configuration Properties”
- General->Target Extension: .mexw64
- General->Configuration Type: Dynamic Library (.dll)
- Expand c/c++-
- General->Additional Include Directories: $(SolutionDir)inc\;D:\Program Files\MATLAB\R2011b\extern\include;$(CudaToolkitIncludeDir);%(AdditionalIncludeDirectories)
- Expand CUDA C/C++
- Common->Additional Include Directories: $(SolutionDir)inc\;D:\Program Files\MATLAB\R2011b\extern\include;$(CudaToolkitIncludeDir);%(AdditionalIncludeDirectories)
- Common->Target Machine Platform: 64-bit (–machine 64)
- Device->Code Generation: compute_30,sm_30 – this depends on your GPU compute capability
- Expand Linker
- General->Additional Library Directories: %(AdditionalLibraryDirectories); $(CudaToolkitLibDir); D:\Program Files\MATLAB\R2011b\extern\lib\win64\microsoft
- Input->Additional Dependencies: cuda.lib;cublas.lib;libmex.lib;libmat.lib;libmx.lib;cudart.lib;kernel32.lib;user32.lib;gdi32.lib;winspool.lib;comdlg32.lib;advapi32.lib;shell32.lib;ole32.lib;oleaut32.lib;uuid.lib;odbc32.lib;odbccp32.lib;%(AdditionalDependencies)
- Input->Module Definition File: TrainModule.def (for cuSVMTrain project, for cuSVMPredict set PredictModule.def)
- Expand Build Events
- Post-Build Event->Command Line:
echo copy „$(CudaToolkitBinDir)\cudart*.dll” „$(OutDir)”
copy „$(CudaToolkitBinDir)\cudart*.dll” „$(OutDir)”
each command in separate line
- Post-Build Event->Command Line:
- Expand „Configuration Properties”
Eventually you can check if it is „Release” or „Debug” build.
How to use cuSVM
The zip package contains two folders:
- cuSVM – Visual Studio 2010 solution
- cuSVMmatlab – contains:
- libsvm,
- compile cuSVMTrain.mexw64 and cuSVMPredict.mexw64 in Lib folder,
- sample datasets in data folder
- matlab script cuSVMTest.m
- Build cuSVM in Release or Debug mode – important check your GPU compute capability
- Copy cuSVMTrain.mexw64 and cuSVMPredict.mexw64 to Lib folder
- Add Lib folder matlab search path.
- If you want classify some dataset open cuSVMTest.m file.
Comments (5)
Trackbacks (0) ( subscribe to comments on this post )
Leave a comment
No trackbacks yet.
2:57 on Grudzień 3rd, 2013
Hi, I got a problem when I use cuSVMTest.m, when it runs to cuSVMTrain, errors are as below.
Cuda errro in line 952: invalid device symbol.
Cuda errro in line 955: invalid device symbol.
Cuda errro in line 957: invalid device symbol.
Kernel Cache GPU memory allocation failed invalid device function . Error using cuSVMTrain
Kernel Cache GPU memory allocation failed
My GPU is GeForce GT 620.
I did all the step but still cannot find the reason.
9:46 on Luty 4th, 2014
Have you download the code from my site or use original Caprpenter code?
21:41 on Marzec 12th, 2014
To fix Felix’s „Kernel Cache GPU memory allocation failed” error, replace:
size_t free, total;
cuMemGetInfo( &free, &total );
with
CUdevice device;
CUcontext context;
cuInit(0);
cuDeviceGet( &device, 0 );
cuCtxCreate( &context, 0, device );
cuMemGetInfo( &free, &total );
cuCtxDetach( context );
17:43 on Marzec 19th, 2015
Hi ksopyla,
thank you very much for sharing your work. It helped me a lot!
I can build your solution with Visual Studio and your precompiled binaries worked for me out of the box. At least your classification example did.
But when I modify the example to run a regression task I get a super high mse and the presictions don’t seem to make sense at all.
I think there are two main changes necessary: replace „epsilon = [] ” with a correct value -> For compariaon reasons I took the standard libsvm value (which I think is 0.1 or 0.001). Next I changed the regression indicator variable from 0 to 1.
LibSVM produces correct (expected) results, but the cuSVM does not. The mse of LibSVM is 0.17 while cuSVM’s mse is something like 659135.
The compute capability for my geforce gt 650m (MacBook Pro) should be 3.0 – but I also tried building with settings for a capability of 1.2
Can I send you the modified regresseion cuSVM example and you try it with some of your own datasets?
Or maybe you could provide a simple regression example? But I think I did everything like described in the cuSVM documentation.
Many thanks in advance,
Marcus
22:06 on Czerwiec 25th, 2015
Kernel Cache GPU memory allocation failed invalid device function . Error using cuSVMTrain
Kernel Cache GPU memory allocation failed
I got this problem too
And i download the code from this site (ksopyla)
Can you give me solution, sir?