Here are the steps I ran to test out Caffe on an AWS G2 instance. The current rates for running a g2.2xlarge is 65 cents/hour and using an EBS general purpose SSD is 10 cents per GB-month. So running these commands will cost you a couple of dollars.
1) Launch a GPU Instance with an HVM AMI. Here are some parameters I chose:
2) Connect to your instance once it's running and you have the IP address (nn.nn.nn.nn). I use ssh from my local linux machine with the command that looks something like:
2) Update and install some basic packages:
If you want to make sure you can see the see the GPU device on your PCI bus, run this:
My output was:
3) Install CUDA. Grab the CUDA repository RPM for Fedora from: https://developer.nvidia.com/cuda-downloads
I copied the URL and ran this command:
Also install the RPMFUSION repositry for akmods and other good stuff
Install CUDA with this command:
sudo yum install -y cuda
Make sure the nouveau driver is blacklisted on the latest kernel which you are about to reboot to. Edit grub.conf (
For example, mine is:
Add the CUDA libraries to your standard shared library path:
4) Now is a good time to reboot:
Once you reconnect, if you want to make sure X11 forwarding works (the -X in the ssh command) then run the
If you want to make sure your nvidia kernel driver works run this command:
My output was:
It's a good idea to build the CUDA samples just to make sure they work:
To see what Device Query returns run:
Mine returned:
5) Install CUDNN. Somehow download CUDNN from nvidia and get the file on your AWS Instance. Since you need to register as an nvidia developer and accept license terms I downloaded it on my linux desktop then copied it to my aws instance with scp. The CUDNN URL is: https://developer.nvidia.com/cuDNN and once I had the file in my home directory these are the commands I used to 'install' it:
6) Download Caffe. Follow the instructions on: http://caffe.berkeleyvision.org/installation.html
Clone the caffe source code on github:
Then install a bunch of caffe dependencies (some of which are optional for python):
7) Build and test Caffe. Create and edit your config:
This is all I changed:
Build the source, tests then run the tests:
Test MNIST for good measure:
It took about 47 seconds to train and achieved an accuracy of 0.9909. Don't forget to shutdown your instance when you are done:
1) Launch a GPU Instance with an HVM AMI. Here are some parameters I chose:
- Datacenter: US East (N. Virginia)
- AMI: Community AMIs -> Fedora -> Fedora_20_HVM_AMI
- Instance Type: g2.2xlarge
- Storage: 30 GB General Purpose SSD EBS volume
2) Connect to your instance once it's running and you have the IP address (nn.nn.nn.nn). I use ssh from my local linux machine with the command that looks something like:
ssh -X -i key_filename.pem fedora@nn.nn.nn.nn
2) Update and install some basic packages:
# initial security updates
sudo yum update -y
# gcc toolchain
sudo yum groupinstall -y "C Development Tools and Libraries"
# git and stuff
sudo yum groupinstall -y "Development tools"
# for the nvidia driver
sudo yum install -y kernel-devel dkms
# for lspci, locate and wget
sudo yum install -y pciutils mlocate wget
# basic X11
sudo yum install -y xorg-x11-apps xorg-x11-xauth
If you want to make sure you can see the see the GPU device on your PCI bus, run this:
lspci | grep NVIDIAMy output was:
00:03.0 VGA compatible controller: NVIDIA Corporation GK104GL [GRID K520] (rev a1)3) Install CUDA. Grab the CUDA repository RPM for Fedora from: https://developer.nvidia.com/cuda-downloads
I copied the URL and ran this command:
sudo rpm -Uvh http://developer.download.nvidia.com/compute/cuda/repos/fedora20/x86_64/cuda-repo-fedora20-6.5-14.x86_64.rpmAlso install the RPMFUSION repositry for akmods and other good stuff
sudo rpm -Uvh http://download1.rpmfusion.org/free/fedora/rpmfusion-free-release-20.noarch.rpm
sudo rpm -Uvh http://download1.rpmfusion.org/nonfree/fedora/rpmfusion-nonfree-release-20.noarch.rpm
Install CUDA with this command:
sudo yum install -y cuda
Make sure the nouveau driver is blacklisted on the latest kernel which you are about to reboot to. Edit grub.conf (
sudo vi /etc/grub.conf) and
make sure these parameters are added to the kernel option:nouveau.modeset=0 rd.driver.blacklist=nouveauFor example, mine is:
kernel /boot/vmlinuz-3.16.6-203.fc20.x86_64 ro root=UUID=f1d4c251-e4c9-408b-a7b8-f5a9be8511fd console=hvc0 LANG=en_US.UTF-8 nouveau.modeset=0 rd.driver.blacklist=nouveau video=vesa:off vga=normal
Add the CUDA libraries to your standard shared library path:
echo /usr/local/cuda/lib64 | sudo tee /etc/ld.so.conf.d/cuda-x86_64.conf4) Now is a good time to reboot:
sudo rebootOnce you reconnect, if you want to make sure X11 forwarding works (the -X in the ssh command) then run the
xlogo command and you should see an X window pop up on your desktop.If you want to make sure your nvidia kernel driver works run this command:
nvidia-smi -q | headMy output was:
==============NVSMI LOG==============
Timestamp : Fri Oct 31 20:09:04 2014
Driver Version : 340.29
Attached GPUs : 1
GPU 0000:00:03.0
Product Name : GRID K520
Product Brand : Grid
It's a good idea to build the CUDA samples just to make sure they work:
cd /usr/local/cuda/samples/
sudo make
To see what Device Query returns run:
./1_Utilities/deviceQuery/deviceQueryMine returned:
./1_Utilities/deviceQuery/deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GRID K520"
CUDA Driver Version / Runtime Version 6.5 / 6.5
CUDA Capability Major/Minor version number: 3.0
Total amount of global memory: 4096 MBytes (4294770688 bytes)
( 8) Multiprocessors, (192) CUDA Cores/MP: 1536 CUDA Cores
GPU Clock rate: 797 MHz (0.80 GHz)
Memory Clock rate: 2500 Mhz
Memory Bus Width: 256-bit
L2 Cache Size: 524288 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device PCI Bus ID / PCI location ID: 0 / 3
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.5, CUDA Runtime Version = 6.5, NumDevs = 1, Device0 = GRID K520
Result = PASS
5) Install CUDNN. Somehow download CUDNN from nvidia and get the file on your AWS Instance. Since you need to register as an nvidia developer and accept license terms I downloaded it on my linux desktop then copied it to my aws instance with scp. The CUDNN URL is: https://developer.nvidia.com/cuDNN and once I had the file in my home directory these are the commands I used to 'install' it:
tar -xvzf cudnn-6.5-linux-R1.tgz
sudo cp -var cudnn-6.5-linux-R1/libcudnn* /usr/local/cuda/lib64/
sudo cp -var cudnn-6.5-linux-R1/cudnn.h /usr/local/cuda/include/
6) Download Caffe. Follow the instructions on: http://caffe.berkeleyvision.org/installation.html
Clone the caffe source code on github:
git clone https://github.com/BVLC/caffe.git
cd caffe
Then install a bunch of caffe dependencies (some of which are optional for python):
sudo yum install -y atlas-devel bc
sudo yum install -y protobuf-devel leveldb-devel
sudo yum install -y snappy-devel opencv-devel
sudo yum install -y boost-devel hdf5-devel
sudo yum install -y gflags-devel glog-devel lmdb-devel
sudo yum install -y python-pip python-devel boost-python
sudo yum install -y gcc-gfortran
sudo yum install -y libpng-devel freetype-devel
for req in $(cat python/requirements.txt); do sudo pip install $req; done
7) Build and test Caffe. Create and edit your config:
cp Makefile.config.example Makefile.configThis is all I changed:
USE_CUDNN := 1
BLAS_LIB := /usr/lib64/atlas
Build the source, tests then run the tests:
make all
make test
make runtest
Test MNIST for good measure:
pushd data/mnist; ./get_mnist.sh; popd
./examples/mnist/create_mnist.sh
./examples/mnist/train_lenet.sh
It took about 47 seconds to train and achieved an accuracy of 0.9909. Don't forget to shutdown your instance when you are done:
sudo shutdown -h now