Thoughts on Cognition

Here are the steps I ran to test out Caffe on an AWS G2 instance. The current rates for running a g2.2xlarge is 65 cents/hour and using an EBS general purpose SSD is 10 cents per GB-month. So running these commands will cost you a couple of dollars.

1) Launch a GPU Instance with an HVM AMI. Here are some parameters I chose:

Datacenter: US East (N. Virginia)
AMI: Community AMIs -> Fedora -> Fedora_20_HVM_AMI
Instance Type: g2.2xlarge
Storage: 30 GB General Purpose SSD EBS volume

I encourage you to create a security group that only allows in SSH from your specific subnet.

2) Connect to your instance once it's running and you have the IP address (nn.nn.nn.nn). I use ssh from my local linux machine with the command that looks something like:

  ssh -X -i key_filename.pem fedora@nn.nn.nn.nn

2) Update and install some basic packages:


  # initial security updates

  sudo yum update -y

  # gcc toolchain

  sudo yum groupinstall -y "C Development Tools and Libraries"

  # git and stuff

  sudo yum groupinstall -y "Development tools"

  # for the nvidia driver

  sudo yum install -y kernel-devel dkms

  # for lspci, locate and wget

  sudo yum install -y pciutils mlocate wget

  # basic X11

  sudo yum install -y xorg-x11-apps xorg-x11-xauth

If you want to make sure you can see the see the GPU device on your PCI bus, run this:
lspci | grep NVIDIA
My output was:
00:03.0 VGA compatible controller: NVIDIA Corporation GK104GL [GRID K520] (rev a1)

3) Install CUDA. Grab the CUDA repository RPM for Fedora from: https://developer.nvidia.com/cuda-downloads
I copied the URL and ran this command:

sudo rpm -Uvh http://developer.download.nvidia.com/compute/cuda/repos/fedora20/x86_64/cuda-repo-fedora20-6.5-14.x86_64.rpm

Also install the RPMFUSION repositry for akmods and other good stuff


    sudo rpm -Uvh http://download1.rpmfusion.org/free/fedora/rpmfusion-free-release-20.noarch.rpm

    sudo rpm -Uvh http://download1.rpmfusion.org/nonfree/fedora/rpmfusion-nonfree-release-20.noarch.rpm

Install CUDA with this command:
sudo yum install -y cuda

Make sure the nouveau driver is blacklisted on the latest kernel which you are about to reboot to. Edit grub.conf (sudo vi /etc/grub.conf) and make sure these parameters are added to the kernel option:
nouveau.modeset=0 rd.driver.blacklist=nouveau

For example, mine is:


  kernel /boot/vmlinuz-3.16.6-203.fc20.x86_64 ro root=UUID=f1d4c251-e4c9-408b-a7b8-f5a9be8511fd console=hvc0 LANG=en_US.UTF-8 nouveau.modeset=0 rd.driver.blacklist=nouveau video=vesa:off vga=normal

Add the CUDA libraries to your standard shared library path:
echo /usr/local/cuda/lib64 | sudo tee /etc/ld.so.conf.d/cuda-x86_64.conf

4) Now is a good time to reboot:
sudo reboot

Once you reconnect, if you want to make sure X11 forwarding works (the -X in the ssh command) then run the xlogo command and you should see an X window pop up on your desktop.

If you want to make sure your nvidia kernel driver works run this command:
nvidia-smi -q | head

My output was:


==============NVSMI LOG==============



Timestamp                           : Fri Oct 31 20:09:04 2014

Driver Version                      : 340.29



Attached GPUs                       : 1

GPU 0000:00:03.0

    Product Name                    : GRID K520

    Product Brand                   : Grid

It's a good idea to build the CUDA samples just to make sure they work:


cd /usr/local/cuda/samples/

sudo make

To see what Device Query returns run:
./1_Utilities/deviceQuery/deviceQuery

Mine returned:


./1_Utilities/deviceQuery/deviceQuery Starting...



 CUDA Device Query (Runtime API) version (CUDART static linking)



Detected 1 CUDA Capable device(s)



Device 0: "GRID K520"

  CUDA Driver Version / Runtime Version          6.5 / 6.5

  CUDA Capability Major/Minor version number:    3.0

  Total amount of global memory:                 4096 MBytes (4294770688 bytes)

  ( 8) Multiprocessors, (192) CUDA Cores/MP:     1536 CUDA Cores

  GPU Clock rate:                                797 MHz (0.80 GHz)

  Memory Clock rate:                             2500 Mhz

  Memory Bus Width:                              256-bit

  L2 Cache Size:                                 524288 bytes

  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)

  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers

  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers

  Total amount of constant memory:               65536 bytes

  Total amount of shared memory per block:       49152 bytes

  Total number of registers available per block: 65536

  Warp size:                                     32

  Maximum number of threads per multiprocessor:  2048

  Maximum number of threads per block:           1024

  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)

  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)

  Maximum memory pitch:                          2147483647 bytes

  Texture alignment:                             512 bytes

  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)

  Run time limit on kernels:                     No

  Integrated GPU sharing Host Memory:            No

  Support host page-locked memory mapping:       Yes

  Alignment requirement for Surfaces:            Yes

  Device has ECC support:                        Disabled

  Device supports Unified Addressing (UVA):      Yes

  Device PCI Bus ID / PCI location ID:           0 / 3

  Compute Mode:

     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >



deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.5, CUDA Runtime Version = 6.5, NumDevs = 1, Device0 = GRID K520

Result = PASS

5) Install CUDNN. Somehow download CUDNN from nvidia and get the file on your AWS Instance. Since you need to register as an nvidia developer and accept license terms I downloaded it on my linux desktop then copied it to my aws instance with scp. The CUDNN URL is: https://developer.nvidia.com/cuDNN and once I had the file in my home directory these are the commands I used to 'install' it:


    tar -xvzf cudnn-6.5-linux-R1.tgz

    sudo cp -var cudnn-6.5-linux-R1/libcudnn* /usr/local/cuda/lib64/

    sudo cp -var cudnn-6.5-linux-R1/cudnn.h /usr/local/cuda/include/

6) Download Caffe. Follow the instructions on: http://caffe.berkeleyvision.org/installation.html

Clone the caffe source code on github:


git clone https://github.com/BVLC/caffe.git

cd caffe

Then install a bunch of caffe dependencies (some of which are optional for python):


sudo yum install -y atlas-devel bc

sudo yum install -y protobuf-devel leveldb-devel

sudo yum install -y snappy-devel opencv-devel

sudo yum install -y boost-devel hdf5-devel

sudo yum install -y gflags-devel glog-devel lmdb-devel



sudo yum install -y python-pip python-devel boost-python

sudo yum install -y gcc-gfortran

sudo yum install -y libpng-devel freetype-devel



for req in $(cat python/requirements.txt); do sudo pip install $req; done

7) Build and test Caffe. Create and edit your config:

cp Makefile.config.example Makefile.config

This is all I changed:



USE_CUDNN := 1

BLAS_LIB := /usr/lib64/atlas

Build the source, tests then run the tests:


make all

make test

make runtest

Test MNIST for good measure:


pushd data/mnist; ./get_mnist.sh; popd

./examples/mnist/create_mnist.sh

./examples/mnist/train_lenet.sh

It took about 47 seconds to train and achieved an accuracy of 0.9909. Don't forget to shutdown your instance when you are done:

sudo shutdown -h now

Thoughts on Cognition

Friday, October 31, 2014

Caffe on AWS with Fedora 20, CUDA 6.5 and CUDNN

Thursday, August 4, 2011

Andrew Ng

Thursday, February 5, 2009

LeCun on DBN

Saturday, September 13, 2008

Two more Google Tech Talks

Sunday, June 1, 2008

Sparse codes for natural sounds

Saturday, May 24, 2008

Misc courses

Monday, May 12, 2008

Talk on sentence trees

Blog Archive

About Me