Remote CUDA with Amazon EC2

3 minute read Published:

Setting up an EC2 server to develop CUDA code remotely with Eclipse Nsight

I write code on a virtual machine with Fedora 21 64-bit (kernel: 3.19.3-100.fc20.x86_64). I wanted to write some CUDA programs without having Nvidia hardware or drivers.

Requirements:

  • Computer with Fedora 21 or RHEL 7 64-bit - no Nvidia hardware necessary
  • Amazon Web Services account

Install the CUDA toolkit on your computer

Download the CUDA Toolkit 7. The official Fedora 21 installers are not released yet, but I used the RHEL 7 Network Installer RPM file on my Fedora 21 installation with no problem.

After you downloaded the file:

$ sudo yum install cuda-repo-rhel7-7.0-28.x86_64.rpm
$ sudo yum clean expire-cache
$ sudo yum install cuda
$ tail ~/.bashrc
function exportcuda() {
    PATH=$PATH:/usr/local/cuda-7.0/bin
    LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-7.0/lib64
    export PATH
    export LD_LIBRARY_PATH
}

Now you have to erase the Nvidia drivers, or else you’ll boot into a black screen (you can solve it with the proceeding steps from the non-graphical tty).

Steps to erase Nvidia drivers:

for pkg in $(rpm -qa --last | grep nvidia); do sudo rpm -e --nodeps $pkg; done

After the installation, you will have several new applications, including Nsight Eclipse Edition and NVIDIA Visual Profiler. Try to run Nsight:

nsight

Set up an Amazon EC2 GPU instance

According to EC2 pricing, the g2.2xlarge GPU instance costs $0.650/hr and the g2.8xlarge GPU instance costs $2.60/hr. Launch an instance, and in the security group settings for the instance, open up all traffic between your ip and the instance.

Now execute the following on the EC2 instance:

$ wget https://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-5.noarch.rpm
$ sudo yum install epel-release-7.5.noarch.rpm
$ echo "[linuxtech]
> name=LinuxTECH
> baseurl=http://pkgrepo.linuxtech.net/el6/release/
> enabled=1
> gpgcheck=1
> gpgkey=http://pkgrepo.linuxtech.net/el6/release/RPM-GPG-KEY-LinuxTECH.NET
> " > /etc/yum.repos.d/linuxtech.repo
$ sudo yum install libvdpau
$ sudo yum install dkms
$ sudo yum install git gcc-c++
$ wget http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-repo-rhel7-7.0-28.x86_64.rpm

Follow the same installation procedure as earlier, except you shouldn’t erase the Nvidia drivers afterwards.

Add your SSH key to the EC2 instance so that you don’t need the pem file.

Find and download the latest Nvidia GRID K520 driver for Linux 64-bit, transfer the file to your instance and run it.

If you run into dkms kernel header errors reported by the NVIDIA installer, verify your kernel-header and kernel-devel versions and repair any broken symlinks in /lib/modules/$(uname-r).

After the NVIDIA driver installer exits with a success message, execute nvidia-smi, which should tell you that you have a correct/active Nvidia driver:

nvidia-smi

Set up Nsight with the EC2 instance

Back on your computer, launch Eclipse Nsight:

  • Click on File -> New -> CUDA C/C++ Project
  • Choose a name and a type (I chose CUDA Runtime Project)
  • Click OK on the next dialog
  • In the Target system dialog, click the Manage button next to Remote Connections. Here, you will add the EC2 machine details (user: ec2-user, host: $ip-of-instance)
  • Click finish. Specify the remote directories. Toolkit is still at “usr/bin/cuda-7.0”. Pick a project path (I chose /home/ec2-user/test)
  • Click OK and Finish in the final dialog

Now we’re ready to build and run the project:

  • Click Project -> Build Configurations -> Set Active -> ec2-user Debug
  • Click Project -> Build All. When this completes, check if the directory you specified on the EC2 machine (/home/ec2-user/test) is popualted with files
  • Click on Run -> Run Configurations, choose the remote ec2 configuration, and click run at the bottom of the dialog. If you have an output like ‘gpuSum = 23.335 cpuSum = 23.335’, you’re good to go:

nsight-output