Setting Up R in the WRDS Cloud

Learn how to set up your R working environment in the WRDS Cloud

Before You Begin

Before you can work with R software and data directly in the WRDS Cloud, you must have the following:

Your WRDS home directory is where you store all your R programs and data files. It is located at /home/[your_group name]/[your_username]

An SSH client is how you connect to the WRDS Cloud from your computer, and allows you to enter commands and run programs in the WRDS native UNIX environment from a command line window.

If you are unfamiliar with or need to brush up on your UNIX, see UNIX Quick Reference.

Top of Section

Introduction to R at WRDS

R is a programming language and a software environment used for statistical computing and research, and offers robust data analytics and data processing capabilities. It includes powerful tools to analyze and interpret data, particularly its graphing and plotting capabilities, and is popular among data scientists and programmers alike.

WRDS provides a direct interface for R, enabling you to query WRDS data from within your R program. WRDS data is stored in a series of PostgreSQL databases and accessed using an R Postgres driver.

Alternatively, you can also run R locally on your computer using RStudio to access WRDS data. For more information, see Accessing WRDS Remotely via R.

Top of Section

Setting Up Your R Working Environment

The first step to connecting to WRDS data from within R in the WRDS Cloud is setting up your .Rprofile file in your WRDS home directory. This step only needs to be done once.

The .Rprofile file first loads the RPostgres package, then creates a connection to WRDS with all the necessary parameters and saves that connection as wrds. This allows you to use wrds to connect to the WRDS PostgreSQL database servers seamlessly without having to enter the hostname, username and password each time.

To create your .Rprofile file, you first use an SSH session to connect to the WRDS Cloud, and then create a file named .Rprofile in your WRDS Cloud home directory using a text editor.

To create the .Rprofile file

library(RPostgres)
wrds <- dbConnect(Postgres(), 
                  host='wrds-pgdata.wharton.upenn.edu',
                  port=9737,
                  user='your_username',
                  password='your_password',
                  sslmode='require',
                  dbname='wrds')

Where your_username is your WRDS username and your_password is your WRDS password.

As your .Rprofile file contains your WRDS username and password in plain text, it is very important that you restrict permissions on this file using the following command. This command sets the permissions on this file to be read/write for your user account only.

To restrict file permissions:

chmod 600 ~/.Rprofile

The following code shows using an SSH session to connect to the WRDS Cloud, creating the .Rprofile file, and restricting file permissions, all together.

To connect to the WRDS cloud and create the .Rprofile file

my_laptop$ ssh joe@wrds-cloud.wharton.upenn.edu
[joe@wrds-cloud-login2-h ~]$
[joe@wrds-cloud-login2-h ~]$ nano .Rprofile
[joe@wrds-cloud-login2-h ~]$ cat .Rprofile
library(RPostgres)
wrds <- dbConnect(Postgres(), 
                  host='wrds-pgdata.wharton.upenn.edu',
                  port=9737,
                  user='joe',
                  password='mypassword',
                  sslmode='require',
                  dbname='wrds')
[joe@wrds-cloud-login2-h ~]$ chmod 600 ~/.Rprofile
[joe@wrds-cloud-login2-h ~]$ ls -l .Rprofile
-rw------- 1 joe wharton 60 Jan 1  2018 .Rprofile

Top of Section

Next: Submitting Jobs using R

Now that your R environment is configured to connect to the WRDS Cloud, you are ready to start Submitting R Programs.

Top of Section

Top