Linking RepRisk with Compustat

Using WRDS to connect RepRisk ESG data with Compustat (NA) data

Linking with Compustat (NA)

RepRisk is database containing information about environmental, social, governance (ESG) issues in regards to companies.

In WRDS, there are two vintages of RepRisk data:

  1. Legacy files: files in /wrds/reprisk/sasdata/pm or /wrds/reprisk/sasdata/std
  2. Current files: files in /wrds/reprisk/sasdata/v2

The following linking methodology is for the current files.

RepRisk (v2) provides a file containing company identifiers, named V2_COMPANY_IDENTIFIERS. This file, with the relevant variables, is organized as illustrated below:

RepRisk-Table-01

This table contains the RepRisk’s company id ( reprisk_id ). It also contains the primary ISIN (primary_isin) and a list of ISINs for the same company. Be aware that some companies do not have ISINs. In those cases, the only linkable information with other databases are the company names and the url. Our method focuses on the ISIN linkings.

To facilitate linking, WRDS created a simple file that extracted all ISINs from the above table. The name of this file is V2_WRDS_COMPANY_ID_TABLE.

RepRisk-Table-02
V2_WRDS_Company_ID_Table

We use the ISIN variable in this file to link with Compustat. The link is made using CUSIP. The CUSIP is extracted from ISIN (if available) by extracting 9-characters from the third position. CUSIP in Compustat is available in the NAMES file (“/wrds/comp/sasdata/nam/”) that also contains a 9-digit CUSIP.

The following sample program simplifies the process:

data rep;
set reprisk.V2_WRDS_COMPANY_ID_TABLE ;
where primary_isin ne '';
cusip = substr(primary_isin,3,9);
keep reprisk_id primary_isin cusip;
run;
proc sort data=rep out=rep1 nodupkey;
by cusip reprisk_id;
run;
Proc sort data=comp.names(keep=gvkey cusip) out=comp nodupkey;
where cusip ne '';
by cusip gvkey;
run;
data rep_comp;
merge rep1(in=a) comp(in=b);
by cusip;
if a and b;
run;

Top of Section

Linking with Compustat Global

To link RepRisk with Compustat Global, we match both datasets via ISIN. This identifier is included in the RepRisk file named reprisk.V2_WRDS_COMPANY_ID_TABLE (/wrds/reprisk/sasdata/v2). (See the above example.)

Compustat Global file G_NAMES (/wrds/comp/sasdata/global) contains GVKEY and ISIN. So this link can be done directly. This following sample code does the linking:

proc sql;
create table rep_glo as select unique a.gvkey, a.isin, b.reprisk_id
from compg.g_names(where=(isin ne '')) as a, 
     reprisk.V2_WRDS_COMPANY_ID_TABLE (where=(primary_isin ne ''))  as b
where a.isin=b.primary_isin;
quit;

Top of Section

Top