This document illustrates how to link Markit Securities Finance data with CRSP data
Brief Overview of Markit Securities Finance Data
Markit RED (RED) is the market standard for reference data in the credit markets. RED provides a unique 6-digit identifier,
redcode, for each entity in the database. In addition, it also carries several other common entity identifiers, such as 6-digit
ticker as well as company name strings.
Researchers can use these identifiers to link the CDS data with other data sources, such as CRSP and TRACE for equity and fixed income respectively.
Identifiers within MSF
MSF data does not carry separate table containing security description information. Instead, all identifying information are reported daily together with the time series data.
For the case of MSF Analytics data, the identifiers are
- DXLID- MSF unique identifier, known as DataExplorer ID as MSF was previously an independent data company DataExplorer
- CUSIP - 9 digit historical CUSIP
- QUICK - Instrument identifier used in Japan, empty for all non-Japanese securities
- InstrumentName - Security description
- MarketArea - Market classification of security defining the country or currency of trading, security type
Below is a sample security description field from MSF Analytics American Equity 2022 data.
|DX00000021||US98956P1021||2783815||98956P102||Zimmer Biomet Holdings Inc||US Equity (S&P500)|
|DX00000023||US7901481009||2768663||790148100||St Joe Co||US Equity (RUSSELL 2000)|
|DX00000026||US6934751057||2692665||693475105||Pnc Financial Services Group Inc||US Equity (S&P500)|
MFS Institutional carries very similar identifier information:
- DXLID - MSF unique identifier
- CUSIP - 9 digit historical CUSIP
- Quick - Instrument identifier used in Japan, empty for all non-Japanese securities
- StockDesc - Stock Description
- MarketArea - Market Area (Country level)
- BBGID - Bloomberg Global Identifier (FIGI), available starting in 2017
- BB_TICKER - Bloomberg Ticker, available starting in 2017
Below is a sample security description field from MSF Institutional American Equity 2022 data.
|21||US98956P1021||2783815||98956P102||Zimmer Biomet Holdings Inc||
|23||US7901481009||2768663||790148100||St Joe Co||
|26||US6934751057||2692665||693475105||Pnc Financial Services Group Inc||
Notice that though both MSF Analytics and Intuitional data carry MSF unique identifier
DXLID at instrument level, but the format of the variable in respective database is slight different.
Take the security Zimmer Biomet Holdings Inc for instance, in MSF Analytics its
DXLID = DX00000021, while in MSF Institutional its
DXLID = 21. It appears the two identifier systems follow the same internal ordering system, but has different representation. We just want to point this out in this section of discussion in case users would like to link across these two MSF databases.
Connecting with CRSP
We illustrate below using the new CIZ format of CRSP data. The logic is the same for the legacy SIZ format of CRSP data, just with different database syntax. And since CRSP is for North American trading equities , we focus on linking with the portion of the MSF data that is for American equities, and the primary linking key is through historical CUSIP that is common in both databases.
Extract MSF Identifiers
As MSF doesn't carry a separate identifier table, researchers need to extract the relevant information from the actual time series datasets. We illustrate below how to quickly extract the information from both the MSF Analytics and Institutional data.
/* ****************** */ /* MSF Analytics IDs */ /* ****************** */ * No need to assign library if you are running the code on SAS Studio; * libname msfanly ('/wrds/markit/sasdata/msf_analytics_eqty_amer', '/wrds/markit/sasdata/msf_analytics_eqty_euro', '/wrds/markit/sasdata/msf_analytics_eqty_asia', '/wrds/markit/sasdata/msf_analytics_eqty_other'); /* Consolidate all records entity information */; %macro combine; %do yr = 2002 %to 2022; data _&yr.; set msfanly.AmerEqty&yr.; keep datadate dxlid isin sedol cusip quick instrumentname marketarea; run; %end; %mend; %combine; /* Combine ids into one data */; data msfaid; set _2002-_2022; run; proc sql; create table msfaid_nodup as select distinct dxlid, isin, sedol, cusip, instrumentName, min(datadate) as stdt format date9., max(datadate) as enddt format date9. from msfaid group by dxlid, isin, sedol, cusip, instrumentName order by cusip; quit;
/* ********************** */ /* MSF Institutional IDs */ /* ********************** */ * No need to assign library if you are running the code on SAS Studio; *libname msfinst ('/wrds/markit/sasdata/msf_inst_eqty_amer', '/wrds/markit/sasdata/msf_inst_eqty_euro', '/wrds/markit/sasdata/msf_inst_eqty_asia', '/wrds/markit/sasdata/msf_inst_eqty_other'); /* Consolidate all records entity information */; %macro combine; %do yr = 2002 %to 2022; data _&yr.; set msfinst.AmerEqty&yr.; keep datadate dxlid isin sedol cusip StockDesc MarketArea; run; %end; %mend; %combine; /* Combine ids into one data */; data msfiid; set _2002-_2022; run; proc sql; create table msfiid_nodup as select distinct dxlid, isin, sedol, cusip, stockDesc, min(datadate) as stdt format date9., max(datadate) as enddt format date9. from msfiid group by dxlid, isin, sedol, cusip, stockdesc order by cusip; quit;
Link with CRSP through CUSIP
We then link the identifiers from MSF with CRSP through the historical CUSIP as common key.
* No need to assign library if you are running the code on SAS Studio; libname crsp ('/wrds/crsp/sasdata/a_stock_v2'); /* Join with CRSP by matching CUSIP */; * Prepare CRSP data with date range; proc sql; create table crspid as select distinct permno, permco, cusip, cusip9, min(SecInfoStartDt) as crsp_stdt format date9., max(SecInfoEndDt) as crsp_enddt format date9. from crsp.stksecurityinfohist (where=(cusip ne '')) group by permno, permco, cusip, cusip9 order by permno, calculated crsp_stdt ; quit; proc sql; create table _msfa_crsp as select distinct a.dxlid, a.isin, a.sedol, a.cusip, a.instrumentName, a.stdt, a.enddt, b.* from msfaid_nodup as a left join crspid as b on a.cusip = b.cusip9 order by dxlid, crsp_stdt; quit; data _msfa_crsp_link; set _msfa_crsp; where permno ne .; run; * Add CRSP Header Company Name for name comparison; proc sql; create table _msfa_crsp_link as select a.*, b.issuerNm from _msfa_crsp_link as a left join crsp.stksecurityinfohdr as b on a.permno= b.permno order by a.permno, a.stdt; quit;
In the SQL code above,
- We first create a crspid table that contains the CRSP identifiers permno, permco, cusip as well as the start and end date range of such combination in the database
- We then join this crspid table with the MSF identifiers through the common historical CUSIP link
- Lastly, we add the header CRSP company name to the linked database in case researchers would like to compare
The linking logic of MSF Institutional is the same as MSF Analytics, so we are not repeating the code here.
Note: the new CIZ CRSP has different name convention from the legacy SIZ, and in the new format, historical cusip's syntax is CUSIP, not NCUSIP.