This document illustrates how to link IBES to other Refinitiv databases
Identifiers within IBES
IBES is arguably the most widely used estimates data on WRDS platform. It covers the forecasts and actual earnings of both US and International companies. IBES carries two identifier tables:
ibes.id
from the detail history package, andibes.idsum
from the summary history package
These two tables contain the common identifiers used for companies, such as IBES Ticker, official ticker, CUSIP/SEDOL and company names. They can be found on WRDS server at:
/wrds/ibes/sasdata/
While these two tables are very similar, there's slight difference in the coverage. Based on the 2022/09 data, there are 82,995 unique IBES Tickers in the ibes.id
table, compared to 73,920 unique IBES Tickers in the ibes.idsum
table. If we further analyze the overlapping coverage, there are 354 unique IBES Tickers that are only in ibes.idsum
but not in ibes.id
, compared to 9,429 unique IBES Tickers that are only in ibes.id
but not in ibes.idsum
.
Therefore, for the purpose of this demonstration, we will be using ibes.id
table but users are welcome to explore using ibes.idsum
if they deem that approach more appropriate for their research task.
The table below shows sample records of the identifier columns from the ibes.id
table:
TICKER | CUSIP | OFTIC | CNAME | USFIRM |
---|---|---|---|---|
0 | 87482X10 | TLMR | TALMER BANCORP | 1 |
@ZD2 | FJB4ZDR7 | 6620 | MIYAKOSHI HOLDIN | 0 |
HBI | 41034510 | HBI | HANESBRANDS INC | 1 |
Parsing out CUSIP and SEDOL
Researchers can find in this table the IBES unique identifier, TICKER, as well as other common identifiers such as CUSIP, Official TICKER and company names. Here is an important caveat that researchers should be aware of: according to the IBES documentation, the variable CUSIP in the ibes.id
table is actually dubbed as CUSIP or SEDOL:
" For US companies, the value that appears in the CUSIP/SEDOL field will always be a CUSIP. SEDOLs are used for non-US companies."
Therefore, we can rely on this CUSIP column in the ibes.id
table to extract CUSIP information for US companies, and SEDOL information for international firms. Furthermore, IBES explains how one should parse out the SEDOL information from the the CUSIP column in ibes.id
:
"The SEDOL field consists of a country code followed by the first six digits of the official SEDOL (the last digit of a SEDOL is a check digit and is not used by I/B/E/S)."
Hence, we can parse out SEDOL information by taking digit 3 through 8 of this particular column. Below is an example of how SEDOL is parsed out from the CUSIP column:
TICKER | CUSIP | OFTIC | CNAME | USFIRM | Parsed SEDOL |
---|---|---|---|---|---|
@0 | EKB05BBN | UPET | UPET | 0 | B05BBN |
Alternatively, users may consider taking advantage of the COUNTRY column included in the ibes.idsum
table for the pre-parsed two digit country code when slicing out the SEDOL information from the CUSIP column.
Connecting with Refinitiv
To connect the universe of IBES companies with other Refinitiv data, researchers should use CUSIP as the linking key for US companies and SEDOL for international firms.
On the Refintiv front, users can rely on the vw_securitymasterx
table to look up company identifier information such as CUSIP and SEDOL. For a detailed discussion of the Refinitiv master tables and how to link within the Refinitiv data system, please refer to the Linking within Refinitiv documentation.
More on Parsing Out SEDOL
Before diving into the actual reference code, there is yet another caveat of IBES's CUSIP column from ibes.id
table that researchers should be aware of:
If we read in between the lines of the statement defining this variable listed above, while US companies always report CUSIP in this particular column, it is not necessarily the case that all international companies report SEDOL.
As a matter of fact, it is quite common for international companies (USFIRM = 0) to report CUSIP instead of SEDOL in this column (see table below for several examples). Hence, one can not naively rely on the dummy variable USFIRM to determine if the content of the CUSIP column is indeed CUSIP or SEDOL.
TICKER | CUSIP | OFTIC | CNAME | USFIRM |
---|---|---|---|---|
A1G1 | G0259M10 | ALD | ALLIED GOLD PLC | 0 |
AAC1 | 01855R10 | AAC | ALLIANCE COM 'B' | 0 |
ABH1 | 76117W1X | RFP | RESOLUTE FOREST | 0 |
Given this data irregularity, when linking IBES with the rest of Refinitiv, we will opt to first treat all records in the CUSIP column as CUSIP: in other words, we first link all entities using the CUSIP column presuming it is indeed CUSIP. The logic is, given the fact that CUSIP is 8-digit long and a combination of letters and numbers, it is highly unlikely that a SEDOL with leading country code from ibes.id
table could accidentally be mapped with a valid CUSIP from Refinitiv's vw_securitymasterx
table.
Once all the records mapped with presumed CUSIP are taken out, we can then parse the remaining records following the SEDOL convention, assuming they meet the condition of having leading two-digit country code.
Reference SAS Code
Below is SAS code to illustrate the linking logic described above. It is meant as a reference for researchers to start their own linking procedure. Please feel free to modify the code to suite your research design.
Please make sure to select "SAS" in the kernel list when running the code using Jupyter Notebook.