The stock price tables (dsf, msf) are time series, they have one row per month for each company. The event tables have one row per event - they only note changes.
There are 5 types of events:
- Names History (NAMES).
- Delisting Event Histories (DELIST).
- Distribution Events (DIST).
- NASDAQ Event Information Histories (NASDIN).
- Shares Event Histories (SHARES).
- There are tables for end-of-day and end-of-month frequencies for each event type (e.g. dsenames, msenames) as well as tables of all events together (dse, mse). Additional tables dseall, mseall, and stocknames were created by WRDS and are described below.
The DATE variable in the Monthly Stock Events (mse) file for each event type corresponds to different items across the various files. In the msenames file it refers to the Name Date (NAMEDT), in the msedist file it is the ex-distribution date (EXDT), for msedelist it is the delisting date (DLSTDT), mseshares uses it for the shares observation date (SHRSDT) in , and in msenasdin it is the NASDAQ Traits Date (TRTSDT).
The following table shows a slice taken from the mse file for Microsoft.
Table 4: mse
In the mseall file, items associated with a one-time event, such as dividend cash amount (divamt), will not be carried onto the next observation. If there are multiple one-time events within one month (e.g. see the month of November 2004 for Mircosoft), multiple observations will appear for the same date (msi.date).
The stocknames file is a cross between dseall and dsenames. It has only the most important identification variables, eliminating much of the noise of dseall. It adds an effective date range for the identifiers (NAMEDT, NAMEENDDT) for each set of NCUSIP, COMNAM, TICKER, and EXCHCDvariables and a date range for price data (ST_DATE, END_DATE). CUSIP and HEXCD are header variables and SICCD, SHRCD, and SHRCLS reflect the status of name start date (namedt). stocknames is the most popular event file.
The example below shows slices taken from stocknames file for Microsoft and Dell. In 2003, Dell changed its CUSIP from 24702510 to 24702R10; the two rows in stocknames reflect this change.
In CRSP the variable BEGDAT from table dsfhdr is often used as a rough estimate of the first date of trading after the initial public offering (IPO). The document "WRDS Guide to IPO Databases and Research"discusses this topic in further detail.
The Standard Industrial Classification code (SIC) is used to group companieswith similar products or services. The SIC code is an integer between 100 and 9999. The first 2 digits refer to a major industry group, the third digit identifies an industry group and the fourth digit indicates the industry.
In CRSP, the SIC code variable (SICCD) contains the historical SIC information. The HSICCD variable contains the header or most recent SIC codes. HSICCD is available in the stock files, however, you will have to look in the event files for SICCD.
People often report that there are differences between the SIC codes in CRSP, Compustat, and other databases. This is due to CRSP and Compustat obtaining SIC codes from different sources. Compustat assigns SIC codes by analyzing a company's 10K and annual report.
In the December 2009 stock database, CRSP removed SIC codes provided by Mergent from the Stock Databases and replaced them with SIC codes from Interactive Data Corporation (IDC). Mergent was the primary source of SIC codes for NYSE, NYSE MKT, and ARCA securities between August 24, 2001 (200108240) and 2009. IDC has always been a continuous alternate source of SIC Codes.
SIC codes can be useful for rough groupings of industries. Beyond that, they should be used with caution, as they are not assigned or reviewed with a strict procedure by any government agency. Moreover, most large companies belong to multiple SIC codes and can change over time. After the initial SIC code assignment when a company goes public, government agencies do not refer to the code or the company again - and quite often a company will report its initial SIC code forever. There have been cases in which companies have obsolete SIC codes from the 1972 numbering scheme in SEC filings from the early 1990s
In CRSP the market capitalization as the product of price and shares outstanding is computed as follows: