Archive
Using Merge to select results in an event study
There are times when the database query doesn’t match the data that you are trying to obtain. For example, you have a list of X companies and for each company the date of an event E (e.g. a merger) but the database query has to be in terms of a list of X companies and a date range Ya-Yb.
WRDS is a typical example – you have a list of 100 gvkey – year pairs where you want some accounting data from Compustat NA (North America). The company identifier gvkey is used to identify companies in Compustat so you can use it in WRDS Compustat queries.
Option 1 – Multiple WRDS requests
Sort your list by the year and create a company list for each year, say N lists. You then make N WRDS-Compustat requests using the relevant company list and year. Finally you combine the N results.
This option is good if you have to make a few requests (i.e N is small)
Option 2 – One WRDS request and select the results you want
Calculate the range of years that you need. You then make one WRDS-Compustat request covering all these years.
If you have 1990-2010 you could now have a result set with 2100 observations (21 years times 100 companies) and you need to select the 100 relevant ones.
You can do this selection using the Merge command in a Statistics package (the following uses Stata as an example)
- Decide on the variable names in your WRDS results that are going to identify the observations of interest (for example gvkey and fyear)
- If necessary, Convert the WRDS results into a Stata dataset and save this.
- Create a Stata dataset of your 100 gvkey-year pairs using the same variable names as in your WRDS results dataset (e.g. gvkey and fyear).
(If necessary you may need to note and eliminate duplicates from this dataset – Stata commands duplicates list and drop duplicates) - Merge your gvkey-year dataset with your WRDS results:
merge gvkey fyear using “P:\Projects\BMAN70162\acqwrds.dta”, unique sort
(Note ”P:\Projects\BMAN70162\acqwrds.dta” is the file saved at step 2) - You should have one match for each observation in your gvkey-year dataset. This is indicated by the new Stata variable _merge = 3.
Observations with _merge = 1 indicate there was no matching entry in the WRDS result dataset – this might require further investigation. - Observations with _merge = 2 are all the extra results you want to eliminate.
This technique does not depend on Stata – other Statistics packages provide similar merge functions but the detail of the commands will vary. (Indeed, it is probably possible to do this in Excel.)
Researching Financial Markets – equities, indicies, …
A quick list of our five most-used financial market databases, which I hope will make a little clearer the strengths of each from a research perspective.
WRDS – WRDS (Wharton Research Data Services) is the premier research database for business school academics worldwide. It provides a Web-query interface that is designed by researchers for researchers. At the University of Manchester, our subscriptions include: Compustat (North America), CRSP (stocks, indices, treasuries), CRSP/Compustat Merged (CCM), IBES (US and International), Thomson Reuters US Ownership (mutual funds, institutional-13f, insiders). For a fuller list see - Which databases are available through WRDS? In summary, if you are researching the US financial market then WRDS is well worth a look.
Datastream - Datastream from Thomson Reuters covers financial markets world wide. It is the most used database for quantitative research on UK equities. Licence and software restrictions mean that it is only available on specific PCs. Datastream’s coverage includes equities, equity indices, constituent lists, interest rates, economics. It includes Worldscope for company accounts, and IBES summary data.
Thomson One Banker – Thomson One Banker, also from Thomson Reuters, is a database that is aimed at financial market professionals rather than researchers. Compared with Datastream there is significant overlap, including Worldscope and IBES summary data, but it does not provide as much historical data, does not include most inactive (dead) companies, and does not cover interest rates and economics. However, it does have a Web-interface and this gives access to modules - Deals Analysis, Ownership and Private Equity – that are not part of Datastream.
SDC Platinum – SDC Platinum, again from Thomson Reuters is a database that covers deals: mergers, acquisitions, IPOs (initial public offerings), secondary offerings.
There is significant overlap between SDC Platinum and the Deals Analysis module in Thomson One Banker, but researchers often prefer SDC Platuinum because it doe not have the same restrictions on downloading data. See SDC Platinum and TOB deals for more detail.
Bloomberg - Bloomberg Professional, from Bloomberg, covers financial markets worldwide. It is also aimed at financial market professionals – providing lots of functions that give users an overview of the current state of the market. Like Datastream Bloomberg is only available on specific PCs. Bloomberg provides descriptive information, research and financial statistics on public companies worldwide; financial market information on currencies, commodities, bonds, indices, interest rates, derivatives; economic statistics; and over 3,000 business news stories per day. Bloomberg does not provide as much historical data as more research-oriented databases like WRDS and Datastream.
Professional development perspective
If you are interesting in getting a job in the city then your priorities are different from a researcher- you would probably choose to look at Bloomberg first, then Thomson One Banker, Datastream, WRDS and SDC Platinum.
US company data on WRDS
WRDS (Wharton Research Data Services) is used by researchers worldwide to get financial and accounting data, especially on US companies. WRDS gives access to the respected CRSP and Compustat NA (North America) databases. It also provides a web interface to make it as easy as possible for researchers to download the data they require. and there is help available from other researchers. See About WRDS for more detail.
Example screencast videos of getting data using WRDS:
http://screencast.com/t/DuQmA4j7Id
- Monthly returns from CRSP
http://screencast.com/t/BpijU5FIY6x
- Fama French factors
http://screencast.com/t/2vt8uRiF
- long-term debt from Compustat NA Fundamentals Quarterly
http://screencast.com/t/cr9p3Nv1
- S&P domestic long-term issues credit rating from Compustat NA Ratings
Note: The University of Manchester does not subscribe to all the databases available through WRDS -See FAQ answer on WRDS subscription.
Earlier WRDS post (March 2011) – for full list of WRDS-related post use the tag cloud on the right-hand side.
There is lots of help available when using WRDS: try the “e-learning” and “support” tabs.
Apple users have reported problems with .xls result files. An alternative is to select the .csv (comma separated variables) format. This can be easily imported into Excel. You don’t get the nice .xls formatting but the data is the same.
S&P 500 (Standard and Poor’s 500 Index)
The S&P 500 (Standard and Poor’s 500 Index) is one of the most quoted stock market indices. It is designed to measure performance of the broad US economy through changes in the aggregate market value of 500 stocks representing all major industries. As a widely used index, the S&P 500 is available on several financial databases.
If you are researching US stocks, you may be using WRDS (Wharton Research Data Services) where the S&P 500 is available, but is referred to as S&P’s Composite Index. For example you can get the S&P 500 index level from
- Web Query : CRSP – Annual Update – Index ; S&P 500 Indexes – Index File on S&P 500 : Variable : Level of S&P Composite Index
- Web Query : CRSP – Annual Update – Stock / Security Files – Stock Market indexes : Variable : Level of S&P Composite Index
Note that the “Annual” in CRSP – Annual Update refers to the frequency that the database is updated not the data in the database. You can get monthly and daily data from CRSP – Annual Update. The “Annual” means in Dec 2011 we have access to the data to the end of 2010. There will be an annual update around March 2012 and we will then have access to the data to the end of 2011. (The University of Manchester does not subscribe to the more expensive CRSP Quarterly Update, or CRSP Monthly Update.)
If you want to get recent data values for the S&P 500, or if you are not a WRDS user, there are other source of data on the S&P 500:
- Bloomberg – SPX Index
- Datastream – S&PCOMP (S&P 500 Composite)
- Thomson One Banker – S&PCOMP (S&P 500 Composite)

For related information on S&P 500 constituents or total returns:
Where can I find constituent lists for the S&P indices?
How do I find data on total returns for stocks, bonds, and indexes?
[Thanks to the PhD student who highlighted the difficulties in finding S&P 500 on WRDS.]
Historical Index constituents (e.g FTSE 100)
Most best known stock market indices (e.g. FTSE 100, S&P 500, Nikkei 225, DAX, Shanghai SE, BSE Sensex, Bovespa) have constituents that change over time. For example, the FTSE 100 is the largest companies (by market capitalisation) on the London Main market and will change: as some companies grow faster than others, through merger and aquisition activity, and when large companies list.
While the current constituents are usually readily available on the web, getting the historical constituents often requires a specialist database. Both Thomson One Banker and Datastream provide historical index constituents.
Thomson One Banker (Web Interface) – Indices Module – Lookup the index and select Constituents. The initial results are current – edit date to get historical values. (See screenshot)
Thomson One Banker (Excel add-in) – (Reports) – Index Reports – Thomson Datastream Index Service – Historical Index Constituents – Lookup the TOB Key for the index and select the month you want.
Using Datastream you need a little knowledge of the DS Mnemonics for constituent lists. For example:
- LFTSE100 is the constituent list for the current FTSE100
- LFTSE1000196 is the oldest historical list (Jan 1996 – 0106) and
- LFTSE1000410 much more recent (Apr 2010 – 0410).
Other historical constituent lists follow the same pattern. Not all the historical constituent lists are directly selectable in the Datastream Navigator, you may have to manually edit to get the code for a specific month and year e.g. LFTSE100MMYY.
For a detailed example see Historical FTSE100 Index Constituents on Datastream (July 2012)
Bloomberg also has a member function (MEMB) that can be used to give a list index constituents. Historical information (typically from 2001) is available by using the “Edit” option to change the date.
WRDS provides access to S&P index constituents.
Related questions on Manchester Business Answers 24/7:
WRDS – Wharton Research Data Services
WRDS is regarded as the premier financial research database at business schools worldwide. Strictly, WRDS is not a database itself but a web-based research service. The University of Manchester subscribes to databases (CRSP, Compustat, IBES, etc. ) from their providers, and also to access using WRDS because of the convenience this offers. Several key factors contribute to its popularity:
- Convenient web-based access for databases through WRDS
CCM (CRSP Compustat Merged) example - Web-based queries follow the same style for all the databases making WRDS easier to learn.
- Access to CRSP and Compustat, the top-rated databases for financial research on the US market
- Service designed by researchers for researchers: online documentation, able to cope with large volumes of data, and online support.
In general, WRDS is the first choice for most researchers so long as it has the data they need. Thomson Reuters Datastream is our most-used database for UK and Worldwide financial market research, where WRDS is heavily US-centric.
The Databaser blog has a good range of tips on using WRDS – For example
Note that the University of Manchester WRDS subscription does not include Compustat Global.
Manchester Business Answers 24/7 – WRDS-related questions
Note: WRDS is accessed through either an individual or class username and password, to ensure that you have agreed to the WRDS terms and conditions before using the system. See FAQ answer: How do I get access to Wharton Research Data Services(WRDS)?
Company Credit Ratings
Obtaining company credit ratings can be difficult. The credit ratings agencies traditionally rate the specific bonds, or other debt instuments, of a company rather than the company overall. In some situations this information is combined to give an overall company credit rating.
Where can I find bond credit ratings for a particular company?
For North American companies, there is a Compustat Monthly Updates – Ratings query on WRDS that gives the S&P Domestic Long Term Issuer Credit Rating (SPLTICRM).
Video demo of getting SPLTICRM for S&P 500
http://screencast.com/t/Z2i4CByn7
(using a list of gvkeys for the S&P 500 constituent list for Dec 2010)
Bloomberg’s function CRPR (Company Credit Ratings) gives ratings (where available) from Moodys, Standard and Poors and Fitch.
See also post
http://finabase.blogspot.com/2010/02/databases-offering-debtcredit-ratings.html
which describes tips from Vrije Universiteit.
Thomson One Banker Excel Add-in and IBES
We recently got access to IBES (I/B/E/S – Insitutional Brokers’ Estimate System) from Thomson Reuters on WRDS. This is excellent for researchers who want to access large quantities of brokers’ estimate data, for example when dealing with the detailed estimates collected on an individual broker basis.
![]()
Summary (or Consensus) IBES data is also available on Thomson One Banker and Datastream. This consensus data is gathered monthly so gives a record of how brokers’ expected/expect a company to perform in terms of several variables. (Earnings per Share (EPS) is the most forecast of these.)
Thomson One Banker Excel Add-in IBES example is a spreadsheet that uses the Thomson Reuters PFDL functions to retrieve IBES data. (You need the TOB Excel add-in installed to use this.)
Related resources:
TAQ (intraday) data on WRDS
We now have access to the consolidated TAQ (Trade and Quote) data on WRDS for 1993 – 2010.
“The Trade and Quote (TAQ) database contains consolidated intraday transactions data (trades and quotes) for all securities listed on the New York Stock Exchange (NYSE), American Stock Exchange (AMEX), Nasdaq National Market System (NMS) and SmallCap issues, as well as stocks traded on Arca (formerly Pacific Stock Exchange) and other regional exchanges.”
from Using TAQ Data Efficiently on WRDS – Essential reading for anyone doing research with these very large datasets.
related FAQ answers:
Where do I find intraday quotes for equities and indices?
Note recent intraday data is also available on Bloomberg and FactSet ExcelConnect.
Which databases are available through Wharton Research Data Services (WRDS)?
WRDS website and access
WRDS (Wharton Research Data Service) is not really a database itself. It is a system designed to give researchers access to a range of financial databases (CRSP, Compustat, CCM, Compustat Execucomp) though a common interface, and with integrated support.
WRDS recently produced a new web site, so at the moment some data is available from both the new, improved web interface, and the older classic one.
Access to WRDS is via a username and password. MSc students can use a “class account” – email bids@mbs.ac.uk for login details. PhD students and academic staff can apply for their own username and password (this provides extra capabilities for running SAS programs on the WRDS system.)
WRDS related questions on Manchester Business Answers 24/7





