Home > Business Databases > Using Merge to select results in an event study

Using Merge to select results in an event study

There are times when the database query doesn’t match the data that you are trying to obtain. For example, you have a list of X companies and for each company the date of an event E (e.g. a merger) but the database query has to be in terms of a list of X companies and a date range Ya-Yb.

WRDS-iconWRDS is a typical example – you have a list of 100 gvkey – year pairs where you want some accounting data from Compustat NA (North America). The company identifier gvkey is used to identify companies in Compustat so you can use it in WRDS Compustat queries.

Option 1 – Multiple WRDS requests

Sort your list by the year and create a company list for each year, say N lists. You then make N WRDS-Compustat requests using the relevant company list and year. Finally you combine the N results.

This option is good if you have to make a few requests (i.e N is small)

Option 2 – One WRDS request and select the results you want

Calculate the range of years that you need. You then make one WRDS-Compustat request covering all these years.

If you have 1990-2010 you could now have a result set with 2100 observations (21 years times 100 companies) and you need to select the 100 relevant ones.

You can do this selection using the Merge command in a Statistics package (the following uses Stata as an example)

  1. Decide on the variable names in your WRDS results that are going to identify the observations of interest (for example gvkey and fyear)
  2. If necessary, Convert the WRDS results into a Stata dataset and save this.
  3. Create a Stata dataset of your 100 gvkey-year pairs using the same variable names as in your WRDS results dataset (e.g. gvkey and fyear).
    (If necessary you may need to note and eliminate duplicates from this dataset – Stata commands duplicates list and drop duplicates)
  4. Merge your gvkey-year dataset with your WRDS results:
    merge gvkey fyear using “P:\Projects\BMAN70162\acqwrds.dta”, unique sort
    (Note “P:\Projects\BMAN70162\acqwrds.dta” is the file saved at step 2)
  5. You should have one match for each observation in your gvkey-year dataset. This is indicated by the new Stata variable _merge = 3.
    Observations with _merge = 1 indicate there was no matching entry in the WRDS result dataset – this might require further investigation.
  6. Observations with _merge = 2 are all the extra results you want to eliminate.
Stata Data Browser viewing merge results (click to expand)

Stata Data Browser viewing merge results (click to expand)

This technique does not depend on Stata – other Statistics packages provide similar merge functions but the detail of the commands will vary. (Indeed, it is probably possible to do this in Excel.)

Categories: Business Databases Tags: ,
  1. No comments yet.
  1. 25 July 2013 at 8:05 am

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.