Home > Business Databases > Linking databases on company and date

Linking databases on company and date

Health warning: This post illustrates linking databases using SAS. SAS has many great features but being easy to use for beginners is not one of them. The best software for linking databases will depend on the linking that you want to do, the software packages available and your expertise.

This example is based on discussions with several students who have wanted to do similar linking when gathering their dissertation data. We have a dataset compids containing company ids, an event date, and several other variables. This needs to be linked with a dataset prices of monthly price data covering the duration of the study.

  • First, check that the two datasets have the same company ids. This is essential to link the data.
  • Second, decide the other criteria for linking/matching the two datasets. In this example we want the monthly price for the month before the event date and the latest monthly price before the event date.

The combined dataset is shown in the screenshot below. Notes that the date from prices (label Names Date) is the trading end of month date just before the event date (DA) from compids.


SAS join to link datasets (click to expand)

Now (for those interested in the detail)  the SAS code. The first part creates a new table dataset from compids (a), the event data, and prices (b) the monthly price data.

proc sql;
create table sample1_price as select
b. date, b. prc, b. comnam
from compids as a left join prices as b
on a. permno_a = b. permno and a.da >= b.date;

This proc sql statement links the databases on the company ids,  a. permno_a = b. permno,  and matches each observation in compids with all the monthly prices before the event date,  a.da >= b.date.

The left join was chosen because we are matching each row in the event dataset, compids (a), with many rows in the prices dataset.

The second part is to filter the new sample1_price dataset and keep only the last monthly price before the event date.

proc sort data= sample1_price;
by DA permno_a descending date;

proc sort data= sample1_price nodupkey;
by DA permno_a;

This uses two proc sort statements. The first sorts the dataset by date and company, DA and permno_a, which identify each event, and then by date, with the newest date first. The second sorts by date and company again but this time only keeps the first row (observation) for each DA permno_a pair. The previous sort means that this will be the newest (largest) date before the event date.

The fact that you end a proc sql statement with quit; and a proc sort with run; is just part of the syntax of SAS. You also have to learn to get semicolons (;) in the right places.

In this example the compids dataset was imported into SAS from a .csv file and the prices dataset downloaded directly from CRSP on WRDS in SAS format. The sample1_price dataset created was exported into a .csv file, which can be loaded into Excel. (There was a problem with the export directly in .xls format.)

The prices dataset has labels for all its variables so the dataset in the screenshot has a column labelled Names Date rather than the variable date (which is used in the code). The compids dataset has no labels and so the columns use the variable names. You can use SAS statements to change the labels.

SAS copes well with large datasets. However, do try your code on a small sample set first, our compids was 24 rows,  – it will be quicker in the long run.


You could choose to calculate the month and year of the price date you want from the event date and then link the databases on company id, year and month.

There is a slightly simpler example using this approach in the February post Using Merge to select results in an event study.

Thanks to the anonymous MSc student whose original code was the basis for the example above.

Categories: Business Databases Tags:
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.