Archive

Posts Tagged ‘Compustat’

Getting a linking table from Compustat via WRDS Cloud using SSH and SAS

27 June 2017 Leave a comment

Researchers commonly need to combine data from multiple sources. Even when the sources are offered through the same platform such as Wharton Research Data Service (WRDS), this can still be a difficult process with many choices. Essentially, you need to find a common entity identifier in both sources; if none exists, find a reference table or other third source.

The help provided by WRDS is often sparse in detail and tailored for their more advanced users. I have previously written in detail about using WRDS Cloud and SAS to gain access to resources not available on the WRDS website, in the blog post Getting S&P 500 constituents from CRSP via WRDS Cloud using SSH and SAS.

In this post, I will suggest a way to link I/B/E/S data with Compustat North America data. I will summarise the points that are addressed in the aforementioned blog post.

Get a linking table from Compustat North America

You will need the following to begin.

  1. AUTHENTICATION. Your WRDS username and password, specifically an individual PGR or Academic login.
  2. SENDING COMMANDS. Windows users: the Putty program. Mac/Linux users: the Terminal app or command line.
  3. FILE TRANSFER. Windows users: WinSCP or FileZilla. Mac/Linux users: CyberDuck or FileZilla or command line.
  4. FILE EDITING: Notepad, Notepad++ or another text editor. Excel or other spreadsheet application.

The method is very similar to getting the S&P 500 constituents. Essentially, Compustat has a dataset called securities which lists all the identifiers for all the securities. It is in SAS format. It cannot be accessed directly by us, but we can write a SAS script to read its entire contents to a plain text file, then copy the file to our own PCs. This file is kept at location /wrds/comp/sasdata/d_na/security/security.sas7bdat for North American data and is updated often.

Your script to read this file should look like this:

options nosource nodate nocenter nonumber ps=max ls = max;
title ' ';
options source;
proc print data=comp.security noobs width=UNIFORM;

This is different from last time in a few ways.

  1. The ‘ls’ or ‘line size’ option has been changed from ’72’ to ‘max’. This is because there are more columns for Compustat securities than for S&P 500 constituents.
  2. The data table we are reading from is ‘data=comp.security’.
  3. The ‘labels’ option has been removed so we get the clean variable names as the header.
  4. The ‘width’ option is set to ‘UNIFORM’ so the columns are the same number of characters wide on each ‘page’.

The method is as follows; remember to refer to the aforementioned blog post.

  1. Save the SAS script to a new text file on your computer, compsecurity.sas or similar.
  2. Use WinSCP or equivalent to copy this file to your own space on WRDS Cloud.
  3. Use Putty or equivalent to run the script, log in and type qsas compsecurity.sas and press Enter.
  4. Wait 10 seconds and use WinSCP to download the generated files compsecurity.log and compsecurity.lst to your computer. The ‘lst’ file contains the data, if the program was successful (read the ‘log’ file to check).
  5. Use Notepad++ or equivalent to remove the blank lines from the top of compsecurity.lst and save it as compsecurity.txt – then check every 32,764 lines for a few more blank lines and repeat of the headers to be removed (these are ‘page breaks’).
  6. In Excel, open compsecurity.txt (via Data > Get External Data from Text) and use fixed column width option, carefully setting the locations for the column breaks, remembering to set most columns as Text format instead of General (particularly GVKEY and CUSIP). Save as CSV or Excel format.

Compustat securities via WRDS Cloud

Using the linking table

You will need the columns GVKEY and IBTIC (IBES Ticker).

  • If your Compustat and I/B/E/S data are in Stata format, import this table and consider joining the tables.
  • If your Compustat and I/B/E/S data are in Excel format, use VLOOKUP formulas (some more processing may be required).

Still to do: there is not necessarily a one-to-one match between these two identifiers. The same GVKEY will appear on more than one row, but will those rows contain more than one unique ticker? And is the data frequency the same in each source? You will need to find the answers to these questions out for yourself.

Use pairs in one VLOOKUP with historical exchange rates from Datastream

20 July 2016 Leave a comment

One of the most useful Excel functions I help students with is VLOOKUP – extremely useful to automatically bring in data from one table to another. If you have company accounts data from Compustat Global (via WRDS) for multiple companies in multiple currencies, you will need to bring in the change rate for each currency to show every field in US Dollars.

A B C D E
1 Company Currency Total Assets Local Exchange rate Total Assets USD
2 Co. X AUD 100 =VLOOKUP(B2, currencies, 2, FALSE) =C2*D2
3 Co. Y PLN 400 =VLOOKUP(B3, currencies, 2, FALSE) =C3*D3

Where ‘currencies’ is a named range elsewhere in the workbook:

From Currency Rate
AUD 1.3158
PLN 3.95685
USD 1

The VLOOKUP command looks for the currency code in the first column and returns the rate from the second column (use ‘FALSE’ to ensure an exact match). The conversion is simply a multiplication. (Note that you can convert from USD to USD at a rate of 1.0.) But what about making sure that the exchange rate is correct for the date of the data, and what if you have data from multiple years?

Performing VLOOKUP when matching a pair of variables

Thanks to Professor Marie Dutordoir for suggesting this kind of technique!

Usually you will want to look at accounts data over several years and for several companies. If you are working with several currencies then you will need a more complicated solution to this VLOOKUP exchange rate table, considering a day/month/year for each rate-currency pair. You do not need a second lookup table or other Excel functions, you just need to have another variable of date and a variable of currency-date pairs.

Global company accounts data

Global company accounts data. Given column G (currency) and column H (year), create new column F (map = currency&year).

The new column F (mapping) is formed from currency and year, so F2 contains “=G2&H2” (see image above). The ‘currencies’ named range now begins with a different first column produced in the same way (see image below). A VLOOKUP always matches on the first column of its range, and now this one pulls the rate from the fourth column, so rate is “=VLOOKUP(F2, currencies, 4, FALSE)” for row 2.

In the ‘currencies’ named range, I have combined the exchange rate calculation and lookup into one formula (column K). Repeat this for each data type that you wish to convert the currency for. Remember to have an entry for USD at rate 1 for each year.

paired-vlookup-lookup

Stacked historical exchange rates in a named range ‘currencies’. Created column A (map) from B&C.

This example only has the average exchange rate for each year, you may wish to extend this approach for monthly or even daily rates. If you do this, you will need to take extra care with the format of the mapping cells, but this exceeds the scope of the post.

Where to find historical exchange rates?

Use the Datastream Navigator to find exchange rates

Use the Datastream Navigator to find exchange rates. Start by searching one category (Exchange Rates), limit it “To Currency: United States Dollar”, then use the search box for the source currency.

 

US tickers to Datastream mnemonics

31 March 2016 Leave a comment

A handy summary for those wishing to use a list of US companies Compustat in a Datastream query. In summary, a well-chosen prefix to the ticker symbol can be systematically added in Excel before creating a static or time series Datastream request. All you need to know is if the company is listed on Nasdaq or elsewhere. Thanks to EDSC for this post.

EDSC manuals, tips & tricks

If you have US tickers as output  from e.g. Compustat index constituents and you want to get stock prices from Datastream that’s possible. You can use a ticker to create a Datastream mnemonic. You have to add country code, so “U:” would do the trick. You can easily do this in Excel by concatenate U: with the column with ticker.

This solution works fine for all but the Nasdaq firms. Those will need a “@” sign in front not “U:” So just to be sure you have everything and you don’t miss any data do both so a list with “U:” and one with “@”.

Example

Screenshot_USticker_mnemonics_v2

The ones that don’t make sense will give errors but this way you won’t miss any. So with Ford U:F will work and give data, @ F won’t it will give an error. Similarly Micosoft @MSFT will work and give data, U:MSFT won’t it will give an error

View original post

A quick look at Research Quotient: Analyse and measure the effectiveness of a firm’s R&D

19 January 2016 Leave a comment

Research QuotientA new resource has been added to our collection on WRDS called Research Quotient. It collates increase in total revenue with estimated investment in research and development (R&D) between 1974 and 2010. From their publicity materials:

Research Quotient = percentage increase in revenue from a 1% increase in R&D.
RQ is the output elasticity of R&D. RQ offers a universal, uniform, and reliable measure of a firm’s R&D productivity.

It is available now, for researchers at The University of Manchester, for US firms only. It is searchable by Compustat GVKEY and SIC codes, or by looking up companies by name one at a time.

See also: WRDS access and access.

From ISIN codes to Compustat (via Capital IQ using GVKEY, CUSIP and Name)

17 December 2015 Leave a comment

Given a list of North American company ISIN codes from the Markit credit default swap database (via a lookup database), you may want to join in quarterly company fundamentals from Compustat in WRDS. Compustat does not accept ISIN codes; you have to use an intermediate source such as Capital IQ and one or more intermediate data types (GVKEY, CUSIP and Company Name).

Note, you must be using a PC with the Capital IQ Excel add-in set up. This is in the Library Finance Zone.

Method

  1. Use Capital IQ Office Plug-in (CIQ) with a column of ISIN codes in Excel. Put these in column A.
  2. In the next column, B, get the values of column A with a prefix of “I_” so CIQ can interpret them as ISINs. Cell B2 contains =”I_”&A2
  3. Next get the GVKEY code in column C with a CIQ formula. Cell C2 contains =CIQ(B2, “IQ_GVKEY”) — the output will need cleaning up to remove the “GV_” prefix and any multiple matches.
  4. Then the CUSIP codes in column D. Cell D2 contains =CIQ(B2, “IQ_CUSIP”) — this may not work well, I used the Compustat helpdesk here to get these.
  5. Finally from CIQ, the company names. Column E. Cell E2 contains =CIQ(B2, “IQ_COMPANY_NAME”)
  6. Create a new plain text document with the values of column C (GVKEY) and upload this to Compustat in the usual way.
  7. For the gaps, repeat with the values of column D (CUSIP).
  8. For any final gaps, use the Manually enter company codes > Code Lookup tool in Compustat, typing in the first part of the names to obtain any extra GVKEY codes one by one and repeat step 6.
  9. Stitch together the various Compustat output files and check that all the companies are present without duplicates.

Consider that CUSIP codes can change with time, so use them with caution. Remember to keep any column of CUSIP codes with Text format, never General, Number or Scientific.

Compustat CUSIP Converter – limited use tool and some tips

16 December 2015 Leave a comment
Compustat CUSIP Converter

Converts CUSIP codes to CUSIP codes… it’s (somewhat) more useful than it sounds.

Most of our users who need North American company financial information need to link it to another data source, often with a different type of company identifier. A tool has appeared within Compustat via WRDS (which may have been there for some time) allowing you to convert either 8 or 9-digit CUSIP codes into 6, 8 or 9-digit ones. This is a bit of a surprise to me, as Compustat works best with GVKEY codes, but the different length of CUSIP codes help it to interface with other databases as follows.

  • Bloomberg (9-digit CUSIP)
  • Capital IQ (GVKEY has better coverage)
  • CRSP (8-digit CUSIP)
  • Datastream (via Local Code)
  • Eventus (8-digit CUSIP)
  • SDC Platinum (6-digit CUSIP)
  • ThomsonONE.com (9-digit CUSIP when using the Excel add-in)

The databases you join with may not have the full coverage of CUSIP codes, so take care. There will almost always be gaps when going from one database to another.

Take scientific care of CUSIPs in Excel

A final tip when using Excel. Some CUSIP codes can be interpretted as numbers in scientific format (when the code contains numerals and the letter ‘E’, such as Genuity Inc 37248E202 read as 3.72 x 10206). Some CUSIP codes begin with leading zeroes (Apple Inc is ‘037833AL4‘) which can get lost and change meaning.

You must always make sure that a column of CUSIP codes is formatted as Text not General before you paste them in. When opening from a CSV file, consider using the Data > Get External Data > From Text menu command and explicitly set the CUSIP column as Text format.

See also our earlier post on the CRSP converter tool:

Translating CRSP/Compustat company IDs to Datastream IDs

3 July 2015 1 comment

Use WRDS to convert any permno or gvkey to a cusip.
For example, using CCM Security Monthly and eliminating duplicates.

A cusip should be 9 digits and can be converted to a Datastream local code by adding a U at the front and dropping the final check digit.
260543103 (cusip) goes to U2654310 (DS local code).

(Datastream local codes can be used create a Datastream list.)

If the cusip is less than 9 digits the leading 0’s could be missing, and need to be added.
7525108 (cusip) goes to U00752510 (DS local code)
A 6-digit cusip usually indicates a company and 10 is added to give the primary listing.

Converting via a ticker is also possible but seems to be less reliable.

For a cusip 7525108, in Excel cell A2
=CONCATENATE(“0000”,A2) – in B2 adds leading zeros
=RIGHT(B2,9) – in C2 trims to 9 characters
=CONCATENATE(“U”,C2) – in D2 adds leading U
=LEFT(D2,9) – in E2 drops final checkdigit, giving DS local code.