Posts Tagged ‘CRSP’

Getting S&P 500 constituents from CRSP via WRDS Cloud using SSH and SAS

2 June 2016 2 comments

The title of this blog post uses far more abbreviations than I would like! Before I get into the purpose of this blog post I shall define them.

  • S&P 500: Standards & Poor’s 500, an index of the 500 largest US companies (by market capitalisation) on the NASDAQ or NYSE stock indices.
  • CRSP: The Center for Research In Security Prices, a database of end-of-day and end-of-month prices for US equity indices.
  • WRDS: Wharton Research Data Services, a platform of financial and company databases with a common user interface, including CRSP.
  • SSH: Secure Socket Shell, a network protocol that provides a secure connection between remote computers, where commands are typed out.
  • SAS: Statistical Analysis System, a software tool for data analysis that can run on remote servers or your local PC.

Outline of this blog post

There are 500 companies that make up the S&P 500 equity index. This list can change from month to month as the market capitalisation of each company changes and the list needs to comprise the largest 500. When conducting historical analysis against this index it is important to have the right list of constituents for that period. There are various ways to obtain historical constituent lists [also link2],  but one way not previous discussed on this blog is via the CRSP master list, available on the WRDS Cloud to SAS programs executed over SSH or PC-SAS. This post shows how to log in remotely, write a short SAS script and download the master list.

Edit 6 June 2016: This post explains how to get the constituent list of equities by PERMNO identifier, a five-digit number that never changes for a company, given by CRSP. You can get the equivalent GVKEY list from Compustat quite easily and from the web-based WRDS access. Armed with the PERMNO list it is easier to interact with CRSP and for example get stock returns for the S&P500 constituents. [Thanks to Alex Taylor for contributing this information.]

Please note, the SSH access route to WRDS is only available to academic research staff and PhD students – sorry to MSc and undergraduates!

Getting into WRDS Cloud using SSH/SFTP/SCP

You will need to use a tool like WinSCP to see the WRDS Cloud remote computer, to “secure copy” files between it and your PC. WinSCP is a free graphical tool, familiar if you have used FTP before; it gives a view of a directory on your computer (left) and a view of a directory on the server (right). (FileZilla and Cyberduck are alternatives for Windows or Mac.)

WinSCP login

The WinSCP login screen. My user name is blacked out for security purposes.

Login with the following details.

  • File protocol: SFTP
  • Host name:
  • User name and Password: your regular WRDS account details (PhD and staff only)
  • When prompted, choose “Yes” to save the key to the local machine.
WinSCP files

The standard WinSCP view of files on your computer (left) and files on the server (right). The red ringed icon is the Create New Directory button. I have used it to create a directory called “dsp500list”.

Your personal file space on the server will appear on the right, initially none. The directory labelled “..” is a short-cut meaning “the directory one up from this one (parent)”. The path of this directory is “/home/manchester/<wrdsusername>”. Your new SAS script will be saved here. Create a new directory on the server to save your work in, called “dsp500list” (press F7 or the icon hightlighted above).

The WinSCP tool will allow you to easily copy files but will not let you execute commands (run programs). For this, you will need a tool like Putty (or just the Terminal on Mac). Log in with the details below.

  • Connection type: SSH
  • Host name:
  • User name and Password: your regular WRDS account details (PhD and staff only)
  • When prompted, choose “Yes” to save the key to the local machine.
Putty login

The Putty configuration and terminal screens. You will be prompted for user name and password later (press Enter after each). Commands are typed at the “$” prompt at the bottom of the screen.

You will need to use a few Unix commands to get about here. The most important one to start with is “chdir” to change directory. Type “chdir dsp500list” to change the working directory to the one you created above. We will put the SAS script and results file here.

Writing a SAS script

The SAS script to copy, saved to file

options nosource nodate nocenter nonumber ps=max ls = 72;
title ' ';
options source;
proc print data=crsp.dsp500list noobs label;

It is easiest to write this on your PC in your chosen text editor then secure copy it to the “dsp500list” directory via WinSCP. (You may choose instead to type it directly in the Putty terminal using the command line editor “vim“, but explaining that is out of scope of this post.)

WinSCP copy

On the right of WinSCP, double-click the “dsp500list” directory to enter it, then drag the new SAS script from the local hard drive (left) into it (select “Copy” not “Move”).

This script accesses the SAS dataset stored at the location “/wrds/crsp/sasdata/q_indexes/dsp500list.sas7bdat”, the daily S&P500 list changes. You could use the monthly file “msp500list” instead by changing the “data=crsp.dsp500list” part of the script accordingly.

Getting the data out

Run qsas on the Putty command line to execute the SAS query. It will take a few seconds but then there will be two files produced.

  1. dsp500list_0.log – a log file of the working of SAS, telling you exactly what is happening and any error messages.
  2. dsp500list_0.lst – the output text file, effectively the same data as the daily S&P 500 list.
QSAS results

Running the command “qsas” from the correct directory in Putty will execute the script. The resulting files can be seen with the “ls” command in Putty or by refreshing in WinSCP (green round arrows icon).

The output text file dsp500list_0.lst begins like this, although in plain text with fixed width columns and some blank lines at the start.

CRSP Permanent Number Date when the stock included in S&P500 Index Date when the stock excluded from S&P500 Index
10006 19570301 19840718
10030 19570301 19690108
10049 19251231 19320101

This file can be used in your local SAS scripts, perhaps with a little work converting it back to a true sas7bdat file — this exercise is left for the reader to complete!


This is a lot to take on if you have never worked with Unix systems before, and I would advise the regular WRDS web interface for most users. If you need to use WRDS Cloud to run your SAS scripts to directly access files such as the master CRSP lists, this post shows you how to get started.

See also the WRDS Cloud manual and Intoduction to WRDS Unix system.

Convert CRSP date YYYYMMDD to Excel true date

10 May 2016 1 comment

If you have a data file from CRSP (or many other sources), the date may be recorded as a text string of the form YYYYMMDD (for example, 20151231 for the last New Year’s Eve). If you wish for Excel to interpret this as a date, for the purpose of analysis, this can be done with the four formulae below.

=DATE(LEFT(A2,4), MID(A2,5,2), RIGHT(A2,2))

This reads the following fields.

  • Year from the first 4 characters of cell A2.
  • Month from the middle 2 characters of cell A2 starting from position 5.
  • Day from the last 2 characters of cell A2.

See the image below for an example.

Excel date from CRSP YYYYMMDD

Use Excel’s DATE formula to interpret text or numbers as a date, for analysis purposes. You may need the LEFT, MID and RIGHT formulae to pull out the year, month and day values from another cell.

Now that the dates are recorded this way, Excel can interpret them correctly when doing analysis or when using features like pivot tables.

Categories: Business Databases Tags: ,

WRDS – available databases update

21 March 2014 1 comment

WRDS-iconUniversity of Manchester researchers now have access to additional databases on WRDS (Wharton Research Data Services) following a recent review of the data needed to support current and planned research. These changes are not always easy to see in the menu of “Current Subscriptions” on the left side when you login with a WRDS username and password.

University of Manchester (UoM) main WRDS subscriptions (we don’t subscribe to everything) are:

WRDS University of Manchester subscriptions (click to expand)

WRDS University of Manchester subscriptions (click to expand)

  • Bureau van Dijk – Amadeus (European companies) and Bankscope (banks worldwide)
  • COMPUSTAT (North America) – “S&P Capital IQ’s Compustat North America is a database of U.S. and Canadian fundamental and market information on active and inactive publicly held companies”.
  • COMPUSTAT Execucomp – Executive Compensation for North American companies
  • CRSP – US Stocks and Indices (annual update) – “The Center for Research in Security Prices (CRSP) maintains the most comprehensive collection of security price, return, and volume data for the NYSE, AMEX and NASDAQ stock markets.”
  • CRSP/Compustat MergedCCM – “The gold standard in academic research, The CRSP/COMPUSTAT Merged Database allows for concurrent database access to CRSP’s stock data and Compustat’s fundamental data.”
  • CSMAR – “The China Stock Market & Accounting Research Database is designed and developed by GTA Information Technology – one of major providers of China data. The CSMAR Databases offer data on the China stock markets and the financial statements of China’s listed companies. “
  • EVENTUS – “Eventus performs event studies using data read directly from CRSP stock databases or pre-extracted from any source.”
  • IBES – I/B/E/S from Thomson Reuters – “I/B/E/S International Inc. created their Academic Research Program over 30 years ago to provide both summary and individual analyst forecasts of company earnings, cash flows, and other important financial items, as well as buy-sell-hold recommendations.”
WRDS Not in UoM subscription (click to expand)

WRDS Not in UoM subscription (click to expand)

  • LSPD –  “The London Share Price Database (LSPD) is a unique, comprehensive database of UK stock returns covering over 9,000 UK shares from 1955 to date.”
  • Mergent FISD – The Mergent Fixed Income Securities Database (FISD) is a comprehensive database of publicly-offered U.S. bonds.
  • Option Metrics  – “Ivy DB OptionMetrics is a comprehensive source of historical price and implied volatility data for the US equity and index options markets”
  • Risk Metrics – “RiskMetrics (through ISS Governance Services) is a leader in corporate governance data. IRRC was initially the data provider for WRDS. IRRC, however, was acquired by ISS in 2005.”
  • TAQ -“The Trade and Quote (TAQ) database contains intraday transactions data (trades and quotes) for all securities listed on the New York Stock Exchange (NYSE) and American Stock Exchange (AMEX), as well as Nasdaq National Market System (NMS) and SmallCap issues.”
  • Thomson Reuters ownership ( 13F, Insiders, Mutual Funds ) – “Thomson-Reuters Institutional Holdings (13F) Database provides Institutional Common Stock Holdings and Transactions, as reported on Form 13F filed with the SEC.”
  • Thomson Reuters Dealscan or WRDS-Thomson Reuters LPC DealScan – “WRDS-Thomson Reuters LPC DealScan is the world’s pre-eminent source for extensive and reliable information on the global syndicated bank loan market. It provides users with access to Thomson Reuters LPC’s robust database of detailed terms and conditions on over 240,000 loan transactions. These transactions finance M&A activity, working capital needs and other general corporate purposes for loan participants world-wide.”


The WRDS current subscriptions list also include databases that are free to all WRDS subscribers. A list of these is available on the “What is WRDS?” page. More detailed information on WRDS datasets is available to all on their Our Datasets page (no subscription required)

Most of our WRDS subscriptions are US (North America) only – exceptions include Amadeus (European companies), Bankscope (banks worldwide), CSMAR (China), LSPD (UK) and IBES. (IBES is split into a US file (the default) and an International file.)

WRDS subscriptions are complicated as many providers have multiple databases and the University of Manchester may only subscribe to some. For example, for CRSP we subscribe to US stocks and indices, CRSP/Compustat merged, treasuries and mutual funds, but not to the CRSP Ziman REIT database.

  • CRSP Treasuries – “The CRSP US Treasury and Inflation Series contain returns and index levels on the US Government Bond Fixed Term Index Series, and the Risk Free Rates File.”
  • CRSP Mutual Funds – “As the provider of the only complete database of both active and inactive mutual funds, CRSP leads the way in mutual fund research.”

WRDS does provide some online help for selecting appropriate databases for different types of data, but this help does not take account of our subscriptions. An alternative view of UoM subscriptions is given on the MyWRDS tab and Products page.

Related Posts

US company data on WRDS (posted March 2012)

S&P 500 (Standard and Poor’s 500 Index) (posted December 2011)

WRDS – Wharton Research Data Services (posted March 2011)


Wharton Research Data Services (WRDS). (2014) WRDS Database overview desciptions. [Online]. Available at: WRDS (Accessed: 21 March 2014)

[This reference is adequate since it is probably enough for readers to get access to the information quoted in this post. It does not include the specific page for each quote or cope well with different pages which WRDS being accessed at different dates. However, the alternative of having a whole list of references (WRDS, 2014a), (WRDS, 2014b), (WRDS, 2014c) … looks a worse choice using far more words to give the reader only a little more information.]

US company data on WRDS

7 March 2012 1 comment

WRDS (Wharton Research Data Services) is used by researchers worldwide to get financial and accounting data, especially on US companies. WRDS gives access to the respected CRSP and Compustat NA (North America) databases. It also provides a web interface to make it as easy as possible for researchers to download the data they require. and there is help available from other researchers. See About WRDS for more detail.

Example screencast videos of getting data using WRDS:

Note: The University of Manchester does not subscribe to all the databases available through WRDS -See  FAQ answer on WRDS subscription.

Earlier WRDS post (March 2011) – for full list of WRDS-related post use the tag cloud on the right-hand side.

There is lots of help available when using WRDS: try the “e-learning” and “support” tabs.

Apple users have reported problems with .xls result files. An alternative is to select the .csv (comma separated variables) format. This can be easily imported into Excel. You don’t get the nice .xls formatting but the data is the same.

S&P 500 (Standard and Poor’s 500 Index)

21 December 2011 3 comments

The S&P 500 (Standard and Poor’s 500 Index) is one of the most quoted stock market indices. It is designed to measure performance of the broad US economy through changes in the aggregate market value of 500 stocks representing all major industries. As a widely used index, the S&P 500 is available on several financial databases.

If you are researching US stocks, you may be using WRDS (Wharton Research Data Services) where the S&P 500 is available, but is referred to as S&P’s Composite Index. For example you can get the S&P 500 index level from

  • Web Query : CRSP – Annual Update – Index ; S&P 500 Indexes – Index File on S&P 500 : Variable : Level of S&P Composite Index
  • Web Query : CRSP – Annual Update – Stock / Security Files – Stock Market indexes : Variable :  Level of S&P Composite Index

Note that the “Annual” in CRSP – Annual Update refers to the frequency that the database is updated not the data in the database. You can get monthly and daily data from CRSP – Annual Update. The “Annual” means in Dec 2011 we have access to the data to the end of 2010. There will be an annual update around March 2012 and we will then have access to the data to the end of 2011. (The University of Manchester does not subscribe to the more expensive CRSP Quarterly Update, or CRSP Monthly Update.)

If you want to get recent data values for the S&P 500, or if you are not a WRDS user,  there are other source of data on the S&P 500:

For related information on S&P 500 constituents or total returns:

Where can I find constituent lists for the S&P indices?

How do I find data on total returns for stocks, bonds, and indexes?

[Thanks to the PhD student who highlighted the difficulties in finding S&P 500 on WRDS.]

WRDS – Wharton Research Data Services

11 March 2011 5 comments

WRDS is regarded as the premier financial research database at business schools worldwide. Strictly, WRDS is not a database itself but a web-based research service. The University of Manchester subscribes to databases (CRSP, Compustat, IBES, etc. ) from their providers, and also to access using WRDS because of the convenience this offers. Several key factors contribute to its popularity:

  • Convenient web-based access for databases through WRDS
    CCM (CRSP Compustat Merged) example
  • Web-based queries follow the same style for all the databases making WRDS easier to learn.
  • Access to CRSP and Compustat, the top-rated databases for financial research on the US market
  • Service designed by researchers for researchers: online documentation, able to cope with large volumes of data, and online support.

In general, WRDS is the first choice for most researchers so long as it has the data they need. Thomson Reuters Datastream is our most-used database for UK and Worldwide financial market research, where WRDS is heavily US-centric.

The Databaser blog has a good range of tips on using WRDS – For example

Note that the University of Manchester WRDS subscription does not include Compustat Global.

WRDS website and access

7 July 2010 1 comment

WRDS start page

(An updated version of this post is available: WRDS website and access, updated October 2015)

WRDS (Wharton Research Data Service) is not really a database itself. It is a system designed to give researchers access to a range of financial databases (CRSP, Compustat, CCM, Compustat Execucomp, LSPD) though a common interface, and with integrated support.

Access to WRDS is via a username and password. Some taught modules provide students with a “class account” (such as Advanced Empirical Finance). MSc students, PhD students and academic staff can apply for their own username and password (this provides extra capabilities for running SAS programs on the WRDS system.)

WRDS register pageFor instructions to apply for your own username and password, please see the following guide. (Please note, this is not available for undergraduate students.)

  1. Go to WRDS start page.
  2. Under the login boxes, click on link Register for a WRDS Account.
  3. Fill in the details as follows:
    1. Institution: University of Manchester
    2. Affiliation With Institution: Masters Student
    3. First Name: your first name
    4. Last Name: your last name
    5. Email: must be your University email address
    6. Department: leave blank
    7. Desired Username: University username mxxxxxxx is recommended
    8. When Do You Expect Your Degree?: End date from your library card
  4. Press Submit.
  5. An email will be sent to our library admin staff who should activate your account within 24 hours.
  6. You will receive an email explaining that your account has been activated. It will include a default password that you can change.

Updated 17 June 2015

Beta values – for companies

5 March 2010 3 comments

See also newer post Betas and Company Research (Jan 2011)

Updated answer about historical beta values:

Where can I find historical betas?

There is also How do I find current betas for companies and/or industries?

The Datastream formula (851E) for historical beta is:

where X is the index (e.g. S&PCOMP ), Y is the share id (e.g. 904818 for Boeing), REGB# is the regression beta function, and 60M is the number of months.

Using the expression in a request gives you the most flexibility, say to choose 3 rather than 5 years (36M rather than 60M).

Of course for fullest flexibility, you can get the data and do the regression yourself.