Home > Business Databases > Getting S&P 500 constituents from CRSP via WRDS Cloud using SSH and SAS

Getting S&P 500 constituents from CRSP via WRDS Cloud using SSH and SAS

The title of this blog post uses far more abbreviations than I would like! Before I get into the purpose of this blog post I shall define them.

  • S&P 500: Standards & Poor’s 500, an index of the 500 largest US companies (by market capitalisation) on the NASDAQ or NYSE stock indices.
  • CRSP: The Center for Research In Security Prices, a database of end-of-day and end-of-month prices for US equity indices.
  • WRDS: Wharton Research Data Services, a platform of financial and company databases with a common user interface, including CRSP.
  • SSH: Secure Socket Shell, a network protocol that provides a secure connection between remote computers, where commands are typed out.
  • SAS: Statistical Analysis System, a software tool for data analysis that can run on remote servers or your local PC.

Outline of this blog post

There are 500 companies that make up the S&P 500 equity index. This list can change from month to month as the market capitalisation of each company changes and the list needs to comprise the largest 500. When conducting historical analysis against this index it is important to have the right list of constituents for that period. There are various ways to obtain historical constituent lists [also link2],  but one way not previous discussed on this blog is via the CRSP master list, available on the WRDS Cloud to SAS programs executed over SSH or PC-SAS. This post shows how to log in remotely, write a short SAS script and download the master list.

Edit 6 June 2016: This post explains how to get the constituent list of equities by PERMNO identifier, a five-digit number that never changes for a company, given by CRSP. You can get the equivalent GVKEY list from Compustat quite easily and from the web-based WRDS access. Armed with the PERMNO list it is easier to interact with CRSP and for example get stock returns for the S&P500 constituents. [Thanks to Alex Taylor for contributing this information.]

Please note, the SSH access route to WRDS is only available to academic research staff and PhD students – sorry to MSc and undergraduates!

Getting into WRDS Cloud using SSH/SFTP/SCP

You will need to use a tool like WinSCP to see the WRDS Cloud remote computer, to “secure copy” files between it and your PC. WinSCP is a free graphical tool, familiar if you have used FTP before; it gives a view of a directory on your computer (left) and a view of a directory on the server (right). (FileZilla and Cyberduck are alternatives for Windows or Mac.)

WinSCP login

The WinSCP login screen. My user name is blacked out for security purposes.

Login with the following details.

  • File protocol: SFTP
  • Host name:  wrds-cloud.wharton.upenn.edu
  • User name and Password: your regular WRDS account details (PhD and staff only)
  • When prompted, choose “Yes” to save the key to the local machine.
WinSCP files

The standard WinSCP view of files on your computer (left) and files on the server (right). The red ringed icon is the Create New Directory button. I have used it to create a directory called “dsp500list”.

Your personal file space on the server will appear on the right, initially none. The directory labelled “..” is a short-cut meaning “the directory one up from this one (parent)”. The path of this directory is “/home/manchester/<wrdsusername>”. Your new SAS script will be saved here. Create a new directory on the server to save your work in, called “dsp500list” (press F7 or the icon hightlighted above).

The WinSCP tool will allow you to easily copy files but will not let you execute commands (run programs). For this, you will need a tool like Putty (or just the Terminal on Mac). Log in with the details below.

  • Connection type: SSH
  • Host name:  wrds-cloud.wharton.upenn.edu
  • User name and Password: your regular WRDS account details (PhD and staff only)
  • When prompted, choose “Yes” to save the key to the local machine.
Putty login

The Putty configuration and terminal screens. You will be prompted for user name and password later (press Enter after each). Commands are typed at the “$” prompt at the bottom of the screen.

You will need to use a few Unix commands to get about here. The most important one to start with is “chdir” to change directory. Type “chdir dsp500list” to change the working directory to the one you created above. We will put the SAS script and results file here.

Writing a SAS script

The SAS script to copy, saved to file dsp500list_0.sas:


options nosource nodate nocenter nonumber ps=max ls = 72;
title ' ';
options source;
proc print data=crsp.dsp500list noobs label;

It is easiest to write this on your PC in your chosen text editor then secure copy it to the “dsp500list” directory via WinSCP. (You may choose instead to type it directly in the Putty terminal using the command line editor “vim“, but explaining that is out of scope of this post.)

WinSCP copy

On the right of WinSCP, double-click the “dsp500list” directory to enter it, then drag the new SAS script from the local hard drive (left) into it (select “Copy” not “Move”).

This script accesses the SAS dataset stored at the location “/wrds/crsp/sasdata/q_indexes/dsp500list.sas7bdat”, the daily S&P500 list changes. You could use the monthly file “msp500list” instead by changing the “data=crsp.dsp500list” part of the script accordingly.

Getting the data out

Run qsas dsp500list_0.sas on the Putty command line to execute the SAS query. It will take a few seconds but then there will be two files produced.

  1. dsp500list_0.log – a log file of the working of SAS, telling you exactly what is happening and any error messages.
  2. dsp500list_0.lst – the output text file, effectively the same data as the daily S&P 500 list.
QSAS results

Running the command “qsas dsp500list_0.sas” from the correct directory in Putty will execute the script. The resulting files can be seen with the “ls” command in Putty or by refreshing in WinSCP (green round arrows icon).

The output text file dsp500list_0.lst begins like this, although in plain text with fixed width columns and some blank lines at the start.

CRSP Permanent Number Date when the stock included in S&P500 Index Date when the stock excluded from S&P500 Index
10006 19570301 19840718
10030 19570301 19690108
10049 19251231 19320101

This file can be used in your local SAS scripts, perhaps with a little work converting it back to a true sas7bdat file — this exercise is left for the reader to complete!

Summary

This is a lot to take on if you have never worked with Unix systems before, and I would advise the regular WRDS web interface for most users. If you need to use WRDS Cloud to run your SAS scripts to directly access files such as the master CRSP lists, this post shows you how to get started.

See also the WRDS Cloud manual and Intoduction to WRDS Unix system.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s