Archive

Posts Tagged ‘WRDS’

Reshaping wide and long data in Stata

29 July 2016 1 comment
Reshape from long to wide via Stata

Reshape from long data (left) to wide data (right) via Stata

Downloading data from a platform like WRDS can give you the data you need but not in the layout you desire. This post shows you how to change from long data to wide data; that is, from a stacked column of companies to a column for each company, using date as unique identifier. It requires the use of Stata, available on PCs at the Library Finance Zone. (Version 14 was used in this test but it should work just as well on others.)

Preparation

Excel data (long)

Long data in Excel

In Excel, keep only the essential columns (you can add the others back later using VLOOKUP). In this example, some kind of ID (like PERMNO), Date and Return.The first row of the sheet contains the variable names. Save this file as type CSV (Comma delimited) *.csv, ignoring any warnings about features or multiple sheets. Close the file.

Stata import text delimited

Import text data delimited in Stata

Open Stata, click File > Import > Text data (delimited, *.csv). Click the Browse button to select your CSV file the click OK. You will see a command like the following was performed: import delimited P:\Reshape\long.csv. You can look at a summary of the data by typing list in the Command window.

Performing the command

The reshape command in Stata works in two modes.

  • Wide: instead of repeating dates and companies in long columns, create a new column for return for each company so that there is only one row for each date. This is also called long-to-wide. (You could instead have a column for each year with one row per company.)
  • Long: the reverse process, also called wide-to-long.

Here we are going from wide-to-long, with Date as the unique variable (i). We will lose the current Return column. Each new column will be Return for each company ID (j), with a name that is Return followed by that ID, such as Return100500, Return222622. The command to type in the Command window is:

reshape wide return, i(date) j(id)
Stata reshape

Reshape in Stata (click to enlarge)

After this is typed, you will get a summary of what just happened. If the text is in red, an error occurred, often if there are other variables that were not mentioned in the reshape command. You can list more than one variable to be reshaped (the command might then be reshape wide return volume price, i(date) j(id) for example).

Again, you can type the list command to see the data.

Finishing

Stata export

Export to Excel from Stata

You probably wish to export the data to Excel format to continue with your work. To do this, click File > Export > Data to Excel spreadsheet (*.xls, *.xlsx). Click the Save as button to choose where you wish to save to, and if you want older (pre-2007) or current Excel format. Tick the box “Save variable names to first row in Excel file” and press OK. You will see a command generated like this:  export excel using “P:\Reshape\reshaped.xlsx”, firstrow(variables)

Excel results

Now you have wide data in Excel!

You can now open the new file in Excel to see the reshaped data set.

More information

The Institute for Digital Education and Research at UCLA have two excellent example pages, one for reshaping long-to-wide and one for reshaping wide-to-long.

 

 

Categories: Data Analysis Tags: , ,

Getting S&P 500 constituents from CRSP via WRDS Cloud using SSH and SAS

2 June 2016 1 comment

The title of this blog post uses far more abbreviations than I would like! Before I get into the purpose of this blog post I shall define them.

  • S&P 500: Standards & Poor’s 500, an index of the 500 largest US companies (by market capitalisation) on the NASDAQ or NYSE stock indices.
  • CRSP: The Center for Research In Security Prices, a database of end-of-day and end-of-month prices for US equity indices.
  • WRDS: Wharton Research Data Services, a platform of financial and company databases with a common user interface, including CRSP.
  • SSH: Secure Socket Shell, a network protocol that provides a secure connection between remote computers, where commands are typed out.
  • SAS: Statistical Analysis System, a software tool for data analysis that can run on remote servers or your local PC.

Outline of this blog post

There are 500 companies that make up the S&P 500 equity index. This list can change from month to month as the market capitalisation of each company changes and the list needs to comprise the largest 500. When conducting historical analysis against this index it is important to have the right list of constituents for that period. There are various ways to obtain historical constituent lists [also link2],  but one way not previous discussed on this blog is via the CRSP master list, available on the WRDS Cloud to SAS programs executed over SSH or PC-SAS. This post shows how to log in remotely, write a short SAS script and download the master list.

Edit 6 June 2016: This post explains how to get the constituent list of equities by PERMNO identifier, a five-digit number that never changes for a company, given by CRSP. You can get the equivalent GVKEY list from Compustat quite easily and from the web-based WRDS access. Armed with the PERMNO list it is easier to interact with CRSP and for example get stock returns for the S&P500 constituents. [Thanks to Alex Taylor for contributing this information.]

Please note, the SSH access route to WRDS is only available to academic research staff and PhD students – sorry to MSc and undergraduates!

Getting into WRDS Cloud using SSH/SFTP/SCP

You will need to use a tool like WinSCP to see the WRDS Cloud remote computer, to “secure copy” files between it and your PC. WinSCP is a free graphical tool, familiar if you have used FTP before; it gives a view of a directory on your computer (left) and a view of a directory on the server (right). (FileZilla and Cyberduck are alternatives for Windows or Mac.)

WinSCP login

The WinSCP login screen. My user name is blacked out for security purposes.

Login with the following details.

  • File protocol: SFTP
  • Host name:  wrds-cloud.wharton.upenn.edu
  • User name and Password: your regular WRDS account details (PhD and staff only)
  • When prompted, choose “Yes” to save the key to the local machine.
WinSCP files

The standard WinSCP view of files on your computer (left) and files on the server (right). The red ringed icon is the Create New Directory button. I have used it to create a directory called “dsp500list”.

Your personal file space on the server will appear on the right, initially none. The directory labelled “..” is a short-cut meaning “the directory one up from this one (parent)”. The path of this directory is “/home/manchester/<wrdsusername>”. Your new SAS script will be saved here. Create a new directory on the server to save your work in, called “dsp500list” (press F7 or the icon hightlighted above).

The WinSCP tool will allow you to easily copy files but will not let you execute commands (run programs). For this, you will need a tool like Putty (or just the Terminal on Mac). Log in with the details below.

  • Connection type: SSH
  • Host name:  wrds-cloud.wharton.upenn.edu
  • User name and Password: your regular WRDS account details (PhD and staff only)
  • When prompted, choose “Yes” to save the key to the local machine.
Putty login

The Putty configuration and terminal screens. You will be prompted for user name and password later (press Enter after each). Commands are typed at the “$” prompt at the bottom of the screen.

You will need to use a few Unix commands to get about here. The most important one to start with is “chdir” to change directory. Type “chdir dsp500list” to change the working directory to the one you created above. We will put the SAS script and results file here.

Writing a SAS script

The SAS script to copy, saved to file dsp500list_0.sas:


options nosource nodate nocenter nonumber ps=max ls = 72;
title ' ';
options source;
proc print data=crsp.dsp500list noobs label;

It is easiest to write this on your PC in your chosen text editor then secure copy it to the “dsp500list” directory via WinSCP. (You may choose instead to type it directly in the Putty terminal using the command line editor “vim“, but explaining that is out of scope of this post.)

WinSCP copy

On the right of WinSCP, double-click the “dsp500list” directory to enter it, then drag the new SAS script from the local hard drive (left) into it (select “Copy” not “Move”).

This script accesses the SAS dataset stored at the location “/wrds/crsp/sasdata/q_indexes/dsp500list.sas7bdat”, the daily S&P500 list changes. You could use the monthly file “msp500list” instead by changing the “data=crsp.dsp500list” part of the script accordingly.

Getting the data out

Run qsas dsp500list_0.sas on the Putty command line to execute the SAS query. It will take a few seconds but then there will be two files produced.

  1. dsp500list_0.log – a log file of the working of SAS, telling you exactly what is happening and any error messages.
  2. dsp500list_0.lst – the output text file, effectively the same data as the daily S&P 500 list.
QSAS results

Running the command “qsas dsp500list_0.sas” from the correct directory in Putty will execute the script. The resulting files can be seen with the “ls” command in Putty or by refreshing in WinSCP (green round arrows icon).

The output text file dsp500list_0.lst begins like this, although in plain text with fixed width columns and some blank lines at the start.

CRSP Permanent Number Date when the stock included in S&P500 Index Date when the stock excluded from S&P500 Index
10006 19570301 19840718
10030 19570301 19690108
10049 19251231 19320101

This file can be used in your local SAS scripts, perhaps with a little work converting it back to a true sas7bdat file — this exercise is left for the reader to complete!

Summary

This is a lot to take on if you have never worked with Unix systems before, and I would advise the regular WRDS web interface for most users. If you need to use WRDS Cloud to run your SAS scripts to directly access files such as the master CRSP lists, this post shows you how to get started.

See also the WRDS Cloud manual and Intoduction to WRDS Unix system.

WRDS new interface for selecting output variables

6 April 2016 Leave a comment

Following on from the refresh last summer, the Wharton WRDS web portal has been updated again. It is now easier to find and select the output variables you want to use in your request.

WRDS variables 2016

Finding and selecting output variables in WRDS is now much easier

The screen above uses the CRSP monthly stock database as an example. In the ‘Query Variables’ step, there are now multiple tabs containing categories of variables (which used to be listed in one column of boxes). The first tab contains all the variables, the subsequent tabs are categories like identifiers, balance sheet items, profit & loss items, and so on. There are orange and white arrows to scroll left and right between the tabs.

There is a search field in the first tab where you can type the name of the variable or any substring of it. For example, type “code” to get any variable with “code” in its name, it does not have to begin with “code”. Tick each match that you want to use and it appears in the column on the right, labelled ‘Selected. You can browse through all the tabs and see the same master list of Selected variables appear in each. This is much easier than using the browser’s ‘CTRL+F’ find function that was required before.

New feature: Saved Codelists

WRDS saved lists

You can save lists of codes to use later

Another new feature is the Saved Codelists. Usually we recommend people upload a text file containing the codes for the companies (or other entities) that they wish to run a query on. WRDS lets you type in or select companies one-by-one as well (useful for looking up a small number of companies). There is a blue link to ‘Code Lookup’ in most of the WRDS databases which lets you type the whole or part of the name of a company to find its code.

The new feature is that you can save these short lists to use later. The screen above shows Bureau van Dijk Amadeus. Simply tick the first box highlighted in the image above (to the right of the codes) and type a name for the list. On a later visit to WRDS, you can find the list in the second box highlighted above, ‘Select Saved Codelists’.

Insider trading

16 July 2015 Leave a comment

Definition of Insider Trading (from Investopedia):

The buying or selling of a security by someone who has access to material, nonpublic information about the security.

Companies in any country

Use ThomsonONE.com

  • Search for the company you want.
  • Click on the Company Analysis > Ownership folder, and select Insider Holdings or Insider Transactions report.
  • Click “Click here for full history” for historic holdings (older than 24 months).

Use Capital IQ

  • Search for the company you want.
  • On the long menu down the left, choose Investors > Public Ownership, then the Insider Trading tab.
  • Only the last 12 months are available.

Use Bloomberg (available in the Finance Zone and Precinct Library)

  • Try Bloomberg function PHDC for a specific company (e.g. VOD LN <F8> PHDC <GO>) and select Insiders, sorted by size.
  • You can also look up the biographies of executies/directors (BIO) and then choose Reported Holdings.
  • For stocks that have experienced the most purchases and sales by insiders, type: INSD and hit Go. Use the fields at the top of the screen to select country, time period, type of transaction. Click on a stock to display a graph of its historical prices with date of insider transactions.
  • For more info, type: insider and hit the green Help key <F1>.

Companies in the U.S. only

Use WRDS

  • Choose Thomson Reuters database, and then Insiders Data. Search by company ticker. Check the variables you want in your report.

Others

  • Insider filings with the U.S. SEC are available through Edgar. Enter your company name or ticker symbol or CIK and Include Ownership forms.
  • Insider transactions are available by issuer and reporting owner. Take a look at the website SecForm4.Com

Intraday quotes for equities and indices

Global source

Use Bloomberg.

For Equities and Indices:

Type the stock’s ticker symbol then hit [EQUITY] and hit <GO>

Choose: GIP – Intraday Price Graph

You can change the date and the time interval.

or

Type the index symbol then hit [INDEX], and then type GIP or GIT and hit <GO>

For example, to get intraday price intervals for the S&P 500:

SPX [INDEX] GIT and hit <GO>

Note: limited historical intraday data is available approx 900 trading days for GIP and 50 trading days for GIT.

Other sources

For US stocks, go to Wharton Research Data Services (WRDS) TAQ (New York Stock Exchange Trade and Quote, or NYSE TAQ.

For historical (1983-1992) NYSE and AMEX tick-by-tick data and for historical NASDAQ (1987-1992) tick-by-tick data, click on ISSM – Institute for the Study of Security Markets

TICKDATA provides a commercial service offering intraday data from the UK and other exchanges.

Finally, Thomson Reuters Tick History database offers global intra-day Time and Sales, Time and Quotes and Market Depth content from the 1st January 1996 for an extensive range of Equities, including their derivatives. Access to this resource is limited; please contact us.

Exchange rates

10 July 2015 1 comment

Exchange rates available online:

Several databases also have exchange rates (current/spot and forward):

  • Search Global Financial Database It has daily and monthly exchange rates for many different countries, in some cases going back hundreds of years.
  • Use WRDS and select Foreign Exchange Rates.
  • Use Datastream – select Category as Exchange Rates
  • Use Bloomberg
    Hit the Currency Market Sector key (Yellow) and then hit GO/EnterFor the best and worst performing currencies relative to a selected currency,type WCRS and hit GO

Economic databases also provide exchange rate date: UK Data Service (formerly Economic and Social Data Service) provides IMF monthly, quarterly and annual data.

LSPD on WRDS (three tips)

London Share Price Database (LSPD) is a unique, comprehensive database of UK stock returns covering over 9,000 UK shares available to University of Manchester researchers through our WRDS (Wharton Research Data Services) subscription.

LSPD is firmly aimed at researchers: it only covers UK shares and does not have the variety of data available in Thomson Reuters Datastream or Bloomberg. It concentrates on its historic coverage (from 1955 to date) and the quality of its returns data.  The following three items illustrate LSPD strengths.

1) Historic company names

Finding companies by their previous names can be a headache for researchers, especially if a company is dead or inactive. LSPD uses its own unique company number (G1 in the excellent manual) so that it can trace historic names. You can search by name or company number using the WRDS-LSPD company lookup facility. The screenshot below shows the names for Granada (LSPD_STOCK_ID (G1) is 2222) from 1959 onwards.

WRDS-LSPD lookup for Granada

WRDS-LSPD lookup for Granada

It can be useful to first lookup by name, e.g. “Granada”, and then by the LSPD_STOCK_ID (G1) number where there has been a major name change, e.g. “Granada plc” to “ITV plc”.

Previous posts that demonstrate the problems name changes can cause

An alternative approach to find historical company names is to download the “st_names” dataset and use the find/selection options in your chosen analysis software (Excel/SPSS/Stata/SAS).

2) Main or AIM market

UK companies listed on the London Stock Exchange can be on either the Main Market or the AIM (Alternative Investment Market). LSPD can be used to identify AIM listed companies, including dead/inactive ones.

You can used the LSPD “Stock Annual query” – search the entire database with condition G16 (SEDOL Group) is 95 (Alternative Investment Market).

A recent test gave 2667 AIM listed companies, of which 1092 were active. Of the inactive/dead AIM companies the most popular reasons (from G10 Type of Death) were:

  • G10 = 5 – Acquisition/takeover/merger (approx 600 companies)
  • G10 = 14 – Quotation cancelled for reason unknown. No dealings under rule 163(2) or (3) (approx 400)
  • G10 = 20 – In Administration/Administrative receivership (approx 200)

3) UK monthly risk free rate

Finally LSPD Index 3 is Treasury Bill Rate (90 Day) so you can also get a UK risk free rate from LSPD.

One advantage of the LSPD Index Monthly query is that you get both the annual yield (I10) and the monthly log return (R22) (Screenshot below).

LSPD-T-bill

As previously, it is best to download the LSPD manual for the definitions of the these variables.

Categories: company information Tags: ,