Archive

Author Archive

Getting a linking table from Compustat via WRDS Cloud using SSH and SAS

27 June 2017 Leave a comment

Researchers commonly need to combine data from multiple sources. Even when the sources are offered through the same platform such as Wharton Research Data Service (WRDS), this can still be a difficult process with many choices. Essentially, you need to find a common entity identifier in both sources; if none exists, find a reference table or other third source.

The help provided by WRDS is often sparse in detail and tailored for their more advanced users. I have previously written in detail about using WRDS Cloud and SAS to gain access to resources not available on the WRDS website, in the blog post Getting S&P 500 constituents from CRSP via WRDS Cloud using SSH and SAS.

In this post, I will suggest a way to link I/B/E/S data with Compustat North America data. I will summarise the points that are addressed in the aforementioned blog post.

Get a linking table from Compustat North America

You will need the following to begin.

  1. AUTHENTICATION. Your WRDS username and password, specifically an individual PGR or Academic login.
  2. SENDING COMMANDS. Windows users: the Putty program. Mac/Linux users: the Terminal app or command line.
  3. FILE TRANSFER. Windows users: WinSCP or FileZilla. Mac/Linux users: CyberDuck or FileZilla or command line.
  4. FILE EDITING: Notepad, Notepad++ or another text editor. Excel or other spreadsheet application.

The method is very similar to getting the S&P 500 constituents. Essentially, Compustat has a dataset called securities which lists all the identifiers for all the securities. It is in SAS format. It cannot be accessed directly by us, but we can write a SAS script to read its entire contents to a plain text file, then copy the file to our own PCs. This file is kept at location /wrds/comp/sasdata/d_na/security/security.sas7bdat for North American data and is updated often.

Your script to read this file should look like this:

options nosource nodate nocenter nonumber ps=max ls = max;
title ' ';
options source;
proc print data=comp.security noobs width=UNIFORM;

This is different from last time in a few ways.

  1. The ‘ls’ or ‘line size’ option has been changed from ’72’ to ‘max’. This is because there are more columns for Compustat securities than for S&P 500 constituents.
  2. The data table we are reading from is ‘data=comp.security’.
  3. The ‘labels’ option has been removed so we get the clean variable names as the header.
  4. The ‘width’ option is set to ‘UNIFORM’ so the columns are the same number of characters wide on each ‘page’.

The method is as follows; remember to refer to the aforementioned blog post.

  1. Save the SAS script to a new text file on your computer, compsecurity.sas or similar.
  2. Use WinSCP or equivalent to copy this file to your own space on WRDS Cloud.
  3. Use Putty or equivalent to run the script, log in and type qsas compsecurity.sas and press Enter.
  4. Wait 10 seconds and use WinSCP to download the generated files compsecurity.log and compsecurity.lst to your computer. The ‘lst’ file contains the data, if the program was successful (read the ‘log’ file to check).
  5. Use Notepad++ or equivalent to remove the blank lines from the top of compsecurity.lst and save it as compsecurity.txt – then check every 32,764 lines for a few more blank lines and repeat of the headers to be removed (these are ‘page breaks’).
  6. In Excel, open compsecurity.txt (via Data > Get External Data from Text) and use fixed column width option, carefully setting the locations for the column breaks, remembering to set most columns as Text format instead of General (particularly GVKEY and CUSIP). Save as CSV or Excel format.

Compustat securities via WRDS Cloud

Using the linking table

You will need the columns GVKEY and IBTIC (IBES Ticker).

  • If your Compustat and I/B/E/S data are in Stata format, import this table and consider joining the tables.
  • If your Compustat and I/B/E/S data are in Excel format, use VLOOKUP formulas (some more processing may be required).

Still to do: there is not necessarily a one-to-one match between these two identifiers. The same GVKEY will appear on more than one row, but will those rows contain more than one unique ticker? And is the data frequency the same in each source? You will need to find the answers to these questions out for yourself.

Extending your Excel worksheet formulas via Python, for Capital IQ

17 March 2017 3 comments

Downloading bulk data from financial databases can be a tiring process. You may need panel data structured in a very specific way for analysis in a tool like Stata, and the data provider provides data in a different and limited way; perhaps for one company or one year at a time, leaving you to build it into the shape you need. Each product, like Bloomberg, Datastream, ThomsonONE.com and Capital IQ, provide a different Excel interface. Sometimes you can build a formula for each data item for each year for each company, then fill or copy it across or down to save time. This extended blog post will show you how to do this for the Capital IQ Office Plug-in, and how to save time extending it for panel data using a short Python script.

What kind of data are we looking at?

First, let’s consider the types of variables in turn. In no particular order:

  1. Entities: That is, Companies, Funds, Equities, Indices, Persons…
  2. Time: Usually one of Years, Quarters, Months, Days…
  3. Data items/types: Time series or line items such as Total assets, Turnover, Closing price…

If we consider any one variable or dimension at a time, this is easy; it’s just a list of identifiers or numbers.

  1. AAPL, MSFT, GOOG
  2. 1995, 1996, 1997
  3. TOTAL_ASSETS, NUM_EMPLOYEES, CREDIT_RATING

We can easily create a table to list any two of these dimensions simply.

For example: (fictional) figures for one company:

Item\Year 2001 2002 2003
Total assets ($m) 6.3 7.2 8.9
Net profit ($m) 2.3 4.7 6.1

Or, (fictional) figures for Year 2004:

Company\Item Employees (thousands) Credit rating
AAPL 50 AAA
MSFT 65 AA
GOOG 14 AAA

But if we want to look at three at once, it gets more complicated. More on that later.

Getting data from Capital IQ Office Plug-in

This post assumes you are familiar with using Capital IQ and its Excel add-in. If not, search for other posts on Capital IQ or read the hand-out guides we have written before proceeding.

In short, you can use an Excel formula to pull data from the Capital IQ servers to your spreadsheet. These can be created in the Formula Bulider menu, or typed out manually. For example, to get total assets, the formula will be =CIQ(“IQ18671”, “IQ_TOTAL_ASSETS”, “FY2012”) if the company identifier is IQ18671 and the selected date is financial year 2012. [Note that the double-quote characters appear ‘curved’ here in WordPress and you’ll need to retype them clean in Excel.]

Tip: The data retrieved will only load on a PC with the Capital IQ Excel add-in set up; you will need to copy then paste-as-values, or save as CSV to keep the data for use elsewhere.

Using the fill down/across command

The formula includes several arguments — the options separated by commas — which are “hard coded” – the values for which company, date or item are included as text strings.

Basic Capital IQ formula

The CIQ formula in the yellow cell is shown shown in the formula editor. It references the company in A2, the item “IQ_TOTAL_ASSETS” and the date “FY2012”.

You can replace these arguments with cell references, to a cell that contains the same text, or part of it.

Replacing the item with a cell reference

The yellow cell’s formula now includes a cell reference B$1 for the item type. Note the dollar sign locks the row number when we fill the formula down.

This will allow you to use the Excel feature fill down or fill across, to extend the formula down or right.

Fill the formula down and right

The cell reference for date follows the text “FY” then an ampersand, as the cell B2 contains just the year number. Note the dollar sign in B$2, and now also in the company cell reference $A3, so that the formula fills right correctly.

It will work for any argument, including item or date, although you need to take more care with date. In the example above we use year, and Capital IQ expects the year number to follow the letters “FY” for Financial Year (or “CY” for Calendar Year). We must therefore use the ampersand ‘&’ character to join the strings.

Adding the third dimension

So far we have looked tables to represent two of our three dimensions. There are six permutations for this:

  • Companies down and dates across, for one item
  • Companies down and items across, for one date
  • Dates down and items across, for one company
  • and each of those transposed the other way around.

So if we want to show all three, we have to essentially stack more tables across or down. There are many more ways to arrange the data like this, such as:

  • Companies down, items across, dates across
  • Companies down, date across, items across

It might be easier to show these in a dummy table:

Company ID Item1-2001 Item2-2001 Item1-2002 Item2-2002 Item1-2003 Item2-2003
comp1  a  b  c  d  e  f
comp2  g  h  i  j  k  l

Or

Company ID Item1-2001 Item1-2002 Item1-2003 Item2-2001 Item2-2002 Item2-2003
comp1  a  c  e  b  d  f
comp2  g  i  k  h  j l

You might prefer to stack into panels as below (this is like reshaping from wide to long in Stata). We have to repeat the first column (company ID) for each date… do not be tempted to use merged cells here! If you did, you won’t be able to use data filters, Excel tables or correctly export to other programs.

Company ID Year Item1 Item2
comp1 2001  a  b
comp1 2002  c  d
comp1 2003  e  f
comp2 2001  g  h
comp2 2002  i  j
comp2 2003  k  l

Often we start with a list of thousands of companies. It would be very time consuming to manually insert extra lines for each year after each company! This is where the script comes in…

Extending the process with a Python script

In order to perform a predictable and repetitive task, it can be a good idea to write a short script instead. The method below uses a combination of Excel and an external programming language, Python, and some human intervention. (I assume you have Python installed – the University PCs tend to have it, sometimes as part of SPSS.)

  1. Start with a column of the company IDs in Excel, and save this as a text file. You can do this by copying the column and pasting it into Notepad, remembering to delete any header rows. (You can instead paste it to a new workbook in Excel and save as Text (Tab delimited) (*.txt).) The file must be named ids.txt for the script to work.

    List of company IDs

    Save the list of IDs to a text file ids.txt

  2. Now for the Python script, which is shown at the end of the post for you to copy. Copy and paste this into a text editor like Notepad, although Notepad++ is included on University of Manchester PCs and is much better. If the list of years {1995, 1996, 1997} is not right for you, change it here! Save this as script01.py or similar – note the file extension is .py and not .txt – this is a Python script. Save it to the same directory as the ID file.

    Python code, showing whitespace

    Notepad++ showing Python code (you can copy-paste it from the end of this blog post), showing whitespace/tabs in orange. Python scripts must have consistent line indents, tabs or spaces.

  3. Run the script. To do this, open Windows Explorer, navigate to the folder you have your two new files in and hold Shift + Right-click then choose “Open command window here”. A black command window will open, type “python script01.py” (no quotes) and press enter. Hopefully no error message will show, you’ll get no message if it was successful.

    Run the script

    Shift+Right-click to get “Open command window here”, type “python script01.py” and press Enter. File template.txt is created.

  4. Find the newly created file, template.txt, and open it in Excel. There are several ways to open tab-delimited text files in Excel, any should do.
    Open .txt file in Excel, part 1

    Excel, open file, select type “Text Files (*.prn; *.txt; *.csv)” and choose template.txt. Accept any warning messages.

    Open .txt file in Excel - part 2

    This is a delimited text file, with Tab delimiters only.

  5. You can type the item names in C1, D1, E1 etc., such as “IQ_FINISHED_INV” and “IQ_TOTAL_ASSETS” (no quotes). You can type the template Capital IQ formula in C2, remembering to replace the literal arguments with cell references, and remembering to lock some of the references with dollar signs. For example, =CIQ($A2, C$1, “FY”&$B2)

    Fill the template columns and first formula

    Complete the green cell column headers (item codes). Fill in the yellow cell formula with correct cell references and locks, here it is =CIQ($A2, C$1, “FY”&$B2)

  6. Save the workbook as regular Excel file (*.xlsx). Now select the cell, hover the mouse over the bottom-right corner for the little black square icon, and drag the icon to the right to select that row in all the columns you need. The formula will copy across, and hopefully correctly. You may need to edit it and try again if the dollar sign locks went wrong. Now fill the formula down (just a few rows at first to test, then all the way down by double-clicking the black square icon).

    Final fill out

    Save before you start! Fill the formula right, then down a few rows, then further. Don’t just fill all the way to the bottom if there are thousands of rows, this might overload Capital IQ.

The Python code, which should have indents on the lines in each for loop:

fin = open('ids.txt')
fout = open('template.txt', 'w')
years = {"1995", "1996", "1997"}
# header
fout.write("ID\tYear\n")
#body
for line in fin:
   id = line.rstrip()
   for year in years:
       lineout = id + "\t" + year + "\n"
fout.write(lineout)
fin.close()
fout.close()

Moving on

  • The script is not the best example of Python, but it works! For example, the list of years should really be done better.
  • You could extend the script to include the items and Excel formulas, but for a one-off script, it is probably quicker to do some of the work in Excel by hand (using the fill function).
  • You could chose any other programming language over Python, including VBA which is built into Excel.
  • You could apply this technique to other databases with an Excel interface such as Bloomberg.

Related posts

Bond ISIN to company ISIN — EDSC manuals, tips & tricks

9 November 2016 Leave a comment

Suppose you created a list with bonds and the accompanying ISIN codes. You can’t use these ISIN codes to download information from the issuer of this bond. But here is way to use the ISIN codes you collected so far to find the ISIN code of the issuer An example: the ISIN code US44044KAA97 belongs […]

via Bond ISIN to company ISIN — EDSC manuals, tips & tricks

Finding ESG data in Datastream using the ASSET4 template

25 October 2016 Leave a comment

A detailed source of environmental, social and governance (ESG) data can be found in Thomson Reuters Datastream.

Instead of choosing the appropriate datatypes in a regular Datastream time series request, you can download the ASSET4 template. This is hard to find in the InfoBase support platform (replacement for Extranet) so I suggest you get it from the Datastream Excel add-in toolbar, although this method is not quite as easy as you would like.

Open Datastream in Excel the usual way (remember the desktop icon “DSSetup – Shortcut”). Click on the Datastream ribbon tab, go to the Request Tables group and click Sample Sheets. Tick the button ‘Equities : ESG ASSET4 – Sector Industry  analysis’ then click the Download button.

ASSET4 - find the template

Find the ASSEST4 template with the ‘Sample Sheets’ button and download it.

The Download & Open button does the same as the Download button, it doesn’t open the file.

ASSET4 template location

ASSET4 template downloads to a hidden folder.

The template takes the form of a macro-enabled Excel workbook and is downloaded to a hidden folder. Click on Start > Computer, click in the address bar and type the following:
C:\ProgramData\datastream\datastream advance\User

Note that ‘ProgramData’ is not the same as ‘Program Files’ or ‘Program Files (x86)’, it is a hidden folder. As you type the rest of the path, Windows should suggest auto-completions, which you can accept.

ASSET4 file location

ASSET4 file location, in the hidden C:\ProgramData folder.

Find downloaded ASSET4 (animated)

ASSET4 file location (animated)

From here, open the file and accept any warnings about enabling content.

ASSET4 intro

ASSET4 introduction

There are many sheets (tabs), starting with Home. Click on ESG Filters to select your search criteria. Choose (1) a sector or industry such as Telecommunications Services and (2) criteria across the environmental, social and corporate governance categories such as Water Use Total, Women Managers and Size of Board. When ready, click the Go button.

ASSET4 search criteria

ASSET4 search criteria

Wait a few minutes while the data is downloaded to the Data Table sheet.

ASSET4 data table

ASSET4 data table

The tables and charts of the many other sheets are automatically updated to reflect your search (except the Data Fields definitions sheet). For example, the Bottom 5’s sheet.

ASSET4 example results

ASSET4 example results

If you find a datatype you would like to use in a regular time series request, make a note of its code.

See also earlier post Environmental, Social and Governance (ESG) Data from July 2013.

Advanced support for using the Bloomberg Excel add-in

4 August 2016 2 comments

Introduction

An example Bloomberg Excel template

This post is a collection of frequently and infrequently asked questions about the Bloomberg Professional API, most specifically the Bloomberg Excel Add-in. For the most part, it should be considered for experts, so please don’t be disheartened if it is too advanced for you; it’s effectively everything I know about the topic.

The questions are as follows.

  1. Where is the add-in? I don’t have it.
  2. Which ways can one use Bloomberg and Excel together?
  3. Where can I find the field codes?
  4. I don’t want to be restricted to Excel, can I use Java, C++, .NET, Python or R?
  5. Where can I get more help?

Where is the add-in? I don’t have it.

Install Bloomberg Excel add-in

  • If the ribbon tab labelled “Bloomberg” is not showing in Excel, close it then click on Start > All Programs > Bloomberg > Install Office Add-Ins. Follow the instructions, but be prepared to run the little install program a few times, perhaps opening and closing Excel and running it again.

Which ways can one use Bloomberg and Excel together?

Terminal copy or export

Bloomberg types of copy/export to Excel

Some of the ways to get data from Bloomberg to Excel (remember to work at the terminal though!). Top-left (FA): Red menu > Output > Excel > Current Template (data is live). Top-right (GP): Right-click on chart > Copy/Export Options > Copy Data to Clipboard (then paste in Excel). Bottom-left (MEMB): Red menu > Output > Excel (data is fixed). Bottom-right (CHNG): Not possible to export report data.

  • Various functions in the Bloomberg terminal let you get the on-screen data into Excel in different ways. Look for commands such as “Copy data to clipboard”, “Output > Excel” or similar, by clicking on the red menu bar (Actions) or right-clicking on charts or data.
  • Sometimes the action will download and open a new Excel document, either with the data written in directly or loading later via the Bloomberg API. Sometimes the data is copied to the clipboard for you to paste into a worksheet of your choice.
  • Remember that a sheet which contains Bloomberg formulas to load live data may not load on a PC without Bloomberg unless you save as CSV or copy/paste-as-values. (This is also true for all the Excel options below.)

Excel templates

Excel template library

Bloomberg template library, browsed in Excel. Could also use XLTP function in terminal.

  • Usually most useful if you are looking up one company, bond, exchange rate or commodity, a Bloomberg template will give you a detailed Excel workbook filled with data and visualisations that are updated live from Bloomberg via the Excel API. Amber coloured fields are editable, often to change the company, country, sector, date or other variable. (See the first image in the post.)
  • The templates can be found in the terminal with the XLTP<GO> function, and in Excel under Bloomberg > Explore > Template Library.

Excel import

Bloomberg Excel historical end of day

Use the Import Data menu in Excel to get historical end of data and other data.

  • A commonly used feature that is described in our Bloomberg Workbook (available in the Bloomberg Suite and at the Precinct Library) is the Historical End of Day wizard. In Excel, click Bloomberg > Import > Import Data > Real-Time / Historical > Historical End of Day.
  • The wizard will let you type security identifiers or select from a common index, then choose your data types and data range. It will then produce the results in the cell you selected.

Excel function builder

Bloomberg Excel function builder

Build a function from scratch. The formula in cell B1 is =BDP(“AAPL US EQUITY”, “INDUSTRY_SECTOR”) and the value is Technology.

  • If you want a little more control, use the function builder, found in Bloomberg > Create > Function Builder. This more advanced tool will expose the Bloomberg API to you, starting by asking you to choose one of three major Bloomberg Functions:
    • BDP: (Bloomberg Data Point) Import a single data point of current data.
    • BDH: (Bloomberg Data History) returns the historical data for a selected security.
    • BDS: (Bloomberg Data Series) imports a set of bulk data such as peers.
  • For your chosen function, you will be asked to type in a security (such as “AAPL US EQUITY”), a field (such as “INDUSTRY_SECTOR” or “PX_LAST”) , and dates (depending on the function).
  • The tool will suggest auto-completion if you don’t happen to know the exact security or field code. It will only suggest valid responses.
  • You can add optional extra parameters such as orientation=H|V, currency, or “array=True” which puts all the output data into one cell instead many rows/columns (requires array formulas afterwards). Note that row and column counts will be added as extra parameters automatically after the formula has called.
  • The security, field and dates can be written into the formula or referenced from other cells.
  • Notice that the security ID needs to end with what kind of entity it is, so equities end “EQUITY”, bonds end “CORP” etc.

Excel manual function creation

Bloomberg Excel function builder manual edits

In the formula in cell B1, by replacing the security ID with a cell reference A1, you can then copy the formula down or across.

  • Once you have used the function builder, you will have a working formula that you may wish to copy out for each of your many securities, fields or dates. If you use a cell reference for this variable, you may copy the formula across or down. For example, you can have a list of security identifiers in column A {AAPL US EQUITY, IBM US EQUITY, VOD LN EQUITY, …} and the formula in column B =BDP(A1, “INDUSTRY_SECTOR”) and copy down the formula in column B.
  • What if your formula produces data in two dimensions and you need to leave a gap between each call, for example with amendment history of bonds? I have addressed that problem by writing a Python script to prepare the formulas and spacing. Assuming noblanks.txt is a file with one security ID per line (without the “CORP” bit) and withblanks.txt is our output:
fin = open('noblanks.txt')
fout = open('withblanks.txt', 'w')
for line in fin:
    id = line.rstrip()
    fout.write(id + "\t=BDS(\"" + id + "Corp\",\"AMENDMENT_HISTORY\",\"cols=3;rows=100\")")
    fout.write('\t\n' * 100)
fin.close()
fout.close()

The generated file is tab-delimited and can be opened in Excel for Bloomberg to action. The first column is the security ID (without the “CORP” bit), the second column contains the formulas, and there are 100 blank rows between each Bloomberg call to ensure enough space. There is room to improve this approach!

Where can I find the field codes?

The mnemonic codes for each field are not the same as in the Bloomberg terminal but can be looked up using the FLDS<GO> function. The auto-complete feature of the function builder in Excel is a good alternative.

I don’t want to be restricted to Excel, can I use Java, C++, .NET, Python or R?

  • In theory, yes, that is possible, although you will need technical support and all the necessary development environment to be set up on a Bloomberg terminal. In summary, the APIv3 needs be installed (from the WAPI<GO> function) which provides the necessary libraries for Java, C, C++ and .NET.
  • To use R (or RStudio), you’ll need to connect via Java (with the standard rJava library and Rbbg library from the http://r.findata.org/ repository). Python connects via the C library.
  • This is too advanced to be part of the regular Business Data Service, sorry!

Last year, I worked with PhD candidate Ali Bayat, and we got the following R script working from RStudio with R 3.1.3 and Java 8u22. [Thank you, Ali.]

install.packages("rJava")
install.packages("Rbbg", repos = "http://r.findata.org/")
library("rJava")
library("Rbbg")
conn <- blpConnect()
bdp(conn, "AMZN US Equity", "NAME")

Where can I get more help?

  • As mentioned, our white binder Bloomberg Workbook (at the Bloomberg Suite and Precinct Library) describes how to use the historical end of day import from Excel.
  • There is help throughout the Excel add-in (look for white question mark in a blue circle icons). The templates all have a help sheet (coloured green tab). Remember to press the F1 key at any terminal function to get context-sensitive support.

 

Reshaping wide and long data in Stata

29 July 2016 1 comment
Reshape from long to wide via Stata

Reshape from long data (left) to wide data (right) via Stata

Downloading data from a platform like WRDS can give you the data you need but not in the layout you desire. This post shows you how to change from long data to wide data; that is, from a stacked column of companies to a column for each company, using date as unique identifier. It requires the use of Stata, available on PCs at the Library Finance Zone. (Version 14 was used in this test but it should work just as well on others.)

Preparation

Excel data (long)

Long data in Excel

In Excel, keep only the essential columns (you can add the others back later using VLOOKUP). In this example, some kind of ID (like PERMNO), Date and Return.The first row of the sheet contains the variable names. Save this file as type CSV (Comma delimited) *.csv, ignoring any warnings about features or multiple sheets. Close the file.

Stata import text delimited

Import text data delimited in Stata

Open Stata, click File > Import > Text data (delimited, *.csv). Click the Browse button to select your CSV file the click OK. You will see a command like the following was performed: import delimited P:\Reshape\long.csv. You can look at a summary of the data by typing list in the Command window.

Performing the command

The reshape command in Stata works in two modes.

  • Wide: instead of repeating dates and companies in long columns, create a new column for return for each company so that there is only one row for each date. This is also called long-to-wide. (You could instead have a column for each year with one row per company.)
  • Long: the reverse process, also called wide-to-long.

Here we are going from wide-to-long, with Date as the unique variable (i). We will lose the current Return column. Each new column will be Return for each company ID (j), with a name that is Return followed by that ID, such as Return100500, Return222622. The command to type in the Command window is:

reshape wide return, i(date) j(id)
Stata reshape

Reshape in Stata (click to enlarge)

After this is typed, you will get a summary of what just happened. If the text is in red, an error occurred, often if there are other variables that were not mentioned in the reshape command. You can list more than one variable to be reshaped (the command might then be reshape wide return volume price, i(date) j(id) for example).

Again, you can type the list command to see the data.

Finishing

Stata export

Export to Excel from Stata

You probably wish to export the data to Excel format to continue with your work. To do this, click File > Export > Data to Excel spreadsheet (*.xls, *.xlsx). Click the Save as button to choose where you wish to save to, and if you want older (pre-2007) or current Excel format. Tick the box “Save variable names to first row in Excel file” and press OK. You will see a command generated like this:  export excel using “P:\Reshape\reshaped.xlsx”, firstrow(variables)

Excel results

Now you have wide data in Excel!

You can now open the new file in Excel to see the reshaped data set.

More information

The Institute for Digital Education and Research at UCLA have two excellent example pages, one for reshaping long-to-wide and one for reshaping wide-to-long.

 

 

Categories: Data Analysis Tags: , ,

Use pairs in one VLOOKUP with historical exchange rates from Datastream

20 July 2016 Leave a comment

One of the most useful Excel functions I help students with is VLOOKUP – extremely useful to automatically bring in data from one table to another. If you have company accounts data from Compustat Global (via WRDS) for multiple companies in multiple currencies, you will need to bring in the change rate for each currency to show every field in US Dollars.

A B C D E
1 Company Currency Total Assets Local Exchange rate Total Assets USD
2 Co. X AUD 100 =VLOOKUP(B2, currencies, 2, FALSE) =C2*D2
3 Co. Y PLN 400 =VLOOKUP(B3, currencies, 2, FALSE) =C3*D3

Where ‘currencies’ is a named range elsewhere in the workbook:

From Currency Rate
AUD 1.3158
PLN 3.95685
USD 1

The VLOOKUP command looks for the currency code in the first column and returns the rate from the second column (use ‘FALSE’ to ensure an exact match). The conversion is simply a multiplication. (Note that you can convert from USD to USD at a rate of 1.0.) But what about making sure that the exchange rate is correct for the date of the data, and what if you have data from multiple years?

Performing VLOOKUP when matching a pair of variables

Thanks to Professor Marie Dutordoir for suggesting this kind of technique!

Usually you will want to look at accounts data over several years and for several companies. If you are working with several currencies then you will need a more complicated solution to this VLOOKUP exchange rate table, considering a day/month/year for each rate-currency pair. You do not need a second lookup table or other Excel functions, you just need to have another variable of date and a variable of currency-date pairs.

Global company accounts data

Global company accounts data. Given column G (currency) and column H (year), create new column F (map = currency&year).

The new column F (mapping) is formed from currency and year, so F2 contains “=G2&H2” (see image above). The ‘currencies’ named range now begins with a different first column produced in the same way (see image below). A VLOOKUP always matches on the first column of its range, and now this one pulls the rate from the fourth column, so rate is “=VLOOKUP(F2, currencies, 4, FALSE)” for row 2.

In the ‘currencies’ named range, I have combined the exchange rate calculation and lookup into one formula (column K). Repeat this for each data type that you wish to convert the currency for. Remember to have an entry for USD at rate 1 for each year.

paired-vlookup-lookup

Stacked historical exchange rates in a named range ‘currencies’. Created column A (map) from B&C.

This example only has the average exchange rate for each year, you may wish to extend this approach for monthly or even daily rates. If you do this, you will need to take extra care with the format of the mapping cells, but this exceeds the scope of the post.

Where to find historical exchange rates?

Use the Datastream Navigator to find exchange rates

Use the Datastream Navigator to find exchange rates. Start by searching one category (Exchange Rates), limit it “To Currency: United States Dollar”, then use the search box for the source currency.