Archive

Posts Tagged ‘SPSS’

Converting an SAS dataset into an Excel file

The Statistical package SAS is often used by researchers who have to deal with very large datasets. It is also used by WRDS (Wharton Research Data Services): a WRDS web query runs a SAS program (using the values on the web form as program parameters) and makes the result dataset available.

For some Eventus web queries on WRDS the results are only available as a SAS dataset. You can convert this to Excel (if it is not too large) using the SAS software.

SAS Export

SAS 9.4 Export wizard

SAS 9.4

  1.     Open the SAS dataset file
  2.     Select File and Export Data
  3.     Follow the SAS Export wizard (Excel is the default)
  4.     Select Finish to start the export
  5.     If the export fails, this is probably due to an incompatability between SAS and Microsoft Office/Excel. The simplest workaround is to use a different format: comma separated values (*.csv), Stata file (*.dta), SPSS file (*.sav) etc.

SAS Enterprise Guide

  •     Open the SAS dataset file
  •     Select File and Export
  •     Select .xls as output format

For detail on converting from SAS to Excel see the 64-bit-gotchas post on The SAS Dummy blog.

SPSS Statistics

If SAS is not available on your PC, you can open a SAS data file in SPSS Statistics and then export to Excel.

Getting data in and out of Eventus

13 February 2015 Leave a comment

WRDS top cornerEventus is a tool to perform event studies on US companies, pulling data directly out of the CRSP stock database. It is available through the WRDS portal for postgraduate students and members of staff at The University of Manchester.

The interface of Eventus is similar to the other platforms available through WRDS, but it can be a little more confusing to the newcomer. The post aims to help you get data in and out of Eventus, resulting in a usable Excel or CSV document. It does not aim to explain the mechanics of Eventus or what is really happening under the surface!

Preparing company identifiers and dates

Eventus 1 - source data

Mergers and acquisitions data, including acquirer PERMNO and announcement date

Eventus 2 - prepare columnsTypically, people who decide to use Eventus already have a list of acquiring companies, identified by PERMNO codes, and a corresponding list of dates to study (in YYYYMMDD format). Copy these two columns to a new Excel workbook in this order, then delete the header row if there is one.

Eventus 3 - ready to uploadThis workbook will need to be Saved as type: Text (Tab delimited) (*.txt). Ignore any warnings about the file type not accepting multiple sheets — you only need the active worksheet. Accept any warning about features not compatible with this file format. After the file has saved, close it, ignoring any further warnings about not saving changes.

You should now have a plain text file that begins like this if you open it in Notepad++ or similar.

Using Eventus

Your file is now ready to be uploaded to Eventus. Log onto WRDS, then choose Eventus from the drop-down list Select a Data Set on the top-left. Choose Cross-Sectional Analysis Daily. (Only the cross-sectional analysis options produce data that can be used in the methods described by this blog post.) You will see that CUSIP codes could have been chosen instead of PERMNO, but I suggest PERMNO where possible — there can be confusion over CUSIP codes sometimes, whether you need 6, 8, or 9-digit versions.

Eventus 4 - fill in fields

Launch Eventus Cross-Sectional Analysis, fill in the fields and upload the prepared file

Choose your new text file to upload. Fill in the options as you require, including the event period and alternative windows.

Eventus 5 - download

Download the files produced by Eventus; the first is the most useful (*.sas7bdat)

When ready, click on the Submit Query button. The next screen can take a few seconds to load, but hopefully it will look like this (above). If not, read the error message and check what you entered on the previous screen.

You can download three files, but the first one is the most useful. It has a filename ending in .sas7bdat. This is a SAS dataset.

Opening or converting the results

If you have SAS installed on your computer, you can open the results file directly in that (double-click the file) then choose File > Export and Files of type: Microsoft Excel Workbooks.

Eventus 6 - Open output in SAS

Output from Eventus opened in SAS Enterprise Guide which can be exported to Microsoft Excel

If you have SPSS instead, you can import the results file then export it to Excel format or CSV.

  • To launch SPSS, go to Start > IBM SPSS Statistics > IBM SPSS Statistics 22 (or 20).
  • To import in SPSS, use File > Open > Data and then Files of type: SAS (*.sas7bdat).
  • To export it from SPSS, use File > Save As and then Save as type: Excel 2007 through 2010 (*.xlsx) or Comma delimited (*.csv).
Eventus 6 - SPSS and Excel

Output from Eventus opened in IBM SPSS Statistics and exported to Microsoft Excel

Summary

This guide should help you to upload data to and download data from Eventus, ready for the next stage of your study. If you found this useful, or if you need further help explaining these steps, please leave a comment below.

Updated 20 January 2016 with some new screenshots for WRDS and SAS.

Categories: Data Analysis Tags: , ,

Importing data into SPSS Statistics (part 3)

7 May 2014 1 comment

SPSS logoThere are many ways to bring your data into IBM SPSS Statistics, for whatever manner of analysis or reorganisation you wish to perform. Perhaps you just want to open a very large text document that is too big for Excel. Please read Part 2 of this blog post before proceeding, which covers preparing an Excel document and the direct but limited way to import data into SPSS.

Approach 3: Read text data via Comma-Separated Values format

You may have data in a text file. If so, each line must represents a reading/case/observation, and the variables are delimited (separated) by a tab, comma or other character. The most common of these is the Comma-Separated Values (CSV) file, which may be saved with a *.CSV file extension or a *.TXT one.

Prepared CSV dataAn Excel worksheet can be saved in this format using the Save As command and the option “Save as type: CSV (Comma delimited) (*.csv)”. Doing this will save just the current worksheet and the current text formatting of the data, with a comma between each cell value, and with double quote marks around every cell value that itself contains a comma. Once saved, close the Excel document and dismiss any warnings, assuming you already have saved the document in Excel format as a backup. You can open a CSV file in Notepad or a similar text editor to check the format, but don’t just double-click the file or it will be imported straight back into Excel.

Now you are ready to bring the data into SPSS. Choose the File -> Read Text Data menu, and change the filter to “Files of type: Text (*.txt, *.dat, *.csv)”. Open your CSV file, and the Text Import Wizard will begin.

Text Import Wizard

1Text Import Wizard Step 1 2Text Import Wizard Step 2 3Text Import Wizard Step 3

The wizard asks questions about the format of your text file. It lets you save all the answers you give at the end as a format, so you can skip all the questions on subsequent imports. Step 1 simply asks if you already have a format file to load. If you don’t, just click Next.

Step 2 asks if your data is delimited (values have a comma or other character between them) or fixed width (the number of characters along each line determines where the value boundaries are). A CSV file is delimited. The next question asks if the first row contains variable names or not. The preview at the bottom will update with your answers, then click Next. Don’t worry if it doesn’t look totally right yet.

Step 3 asks about how many lines of data to read in. Usually the default values are correct here, but check with the preview to be sure, then click Next.

Text Import Wizard Step 4 5Text Import Wizard Step 5 6Text Import Wizard Step 6

If your data is delimited, Step 4 asks which delimiter characters you have, and if there are any text qualifiers (quote marks around the values). If you data has been exported from Excel to CSV, choose comma (and space), then double quote mark text qualifier. The preview should be looking better by now. Click Next.

Step 5 gives you the chance to name your variables and set which variable type (data format) each one is. (Comma is used if your number has commas for thousands marks.) You cannot change the number of decimal places — that had to be done in the formatting in Excel before exporting — but variable names that would not be valid in SPSS are automatically fixed. Click Next.

Step 6 is the final stage, which just offers the chance to save all your answers to a format file. Click Finish to see you data appear in SPSS, and remember to save immediately to the native SPSS Statistics format.

You can now add value labels and variable labels. One further thing to note if your data includes CUSIP codes (a type of US company identifier): if one of the entries contains the letter E with numbers and no other letters, it may get converted to a number in scientific format. This is a danger in Excel too, so it might be worth checking for this manually after the data has been imported.

Going further: Create a new query using Database Wizard

You may open from an ODBC database, such as a Microsoft Access file, but that is beyond the scope of this blog post.

Notes

IBM SPSS Statistics version 20 was used in this blog post, but the methods should apply to older and newer versions too. The data used in these examples comes from UK’s carbon footprint, 1997-2011, retrieved 3 April 2014.

Categories: Data Analysis Tags: , , ,

Importing data into SPSS Statistics (part 2)

24 April 2014 2 comments

SPSS logoThere are many ways to bring your data into IBM SPSS Statistics, for whatever manner of analysis or reorganisation you wish to perform. Here are a few approaches to consider, with some of their relative merits and shortcomings. Please read Part 1 of this blog post before proceeding, which covers opening an existing data source in SPSS Statistics format and typing or pasting in data.

Approach 2: Read text data directly from other formats

You can open data files created in other applications, from the File -> Open menu or the File -> Read Text Data menu. (If you spot any difference between these two, please comment below.) The file type choices include:

  • Text (*.txt, *.dat, *.csv) — tab, comma or otherwise-delimited
  • Excel (*.xls, *.xlsx, *.xlsm) — one sheet only
  • SAS (*.sas7bdat or others) — one data set
  • Stata (*.dta)

The plain text or Excel document may have variable names in the first row, but only the first row. It may instead have no variable names at all. It is worth preparing your source data file and checking that it has no surprises for SPSS like a variable type changing from 2 decimal places to no decimal places to text within the same column.

Preparing an Excel document

Excel original data layoutThis example uses UK carbon emissions data across difference streams from 1997 to 2011, starting in an Excel document with several sheets and some formatting.

  • The numerical data has 9 decimal places but is formatted to show none.
  • The variable names are split over two rows and contain space characters.
  • The data does not start in the top-left corner.

Excel file with prepared data layoutThe images to the right show how the data has been copied to a new workbook with one worksheet (tab).

  • The variable names are in the first row only, with no spaces or punctuation (although underscores and dots would be okay).
  • The data starts from the first column (which contains an identifier “year”), and from the second row.

Example: reading text directly from Excel (quick)

SPSS: CO2 import from ExcelIn, SPSS, choose the File -> Read Text Data menu and select your prepared Excel document. You can choose if the file has variable names in the first row or now, choose which sheet to read from, and optionally specify a cell range and maximum column character width for string (text) columns.

SPSS file imported from ExcelAfter clicking on OK, the data will immediately appear in SPSS (see right). The variable types were guessed by SPSS based on the content of the first row of actual data, and the number of decimal places is set to 1 for numerical data. Although you can change the number of decimal places in the Variable View tab, the data past the first place is lost. The identifier variable “year” has had one decimal place added too, which should be removed too.

This method does not allow you to change the variable type or properties, so consider the longer approach outlined in Part 3, going via a plain text file.

This post continues with Part 3.

Notes

IBM SPSS Statistics version 20 was used in this blog post, but the methods should apply to older and newer versions too. The data used in these examples comes from UK’s carbon footprint, 1997-2011, retrieved 3 April 2014.

See also: Research Financial: Using plain text files in Excel

 

Categories: Data Analysis Tags: , , ,

Importing data into SPSS Statistics (part 1)

4 April 2014 1 comment

SPSS logoThere are many ways to bring your data into IBM SPSS Statistics, for whatever manner of analysis or reorganisation you wish to perform. Here are a few approaches to consider, with some of their relative merits and shortcomings. This post is split over three parts.

Before you proceed, you should be at least slightly familiar with the main window in SPSS Statistics, the Data Editor. Specifically, there are two views as identified by the orange tab at the bottom-left of the screen: Data View and Variable View. The former has the variables in columns with observations/readings in rows; the latter has the variables in rows with their meta-data in columns.

SPSS Data Editor showing labels SPSS Data Editor tabs  SPSS Data Editor with Variables View

Best-case scenario: Open an existing data source in SPSS Statistics format

You might be fortunate enough to already have data in the native format to SPSS Statistics (*.sav). This is the format to choose when saving your data while working in SPSS.

Each variable has a Name, and that name cannot contain spaces, punctuation (except dots or underscores) and cannot begin with a number. In older versions of SPSS, variable names could only be 8 characters long. It is good practice to use more explanatory Labels with your variables as well as short-hand Names. This will help you if you come back to your data in the future and cannot remember it as well as you thought (it happens to everyone!) or if you pass on your data to somebody else.

Approach 1: Type in data, or copy and paste

You may create a new, blank document and save it in SPSS Statistics default format. Set up your variables carefully, including the variable type (e.g. Number) and number of decimal places, before you type any data in. This is especially important if you choose to copy and paste your data in from another source such as a spreadsheet, or you risk your data being rounded down to integers.

SPSS Data Editor showing valuesRemember to use a numerical variable type wherever possible, even if your data appears to be in labelled categories such as Yes/No or UK/Europe/World. SPSS works best with numbers, so record your categories as integers (e.g. 0/1 for No/Yes) then assign value labels once it is in SPSS (that’s the Values column in the Variable View).

To toggle the display of the data category labels and the numbers behind them, go to View -> Value Labels when in the Data View tab.

This post continues with Part 2.

Notes

IBM SPSS Statistics version 20 was used in this blog post, but the methods should apply to older and newer versions too. The data in the screenshots come from a British Crime Survey, 2010, and were prepared by The Cathy Marsh Centre for Census and Survey Research.

Categories: Data Analysis Tags: , , ,