Home > Data Analysis > Importing data into SPSS Statistics (part 3)

Importing data into SPSS Statistics (part 3)

SPSS logoThere are many ways to bring your data into IBM SPSS Statistics, for whatever manner of analysis or reorganisation you wish to perform. Perhaps you just want to open a very large text document that is too big for Excel. Please read Part 2 of this blog post before proceeding, which covers preparing an Excel document and the direct but limited way to import data into SPSS.

Approach 3: Read text data via Comma-Separated Values format

You may have data in a text file. If so, each line must represents a reading/case/observation, and the variables are delimited (separated) by a tab, comma or other character. The most common of these is the Comma-Separated Values (CSV) file, which may be saved with a *.CSV file extension or a *.TXT one.

Prepared CSV dataAn Excel worksheet can be saved in this format using the Save As command and the option “Save as type: CSV (Comma delimited) (*.csv)”. Doing this will save just the current worksheet and the current text formatting of the data, with a comma between each cell value, and with double quote marks around every cell value that itself contains a comma. Once saved, close the Excel document and dismiss any warnings, assuming you already have saved the document in Excel format as a backup. You can open a CSV file in Notepad or a similar text editor to check the format, but don’t just double-click the file or it will be imported straight back into Excel.

Now you are ready to bring the data into SPSS. Choose the File -> Read Text Data menu, and change the filter to “Files of type: Text (*.txt, *.dat, *.csv)”. Open your CSV file, and the Text Import Wizard will begin.

Text Import Wizard

1Text Import Wizard Step 1 2Text Import Wizard Step 2 3Text Import Wizard Step 3

The wizard asks questions about the format of your text file. It lets you save all the answers you give at the end as a format, so you can skip all the questions on subsequent imports. Step 1 simply asks if you already have a format file to load. If you don’t, just click Next.

Step 2 asks if your data is delimited (values have a comma or other character between them) or fixed width (the number of characters along each line determines where the value boundaries are). A CSV file is delimited. The next question asks if the first row contains variable names or not. The preview at the bottom will update with your answers, then click Next. Don’t worry if it doesn’t look totally right yet.

Step 3 asks about how many lines of data to read in. Usually the default values are correct here, but check with the preview to be sure, then click Next.

Text Import Wizard Step 4 5Text Import Wizard Step 5 6Text Import Wizard Step 6

If your data is delimited, Step 4 asks which delimiter characters you have, and if there are any text qualifiers (quote marks around the values). If you data has been exported from Excel to CSV, choose comma (and space), then double quote mark text qualifier. The preview should be looking better by now. Click Next.

Step 5 gives you the chance to name your variables and set which variable type (data format) each one is. (Comma is used if your number has commas for thousands marks.) You cannot change the number of decimal places — that had to be done in the formatting in Excel before exporting — but variable names that would not be valid in SPSS are automatically fixed. Click Next.

Step 6 is the final stage, which just offers the chance to save all your answers to a format file. Click Finish to see you data appear in SPSS, and remember to save immediately to the native SPSS Statistics format.

You can now add value labels and variable labels. One further thing to note if your data includes CUSIP codes (a type of US company identifier): if one of the entries contains the letter E with numbers and no other letters, it may get converted to a number in scientific format. This is a danger in Excel too, so it might be worth checking for this manually after the data has been imported.

Going further: Create a new query using Database Wizard

You may open from an ODBC database, such as a Microsoft Access file, but that is beyond the scope of this blog post.

Notes

IBM SPSS Statistics version 20 was used in this blog post, but the methods should apply to older and newer versions too. The data used in these examples comes from UK’s carbon footprint, 1997-2011, retrieved 3 April 2014.

Advertisements
Categories: Data Analysis Tags: , , ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s