Return to UOCC HomeComputing News Home
Header bar

Choosing the Right Software for the Job: an Overview of Statistical Computing at the UO

By Robin High (robinh@oregon.uoregon.edu)

If your UO curriculum requires data analysis and you're new to the field of statistics, you may not know where to start.

A number of good statistical software packages are available at the UO, but which one should you choose? The number and type of statistical procedures has grown so large that it's easy to overlook--or be totally unaware oall available methods. This article is designed to help you make the selection that best suits your needs. Before choosing a software package, ask yourself the following questions:

Statistics packages at the UO easily meet most of these requirements. Consider three major state-of-the art statistics programs--SAS, SPSS, and Splus:

Availability - All are available on OREGON, DARKWING, or the ALPHAcluster

Cost - SAS, SPSS, and Splus on the UO's large systems are free to all UO students, faculty, and staff

Up-to-date techniques - SAS, SPSS, and Splus use numerically correct, sophisticated computational routines, and they'll meet just about every statistical computing need.

Technical support - Written documentation is available in both book and web formats, and all three software packages are supported by Computing Center consultants.

Portability. For some, portability is also desirable, especially if you want to work where a modem or network connection is not available. In this situation, you'll need to purchase a suitable statistics program and load it on a personal computer or laptop so you can work at home or while traveling.

PC versions of SAS, SPSS, and Splus are all available, although the cost of acquiring an individual license may be prohibitive and memory requirements may exceed your computer's resources. Less expensive student versions or multiple copy discounts may be an option, but be aware that some procedures may have limited capabilities or even be missing in those versions.

Other Statistics Programs Available at the UO

In addition to SAS, SPSS, and Splus, there are four other statistics programs worth exploring: MINITAB, BMDP, LIMDEP (for structural equations), and RATS (for time series). MINITAB is used for instructional purposes in several of the UO's introductory statistics classes, and LISREL and RATS are advanced programs used for special applications.

Spreadsheets

Spreadsheet programs like Microsoft Office's EXCEL are also available for personal computers. While I strongly endorse knowing how to use spreadsheets, they are rarely to be recommended for statistical analyses. The choice of statistical methods is limited, and they're very awkward to use, especially with large data sets.

On the positive side, spreadsheets are great for data entry and storage, intermediate calculations, and graphical displays. Once data are entered into a spreadsheet, it's a very simple process to access data for use with any of the statistics programs mentioned above.

Statistics Packages on OREGON, DARKWING, ALPHA

Statistics packages on OREGON include SAS, SPSS, PRELIS/LISREL, and LINDO; programs on DARKWING include sas, bmdp, Splus5, eqs, and rats. DARKWING is case sensitive, so to invoke these programs, you must enter their names exactly as shown above.

Even though SPSS is not available on DARKWING, you can use it on the ALPHAcluster. (All files on DARKWING are also accessible on the ALPHAcluster.)

Each of these programs has its own command structure--well worth the time needed to learn.

Running Programs in Batch Mode on OREGON, DARKWING

A direct way of running statistics programs on OREGON and DARKWING is in batch mode, which requires you to write a file to perform the desired tasks. Everything you want the program to do with the data--from input and necessary transformations all the way to the final analysis--is clearly written into the program. This approach is known as the "syntax" method.

Example of Batch Mode on OREGON: Once written with a text editor such as pico, programs can be submitted in batch mode. For example, to run a SAS program on OREGON, enter the program commands into a file called myfile.sas and type

$ sas myfile 

The .sas extension is not necessary when submitting the command. You will always be given an output file called myfile.log where you can check the execution for error messages and summary information, and if it produced output you'll find the output in a file called myfile.lis. You can also run SAS and SPSS in a more interactive mode through an X-Windows interface.

Example of Batch Mode on DARKWING: To run a SAS program on DARKWING, edit the command file (called myfile.sas) with a text editor, then submit the job by typing

% sas myfile

A file called myfile.log will always be produced, and printed output will appear in the file myfile.lst.

Running Computationally Intense Programs

Both DARKWING and ALPHAcluster are UNIX-based systems that can be used for computationally intensive programs written in sas, Splus5, minitab, and rats. Spss is currently available on ALPHA only, but it soon will be available on DARKWING (see "ALPHA Software Migration"). You can log in on the ALPHAcluster with your DARKWING userid and password. All your files and UNIX commands work the same as on DARKWING. To run an spss program on ALPHA, enter:

% spss -m myfile.sps > myfile.lis

Getting Started: Plan Ahead and Document Your Work!

Plan ahead for analyzing your collection of data most effectively. Always begin with your list of written research questions. This list will help you decide not only what procedures to use; in some cases it may determine which software is best suited for the job. I cannot emphasize this next point enough: whatever software you select, always, always, always DOCUMENT YOUR WORK! It's extremely important to keep a current record of what you did, and why. Preserving your steps in a syntax file can save you a great deal of time and confusion in the long run.

Beware of 'Quick and Easy' Solutions

While easy "point-and-click" methods are available with some PC versions of statistics programs, these methods fail to automatically document your steps into a syntax file. This approach is appealing but treacherous, because you can easily lose track of your sequence of steps or the formulas you applied. It also takes more time to repeat a similar analysis later on or to transfer the data to another user without confusion. This is particularly true of recent versions of SPSS for Windows, where (unlike EXCEL) formulas are not saved in the SPSS spreadsheet.

Advantages of Writing Code. . .the "Syntax" Method: Assuming you can use a text editor, one of the real strengths of writing code in SAS, SPSS, Splus, and other statistics programs is that this approach documents the entire process. All the steps you take, from initial data input to final analysis, are clearly written into the program. This "syntax" method is also a highly efficient way to proceed if you have a lot of repetitive tasks or many variables to process.

The SAS Advantage. One of the great strengths of SAS is that writing a syntax file gives you an automatic documented program. Many users balk at writing programs, assuming it's too difficult. While the SAS language may at first seem complex, the basics are actually quite easy and the selection of statistical procedures is not nearly as complicated as you might imagine. Also, if you have any type of data collected over time (e.g., repeated measures), or if you need to do programming or merge separate files together, SAS is the best choice.

Writing Code in Splus. If you're mathematically inclined, think in matrix terms, and like to write your own programs, Splus offers many nice features and would be a good choice--especially for computer-intensive methods such as the bootstrap.

Need More Information?

Workshops: Many of the concepts introduced in this article will be discussed in much greater detail in two fall workshops in the Knight Library's Electronics Classroom. Consult the workshop schedule for times and dates.

Web Resources: For detailed product information on SAS, SPSS, and Splus, see the following web sites:

SAS: http://www.sas.com

SPSS: http://www.spss.com

Splus: http://www.mathsoft.com

Other useful information concerning SAS, SPSS, Splus and direct connections to statistical sites is available at http://darkwing.uoregon.edu/~robinh/statistics.html


Fall 1999 Computing News | Computing Center Home Page