Robin High
Statistical Programmer and Consultant
robinh@uoregon.edu
In recent years, the primary resources for statistical computing at the University of Oregon have been Darkwing and Oregon. Most of the statistical software on the large timesharing systems is now currently available only on Darkwing or Gladstone. Perhaps the biggest change is that statistical programs for desktop and laptop computers now have the computing edge in many situations, as I'll describe below.
Statistical programs for Darkwing and Gladstone. Darkwing and Gladstone are Unix-based systems for computationally intensive programs. Statistical programs available on these systems include SAS Version 8.2 and SPSS Release 6.14 (available only on Darkwing). Note: Although Release 6.14 is the most recent version of SPSS for Unix, it dates back to the mid-1990s and lacks some options that are available with more recent versions of SPSS that currently run on a PC or Mac.
More specialized programs. Other programs that emphasize statistical applications include SPLUS 6.1.2, BMDP 7.1, EQS, MINITAB Release 9.1, and RATS 5.01. All of these programs are available on both Darkwing and Gladstone.
SPLUS has many advanced features which more math-oriented users would feel comfortable learning. MINITAB is an interactive program that is primarily an instructional resource, and RATS is an advanced program for time series applications.
SAS is the only statistical program currently available on Oregon. Unless you have specific reasons for running SAS on this system, it is highly recommended that you to make the transition to SAS on Darkwing or a personal computer. (Note that Oregon is being phased out and will no longer be available as of fall 2004.)
For a summary of all the programs we offer, see the "Software on Darkwing and Gladstone" table on page 9. These programs are available for use by all UO students, faculty, and staff and will cover many statistical computing needs.
Statistical Programs for Personal Computers. SAS is currently the only campuswide licensed statistical software we offer for personal computers. This license allows all UO students, faculty, and staff to use SAS on campus and also load it onto personal computers at home.
SPSS for PC or Mac. Unfortunately, we're not able to offer a university-wide license for SPSS for the PC or the Mac. However, you may purchase a license directly from SPSS, obtain a valid license from your department, or run it on a computer located in a campus computing laboratory. Less expensive student versions may be an option, but some advanced procedures are not available and the program may have limited capabilities.
Other programs such as STATA and SCA may be found within specific departments, but are not officially supported at this time by the Computing Center staff.
SAS or SPSS? SAS and SPSS have long been the two primary choices for most users. Both programs handle routine analysis methods quite satisfactorily. Version 8.2 of SAS is particularly good for a broad range of statistical computing applications and is very well suited for any project that requires ongoing data file manipulations. SPSS works well for many basic statistical applications if you don't anticipate needing to continually work with data file manipulations.
Whichever program you choose, the first task is to make certain you have a valid license to run it!
In recent years the PC version of SAS has become much more versatile, powerful, and convenient to use. For members of the university community who own a computer, all it costs to acquire this software is the amount of time it takes to install (installation instructions are available at http://sas.uoregon.edu/).
Although PC SAS has Windows-based selection of procedures and options, its real power and versatility will be found by learning to enter commands in the editor window. Combined with the incredible computing power of PCs, you can now use SAS to run most data analysis applications typically found in classrooms or research settings. Although SAS is designed for the PC, a few MAC users have loaded and run it successfully with a PC emulator.
Windows interface. Statistical programs on PCs and Macs generally work with a Windows interface, either through commands submitted with a program editor or with pull-down menus.
The recommended process is to enter programs into a text editor. You can run the entire program with a "submit" command, or by highlighting a specific section with a mouse you can submit only that portion. Some programs are interactive in that they offer a menu of choices that allow you to point-and-click your way through the analysis process (I'll have more to say on that method below.)
Batch mode. Unless you have X-Windows (e.g., Linux or the commercial product Exceed), you'd typically run statistical programs on Darkwing or Gladstone in batch mode.
Batch mode requires you to write a file of program commands with a text editor (such as pico). With this approach (known as the "syntax" method), the tasks you want the program to perform—from data input and transformations all the way to the final analysis—are clearly written into the list of program commands. Once a program is written, you submit it with a command at the system prompt.
Running a SAS Program. Darkwing and Gladstone are both case-sensitive, so to invoke a program you must enter its name in the proper case. For example, to run a SAS program on Darkwing, enter the program statements into a file called myprog.sas and then, at the % prompt, type:
% sas myprog
Note that you don't need the .sas extension when issuing this command.
Running a SAS program always produces an output file called myprog.log where you can check for error messages and summary information. When program output is produced it will be found in a file called myprog.lst.
Advantages of writing code with the "syntax" method. Whether you run programs on a personal computer or submit them in the batch mode, it's extremely important to keep a current record of what you did and why. Whatever software you select or system you run it on, always document your work! When you record your data processing and analysis steps in a syntax file with concise and relevant comments, this simple process can possibly save you a great deal of time and confusion in the long run.
One of the real strengths of writing program code is that it clearly documents the data analysis process. This "syntax" method is also a highly efficient way to proceed if you have many repetitive tasks or many variables to process.
The SAS Advantage. While the SAS system may seem complex (and it is), the basic statements and instructions necessary to get started are easy to learn. Once data have been read into a SAS dataset, invoking specific statistical procedures is not complicated in most cases. Also, SAS is a particularly good choice if you have data collected over time (e.g., longitudinal studies), survey data, projects that requires programming, or if you need to merge or update multiple files, among many other advantages. SAS also has a macro feature which makes repeated tasks simple to run once the basic program is written.
While "point-and-click" methods are available with many statistical programs, this approach has the potential to cause problems. Most programs do not "remember" the steps you take unless you write them or paste them to a file as you proceed. The ease of the interactive approach is appealing; however, it can cause confusion later on if you don't remember the sequence of analysis steps you took, specific options you chose, or what particular data transformations you applied. Also, it takes more time to repeat a similar or identical analysis later on. If you transfer a data file to another person, the structure or contents of the data file can be confusing if numerous variables computed over time are included. This is particularly true of SPSS for Windows, where the formulas applied are not saved in the SPSS spreadsheet (unlike EXCEL).
Spreadsheet programs such as Microsoft Office's Excel are also widely available for desktop and laptop computers. While spreadsheets are strongly endorsed for data entry and storage, they are rarely suitable for statistical analyses. The choice of statistical methods is limited, and they can be very awkward to use, especially if your dataset contains many rows and columns.
Spreadsheets can compute basic summary statistics and make a few helpful graphical displays. Once your data have been entered into a spreadsheet, it's a very simple process to access them with statistical programs such as SAS or SPSS through an IMPORT procedure, direct data exchange (DDE), or saving your data in a delimited text file and having SAS or SPSS read it. More information about using Microsoft Excel as a statistics package is available at http://www.practicalstats.com/Pages/excelstats.html
Workshops: Many of the concepts introduced in this article will be discussed in detail in SAS workshops this fall. Consult the IT workshop schedule (http://libweb.uoregon.edu/it/ ) for times and dates.
Web Resources: You'll find detailed information concerning statistical programs and direct connections to statistical websites at http://darkwing.uoregon.edu/~robinh/statistics.html
For detailed product information on SAS, SPSS, and Splus, visit the vendor websites: