I have been spending some time trying to figure out why R’s read.spss()
function won’t read Qualtrics-generated SPSS SAV files. (Qualtrics is a very nice online survey system which we have been using with one of our partners.)
I have to admit that I have no interest in the structure of SPSS files (or most others, for that matter), so I was very glad to find Scott Czepiel’s spssread.pl
Perl script to parse and display metadata.
So far I can tell that R’s read.spss()/code> is croaking on null characters (as in ASCII 0) at the end of variable names. What was puzzling is that the open source PSPP seems to read these Qualtrics files just fine and the
read.spss()
code was originally based on PSPP.
To read these files from R, I have been reading them into PSPP first and saving new copies.
Thanks to spssread.pl
, I can now see that PSPP doesn't like these variable names either. But instead of croaking, PSPP simply assigns new variable names as spssread.pl -r
shows:
$ ./spssread.pl -r qualtrics_short.sav Name Type Label A_1 String (20) ResponseID A_2 String (20) ResponseSet A_3 String (255) Name A_4 String (255) ExternalDataReference A_5 String (255) Email A_6 String (255) IPAddress A_7 String (255) StartDate A_8 String (255) EndDate A_9 Numeric Finished A_10 Numeric Many airlines are involved in a continui A_11 Numeric Please check which applies to this trip. A_12 Numeric About how full was your cabin of the air A_13 Numeric What was the primary purpose of this fli A_14 Numeric Who made the decision regarding the airp A_15 Numeric Please divide 100 points among the five -Schedule convenience A_16 Numeric Please divide 100 points among the five -Preference for airline A_17 Numeric Please divide 100 points among the five -Frequent flyer/Mileage program A_18 Numeric Please divide 100 points among the five -Ticket price A_19 Numeric Please divide 100 points among the five -Company policy A_20 Numeric How close to the scheduled departure tim A_21 Numeric Please rate the services you received fr-Speed in getting through to Agent A_22 Numeric Please rate the services you received fr-Helpfulness of Agent A_23 Numeric Please rate the services you received fr-Courtesy of Reservation Agent A_24 Numeric Please rate the services you received fr-Accuracy of flight information A_25 Numeric Please rate the services you received fr-Accuracy of fare information A_26 Numeric Please rate the services you received fr-Value for the money A_27 Numeric Please rate the services you received fr-Overall rating of the flight A_28 String (255) Including this trip how many air trips fBusiness A_29 String (255) Including this trip how many air trips fPleasure A_30 Numeric For classification purposes are you... A_31 String (255) Approximate age: A_32 Numeric Occupation A_33 Numeric Approximately how many people are employ A_34 String (255) City and state of residence: A_35 Numeric THANK YOU FOR YOUR COOPERATION. $ ./spssread.pl -r pspp_short.sav Name Type Label V1 String (20) ResponseID V2 String (20) ResponseSet V3 String (255) Name V4 String (255) ExternalDataReference V5 String (255) Email V6 String (255) IPAddress V7 String (255) StartDate V8 String (255) EndDate V9 Numeric Finished A22777 Numeric Many airlines are involved in a continui A22778 Numeric Please check which applies to this trip. A22779 Numeric About how full was your cabin of the air A22780 Numeric What was the primary purpose of this fli A22781 Numeric Who made the decision regarding the airp A22782_1 Numeric Please divide 100 points among the five -Schedule convenience A22782_2 Numeric Please divide 100 points among the five -Preference for airline A22782_3 Numeric Please divide 100 points among the five -Frequent flyer/Mileage program A22782_4 Numeric Please divide 100 points among the five -Ticket price A22782_5 Numeric Please divide 100 points among the five -Company policy A22783 Numeric How close to the scheduled departure tim A_21 Numeric Please rate the services you received fr-Speed in getting through to Agent A_22 Numeric Please rate the services you received fr-Helpfulness of Agent A_23 Numeric Please rate the services you received fr-Courtesy of Reservation Agent A_24 Numeric Please rate the services you received fr-Accuracy of flight information A_25 Numeric Please rate the services you received fr-Accuracy of fare information A_26 Numeric Please rate the services you received fr-Value for the money A_27 Numeric Please rate the services you received fr-Overall rating of the flight A22823_0 String (255) Including this trip how many air trips fBusiness A22823_1 String (255) Including this trip how many air trips fPleasure A22825 Numeric For classification purposes are you... A22826_0 String (255) Approximate age: A22827 Numeric Occupation A22828 Numeric Approximately how many people are employ A22829_0 String (255) City and state of residence: Q16 Numeric THANK YOU FOR YOUR COOPERATION.
File header information can be displayed with spssread.pl -h
:
$ ./spssread.pl -h qualtrics_short.sav Record type $FL2 Product name @(#) SPSS DATA FILE PHP Writer (c) Qualtrics - 0.9.0 Layout code 2 Case Size 349 Compression 1 Weight index 0 Number of cases -1 Bias 100.000000 Creation date 21 Jul 10 Creation time 10:28:27 File label $ ./spssread.pl -h pspp_short.sav Record type $FL2 Product name @(#) SPSS DATA FILE GNU pspp 0.7.6-g55e6e7 - i386-apple-darw Layout code 2 Case Size 349 Compression 1 Weight index 0 Number of cases 1 Bias 100.000000 Creation date 15 Dec 10 Creation time 12:57:31 File label
Thanks, Scott. spssread.pl
sure beats the heck out of some quality time with od
and the SAV file format docs!