Skip to content

SPSS and PSPP binary file access for .SAV files. Does not require spss io DLLs to access. Based on the PSPP specifications.

License

Notifications You must be signed in to change notification settings

tonykaralis/Spssly

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spssly

Build Status Build Status

Nuget Package Nuget

License

C# SPSS SAV file reader and writer library.

A library that enables reading and writing of SPSS data files (.sav) from and to a Stream. The library is UTF-8 safe.

This project is a fork of SPSS-.NET-Reader by fbiagi (based on spsslib-80132 by elmarj).

Since forking:

  • a lot of bugs have been fixed
  • code has been cleaned up
  • project ported to .NET Standard
  • .NET Core compatible
  • other utlities added to decrease file size

This library has been battle tested in production on a few large deployments at MMR Research WorldWide.

Installation

Via Package Manager:

Install-Package Spssly

Via .NET CLI

dotnet add package Spssly

Bloated data files:

Sometimes files are exported from a platform and are extremely bloated. In this case use the IDataReallocator which helps to reallocate the data, then write the data to a new file if you plan on using the data more than once. This will consume less memory in the long run. See unit tests for examples. Reduction in file size is drastic and performance is increased noticeably on large datasets.

Example:

  • Third party platform export -> 230mb
  • After DataReallocation -> 34mb

The above example file could be further compressed down to circa 2mb, however this can only be achieved by saving the file using the official SPSS software.

Reading a data file:

// Open file, can be read only and sequetial (for performance), or anything else
using (FileStream fileStream = new FileStream("data.sav", FileMode.Open, FileAccess.Read, FileShare.Read, 2048*10, 
                                              FileOptions.SequentialScan))
{
    // Create the reader, this will read the file header
    SpssReader spssDataset = new SpssReader(fileStream);
    
    // Iterate through all the variables
    foreach (var variable in spssDataset.Variables)
    {
        // Display name and label
        Console.WriteLine("{0} - {1}", variable.Name, variable.Label);
        
        // Display value-labels collection
        foreach (KeyValuePair<double, string> label in variable.ValueLabels)
        {
            Console.WriteLine(" {0} - {1}", label.Key, label.Value);
        }
    }
    
    // Iterate through all data rows in the file
    foreach (var record in spssDataset.Records)
    {
        foreach (var variable in spssDataset.Variables)
        {
            // Use the corresponding variable object to get the values.
            Console.Write(record.GetValue(variable));
            // This will get the missing values as null, text with out extra spaces,
            // and date values as DateTime.
            // For original values, use record[variable] or record[int]
        }
    }
}

Writing a data file:

// Create Variable list
var variables = new List<Variable>
{
    new Variable
    {
        Label = "The variable Label",
        ValueLabels = new Dictionary<double, string>
        {
            {1, "Label for 1"},
            {2, "Label for 2"},
        },
        Name = "avariablename_01",
        PrintFormat = new OutputFormat(FormatType.F, 8, 2),
        WriteFormat = new OutputFormat(FormatType.F, 8, 2),
        Type = DataType.Numeric,
        Width = 10,
        MissingValueType = MissingValueType.NoMissingValues
    },
    new Variable
    {
        Label = "Another variable",
        ValueLabels = new Dictionary<double, string>
        {
            {1, "this is 1"},
            {2, "this is 2"},
        },
        Name = "avariablename_02",
        PrintFormat = new OutputFormat(FormatType.F, 8, 2),
        WriteFormat = new OutputFormat(FormatType.F, 8, 2),
        Type = DataType.Numeric,
        Width = 10,
        MissingValueType = MissingValueType.OneDiscreteMissingValue
    }
};
// Set the one special missing value
variables[1].MissingValues[0] = 999;  

// Default options
var options = new SpssOptions();

using (FileStream fileStream = new FileStream("data.sav", FileMode.Create, FileAccess.Write))
{
    using (var writer = new SpssWriter(fileStream, variables, options))
    {
        // Create and write records
        var newRecord = writer.CreateRecord();
        newRecord[0] = 15d;
        newRecord[1] = 15.5d;
        writer.WriteRecord(newRecord);
        
        newRecord = writer.CreateRecord();
        newRecord[0] = null;
        newRecord[1] = 200d;
        writer.WriteRecord(newRecord);
        writer.EndFile();
    }
}

License

Spssly is provided as-is under the MIT license. For more information see LICENSE.

About

SPSS and PSPP binary file access for .SAV files. Does not require spss io DLLs to access. Based on the PSPP specifications.

Resources

License

Stars

Watchers

Forks

Languages

  • C# 100.0%