Another Mystery: sas7bdat != sd2

I received an email from a very inconvenienced statistician a few weeks ago. The problem was an old data file with the extension .sd2. Apparently, this is an obsolete data storage format used by past versions of SAS. A quick glance at the file contents revealed that this sd2 formatted file is incompatible with the sas7bdat format (regular SAS users will probably know this already).

However, the structure of sd2 and sas7bdat formatted files appear superficially similar. For example, the sd2 file seems to have a 'header' followed by 'pages' of data and metadata. Also, the metadata appear to be structured into 'subheaders', much like the metadata of sas7bdat files.

Given these similarities, and that [good] software developers will tend to reuse code and concepts, I think sd2 mystery would crack fairly easily, if someone could devote the effort. Since the format is obsolete, this might be a good project, perhaps for a CS [graduate] student, or another computer savvy student. I'd be happy to facilitate this if someone is interested.

3 thoughts on “Another Mystery: sas7bdat != sd2

  1. Q

    I'm confused as to what you mean by saying that .sd2 is incompatible with .sas7bdat. Certainly SAS is happy reading both. Do you mean that your (R?) code that reads in .sas7bdat files is not able to (incompatible with) reading .sd2 files?

  2. BioStatMatt Post author

    I mean that the formats are not identical, nor is one a clear subset or minor modification of the other. Hence, the SAS system must have two distinct mechanisms for reading the two file types.

    The documentation available in the sas7bdat package does not apply to .sd2 files. And, since the read.sas7bdat function implements the format described therein, it can't read .sd2 files.

Comments are closed.