Tag Archives: R

Another Mystery: sas7bdat != sd2

I received an email from a very inconvenienced statistician a few weeks ago. The problem was an old data file with the extension .sd2. Apparently, this is an obsolete data storage format used by past versions of SAS. A quick glance at the file contents revealed that this sd2 formatted file is incompatible with the sas7bdat format (regular SAS users will probably know this already).

However, the structure of sd2 and sas7bdat formatted files appear superficially similar. For example, the sd2 file seems to have a 'header' followed by 'pages' of data and metadata. Also, the metadata appear to be structured into 'subheaders', much like the metadata of sas7bdat files.

Given these similarities, and that [good] software developers will tend to reuse code and concepts, I think sd2 mystery would crack fairly easily, if someone could devote the effort. Since the format is obsolete, this might be a good project, perhaps for a CS [graduate] student, or another computer savvy student. I'd be happy to facilitate this if someone is interested.