Access attribute_hidden Functions in R Packages

Maybe the title should have been prepended with "Don't..."

The discussion below was written in the context of R 2.11.0, as distributed by Debian, and other web resources to date. The ideas below may not be portable to other distributions, operating systems, or future releases of R.

The source code of R is littered with "attribute_hidden" declarations. These declarations are a mechanism to ensure that the variable or function is only accessible by code in the core R distribution, and not by R extension packages. For example, the C function do_scan is declared attribute_hidden, presumably to promote use of the R function scan.

Package developers may consult with the R development community when they want to access attribute_hidden functions. It's much more elegant and useful to convince an R core developer to simply modify the attribute_hidden declaration, or to offer an alternative. In the meantime, we may want to use other methods to access attribute_hidden symbols at the C level. For instance, not every attribute_hidden function or variable has an R level counterpart. Another reason to access attribute_hidden symbols is to partially expose symbols related to R connections. However, an old comment in the R source code hints that a public connections API may be available to R packages in the future.

The attribute_hidden declaration is defined in multiple places in the R source code, including src/include/Defn.h:

#define attribute_hidden __attribute__ ((visibility ("hidden")))

This declaration is a compiler extension that affects how the symbol may be accessed. From the gcc documentation, the "hidden" attribute has the following meaning:

Hidden visibility indicates that the symbol will not be
placed into the dynamic symbol table, so no other "module"
(executable or shared library) can reference it directly.
Note that hidden symbols, while they cannot be referenced
directly by other modules, can be referenced indirectly via
function pointers

Clearly, attribute_hidden functions may be accessed using function pointers, which requires knowing the address where the function is loaded in memory when R is executed. In general, this information is not accessible to R packages. However, this information may be available in the R executable file, or the R shared library. The trick is to extract the address of the function we want to use, and then construct a function pointer to use it.

Suppose we want to use the attribute_hidden function getConnections_no_err, defined in the file src/main/connections.c:

attribute_hidden Rconnection getConnection_no_err(int n);

The first step is to find out the address of this function in memory when R is executed. As I mentioned this information may be stored in the R executable file, usually located at $(R_HOME)/bin/exec/R. The objdump program in the binutils package may be used to extract such information, when present:

$ objdump -t `R RHOME`/bin/exec/R | grep getConnection_no_err
0000000000509eb0 l     F .text	0000000000000024              .hidden getConnection_no_err

In this example, objdump outputs several pieces of information associated with the function getConnection_no_err, the first of which is the (hex) address where the function will be loaded in memory when R is executed. We can isolate this bit of information with an additional command:

$ objdump -t `R RHOME`/bin/exec/R | grep getConnection_no_err | awk '{print $1}'

The next step is to construct a function pointer that that we can assign this address, and thereby call the function. In our package source code, we would use the following declarations

#include <Rinternals.h>
typedef Rconnection (* FUNP)(int);
FUNP getConnection_no_err = (FUNP) 0x0000000000509eb0;

and then we could call the function with a statement like

Rconnection con = (*getConnection_no_err)(0);

However, in this particular case, we wouldn't be able to do much with the Rconnection pointer, other than pass it to another function. In order to get any useful information about the Rconnection, we would first need to copy the struct Rconn declaration from the file src/include/Rconnections.h file to our package source code. Of course, this is not considered "good" programming practice. However, Rconnections.h is not a public header. Until this header is made public, there would be little alternative.

This is the gist of how attribute_hidden symbols may be accessed in package code. There are various tricks that may be utilized to automate collecting the symbol address from the R executable. For those interested, I have prepared a small R extension package example_1.0.tar.gz containing a single function get_mode that is passed a connection description and returns the mode (e.g."rw"), by accessing the internal Rconnection pointer. For example:

> library(example)
> get_mode("stdin")
[1] "r"

Pay special attention to the file. This is where most of the work of finding the address for getConnection_no_err occurs. Also, note that this package passes R CMD check under R-2.11.0.

The trick of accessing attribute_hidden functions is not a "robust" method, as we might say in statistics. That is, it is easily broken. If objdump, grep, awk are not installed, the installation will fail. Also, if an incorrect address is found, installation may fail, or get_mode may result in a segmentation fault, or may produce other odd behavior. Please leave feedback if you try it out.