Next: , Up: Life Cycle


3.3.1 Initialization

When a user calls the R function file, the .Internal calling convention passes the arguments description, open, blocking, encoding, and raw to the C function do_url, defined in src/main/connections.c (see the “file” entry in src/main/names.c for verification). Briefly, these arguments represent the URL, read/write mode, whether the read/write operations are blocking, the character encoding of text to be read from or written to the connection, and whether the connection should be a ‘raw’ connection. Other R functions that create connections typically result in a call to other C functions with the naming convention of the form do_connection.

The do_url function first checks the arguments for validity, for example, that description is an object of type STRSXP, and not empty. The character encoding of description is checked and converted, if necessary to the current locale encoding. In addition, the NextConnection function is used to get the index of the next available Rconnection in the Connections array, raising an error if none are available. Additional tests are performed to determine the file given by description is a regular system file, an internet URL specifying the transfer protocol (i.e. http:// or ftp://), or one of the supported compressed file types, such as a gzip compressed file.

Assuming the file is a regular (uncompressed) file, data from the do_url arguments are then extracted to their C equivalents (i.e. const char *) and passed to the newfile function, which dynamically allocates and initializes an instance of struct Rconn. The return value is an Rconnection pointer to the newly allocated connection. The newfile function is specific to regular file connections, and tailors the struct Rconn as such. Other connection types have functions that perform similar tasks, following the naming convention newconnection.

The newfile function dynamically allocates an instance of struct Rconn, next passing it to the function init_con to be generically initialized. Code in the newfile function then assigns file connection input/output methods to the function pointer members of struct Rconn. Finally, the private member is set to point to a dynamically allocated instance of struct fileconn, which holds a FILE stream pointer and other members used by the file connection methods. A pointer to the new struct Rconn, and program control is then returned to do_url.

On return to do_url, a pointer to the newly allocated struct Rconn is copied to the Connections array, in the position returned by NextConnection. Other members of the struct Rconn structure are also modified at this point, including the blocking indicator and character encoding name encname.

It is convention that R connections are not immediately opened unless a valid open argument is provided to the R-level function that creates a connection. In the case one is provided to do_url, the connection is opended using the open function pointer set by the newfile functionin the newly created struct Rconn.

The final steps of the do_url function registers finalization code to deallocate memory associated with the connection via the external pointer mechanism (see Finalization), and build a return value. The return value is an INTSXP with ‘conn_id’ attribute set to the external pointer, the class attribute c("file", "connection"), and value equal to the index of the Rconnection pointer to the newly created struct Rconn in the Connections array. This index is used by subsequent R function calls to identify to the new connection. Special care has been taken here to avoid exposing references to the connection structures to R code.