Using external compilers with R

These web pages describe compiler-specific details about writing external code to use with the current version of R, from a Windows programming perspective. For instructions for other platforms, and non-compiler-specific details, see the Writing R Extensions manual in the docs subdirectory of the R installation.

General advice and frequent problems

Calling conventions
Register preservation
Differences between .C() and .Fortran()
is.loaded() returning FALSE

Calling conventions

Calling conventions are the protocols used by the compiler when passing arguments to functions. R always uses the "cdecl" calling convention, which passes all arguments on the stack, pushing the rightmost one first; the caller is responsible for restoring the stack afterwards. For example, the call

.C("foo",as.integer(1), as.double(2),package="bar")

will do the following:
  1. push a pointer to a vector containing the floating point value 2
  2. push a pointer to a vector containing the integer 1
  3. call the function foo in library bar.dll
  4. when it returns, restore the stack pointer to its original value by adding 8 (the size of two pointers) to it.
The standard calling convention in Windows is "stdcall", which is similar except that the routine that is called will remove the parameters from the stack. There are many other calling conventions used by different compilers: some or all parameters passed in CPU registers, parameters in the reverse order, etc. Symptoms of mismatched calling conventions are:

Register preservation

When R calls a function, it assumes that the EBP, EBX, EDI and ESI registers will be returned unchanged. In addition, the direction flag must be preserved. Programs may assume R follows these conventions across calls as well.

Differences between .C() and .Fortran()

The two R functions .C("foo", ...) and .Fortran("foo", ...) differ in the following respects:
  1. .C("foo", ...) looks for the symbol "foo" in the external library, whereas .Fortran("foo", ...) looks for the symbol "foo_" (which is how g77 would export the subroutine "foo").
  2. .C("foo", arg=as.character("red","blue","green")) passes the character mode argument as a pointer to an array of pointers to the strings, whereas .Fortran("foo", arg=as.character("red","blue","green")) would just passes a pointer to a 255 character buffer containing the first string, "red". In both cases the strings are null-terminated.
  3. Both .C and .Fortran allow arbitrary objects to be passed, but only C code which includes the R.h header file is likely to be able to read anything but simple vectors.

is.loaded() returning FALSE

When R uses dyn.load() to load a DLL, it relies on the DLL's export table to find functions. Many compilers use fairly obscure methods to get a function name into the export table. If you don't follow them exactly, your function won't be available to R.

Some compilers (e.g. g77, as mentioned above) make changes to the function names before putting them in the export table. If you specify the original name, R may not be able to find the entry point.

Specific instructions on both of these issues are compiler dependent. However, you can diagnose the causes of errors by examining the export table of your DLL. There are a number of ways to do this:


C/C++

MinGW tools

If your code is written in reasonably portable C, then the easiest way to compile it is to use the R toolset together with the distributed Makefile (Makefile.packages in the source distribution). Instructions are in the readme.packages file.

Microsoft Visual C++

The readme.packages file contains instructions for compiling and linking VC++ code.

Cross-compiling from Linux

If you normally do your development on Linux, then it may be easiest to compile your Windows DLLs there. Instructions are available on CRAN in the document Building Microsoft Windows Versions of R and R packages under Intel Linux by Jun Yan and A.J. Rossini.

Last modified: April 15, 2003, by Duncan Murdoch