Using external compilers with R
These web pages describe compiler-specific details about writing external code
to use with the current version of R, from a Windows programming perspective.
For instructions for other platforms, and non-compiler-specific details,
see the Writing R Extensions manual in the docs
subdirectory of the R installation.
General advice and frequent problems
Calling conventions
Register preservation
Differences between .C() and .Fortran()
is.loaded() returning FALSE
Calling conventions
Calling conventions are the protocols used by the compiler when passing arguments to functions. R
always uses the "cdecl" calling convention, which passes all arguments on the stack,
pushing the rightmost one first; the caller is responsible for
restoring the stack afterwards.
For example, the call
.C("foo",as.integer(1), as.double(2),package="bar")
will do the following:
- push a
pointer to a vector containing the floating point value 2
- push a pointer to a vector containing
the integer 1
- call the function foo in library bar.dll
- when it returns, restore the stack pointer to its original value by adding 8 (the size of
two pointers) to it.
The standard calling convention in Windows is "stdcall", which is similar except that the routine that is
called will remove the parameters from the stack. There are many other calling conventions used
by different compilers: some or all parameters passed in CPU registers, parameters in the reverse order,
etc.
Symptoms of mismatched calling conventions are:
- If your function uses stdcall instead of cdecl, there will likely be a crash when you return, because
your parameters will be removed from the stack after they're already gone.
- If your function uses any other calling convention, it will likely see garbage values in its arguments.
- If your function calls R with stdcall, you'll be left with extra values on the stack; this is often
harmless, but may eventually cause a stack overflow.
- If your function uses some other calling convention, R will see garbage parameter values.
Register preservation
When R calls a function, it assumes that the EBP, EBX, EDI and ESI registers will be returned unchanged.
In addition, the direction flag must be preserved. Programs may assume R follows these conventions across
calls as well.
Differences between .C() and .Fortran()
The two R functions .C("foo", ...) and .Fortran("foo", ...) differ in the following
respects:
- .C("foo", ...) looks for the symbol "foo" in the external library, whereas
.Fortran("foo", ...) looks for the symbol "foo_" (which is how g77
would export the subroutine "foo").
- .C("foo", arg=as.character("red","blue","green"))
passes the character mode argument as a pointer to an array of
pointers to the strings, whereas .Fortran("foo", arg=as.character("red","blue","green"))
would just passes a pointer to a
255 character buffer containing the first string, "red". In both cases the
strings are null-terminated.
- Both .C and .Fortran allow arbitrary objects to be passed, but only C code which
includes the R.h header file is likely to be able to read anything but simple vectors.
is.loaded() returning FALSE
When R uses dyn.load() to load a DLL, it relies on the DLL's export
table to find functions. Many compilers use fairly obscure methods to get a function name into the export
table. If you don't follow them exactly, your function won't be available to R.
Some compilers (e.g. g77, as mentioned above)
make changes to the function names before putting them in the export table. If
you specify the original name, R may not be able to find the entry point.
Specific instructions on both of these issues are compiler dependent. However,
you can diagnose the causes of errors by examining the export table of your DLL.
There are a number of ways to do this:
- Use "objdump -x foo.dll" and search the output for the export tables. The
useful one is the one headed "[Ordinal/Name Pointer] Table", which lists the names
of the exported functions. Objdump is available in Brian Ripley's tool set; see
readme.packages for details.
- Use the Quick View utility that was distributed with some versions of Windows.
The "Export Table" lists the names of exports.
- Use "tdump -ee foo.dll", where tdump.exe is a utility that is
distributed with Borland compilers.
- Use the equivalent utility that came with your compiler. (Please send me
details if yours is not listed here.)
C/C++
MinGW tools
If your code is written in reasonably portable C, then the easiest way to compile it is to
use the R toolset together with the distributed Makefile (Makefile.packages in the source distribution).
Instructions are in the readme.packages file.
Microsoft Visual C++
The readme.packages file contains instructions for compiling and linking
VC++ code.
Cross-compiling from Linux
If you normally do your development on Linux, then it may be easiest to compile your
Windows DLLs there. Instructions are available on CRAN
in the document
Building Microsoft Windows Versions of R and R packages under Intel
Linux by Jun Yan and A.J. Rossini.
Last modified: April 15, 2003, by Duncan Murdoch