18.7 Advanced topic: callbacks from C to Caml

18.7.1. Applying Caml closures from C
18.7.2. Registering Caml closures for use in C functions
18.7.3. Registering Caml exceptions for use in C functions
18.7.4. Main program in C
18.7.5. Embedding the Caml code in the C code

So far, we have described how to call C functions from Caml. In this section, we show how C functions can call Caml functions, either as callbacks (Caml calls C which calls Caml), or because the main program is written in C.

18.7.1 Applying Caml closures from C

C functions can apply Caml functional values (closures) to Caml values. The following functions are provided to perform the applications:

  • caml_callback(f, a) applies the functional value f to the value a and return the value returned by f.
  • caml_callback2(f, a, b) applies the functional value f (which is assumed to be a curried Caml function with two arguments) to a and b.
  • caml_callback3(f, a, b, c) applies the functional value f (a curried Caml function with three arguments) to a, b and c.
  • caml_callbackN(f, n, args) applies the functional value f to the n arguments contained in the array of values args.

If the function f does not return, but raises an exception that escapes the scope of the application, then this exception is propagated to the next enclosing Caml code, skipping over the C code. That is, if a Caml function f calls a C function g that calls back a Caml function h that raises a stray exception, then the execution of g is interrupted and the exception is propagated back into f.

If the C code wishes to catch exceptions escaping the Caml function, it can use the functions caml_callback_exn, caml_callback2_exn, caml_callback3_exn, caml_callbackN_exn. These functions take the same arguments as their non-_exn counterparts, but catch escaping exceptions and return them to the C code. The return value v of the caml_callback*_exn functions must be tested with the macro Is_exception_result(v). If the macro returns "false", no exception occured, and v is the value returned by the Caml function. If Is_exception_result(v) returns "true", an exception escaped, and its value (the exception descriptor) can be recovered using Extract_exception(v).

18.7.2 Registering Caml closures for use in C functions

The main difficulty with the callback functions described above is obtaining a closure to the Caml function to be called. For this purpose, Objective Caml provides a simple registration mechanism, by which Caml code can register Caml functions under some global name, and then C code can retrieve the corresponding closure by this global name.

On the Caml side, registration is performed by evaluating Callback.register n v. Here, n is the global name (an arbitrary string) and v the Caml value. For instance:

let f x = print_string "f is applied to "; print_int n; print_newline()
let _ = Callback.register "test function" f

On the C side, a pointer to the value registered under name n is obtained by calling caml_named_value(n). The returned pointer must then be dereferenced to recover the actual Caml value. If no value is registered under the name n, the null pointer is returned. For example, here is a C wrapper that calls the Caml function f above:

void call_caml_f(int arg)
{
    caml_callback(*caml_named_value("test function"), Val_int(arg));
}

The pointer returned by caml_named_value is constant and can safely be cached in a C variable to avoid repeated name lookups. On the other hand, the value pointed to can change during garbage collection and must always be recomputed at the point of use. Here is a more efficient variant of call_caml_f above that calls caml_named_value only once:

void call_caml_f(int arg)
{
    static value * closure_f = NULL;
    if (closure_f == NULL) {
        /* First time around, look up by name */
        closure_f = caml_named_value("test function");
    }
    caml_callback(*closure_f, Val_int(arg));
}

18.7.3 Registering Caml exceptions for use in C functions

The registration mechanism described above can also be used to communicate exception identifiers from Caml to C. The Caml code registers the exception by evaluating Callback.register_exception n exn, where n is an arbitrary name and exn is an exception value of the exception to register. For example:

exception Error of string
let _ = Callback.register_exception "test exception" (Error "any string")

The C code can then recover the exception identifier using caml_named_value and pass it as first argument to the functions raise_constant, raise_with_arg, and raise_with_string (described in section 18.4.5) to actually raise the exception. For example, here is a C function that raises the Error exception with the given argument:

void raise_error(char * msg)
{
    caml_raise_with_string(*caml_named_value("test exception"), msg);
}

18.7.4 Main program in C

In normal operation, a mixed Caml/C program starts by executing the Caml initialization code, which then may proceed to call C functions. We say that the main program is the Caml code. In some applications, it is desirable that the C code plays the role of the main program, calling Caml functions when needed. This can be achieved as follows:

  • The C part of the program must provide a main function, which will override the default main function provided by the Caml runtime system. Execution will start in the user-defined main function just like for a regular C program.
  • At some point, the C code must call caml_main(argv) to initialize the Caml code. The argv argument is a C array of strings (type char **), terminated with a NULL pointer, which represents the command-line arguments, as passed as second argument to main. The Caml array Sys.argv will be initialized from this parameter. For the bytecode compiler, argv[0] and argv[1] are also consulted to find the file containing the bytecode.
  • The call to caml_main initializes the Caml runtime system, loads the bytecode (in the case of the bytecode compiler), and executes the initialization code of the Caml program. Typically, this initialization code registers callback functions using Callback.register. Once the Caml initialization code is complete, control returns to the C code that called caml_main.
  • The C code can then invoke Caml functions using the callback mechanism (see section 18.7.1).

18.7.5 Embedding the Caml code in the C code

The bytecode compiler in custom runtime mode (ocamlc -custom) normally appends the bytecode to the executable file containing the custom runtime. This has two consequences. First, the final linking step must be performed by ocamlc. Second, the Caml runtime library must be able to find the name of the executable file from the command-line arguments. When using caml_main(argv) as in section 18.7.4, this means that argv[0] or argv[1] must contain the executable file name.

An alternative is to embed the bytecode in the C code. The -output-obj option to ocamlc is provided for this purpose. It causes the ocamlc compiler to output a C object file (.o file) containing the bytecode for the Caml part of the program, as well as a caml_startup function. The C object file produced by ocamlc -output-obj can then be linked with C code using the standard C compiler, or stored in a C library.

The caml_startup function must be called from the main C program in order to initialize the Caml runtime and execute the Caml initialization code. Just like caml_main, it takes one argv parameter containing the command-line parameters. Unlike caml_main, this argv parameter is used only to initialize Sys.argv, but not for finding the name of the executable file.

The native-code compiler ocamlopt also supports the -output-obj option, causing it to output a C object file containing the native code for all Caml modules on the command-line, as well as the Caml startup code. Initialization is performed by calling caml_startup as in the case of the bytecode compiler.

For the final linking phase, in addition to the object file produced by -output-obj, you will have to provide the Objective Caml runtime library (libcamlrun.a for bytecode, libasmrun.a for native-code), as well as all C libraries that are required by the Caml libraries used. For instance, assume the Caml part of your program uses the Unix library. With ocamlc, you should do:

ocamlc -output-obj -o camlcode.o unix.cma other .cmo and .cma files
cc -o myprog C objects and libraries \
   camlcode.o -L/usr/local/lib/ocaml -lunix -lcamlrun

With ocamlopt, you should do:

ocamlopt -output-obj -o camlcode.o unix.cmxa other .cmx and .cmxa files
cc -o myprog C objects and libraries \
   camlcode.o -L/usr/local/lib/ocaml -lunix -lasmrun

Warning:

On some ports, special options are required on the final linking phase that links together the object file produced by the -output-obj option and the remainder of the program. Those options are shown in the configuration file config/Makefile generated during compilation of Objective Caml, as the variables BYTECCLINKOPTS (for object files produced by ocamlc -output-obj) and NATIVECCLINKOPTS (for object files produced by ocamlopt -output-obj). Currently, the only ports that require special attention are:

  • Alpha under Digital Unix / Tru64 Unix with gcc: object files produced by ocamlc -output-obj must be linked with the gcc options -Wl,-T,12000000 -Wl,-D,14000000. This is not necessary for object files produced by ocamlopt -output-obj.
  • Windows NT: the object file produced by Objective Caml have been compiled with the /MT flag, and therefore all other object files linked with it should also be compiled with /MT.