This section describes how Caml data types are encoded in the value type.
Caml type | Encoding |
---|---|
int | Unboxed integer values. |
char | Unboxed integer values (ASCII code). |
float | Blocks with tag Double_tag. |
string | Blocks with tag String_tag. |
int32 | Blocks with tag Custom_tag. |
int64 | Blocks with tag Custom_tag. |
nativeint | Blocks with tag Custom_tag. |
Tuples are represented by pointers to blocks, with tag 0.
Records are also represented by zero-tagged blocks. The ordering of labels in the record type declaration determines the layout of the record fields: the value associated to the label declared first is stored in field 0 of the block, the value associated to the label declared next goes in field 1, and so on.
As an optimization, records whose fields all have static type float are represented as arrays of floating-point numbers, with tag Double_array_tag. (See the section below on arrays.)
Arrays of integers and pointers are represented like tuples, that is, as pointers to blocks tagged 0. They are accessed with the Field macro for reading and the modify function for writing.
Arrays of floating-point numbers (type float array) have a special, unboxed, more efficient representation. These arrays are represented by pointers to blocks with tag Double_array_tag. They should be accessed with the Double_field and Store_double_field macros.
Constructed terms are represented either by unboxed integers (for constant constructors) or by blocks whose tag encode the constructor (for non-constant constructors). The constant constructors and the non-constant constructors for a given concrete type are numbered separately, starting from 0, in the order in which they appear in the concrete type declaration. Constant constructors are represented by unboxed integers equal to the constructor number. Non-constant constructors declared with a n-tuple as argument are represented by a block of size n, tagged with the constructor number; the n fields contain the components of its tuple argument. Other non-constant constructors are represented by a block of size 1, tagged with the constructor number; the field 0 contains the value of the constructor argument. Example:
Constructed term | Representation |
---|---|
() | Val_int(0) |
false | Val_int(0) |
true | Val_int(1) |
[] | Val_int(0) |
h::t | Block with size = 2 and tag = 0; first field contains h, second field t |
As a convenience, caml/mlvalues.h defines the macros Val_unit, Val_false and Val_true to refer to (), false and true.
Objects are represented as blocks with tag Object_tag. The first field of the block refers to the object class and associated method suite, in a format that cannot easily be exploited from C. The second field contains a unique object ID, used for comparisons. The remaining fields of the object contain the values of the instance variables of the object. It is unsafe to access directly instance variables, as the type system provides no guaranteee about the instance variables contained by an object.
One may extract a public method from an object using the C function caml_get_public_method (declared in <caml/mlvalues.h>.) Since public method tags are hashed in the same way as variant tags, and methods are functions taking self as first argument, if you want to do the method call foo#bar from the C side, you should call:
callback(caml_get_public_method(foo, hash_variant("bar")), foo);
Like constructed terms, values of variant types are represented either as integers (for variants without arguments), or as blocks (for variants with an argument). Unlike constructed terms, variant constructors are not numbered starting from 0, but identified by a hash value (a Caml integer), as computed by the C function hash_variant (declared in <caml/mlvalues.h>): the hash value for a variant constructor named, say, VConstr is hash_variant("VConstr").
The variant value `VConstr is represented by hash_variant("VConstr"). The variant value `VConstr(v) is represented by a block of size 2 and tag 0, with field number 0 containing hash_variant("VConstr") and field number 1 containing v.
Unlike constructed values, variant values taking several arguments are not flattened. That is, `VConstr(v, v') is represented by a block of size 2, whose field number 1 contains the representation of the pair (v, v'), rather than a block of size 3 containing v and v' in fields 1 and 2.