Writing Ruby Extensions

Last Revision: 12/10/2008 by Giancarlo Bellido

Hello World

This is a simple example of how to extend ruby with C code.

First we need to create the extconf.rb which will generate the Makefile.

		# file: extconf.rb
		require 'mkmf'
		create_makefile('test')
	

Now we create the source file of the extension. We will call it test.c since this is what the Makefile is going to look for

		/* File: test.c */
		/* Include Main Ruby Header Files. Note: In some systems include <ruby.h> will work. */
		#include <ruby/ruby.h> 

		/* Init test will be called by ruby when the extension is loaded. */
		void Init_test()
		{
			/* Define a new Module called ExtensionTest. Name has to start with a capital letter) */
			VALUE module = rb_define_module("ExtensionTest");

			/* Define a new constant Value. (Remember constants in ruby start with a capital letter) */
			rb_define_const(module, "Value", rb_str_new2("Hello World"));
		}		
	

To generate the Makefile we run ruby extconf.rb in the command line. The output should be creating Makefile.

Then we run make and wait. If everything is ok the compiler will output a test.so file. We can test it like this: ruby -e "require 'test'; puts ExtensionTest::Value". The output should be Hello World.

Modules

Creating a module:

VALUE module = rb_define_module("name");

Adding methods/functions to a module:

rb_define_method_function(module, "name", c_function, argc);

Now the tricky part is that ruby adds a reference to the module when calling the function. c_function would look like this if it were to accept 1 parameter:

VALUE c_function(VALUE module, VALUE arg);

In this case the argc parameter in the rb_define function should be 1 not 2 as you would expect.

Define a constant:

rb_define_const(module_or_class, "NAME", value);

value is an object of type VALUE.

Data Conversion

Ruby -> C
	char* RSTRING_PTR(VALUE); /* Ruby String to char*  */
	int RSTRING_LEN(VALUE); /* String length */
	char* StringValuePtr(VALUE); /* Returns string representation of any object */
	RSTRING StringValue(VALUE); /* Returns Ruby string representation of any object */
	int NUM2INT(VALUE); /* Number to Int */
	double NUM2DBL(VALUE); /* Double to Int */
	VALUE* RARRAY_PTR(RARRAY); /*Returns the pointer to values of a Ruby Array */
	int RARRAY_LEN(RARRAY); /* Returns length of Array */
	
C -> Ruby
	/* Integer to Ruby Number*/
	INT2NUM(number);
	UINT2NUM(number);
	DOUBLE2NUM(double); /* Or rb_float_new(double) */
	RSTRING rb_str_new(char*, len); /* Returns ruby string from char* and length*/
	RSTRING rb_str_new2(char*); /* Returns ruby string from null terminated char* */
	
Generating Arrays
	VALUE array = rb_ary_new(); /* Create the array */
	rb_ary_push(array, VALUE); /* Push data into the array */
	
Generating a Hash
	VALUE hash = rb_hash_new();
	rb_hash_aset(hash, VALUE key, VALUE);
	
Get the type of a Ruby VALUE object
rb_type(VALUE);
Possible Types:
	#define T_NONE   RUBY_T_NONE
	#define T_NIL    RUBY_T_NIL
	#define T_OBJECT RUBY_T_OBJECT
	#define T_CLASS  RUBY_T_CLASS
	#define T_ICLASS RUBY_T_ICLASS
	#define T_MODULE RUBY_T_MODULE
	#define T_FLOAT  RUBY_T_FLOAT
	#define T_STRING RUBY_T_STRING
	#define T_REGEXP RUBY_T_REGEXP
	#define T_ARRAY  RUBY_T_ARRAY
	#define T_HASH   RUBY_T_HASH
	#define T_STRUCT RUBY_T_STRUCT
	#define T_BIGNUM RUBY_T_BIGNUM
	#define T_FILE   RUBY_T_FILE
	#define T_FIXNUM RUBY_T_FIXNUM
	#define T_TRUE   RUBY_T_TRUE
	#define T_FALSE  RUBY_T_FALSE
	#define T_DATA   RUBY_T_DATA
	#define T_MATCH  RUBY_T_MATCH
	#define T_SYMBOL RUBY_T_SYMBOL
	#define T_RATIONAL RUBY_T_RATIONAL
	#define T_COMPLEX RUBY_T_COMPLEX
	#define T_VALUES RUBY_T_VALUES
	#define T_UNDEF  RUBY_T_UNDEF
	#define T_NODE   RUBY_T_NODE
	#define T_MASK   RUBY_T_MASK
	
Iterate through a Ruby Hash We iterate with the function
rb_hash_foreach(hash, iterator_c_function, extra_value);
iterator_c_function is of the form:
int iterator(VALUE key, VALUE value, VALUE extra);
Remeber VALUE is the same size as a void*, this means we can typecast it into whatever we want. To tell the rb_ruby_foreach() function to continue we return a ST_CONTINUE in the iterator function. We can also return a ST_STOP, ST_DELETE, or ST_CHECK.

Internal Structures

To access the internal structure of any Ruby object, 'ruby.h' defines the following macros:
#define RBASIC(obj)  (R_CAST(RBasic)(obj))
	#define ROBJECT(obj) (R_CAST(RObject)(obj))
	#define RCLASS(obj)  (R_CAST(RClass)(obj))
	#define RMODULE(obj) RCLASS(obj)
	#define RFLOAT(obj)  (R_CAST(RFloat)(obj))
	#define RSTRING(obj) (R_CAST(RString)(obj))
	#define RREGEXP(obj) (R_CAST(RRegexp)(obj))
	#define RARRAY(obj)  (R_CAST(RArray)(obj))
	#define RHASH(obj)   (R_CAST(RHash)(obj))
	#define RDATA(obj)   (R_CAST(RData)(obj))
	#define RSTRUCT(obj) (R_CAST(RStruct)(obj))
	#define RBIGNUM(obj) (R_CAST(RBignum)(obj))
	#define RFILE(obj)   (R_CAST(RFile)(obj))
	#define RRATIONAL(obj) (R_CAST(RRational)(obj))
	#define RCOMPLEX(obj) (R_CAST(RComplex)(obj))
The most important are:
	struct RFloat {
		struct RBasic basic;
		double float_value;
	};

	struct RBignum {
		struct RBasic basic;
		union {
			struct {
				long len;
				BDIGIT *digits;
			} heap;
			BDIGIT ary[RBIGNUM_EMBED_LEN_MAX];
		} as;
	};

	struct RBasic basic;
	union {
		struct {
			long len;
			char *ptr;
			union {
				long capa;
				VALUE shared;
			} aux;
		} heap;
		char ary[RSTRING_EMBED_LEN_MAX];
	} as;
To Access the value of a float number you will call:
RFLOAT(ruby_value)->double_value;
Fixnum's (Integers) are stored differently:
10 in Ruby => (10 >> 1) in C
10 in C    => (10 << 1 | 1) in Ruby
	-10 in Ruby => (10 >> 1) in C
-10 in C    => (10 >> 1 | 1) in Ruby (0xffffffed)
	
Bignums are usually 8 bytes (long long).

Blocks

Executing the Block
rb_yield(Qnil);
Replace Qnil with the parameters you wish to pass to the block. The Result of the block is returned.
rb_block_given_p();
This function will tell you if a block was passed. Blocks are not counted as parameters so if your function only accepts a block then you need to specify 0 as the number of parameters.

Classes

To create a class:
VALUE rb_define_class("name", parent_class);
VALUE rb_define_class_under(module_or_class, "name", parent_class);
To add/define/overwrite a method for a class:
rb_define_method(class, "name", c_function, argc)
argc is the number of arguments of the function, not including the first parameter of the c_function which would be the instance variable. Default Parameters can be implemented by setting this number to negative. (-1 would mean variable number of parameters, -2 would create an array to store the arguments) "name" is the name of the method in ruby, for setter methods you would add the = to the name just like in ruby ( ie "name=" ). The same with operator for the [] operator you will use the string "[]" or "[]=". The format of a ruby function accepting a variable number of arguments (argc = -1) would be like this:
VALUE function( int argc, VALUE* args, VALUE instance );
argc is the number of arguments passed. args is an array containing the arguments. instance is the instance of the object. The format for a ruby function accepting an array of arguments (argc = -2) :
VALUE function( VALUE instance, VALUE array_of_arguments);
To access instance variables you will use this two methods:
rb_iv_get(instance, "variable");
rb_iv_set(instance, "variable", new_value);

Errors and Exceptions

Built-In Exceptions

rb_raise will try to call exception_object.new and display string-message as the exception error message.

rb_raise(exception_object, string_message);

Some of the built-in exceptions that can be raised are:

rb_eArgError
rb_eEOFError
rb_eException
rb_eFatal
rb_eFloatDomainError
rb_eIndexError
rb_eInterrupt
rb_eIOError
rb_eKeyError
rb_eLoadError
rb_eLocalJumpError
rb_eNameError
rb_eNoMemError
rb_eNoMethodError
rb_eNotImpError
rb_eRangeError
rb_eRegexpError
rb_eRuntimeError
rb_eScriptError
rb_eSecurityError
rb_eSignal
rb_eStandardError
rb_eStopIteration
rb_eSyntaxError
rb_eSysStackError
rb_eSystemCallError
rb_eSystemExit
rb_eThreadError
rb_eTypeError
rb_eZeroDivError
	

Custom Exceptions

You can create your own exceptions by extending the Exception class.

rb_define_class("CustomErrorClass", rb_eException);

Instances

To Create a new instance of an object with class Class we will do this:
	 VALUE obj;

	 obj = rb_obj_alloc(Class); // Allocate space and create new instance
	 rb_obj_call_init(obj, 0, 0); // We call the initialize method for the object  
	

Remember rb_obj_alloc() does not initialize the object and will seg fault if you use any custom data types in the object.

The 2nd and 3rd parameters of rb_obj_call_init() are the argument count (argc) and the constructor parameters array ( VALUE* ).

Wrapping C Structures

To store data inside a C structure that needs to be handled inside a ruby object we need to wrap it in a DATA object. To do this we use the following functions:

	VALUE Data_Wrap_Struct(VALUE class, void (*mark)(), void (*free)(), void *ptr);
	VALUE Data_Make_Struct(VALUE class, type, void(*mark)(), void (*free)(), type*);
	Data_Get_Struct(VALUE obj, type, type*);
	

To get the pointer of the data stored in the object instance we use:

	DATA_PTR(object);
	
Suggestions and Corrections to giancarlo.bellido @@ gmail.