
A Technique to Call Icon from C under Icon Version 9
	Carl Sturtivant, 2008/2/20 Confidential Draft #1
	

1. Summary.

A new Icon function written in C with a special interface may be 
dynamically loaded from a shared object using the built-in function 
loadfunc [GT95]. We show how such a function may in turn call an Icon 
procedure using the technique described below, provided that the 
procedure call itself does not suspend, but only returns or fails. Note 
that this does not impose constraints of any kind upon other procedures 
executed as a consequence of calling the original procedure. In 
particular, the Icon procedure called from C may in turn lead to a call 
of another Icon function written in C calling Icon recursively. The 
technique described has been implemented and briefly tested with Icon 
9.51(?).


2. Overview.

If the body of an Icon function written in C is to call an Icon 
procedure that does not suspend and retrieve its return value, all 
without modifying iconx, then there are a number of hurdles to jump. 
The procedure descriptor, and those of its arguments must be pushed 
onto the Icon stack, and the interpreter induced to believe it needs to 
execute an  icode instruction to invoke it, one that is not present in 
the icode it loaded. Once the procedure returns (or fails) the 
interpreter must be induced to return control to C just after the point 
where the attempt to call it occurred, rather than simply to go on to 
the next icode instruction. Then the result of the call needs to be 
popped off the Icon stack so that it is in the same state as before the 
call, since C does not normally modify the Icon stack. (Other details 
of the state of the interpreter will be restored by the mechanism 
whereby a procedure is called in Icon.) In all other respects, the main 
interpreter loop must continue to behave as before.

These hurdles are insurmountable, so long as the code of the main 
interpreter loop is inviolate. The code of that loop as it is 
incorporated into iconx is inviolate, since a design goal is that the 
technique should work with the existing implementation. Therefore, we 
take a copy of that loop, and modify it to the ends above, and execute 
it only in order to call Icon. (The original interpreter continues to 
be used for all other purposes.) Dynamic linking allows the new 
interpreter loop to refer to all C globals and functions in iconx, and 
so nothing else need be copied, these things are merely referred to. In 
fact it takes very little modification of the copy to achieve these 
goals, and the result is a C function called icall to which the 
procedure and its arguments are passed to effect the call to Icoan. To 
simplify this interface, the arguments are passed as a single Icon 
list. The resulting function then has similar semantics as the binary 
"!" operator in Icon, (which we henceforth call 'apply' as it applies a 
procedure to its argument list) except that it may be called from C.


3. Implementation.

The main interpreter loop written in RTL resides in the file 
src/runtime/interp.r in the Icon distribution. This was translated into 
the corresponding C file xinterp.c by the RTL translator rtt with the 
command 'rtt -x interp.r'. Now this C file is edited into a file 
compiled into a single C function called icall, taking two descriptors 
(a procedure and a list of arguments) and returning an integer code. 
The effect of calling icall is to apply the procedure to its arguments, 
and restore the state of the interpreter, leaving the result of the 
call just beyond the stack pointer for retrieval.

The contents of xinterp.c consist of some global variables and a 
function interp containing the interpreter loop. The global variable 
declarations are all modified by prefixing them with 'extern', so that 
they now simply refer to those used by the interpreter loop inside 
iconx. The function interp that returns an integer signal and has two 
parameters: an integer fsig used when the interpreter is called 
recursively to simulate suspension, and cargp, a pointer into the Icon 
stack. The function interp is renamed icall. 

Examination of src/runtime/init.r indicates that the signal 0 is passed 
to interp when it is initially called non-recursively to start up 
iconx. So fsig is removed from the parameter list and made a local 
variable initialized to 0. Similarly cargp is made a local variable, 
and icall is given two parameters theProc and arglist used to pass that 
necessary for the call to Icon. Immediately after the initial 
declarations inside icall, the Icon stack pointer sp is used to 
initialize cargp to refer to the first descriptor on the stack beyond 
sp, which is assigned the procedure desciptor parameter of icall. The 
desciptor beyond that is assigned the argument list descriptor 
parameter, and the stack pointer augmented to refer to its last word. A 
new local variable result_sp is initialized to location on the stack of 
the last word of the procedure descriptor. This is used by the 
mechanism to return to C described below. Now the details of pushing 
the procedure descriptor and the argument list descriptor onto the 
stack are complete.

The body of interp consists of some straight-line code followed by the 
interpreter loop, which contains some code to get the next icode 
instruction followed by a switch to jump to the correct code to execute 
it, all inside and endless loop. Just before the loop starts, an 
unconditional goto is inserted, jumping to a newly inserted label 
called aptly 'apply' which is placed just after the switch label 
(called Op_Apply in interp.r) which precedes the code to implement the 
icode 'apply' instruction, that implements the apply operator (binary 
"!") in Icon. This instruction expects to find a procedure descriptor 
and a list descriptor on the stack, and then causes the icode 
instructions of the procedure to be accordingly invoked. Now the 
details of calling the procedure are complete. What is left to insert 
is the mechanism to return to C.

When the procedure that we called returns or fails, it will execute a 
'pret' instruction or a 'pfail' instruction. However, these 
instructions may also be executed by Icon procedures called from the 
one we called. At the end of the code for 'pret' inside the switch in 
the interpreter is a 'break' to leave the switch and go round to get 
the next icode instruction. Just before that 'break' we can tell if our 
procedure call is the one returning by comparing the Icon stack pointer 
sp to the one we saved, result_sp, which our procedure call will have 
restored sp to when it overwrote the procedure descriptor with the 
result of the call. So if they are equal, we can clean up (decrement 
ilevel, move sp just before the former procedure descriptor) and 
return, finishing the call to icall. Now C can retrieve the result of 
the call just beyond the stack pointer. The 'pfail' code is similar, 
just before a jump to efail, which we do not execute since the context 
of our call is not an Icon expression. C can determine success or 
failure from the integer code returned. This completes the mechanism to 
return to C.


4. Conclusions

Overall this mechanism depends upon few things, mainly upon the fact 
that when a procedure is called, the Icon stack below the part used for 
the call is not modified during the call. Our copy of the interpreter 
loop is identical to the original with the exception of the code added 
for the C return mechanism, which is only exceptionally executed. And 
the Icon procedure call mechanism itself will save and restore the 
interpreter state apart from the stack pointer which we abuse at the 
start and restore at the end. The compiled result with gcc was about 10 
Kbyte. A simple test confirmed that call and return occur in the 
correct order, from Icon to C to Icon returning to C returning to Icon.

