This article is out-of-date! 
Please look at this article, 
or articles written after June 2016.

Hey guys,

In my last post about inlining, I described how to maintain the SSA property while inlining in the intermediate representation by adding a phi node at each return point and how to maintain and simplify the control flow graph by showing some inlining examples.

In this post I will focus on non local returns. Then I will discuss how this is related to exceptions and why it is different from exceptions.

Let’s do it example-driven.

Regular message send:

callerOfRegularMessageSend
'before callerOfRegularMessageSend' logCr.
self regularMessageSend.
'after callerOfRegularMessageSend' logCr.

regularMessageSend
'before regularMessageSend' logCr.
self messageSend.
'after regularMessageSend' logCr.

messageSend
^ 42

Here is a typical message send example. I want you to notice an important point. In the method named #regularMessageSend, after executing the message #messageSend, the flow of execution goes back to 'afterRegularMessageSend' logCr in #regularMessageSend.

The output of MyClass new callerOfRegularMessageSend is:

'before callerOfRegularMessageSend'
'before regularMessageSend'
'after regularMessageSend'
'after callerOfRegularMessageSend'

That’s obvious, isn’t it ?

Let’s have a more complex example.

Complex Message send

callerOfComplexMessageSend
'before callerOfComplexMessageSend' logCr.
self complexMessageSend.
'after callerOfComplexMessageSend' logCr.

complexMessageSend
'before complexMessageSend' logCr.
self complexMessageSend: [ :bool | bool ifTrue: [ ^ 1 ] ifFalse: [ 2 ] ].
'after complexMessageSend' logCr.

complexMessageSend: aBlock
'before complexMessageSend:' logCr.
aBlock value: #(true false) atRandom.
'afte complexMessageSend:' logCr.

The output of MyClass new callerOfComplexMessageSend is either:

'before callerOfComplexMessageSend'
'before complexMessageSend'
'before complexMessageSend:'
'after complexMessageSend:'
'after complexMessageSend'
'after callerOfComplexMessageSend'

or:

'before callerOfComplexMessageSend'
'before complexMessageSend'
'before complexMessageSend:'
'after callerOfComplexMessageSend'

depending on if the non local return is taken or not taken. If you don’t get it, you should probably read the chapter of Deep into Pharo about BlockClosures (IV Languages 14: Blocks: a detailed analysis).

If we look carefully at the message #complexMessageSend: in #complexMessageSend, we notice that depending if the non local return is taken or not taken, the execution flow returns either to 'after complexMessageSend:' logCr or to 'after callerOfComplexMessageSend' logCr

Ok. So here is the problem, how to inline a message that can return to different places ?

In addition, how is that related to exception ?

Continuation-based inliner

The problem is solved by supporting continuations. When inlining a message send, the inliner detects where the inlined method may return: right now it knows a message send can return just after the message send and a blockClosure may return to its sender homeContext, but it could know more later (as how to optimize exception handling). These return points are called continuation points in the inliner. Continuation points are then used to correctly patch the control flow and correctly propagate the method return values to the right place.

I tried to optimize #callerOfComplexMessageSend and here is the output of the optimizer:

Screen Shot 2014-05-09 at 3.00.05 PM

When the block was inlined, 2 phi nodes were created to receive the result of the non local return (1) and local return (self). One was created at the messageSend continuation point, and the other one at the non local continuation point. Non local return had its return value and control flow pointing toward the Non local continuation phi node, and the same thing applied for local return and the messageSend continuation phi node. Now the results of the phi nodes were never used, so they were removed. However, the control flow is still changed by this local and non local return, because the execution flow did not go to the same continuation point. That’s why the flow is divided in 2, one path write down 2 additional lines on the Transcript.

Note: in some case, even if the inliner works fine, it is difficult to represent the new execution flow in the bytecode (if some blocks are inlined but other are not, you may want the execution flow to return to a particular context and then jump to a specific point), therefore some optimizations are cancelled due to “unpatchable non local return” (those are very rare, on my experiment it happened 17 times out of 5173 cases).

How is that related to exceptions ?

Right now I don’t optimize exceptions because it does not worth it (the overhead due to exception handling is currently negligible compared to other performance issues). However, they would be optimized the same way. Basically, the inliner would notice additional return points for the optimized methods, which are where the exception is caught and unwind blocks. Now the main difference between exceptions and non local returns is the following:

  • a non local return have the execution flow going to the return point of the homeContext messageSend. Therefore, the homeContext sender pc has already the correct value. Basically, for a non local return, you need to find the correct stack frame to return to, but you can then just resume the execution, the stack frame instruction pointer is already at the correct place.
  • an exception have the control flow going to the exception handling code in the context that caught it. Therefore, after returning to the correct stack frame, the execution of the code cannot just restart, firstly the pc needs to be set to the exception handling block first pc. This means that for example, optimizing exceptions would require the native code of the method to support additional entries for exception handling blocks and unwinding blocks, as in the hip hop VM, which is not needed by non local returns.

As we see, a non local return is in fact a simplified exception. A standard exception would need further VM support to work with maximum performance. Relatedly, the rare cases where I cannot optimize the non local returns (I mentioned it at the end of the Continuation-based inliner section) are solved if the virtual machine provides support for exceptions.

But the optimizer has already the infrastructure to support optimizing exception handling…

I’m sorry these aspects are really difficult for me to explain in English, I know I am not very understandable…

Hope you enjoyed the post :-).

Advertisements