A previous SO question raised the issue about which idiom is better in time of execution efficency terms:
[ (var := exp) > 0 ] whileTrue: [ ... ]
versus
[ var := exp.
var > 0 ] whileTrue: [ ... ]
Intuitively it seems the first form could be more efficient during execution, because it saves fetching one additional statement (second form). Is this true in most Smalltalks?
Trying with two stupid benchmarks:
| var acc |
var := 10000.
[ [ (var := var / 2) < 0 ] whileTrue: [ acc := acc + 1 ] ] bench.
| var acc |
var := 10000.
[ [ var := var / 2. var < 0 ] whileTrue: [ acc := acc + 1 ] ] bench
Reveals no major differences between both versions.
Any other opinions?
So the question is: What should I use to achieve a better execution time?
temp := <expression>.
temp > 0
or
(temp := <expression>) > 0
In cases like this one, the best way to arrive at a conclusion is to go down one step in the level of abstraction. In other words, we need a better understanding of what's happening behind the scenes.
The executable part of a CompiledMethod
is represented by its bytecodes. When we save a method, what we are doing is compiling it into a series of low level instructions for the VM to be able to execute the method every time it is invoked. So, let's take a look at the bytecodes of each one of the cases above.
Since <expression>
is the same in the same in both cases, let's reduce it drastically to eliminate noise. Also, let's put our code in a method so to have a CompiledMethod
to play with
Object >> m
| temp |
temp := 1.
temp > 0
Now, let's look CompiledMethod
and its superclasses for some message that would show us the bytecodes of Object >> #m
. The selector should contain the subword bytecodes, right?
...
Here it is #symbolicBytecodes
! Now let's evaluate (Object >> #m) symbolicBytecodes
to get:
pushConstant: 1
popIntoTemp: 0
pushTemp: 0
pushConstant: 0
send: >
pop
returnSelf
Note by the way how our temp
variable has been renamed to Temp: 0
in the bytecodes language.
Now repeat with the other and get:
pushConstant: 1
storeIntoTemp: 0
pushConstant: 0
send: >
pop
returnSelf
The difference is
popIntoTemp: 0
pushTemp: 0
versus
storeIntoTemp: 0
What this reveals is that in both cases temp
is read from the stack in different ways. In the first case, the result of our <expression>
is popped into temp
from the execution stack and then temp
is pushed again to restore the stack. A pop
followed by a push
of the same thing. In the second case, instead, no push
or pop
happens and temp
is simply read from the stack.
So the conclusion is that in the first case we will be generating two cancelling instructions pop
followed by push
.
This also explains why the difference is so hard to measure: push
and pop
instructions have direct translations into machine code and the CPU will execute them really fast.
Note however, that nothing prevents the compiler to automatically optimize the code and realize that in fact pop + push
is equivalent to storeInto
. With such an optimization both Smalltalk snippets would result in exactly the same machine code.
Now, you should be able to decide which form do you prefer. I my opinion such a decision should only take into account the programming style that you like better. Taking into consideration the execution time is irrelevant because the difference is minimal, and could be easily reduced to zero by implementing the optimization we just discussed. By the way, that would be an excellent exercise for those willing to understand the low level realms of the unparalleled Smalltalk language.