Sitemap

Explaining invokedynamic. Number multiplication (almost) complete example. Part IV

alex_ber
6 min readSep 7, 2020

This is forth part of mini-series of Explaining invokedynamic. This is the full list of all articles:

Introduction. Part I

Toy example. Part II

Bootstrap method . Part III

Number multiplication (almost) complete example. Part IV

Dynamical hashCode implementation. Part V

Java 9 String concatenation. Part VI

Lambda. Part VII

Records. Part VII

Final notes. Part IX

On the previous part I’ve started to explain what bootstrap method is. In this part I will continue to do this, but I’m going to do this together with example.

Note: All code here is tested on Java 8, but the code should be able to be run in classpath on later version also (with some minor tweaks, that I’m describing).

Number multiplication (almost) complete example

Suppose that we want to write a method to multiply two long numbers and return BigInteger. Nothing difficult, it's a one-liner:

Next we benchmark it and see that it works on some hardware, say, 40 ns. Now we notice that in many cases it has unnecessary overhead. In practice, very often a result of two longs multiplication fits the long type as well. In such cases it’s unnecessary to construct two original BigInteger objects. Instead we could just multiply two longs and wrap the result like this:

This method works roughly twice as faster, about 20 ns. But who cares how fast is it if it does not work? If multiplication actually overflows, we will have an incorrect result. Can we detect overflow and only in that case switch to the slower version? Looks like we can: there’s Math.multiplyExact method, which multiplies two longs throwing an exception on overflow. Its Java implementation is non-trivial, but you should not look there. Actually this method is a JVM-intrinsic: JIT-compiler can convert its call to the sequence of assembly instructions. On x86 platform it’s imul (multiply) и jo (jump if overflow), and then it depends on how we handle an exception. The great thing is that if we have no overflow, it costs almost nothing. Let's rewrite our method this way:

Now if we call it with small numbers, it actually works 20 ns. However if overflow occurs, the running time is bigger than 20 and 40, it’s thousands of nanoseconds. Just because exceptions are quite expensive.

Well, ok, let’s start with faster version, but if overflow occurs at least once, then we will permanently switch to the slow version:

Fine, our benchmark shows that this takes 20 ns for small numbers and 40 ns for big. However synthetic benchmark is not the real application. Normally you would multiply in many different places in the code (emphasizing is mine). Usually at most of them the result always fits the long type and overflow occurs in some specific places only. For example, consider this code:

Due to program logic, in the first multiplication we have sometimes huge numbers, while in the second they are always small. Unfortunately our multiplier will switch to the slow implementation in both places which is disappointing.

Let’s wrap our boolean flag with object:

Now we can use different DynamicMultiplier in different places, and it will check whether overflow occurs in this particular place:

https://medium.com/@tagir_valeev/what-is-jep-303-or-reinventing-invokedynamic-4c43d84626eb

This DYNAMIC1and DYNAMIC2 represents places in the code where compiler would normally emit invokevirtual. But, this is not the end.

We’re already close. This implementation is not very convenient. We have to declare separate static field per every multiplication in our code. Moreover, we don’t want having dangling DynamicMultiplier objects in memory until we actually call a method doing multiplication (maybe we will never call it). So we need these static fields to be initialized lazily (and better to do it in thread-safe manner). Roughly this is what invokedynamic can do for us. It binds a hidden static field with every invokedynamic occurrence in the source code and performs lazy thread-safe initialization. This hidden static field type is (CallSite). In fact it's just a link to the executable code accessible via MethodHandle. (emphasis is mine).

https://medium.com/@tagir_valeev/what-is-jep-303-or-reinventing-invokedynamic-4c43d84626eb

The nice thing is that the call site could be mutable and target MethodHandle can be replaced when necessary. Mutable call site could be created either with MutableCallSite or with VolatileCallSite, if you need strict guarantees about visibility of your changes in other threads.

https://medium.com/@tagir_valeev/what-is-jep-303-or-reinventing-invokedynamic-4c43d84626eb

If you don’t want an ability to change target MethodHandlein the future, you can use ConstantCallSite. This type of CallSite would never change after setup. Known features implemented in Java Platform are implemented using ConstantCallSite (we will see some of them below). MutableCallSite and VolatileCallSite were originally designed to support dynamic languages on top of JVM.

Note: With dynamic languages, you can use the invokedynamic to link CallSite to MethodHandlewith specific JVM-method. Than your execution go through the exact same point in the program but with different argument types with invokedynamic that consults CallSite. Now, you should invoke another implementation, because JVM method signature has changed (and maybe even in non-compatible way from JVM point of view). Again, because method signature has changed we should relink our MethodHandle to another JVM-method. We should relink only if the variable type has changed. So, we can store last used method type and compare it to the current one. If it match, we can use fast path, we can use linked JVM-method, if not, we should relink. This very simmillar to the process described above (with notable exception, that we can relink multiple times, not only once).

Usually it’s convenient to extend one of these classes and define wanted behavior. So let’s make a call site to solve our problem. It’s somewhat verbose, but we will try:

https://medium.com/@tagir_valeev/what-is-jep-303-or-reinventing-invokedynamic-4c43d84626eb

Adding asType() conversions is necessary if actual arguments at the call site have different, yet compatible type (e.g. instead of long an int value is passed).

https://medium.com/@tagir_valeev/what-is-jep-303-or-reinventing-invokedynamic-4c43d84626eb

Note: Alternatively, you can use invoke method. More on this later.

Now we can actually use this call site without an indy, similarly to what we’ve done before:

Here dynamicInvoker is dynamic MethodHandle, which redirects to the current target of the call site. Despite the verbosity this works with the same speed as our previous example with DynamicMultiplier, because JIT-compiler knows very much about MethodHandle's and can inline them pretty well.

https://medium.com/@tagir_valeev/what-is-jep-303-or-reinventing-invokedynamic-4c43d84626eb

Note: if smallNum and smalNum are int if we’ll remove asType() calls in the commented lines, this code will fail (it will still work with invoke() call).

Note (again): Alternatively, you can use invoke method. More on this later.

I want to refactor code a bit. Let’s add the following method:

This is so-callled bootstrap method which will be by JVM called to construct a call site.

…The problem is that even in Java 9 one cannot write a Java-program which will generate a custom indy-instruction in bytecode. The indy is used in a few concrete places mentioned above: lambdas, method references, string concatenation [also, records from Java 14 is in preview mode-my addition]. We could use our MultiplyCallSite with invokedynamic only if we generate bytecode using some bytecode generation library like ASM. We cannot just write a Java program and compile it.

https://medium.com/@tagir_valeev/what-is-jep-303-or-reinventing-invokedynamic-4c43d84626eb

This is a missing part in this example. This bootsrap method should be somehow hooked into JVM. Than Java should be somehow told to use in appropriate places to create MULTIPLER on demand when needed. On those places compiler should emit invokedynamic bytecode. For more details, see final notes.

Note: In this case we have static Java code as implementation logic (fastand slowmethods that is bind to MethodHandle). In the real world example, this code is actually generated in runtime. For more details, see final notes.

You can add later any optimization you want. For example, you can check, if (int, int) were actually used at the call site, then you know that the result of multiplication will always fit the long type. So you can have a special ConstantCallSite for such case which will just multiply two ints without any overflow checks. You can add such optimization after your code is released and client code can take benefit from it without even recompilation. I will talk more about this later.

As of time of writing this article JEP 303 is still not implemented, it has Candidate status, so I’m skipping that part from quoting.

--

--

alex_ber
alex_ber

Written by alex_ber

Senior Software Engineer at Pursway

No responses yet