This is first part of mini-series of Explaining invokedynamic. This is the full list of all articles:
Number multiplication (almost) complete example. Part IV
Dynamical hashCode implementation. Part V
Introduction
Since Java 7 new bytecode instruction invokedynamic
(or indy) was added.
It was added in JSR 292 in about 2011. It was originally designed for supporting Dynamically Typed Languages, as JSR name stated. So, it was ignored by waste majority of Java developers.
In Java 8 Lamda was introduced. It was foreign to Java concept by itself (more on this in the separate article later), so most of the attention was focused on what it is, why existing nested and inner classes are not enough, and why SAM is good way to go. When somebody tries to talk about invokedynamic
this talk was for advanced user, once that write compiler, and they know how to read bytecode instructions almost as Java code. For most developers, including myself back than, this information was inaccessible. Things became even worst, because MethodHandle API was also new for almost everybody.
In this article I will try to explain you what is invokedynamic
(or indy), how it works with concrete example, without getting too far inside JVM.Let’s start from short informal explanation about invocestatic
and invocevirtual
bytecodes.
Note: All code here is tested on Java 8, but the code should be able to be run in classpath on later version also (with some minor tweaks, that I’m describing).
invocestatic
How Java make call to the static method? Well, this sound pretty simple, compilers convert your source code to bytecode where it put exact instruction what static method should be called. This is almost true, but you should recall, that Java support method overloading. Let’s consider the following concrete code example:
We have 2 overloaded version of print
function here. How compiler know what variant to chose?
Note: I’m simplifying the whole process, focusing only on relevant part.
Recall, Java is statically-typed language. This means, that variable types are determent in compilation type. So, for a
compilers see that it’s static type is int.
When, it generate byte-code for line 16 it sees 2 candidate functions, print
that receives int
and print
that receives long.
Because it should send variable a
that have (static) type int,
compiler chose more specific method, namely one that receives int.
There are exact rules on how such ambiguity has to be resolved. The crucial point, it is done in compile time. Once, compile chose the method to link to, it emit byte-code that will call this method directly.
When compiler go to the line 17, it sees 2 candidate print
function again. Now, it should send variable b
that have (static) type long,
now more specific print
method will be one that receives long,
so now, compiler will emit byte-code that will call this method directly.
When compiler go to the line 18, it sees 2 candidate print
function again. Now, it should send variable c
that have (static) type long.
It is, indeed, true, that in runtime it will hold int
value, but there is now way to compiler to knows this in advances (there is theoretical justification that I will skip here) it can be resolved only in runtime, that was all dynamically typed languages do. As I’ve said above, Java is statically typed language, so compilers see (static) type long
(and compiler can’t “see” in general case that in runtime it will actually holds int
), so more specific print
method will be one that receives long
will be chosen, compiler will emit byte-code that will call this method directly.
To summaries: compiler emit bytecode that links caller sites (lines 16,17,18 above) with correct implementation of the print
method. Choosing correct method is hard-wired into compiler.
invocevirtual
HowHow Java make call to the function method? Well, compilers convert your source code to bytecode where it links to the correct method. Compiler consults Object’s dispatch mechanism in order to do so. This is why it is called single dispatch.
Let’s look on concrete example:
What compilers do in lines 23,24,26,27,29,30?
Note: I’m simplifying the whole process, focusing only on relevant part.
- Line 23:
As far compiler compiler concern, the type of obj1 is Parent,
so first of all it checks that Parent.class
has method show.
It does. Now, from compiler perspective the “real” signature of show
method is:
public static void show(Parent this)
so, compiler will emit byte-code that will call this method and will pass obj1 to it (as first implicit argument).
When the program will run “Parent’s show()” will be printed out.
- Line 24:
As with line 23 — as far compiler compiler concern, the type of obj1 is Parent,
so first of all it checks that Parent.class
has method time.
It does. Now, from compiler perspective the “real” signature of time
method is:
public static void time(Parent this)
so, compiler will emit byte-code that will call this method and will pass obj1
to it (as first implicit argument).
When the program will run “now” will be printed out.
- Line 26:
As far compiler compiler concern, the type of obj2 is Child
so first of all it checks that Child.class
has method show.
It does. Now, from compiler perspective the “real” signature of show
method is:
public static void show(Child this)
so, compiler will emit byte-code that will call this method and will pass obj2 to it (as first implicit argument).
When the program will run “Child’s show()” will be printed out.
- Line 27:
It starts as in line 26 — as far compiler compiler concern, the type of obj2 is Child
so first of all it checks that Child.class
has method time.
This time, it doesn’t. So, compiler looks up for this method, in it’s super class (class that is not interface can have only at most one parent class). It’s look on Parent.class.
This time it founds time
method. Now, from compiler perspective the “real” signature of show
method is:
public static void show(Parent this)
so, compiler will emit byte-code that will call this method and will pass obj2 to it (as first implicit argument).
Note, that obj2 is really Child
, but show
function above will have access to it only as Parent
(for example, fields that are defined in Child
will be inaccessible to it).
When the program will run “now” will be printed out.
- Line 29:
As far compiler compiler concern, the type of obj3 is Parent,
so first of all it checks that Parent.class
has method show.
It does. Now, from compiler perspective the “real” signature of show
method is:
public static void show(Parent this)
so, compiler will emit byte-code that will call this method and will pass obj3 to it (as first implicit argument).
Note, that obj3 is really Child
, but show
function above will have access to it only as Parent
(for example, fields that are defined in Child
will be inaccessible to it).
When the program will run “Parent’s show()” will be printed out.
- Line 30:
It starts as in line 29 — As far compiler compiler concern, the type of obj3 is Parent,
so first of all it checks that Parent.class
has method time.
It doesn’t. So, compiler looks up for this method, in it’s super class (class that is not interface can have only at most one parent class). It’s look on Parent.class.
This time it founds time
method. Now, from compiler perspective the “real” signature of time
method is:
public static void time(Parent this)
so, compiler will emit byte-code that will call this method and will pass obj3 to it (as first implicit argument).
Note, that obj3 is really Child
, but show
function above will have access to it only as Parent
(for example, fields that are defined in Child
will be inaccessible to it).
When the program will run “time” will be printed out.
To summaries: compiler emit bytecode that links caller sites (lines 23, 24, 26, 27, 29,30) with correct implementation of the show
/time
method. Choosing correct method is hard-wired into compiler. Algorithm is different from previous one, but is known to compiler and he can execute it in compile time. Note also, that we pass the object upon the method is called as implicit first parameter (this) to the method.
MethodHandle, CallSite and MethodHandles.Lookup
A CallSite
is a holder for a variable MethodHandle.
One way took on it is handle to the method (constructor, field, or similar low-level operation). This handle can change overtime. An invokedynamic
instruction delegates all calls to it’s MethodHandle.
MethodHandle
is such an Object which stores the metadata about the method (constructor, field, or similar low-level operation), such as the name of the method signature of the method etc. One way took on it is a destination of the pointer to method (de-referenced method (constructor, field, or similar low-level operation)).
Java code can create a method handle that directly accesses any method, constructor, or field that is accessible to that code. This is done via a reflective, capability-based API called MethodHandles.Lookup
For example, a static method handle can be obtained from Lookup.findStatic
. There are also conversion methods from Core Reflection API objects, such as Lookup.unreflect
.
It is important to understand 2 key difference from Core Reflection API and MethodHandle.
- With
MethodHandle
access check is done only once in construction time, with Core Reflection API it is done on every call toinvoke
method (and Securty Manager is invoked each time, slowing down the performance). - Core Reflection API
invoke
method is regular method. InMethodHandle
allinvoke*
variances are signature polymorphic methods.
Basically, access check means whether you can access method (constructor, field, or similar low-level operation). For example, if the method (constructor, field, or similar low-level operation) is private, you can’t normally invoke it (get value from the field). More on this just in a second.
What is signature polymorphic method? Quote from JavaDoc:
Signature polymorphism
The unusual compilation and linkage behavior of
invokeExact
and plaininvoke
is referenced by the term signature polymorphism. As defined in the Java Language Specification, a signature polymorphic method is one which can operate with any of a wide range of call signatures and return types.In source code, a call to a signature polymorphic method will compile, regardless of the requested symbolic type descriptor. As usual, the Java compiler emits an
invokevirtual
instruction with the given symbolic type descriptor against the named method. The unusual part is that the symbolic type descriptor is derived from the actual argument and return types, not from the method declaration.When the JVM processes bytecode containing signature polymorphic calls, it will successfully link any such call, regardless of its symbolic type descriptor. (In order to retain type safety, the JVM will guard such calls with suitable dynamic type checks, as described elsewhere.)
Bytecode generators, including the compiler back end, are required to emit untransformed symbolic type descriptors for these methods. Tools which determine symbolic linkage are required to accept such untransformed descriptors, without reporting linkage errors.
https://docs.oracle.com/javase/8/docs/api/java/lang/invoke/MethodHandle.html
I will get back to signature polymorphic method later.
Now, let’s see how we can convert Core Reflection API objects to MethodHandle.
Quote:
A lookup object is a factory for creating method handles, when the creation requires access checking. Method handles do not perform access checks when they are called, but rather when they are created. Therefore, method handle access restrictions must be enforced when a method handle is created. The caller class against which those restrictions are enforced is known as the lookup class.
A lookup class which needs to create method handles will call
MethodHandles.lookup
to create a factory for itself. When theLookup
factory object is created, the identity of the lookup class is determined, and securely stored in theLookup
object. The lookup class (or its delegates) may then use factory methods on theLookup
object to create method handles for access-checked members. This includes all methods, constructors, and fields which are allowed to the lookup class, even private ones…Access checks are applied in the factory methods of
Lookup
, when a method handle is created. This is a key difference from the Core Reflection API, sincejava.lang.reflect.Method.invoke
performs access checking against every caller, on every call.All access checks start from a
Lookup
object, which compares its recorded lookup class against all requests to create method handles. A singleLookup
object can be used to create any number of access-checked method handles, all checked against a single lookup class.A
Lookup
object can be shared with other trusted code, such as a metaobject protocol. A sharedLookup
object delegates the capability to create method handles on private members of the lookup class. Even if privileged code uses theLookup
object, the access checking is confined to the privileges of the original lookup class.A lookup can fail, because the containing class is not accessible to the lookup class, or because the desired class member is missing, or because the desired class member is not accessible to the lookup class, or because the lookup object is not trusted enough to access the member. In any of these cases, a
ReflectiveOperationException
will be thrown from the attempted lookup.
https://docs.oracle.com/javase/8/docs/api/java/lang/invoke/MethodHandles.Lookup.html
Note:
MethodHandles
has package-privateIMPL_LOOKUP
field which can access to anything (you can get it with Core Reflection API, at least with JDK 8, see my Java Platform Module System article on how you can access it in later versions from classpath).
/** Package-private version of lookup which is trusted. */
static final Lookup IMPL_LOOKUP = new Lookup(Object.class, TRUSTED);
2. You can call java.lang.reflect.Method.setAccessible(true)
before you’re passing it toLookup.unreflect
. When we make field.setAccessible(true);
it doesn’t meter what MethodHandles.Lookup
object is used (again, I’m talking about classpath case); this call make such field effectively public
. So, we will succeed to create MethodHandle
. Internally, IMPL_LOOKUP
internal object, that has unrestricted access to everything, will be used inside Lookup.unreflect
instead you obtained MethodHandles.Lookup
to get MethodHandle.
So, you can think of MethodHandle
as more powerful alternative to Core Reflection API. Framework developers are often used Core Reflection API to load class at runtime, to get method/feild, to call a method at runtime. But Core Reflection API has a performance cost as it does the security checking each time. MethodHandle
can be seen as alternative approach (but again, it is much more powerful, than this, we will see some power below).
This is first part of mini-series of Explaining invokedynamic. This is the full list of all articles:
Number multiplication (almost) complete example. Part IV
Dynamical hashCode implementation. Part V