WEBVTT

00:00.000 --> 00:14.080
All right, hello everybody, I'm glad to be back here again, finally.

00:14.080 --> 00:17.280
I don't practice my talks, so this might fit and it might not, so we're just going to

00:17.280 --> 00:19.480
get going a little early on this.

00:19.480 --> 00:23.800
I'm going to talk today a little bit about how we use invoke dynamic and J Ruby, the things

00:23.800 --> 00:28.720
that work, the things that don't, places where we see that there's possible improvement,

00:28.720 --> 00:33.800
and hopefully you'll learn a little bit about how invoke dynamic works for us.

00:33.800 --> 00:39.840
Basic contact information for me, probably the most interesting thing there, up until Jet

00:39.840 --> 00:44.480
July, I was working for Red Hat, they were funding the development of the project.

00:44.480 --> 00:50.120
Since then, we have been building our own open source support company to help keep J Ruby

00:50.120 --> 00:55.840
going to help fund other open source projects, just finding a way to connect up commercial

00:55.840 --> 01:00.960
concerns with open source projects like J Ruby so that we can keep the developers funded

01:00.960 --> 01:04.040
and give the companies the support they need.

01:04.040 --> 01:08.120
So I got stickers in business cards, if you're into that kind of stuff, J Ruby stickers, and some

01:08.120 --> 01:10.800
information about the company too.

01:10.800 --> 01:17.640
So my role today is now kind of two things, I am still one of the J Ruby leads, so I do a lot

01:17.640 --> 01:22.160
of the core development, a lot of the research, experimentation with all of these different

01:22.160 --> 01:27.360
JDK projects, I do most of the community outreach, make sure that pull requests are going

01:27.360 --> 01:31.000
through, make sure we're finding the right people in the community to help develop the

01:31.000 --> 01:32.720
features we need.

01:32.720 --> 01:38.080
Also, a co-founder of the new business, HETIUS Enterprises, we are focusing right now on

01:38.080 --> 01:43.520
providing commercial support to J Ruby users, and it turns out there's a lot of large applications

01:43.520 --> 01:49.560
out there running on J Ruby that are very excited to have us on their team, essentially.

01:49.560 --> 01:52.320
And hopefully trying to bring that to other open source projects.

01:52.320 --> 01:57.400
So if you or someone else has an open source project that you know a lot of companies are

01:57.400 --> 02:03.400
relying on, we can probably find a way to help get some funding organization, funding arrangements

02:03.400 --> 02:04.400
happening.

02:04.400 --> 02:08.320
It's generally days for us, but that's what we're looking to do.

02:08.320 --> 02:12.080
So J Ruby, pretty straightforward, hopefully everybody's heard of it by now.

02:12.080 --> 02:18.040
There's Ruby on the JVM of course, we focus very, very tightly on making it as much like

02:18.040 --> 02:21.400
the standard Ruby experience as possible.

02:21.400 --> 02:27.440
All the same command line, all the same command lines work, all the same libraries work.

02:27.440 --> 02:31.640
If it's pure Ruby, everything should just run great like Ruby and Rails, all the different

02:31.640 --> 02:34.960
database frameworks that are available for Ruby.

02:34.960 --> 02:39.160
And if we have native libraries, we even have FFI support so that we can do the same native

02:39.160 --> 02:43.240
calls and native integration that C Ruby does.

02:43.240 --> 02:46.880
It's also just bringing the best of the JVM to Ruby.

02:46.880 --> 02:50.240
We have been pushing the edge of Ruby performance.

02:50.240 --> 02:57.520
Ruby concurrency and scaling for 15, 20 years now of JVB being able to run all these applications.

02:57.520 --> 03:01.680
And that means leveraging all of these different Open JDK projects.

03:01.680 --> 03:06.480
Every single thing that you would see in this room or at the JVM language summit is absolutely

03:06.480 --> 03:13.320
useful and important for JVB and for us to make a better Ruby experience on the JVM.

03:13.320 --> 03:16.800
This question always comes up, so I'm going to just answer it right now.

03:16.800 --> 03:21.240
One about this Truffle Ruby thing, I thought that was the way that JVM Ruby was going

03:21.240 --> 03:22.240
to go.

03:22.240 --> 03:25.360
Truffle Ruby has very different goals from us.

03:25.360 --> 03:29.560
For those of you not familiar, Truffle Ruby is a Ruby implementation using the Truffle

03:29.560 --> 03:32.560
framework on top of Rall VM.

03:32.560 --> 03:35.480
JVB is a JVM implementation of Ruby.

03:35.480 --> 03:38.680
We want to run on every JVM, not just Rall VM.

03:38.680 --> 03:42.320
We want to run on embedded environments and Android and everything.

03:42.320 --> 03:48.640
So we focus on doing as good a job on implementing Ruby as we can with what the JVM provides,

03:48.640 --> 03:51.560
with what the JDK has for us.

03:51.560 --> 03:56.560
Truffle Ruby being limited to Rall VM, being kind of focused on just two major platforms,

03:56.560 --> 04:00.200
Mac OS and Linux, that's too limited for what we want to do.

04:00.200 --> 04:02.840
And there are trade-offs with the Truffle framework.

04:02.840 --> 04:07.760
You can get some amazing long-term stable performance out of it.

04:07.760 --> 04:14.120
There's a lot of startup overhead, one up overhead, and a much larger memory footprint.

04:14.120 --> 04:16.040
So again, very different goals.

04:16.040 --> 04:20.440
JVB focused on being a JVM targeted Ruby implementation.

04:20.440 --> 04:25.120
I'm not going to talk much about method handles, but in this, it's kind of the other half

04:25.120 --> 04:32.200
of in-vokedynamic, the job aside of the API to build up these method graphs and the adaptations.

04:32.200 --> 04:35.560
My talk from PhasDM 2018 is still very relevant.

04:35.560 --> 04:41.040
It gives an intro to method handles and intro to an API I wrote, called invoke binder, that

04:41.040 --> 04:45.480
makes it easier to work with method handles, and actually shows that you can implement an entire

04:45.480 --> 04:49.800
little language in method handles, and it will compile and optimize really well.

04:49.800 --> 04:52.960
So check that out if you want to learn about that side.

04:52.960 --> 04:55.360
So invoke dynamic in JVB.

04:55.360 --> 04:58.320
Envokedynamic really makes JVB possible.

04:58.360 --> 05:00.640
It makes things optimize the way we want.

05:00.640 --> 05:05.640
It allows us to do all of the different call forms and adaptations that we need to do.

05:05.640 --> 05:09.600
It allows us to shrink down the amount of byte code that we generate so that the optimizations

05:09.600 --> 05:11.800
of the JVM work better.

05:11.800 --> 05:18.400
It really is an essential part of the JVM for getting a language like Ruby running.

05:18.400 --> 05:23.320
Unfortunately, we also still need to support a non-involvedynamic mode in JVB.

05:23.320 --> 05:26.920
Indie takes a little bit longer to warm up because of all of the method handle logic

05:26.920 --> 05:27.920
that goes into it.

05:28.040 --> 05:31.520
There's more profiling that's required to optimize and inline things.

05:31.520 --> 05:34.520
So we still support running without invoke dynamic.

05:34.520 --> 05:38.040
Hopefully, as we go forward, that will be less and less the case.

05:38.040 --> 05:45.040
And if any folks in the room are working on improving method handle and lambda form performance

05:45.040 --> 05:49.160
and memory footprint, that's what we're really looking for here.

05:49.160 --> 05:53.280
So the first area we're going to talk about method calls, it's kind of the obvious thing

05:53.280 --> 05:55.680
that we use invoke dynamic for in JVB.

05:55.760 --> 05:59.640
And it was the earliest case for us to use invoke dynamic.

05:59.640 --> 06:05.160
JVB was pretty much the first dynamic language on the JVM to make use of invoke dynamic.

06:05.160 --> 06:11.240
We actually were integrating it into JVB before it was finalized and released in Java 7

06:11.240 --> 06:16.040
and helped drive a lot of the invoke dynamic development at that point.

06:16.040 --> 06:20.880
This not a straightforward as it seems, it's not just like calling into reflection and getting a method point.

06:20.880 --> 06:26.080
We have lots of different targets, all from Ruby to Ruby to Java.

06:26.080 --> 06:29.000
You can call Ruby in the native through our FFI.

06:29.000 --> 06:34.800
Hey, hey, this got the room sound work.

06:34.800 --> 06:39.720
And okay, yeah, yeah.

06:39.720 --> 06:46.920
So lots of different types of calls, different paths from Ruby to Java, Ruby to native.

06:46.960 --> 06:52.600
Validation and binding of all these Ruby classes and method tables can be mutated at runtime.

06:52.600 --> 06:55.480
It's part of the dynamic characteristics of the language.

06:55.480 --> 06:57.520
So we can't just bind it once.

06:57.520 --> 06:59.200
We have to be able to fall back.

06:59.200 --> 07:02.880
We have to be able to detect changes in the Ruby class structures.

07:02.880 --> 07:06.240
And then we need to be able to bind into all the different overlords of Java.

07:06.240 --> 07:11.200
We need to take calls from a dynamic language, turn them into a static call to a Java thing

07:11.200 --> 07:13.600
and try to make that all wire up correctly.

07:13.600 --> 07:16.200
And then all of the adaptations along the way.

07:16.280 --> 07:22.560
Ruby can support optional arguments, variable length argument lists, keyword arguments.

07:22.560 --> 07:29.200
Ideally, we don't have to box all of these things and throw them into a hash table for keywords or an array for

07:29.200 --> 07:30.520
our orgs.

07:30.520 --> 07:35.560
Ideally, we can get those to go straight through on the stack without any doing extra allocation

07:35.560 --> 07:38.520
and putting more load on the jit to optimize for us.

07:38.520 --> 07:44.200
So Indy really does allow all of these adaptations to happen in J Ruby and to inline an optimized

07:45.200 --> 07:46.200
well.

07:46.200 --> 07:47.200
Here's the eye chart.

07:47.200 --> 07:49.200
I'm not going to go through all of this.

07:49.200 --> 07:50.200
A few key things here.

07:50.200 --> 07:55.960
I mentioned that we have to invalidate if the type structures change if method tables change.

07:55.960 --> 07:58.360
That is an active invalidation.

07:58.360 --> 08:05.200
So whenever we cache a method at a call site, we grab a switch point which is basically an abstraction

08:05.200 --> 08:11.640
around JVM safe points to say if this class changes, if a new class is introduced or a module

08:11.640 --> 08:16.800
is introduced, if a method table change happens, go and invalidate all those call sites

08:16.800 --> 08:22.080
and the next time we'll go back through, we'll get the new methods and re-catch it.

08:22.080 --> 08:28.280
The more complex the binding is, the more adaptations we need to do, the longer the chain

08:28.280 --> 08:32.080
of method handles from our call site to the target method.

08:32.080 --> 08:37.520
Usually those will all get compressed down and inline and optimized away but there are thresholds

08:37.520 --> 08:38.520
that we can crash.

08:38.520 --> 08:44.080
If we create too much complexity in a given call site, we may end up not inlining those

08:44.080 --> 08:45.080
pieces.

08:45.080 --> 08:50.280
Now we're actually performing worse with invoke dynamic because we've actually broke in the

08:50.280 --> 08:52.920
inlining process.

08:52.920 --> 08:55.080
And there's obviously a lot more opportunities here.

08:55.080 --> 09:00.960
I mentioned that some special types of calls in Ruby like doing super class calls or refined

09:00.960 --> 09:05.800
calls, which are methods that are patched within a certain scope.

09:05.800 --> 09:09.800
Those don't do any optimization currently, we're still adding those pieces, hopefully

09:09.800 --> 09:15.040
in the upcoming versions of J Ruby will finish those pieces as well.

09:15.040 --> 09:20.680
So a simple example in Ruby, we've got a full method, the full method calls bar and bar calls,

09:20.680 --> 09:24.920
the dump stack method so we can see where we actually are in our execution.

09:24.920 --> 09:31.880
And I'm running this a couple times in a loop and we don't have the extra overhead there.

09:32.840 --> 09:38.480
J Ruby supports an interpreter, we're actually a mixed mode runtime on top of the JVM.

09:38.480 --> 09:43.960
So most code will run in our interpreter for a while, eventually we will jid it to JVM

09:43.960 --> 09:44.960
bytecode.

09:44.960 --> 09:50.840
And the interpreter, this is what it looks like.

09:50.840 --> 09:55.320
You see these specially named methods, interpret block, interpret method.

09:55.320 --> 10:01.480
This is how we take our interpreter frames and the JVM stack frames splice them together

10:01.480 --> 10:04.320
and produce a normal stack trace.

10:04.320 --> 10:09.280
It's kind of a complicated way of doing things, but it allows us to avoid the overhead

10:09.280 --> 10:14.360
of piling everything to JVM bytecode when a large portion of it will be called never or

10:14.360 --> 10:17.160
maybe only once.

10:17.160 --> 10:22.520
If we actually turn on the JVM jit compiler that produces bytecode, now we see that our Ruby

10:22.520 --> 10:26.560
methods here have actually turned into JVM stack frames.

10:26.600 --> 10:32.080
So at the bottom, the block that we used in the times loop, and then above there the

10:32.080 --> 10:34.240
full method and the bar method.

10:34.240 --> 10:37.000
And this is actual JVM stack frames here.

10:37.000 --> 10:43.080
I use a special little syntax to encode Ruby information about the method.

10:43.080 --> 10:45.560
So I know it's a deaf, it's a method.

10:45.560 --> 10:51.400
The heart marks it as being one of our Ruby frames that we want to pull out.

10:51.440 --> 10:57.160
The number here is basically which not index of that name in this particular script so that

10:57.160 --> 11:03.520
if you have two classes in the same file with the same method, we don't overlap on those.

11:03.520 --> 11:07.880
Now if we keep going with this, the version that doesn't use invoke dynamic and you can

11:07.880 --> 11:13.480
see we've got just a caching call site, basically an inline cache that we use.

11:13.480 --> 11:17.320
If we're not using envy, we just go through all of our own plumbing and then there's multiple

11:17.320 --> 11:22.600
layers of adaptation that goes through our utility classes and finally we make the call.

11:22.600 --> 11:28.480
If we wear that all up with envy, this job of stack trace reduces down to that.

11:28.480 --> 11:33.440
All of the stuff in between the call and the receiver is done as method handles which

11:33.440 --> 11:36.480
turns into lambda forms in the JVM.

11:36.480 --> 11:40.600
Those get compressed down, it generates a bit of code, inlines the whole thing.

11:40.600 --> 11:45.040
Now we can see we're actually doing these calls directly and we can get the optimizations

11:45.040 --> 11:50.120
we would expect from inlining food and bar together, for example.

11:50.120 --> 11:52.400
But this actually hides the complexity.

11:52.400 --> 11:56.880
If you actually force the JVM to do a stack dump, you're going to see something that

11:56.880 --> 11:59.040
looks more like this.

11:59.040 --> 12:04.880
These are the layers of lambda forms that secretly exist between the call and the receiver

12:04.880 --> 12:09.800
to do all of those adaptations and they have these horrible names because it's little bits

12:09.800 --> 12:15.760
of generated code that are created to let the JVM do its normal sort of optimizations.

12:15.760 --> 12:21.280
The same optimizations it does for regular JVM byte code, but as part of this invoked

12:21.280 --> 12:26.960
dynamic method handle adaptation, then it can just do the same, use the same jet process,

12:26.960 --> 12:30.680
the same profiling, optimize and inlining it altogether.

12:30.680 --> 12:34.840
But this is a challenge for us when we deal with things like profiling tools which I'll

12:34.840 --> 12:36.960
talk about later.

12:37.040 --> 12:41.800
Going back here, changing the structure of these methods a little bit, here we're calling

12:41.800 --> 12:46.640
on the food side, we're calling with one argument, it's going into a version of bar that

12:46.640 --> 12:51.720
has variable length argument list, still inlines all correctly, all those adaptations

12:51.720 --> 12:57.640
of turning the one argument into an array of arguments works just fine.

12:57.640 --> 13:01.120
We started running the problems if we get more complex than this.

13:01.120 --> 13:06.400
So here is the same stack trace, but showing where we have the times call, we'll be

13:06.400 --> 13:12.000
fixed num.times, the loop we did, that calls back into the Ruby code.

13:12.000 --> 13:17.200
Now we have these extra things because there's no way to do invoked dynamic calls from Java

13:17.200 --> 13:18.200
directly.

13:18.200 --> 13:23.320
So we have to call through our lock interface, that has to do some juggling of arguments

13:23.320 --> 13:28.480
and moving things around, and then finally it gets back into our compile Ruby code.

13:28.480 --> 13:33.080
This is an area we're looking to improve, try to find a better way that we can do invoked

13:33.080 --> 13:38.280
dynamic from Java so that we can get the Ruby code in line back to the Java code just

13:38.280 --> 13:41.720
like vice versa.

13:41.720 --> 13:47.800
Another example, adapting Ruby's many different ways of doing argument lists to a target

13:47.800 --> 13:48.800
job.

13:48.800 --> 13:54.000
Here we're calling the same dump stack method, but because we don't know how many arguments

13:54.000 --> 13:59.080
there might be in this incoming array, we're splatting this argument array, we have to go

13:59.080 --> 14:03.800
through some extra additional adaptations, we can't do the inlining all the way to the

14:03.800 --> 14:08.840
target method, and this is more of just working in J Ruby to try and bind these two sides

14:08.840 --> 14:14.640
together, try and avoid having to go back into our utility code and make sure it's all

14:14.640 --> 14:16.840
invoked dynamic.

14:16.840 --> 14:20.320
And then other call forms that we're still working on exploring, I mentioned doing dynamic

14:20.320 --> 14:26.720
calls from Java, I've opened the suggestions about ways we can rewrite the Java implementations

14:26.720 --> 14:32.480
of these methods to do dynamic calls, I don't have a good pattern for this right now.

14:32.480 --> 14:36.440
We also have the same problem that Java does with lambdas.

14:36.440 --> 14:41.680
If you call a single method with many different lambdas, well, that lambdas dispatch becomes

14:41.680 --> 14:47.920
mega-morphic, you can't inline through all of that, and the JVM currently does not profile

14:47.920 --> 14:50.720
across that mega-morphic call.

14:50.720 --> 14:54.920
We're looking at doing some of our own manual specialization where if we know that it's

14:54.920 --> 14:59.920
a simple method that receives a simple block, well, let's admit a new copy of it, let's

14:59.920 --> 15:04.880
actually specialize and split that method into another version, then the JVM can see through

15:04.880 --> 15:05.880
it.

15:05.880 --> 15:08.840
We really don't want to have to do this on our own.

15:08.840 --> 15:14.000
We would like to be able to hint to the JVM that you should specialize this path based

15:14.000 --> 15:19.360
on the lambda or the block we passed in rather than each piece along the way, find that

15:19.360 --> 15:22.360
common path and inlining.

15:22.440 --> 15:28.360
We've always thought about and wanted to try to do numeric unboxing, the lure of partial

15:28.360 --> 15:34.400
escape analysis has kind of teased us for so many years that we wouldn't have to worry

15:34.400 --> 15:36.440
about all these boxes.

15:36.440 --> 15:41.760
Now possibly with Valhalla, we'll have value types, and we can double up our arguments and

15:41.760 --> 15:45.200
be able to pass a native version and the box version.

15:45.200 --> 15:47.680
But still, we don't do any unboxing.

15:48.000 --> 15:52.160
We're kind of hoping that the JVM will catch up with what we need to be able to represent

15:52.160 --> 16:00.000
objects or represent primitives as objects, and optimizes.

16:00.000 --> 16:04.560
So the next area that we started using in both dynamic and JVM is for handling Ruby instance

16:04.560 --> 16:07.000
variables or fields basically.

16:07.000 --> 16:08.560
So here's a simple class.

16:08.560 --> 16:13.400
We assign two instance variables name and number to the values passed in to the initialize

16:13.480 --> 16:15.480
constructor here.

16:15.480 --> 16:19.960
Adder accessor is a Ruby feature to just add accessors, getters, and setters for those

16:19.960 --> 16:21.760
two fields.

16:21.760 --> 16:26.800
And then an example of how you can dynamically add new fields to a class.

16:26.800 --> 16:32.120
This is all at runtime and we need to be able to efficiently represent objects in memory

16:32.120 --> 16:38.080
even though the set of fields they contain might change while we run.

16:38.080 --> 16:42.440
So instances variables are basically dynamically allocated object fields in a typical Ruby

16:42.440 --> 16:44.080
implementation.

16:44.080 --> 16:49.000
The way that we get around having this be a separate box of values, a separate array we carry

16:49.000 --> 16:54.760
along, is that we statically look through the method table, look for all the different variable

16:54.760 --> 16:59.440
accesses we see, and then make a best guess about what the shape of this object is going

16:59.440 --> 17:00.440
to be.

17:00.440 --> 17:05.440
That allows us to put most Ruby instance variables directly into Java fields, even though

17:05.440 --> 17:10.680
technically there's no declaration syntax for an instance variable.

17:10.680 --> 17:14.640
And then if we if we turns out that we're wrong we still have our spill array that gets

17:14.640 --> 17:16.680
carried along with an object.

17:16.680 --> 17:20.520
Most of the time objects will be pretty well behaved and won't do a lot of dynamic instance

17:20.520 --> 17:22.280
variables.

17:22.280 --> 17:27.680
And then in both dynamic can actually have us wire up our access of this named field.

17:27.680 --> 17:32.720
This instance variable straight into the Java field and the objects we cut out all of that

17:32.720 --> 17:37.840
access look up all of the validation of it and actually go straight to the memory location

17:37.840 --> 17:41.840
to get the Ruby field.

17:41.840 --> 17:47.640
We also use invoke dynamic heavily for managing Ruby constants and globals, which oddly enough

17:47.640 --> 17:51.440
both of these are mutable, constant values.

17:51.440 --> 17:58.160
So here we have the debug constant being set to true, a debug global variable being set to true.

17:58.160 --> 18:03.160
When you declare modules and classes in Ruby that's actually just assigning new constants

18:03.160 --> 18:06.200
to a module object or a class object.

18:06.200 --> 18:14.360
And then at the bottom there is a fully qualified access of that BAS class in the middle.

18:14.360 --> 18:20.040
So for constants and globals, constants are scoped, lexically scoped, and then also scoped

18:20.040 --> 18:23.440
within the class hierarchy.

18:23.440 --> 18:28.600
They can be already, but it's typically not done, it's considered bad form in a typical

18:28.600 --> 18:32.280
Ruby application and usually you'll get warnings about it.

18:32.280 --> 18:36.560
Formals on the other hand can be modified all the time, there's no warnings for that,

18:36.560 --> 18:40.920
but usually they end up falling into either is constantly being mutated or it's never

18:40.920 --> 18:47.480
mutated, like a debug variable, probably not going to be turned on and off at runtime.

18:47.480 --> 18:50.280
The ending call sites actually work really well for this.

18:50.280 --> 18:52.880
We look up our constant value.

18:52.880 --> 18:58.360
We use a global invalidator based on the location of the constant or the name of the global

18:58.360 --> 18:59.760
variable.

19:00.080 --> 19:07.000
We can get that value to fold in as a constant using the existing invoke dynamic features.

19:07.000 --> 19:12.960
Similarly, we do the same thing with global variables, but we usually have a fallback in case

19:12.960 --> 19:16.240
it's a variable that's actually being used to mutate quite a bit.

19:16.240 --> 19:20.320
If it's being changed a lot, we fall back on a slow path so that we're not constantly

19:20.320 --> 19:27.880
throwing out code and invalidating an entire call graph just because it's a mutable value.

19:27.880 --> 19:29.800
There are places to improve here.

19:29.800 --> 19:35.120
The through bar buzz is actually currently in J.W.B. free separate constant lookups, but it's

19:35.120 --> 19:37.320
always going to produce the same result.

19:37.320 --> 19:42.040
And if we're not doing a lot of changing of those constants in a typical application, it

19:42.040 --> 19:43.960
really should be one lookup.

19:43.960 --> 19:47.680
We can shove that all behind invoke dynamic with do-star byte code size.

19:47.680 --> 19:53.120
So this is another area that we're working to improve in the future.

19:53.120 --> 19:57.200
We actually use invoke dynamic for creating Ruby literal values.

19:57.200 --> 20:03.680
Ruby has a much richer set of literals than what we can store in a constant pool in Java.

20:03.680 --> 20:09.080
We have our numerics, of course, but we have a literal big integer, a big num format that

20:09.080 --> 20:14.800
we need to be able to have as a literal value without constructing it every time.

20:14.800 --> 20:17.160
Ruby strings are not Java strings.

20:17.160 --> 20:20.560
We represent a string as a byte array and an encoding.

20:20.600 --> 20:27.400
So we need a way to re-constitute that into a Ruby string object and ideally cache it

20:27.400 --> 20:32.800
in place like it's a constant value like it was a Java string literal, similarly with regular

20:32.800 --> 20:35.200
expression literals.

20:35.200 --> 20:37.560
There are cases where we're going a little bit beyond this.

20:37.560 --> 20:43.040
If we have arrays or hashes where it's all literal values, we should be able to just

20:43.040 --> 20:48.240
tell invoke dynamic, create an array that looks like this, ideally share some of that

20:48.480 --> 20:53.200
store, not have to recreate the entire structure of the array every time because we know

20:53.200 --> 20:56.480
it has a mutable values in it.

20:56.480 --> 21:00.800
Other things that we're still working on, things like other hack more complex hash

21:00.800 --> 21:07.080
formats, composite types like complex and rational, but most of this we can easily put into

21:07.080 --> 21:13.560
invoke dynamic call sites and have one instruction to create even a large structure like

21:13.560 --> 21:15.560
a hash.

21:15.560 --> 21:19.200
Do what those byte codes look like if you look in J. Ruby.

21:19.200 --> 21:23.520
So at the top we have our fixed num which is a long, we just call into our fixed num

21:23.520 --> 21:28.520
site that does a bootstrap, goes out and creates a Ruby fixed num object and then it's

21:28.520 --> 21:34.040
cached at that point in the code forever.

21:34.040 --> 21:38.520
The next one is a frozen string, Ruby has both mutable and immutable strings.

21:38.520 --> 21:43.320
So here we have our hello string, we know that it's going to be UTF8 encoding.

21:43.320 --> 21:50.320
The 16 here is basically an encoding that says this particular string is only seven

21:50.320 --> 21:56.720
bit characters, so we can optimize accesses to it, and then Ruby also supports debugging

21:56.720 --> 22:02.040
if someone accidentally goes and tries to mutate a frozen string, we can print out an error

22:02.040 --> 22:05.800
that says where that string was allocated and let them know that they're not supposed

22:05.800 --> 22:08.000
to be modifying it.

22:08.240 --> 22:14.840
Similarly, the regular expression here, the execution is full, the encoding is UTF8 and

22:14.840 --> 22:19.240
the 512 is basically just regular expression flags that are embedded in it.

22:19.240 --> 22:24.960
I mentioned the array full of all literals, so this is an array that has two literal numeric

22:24.960 --> 22:28.200
values, the fixed num one and fixed num two.

22:28.200 --> 22:32.680
That gets shoved into a structure passed out to invoke dynamic and we can quickly create

22:32.680 --> 22:35.960
that array without too much trouble.

22:35.960 --> 22:42.000
Big nums of course we have here too, we embed that as a string, turn it into a big integer

22:42.000 --> 22:46.120
that's inside of our big num object and only have to do that allocation once.

22:46.120 --> 22:50.280
And then of course there's a range from one to ten as well.

22:50.280 --> 22:54.880
One of the newer areas that we're playing with invoke dynamic is using it to encode string

22:54.880 --> 22:55.880
interpolation.

22:55.880 --> 23:02.440
Similar to the way that the JVM now does string concatenation using invoke dynamic, reducing

23:02.440 --> 23:05.560
the byte code, making it a little bit more optimal.

23:05.560 --> 23:11.360
We do the same thing, so we embed some additional information, the constant, the static

23:11.360 --> 23:17.160
parts of that string interpolation plus a sort of map that says here we pull a static

23:17.160 --> 23:21.720
string, now we need a dynamic string, now we need a static string.

23:21.720 --> 23:25.520
Based on the size and structure of that, we do it a few different ways.

23:25.520 --> 23:29.560
We can call it a number of different overlodes that will stitch it together.

23:29.560 --> 23:34.520
We can just loop over all those values or we just fall back to a slow case, so we don't

23:34.520 --> 23:40.400
create such a giant tree of method handles and overload the JVM that way.

23:40.400 --> 23:45.880
This example of the string here, the value A was passed.

23:45.880 --> 23:50.480
We embed our two static strings into the invoke dynamic call site.

23:50.480 --> 23:57.000
Then we have the second to the last value here is a bit map that shows where the static

23:57.000 --> 23:59.880
values and the where the dynamic values should go.

23:59.880 --> 24:03.840
And then it's just a matter of telling invoke dynamic to stitch all those pieces together,

24:03.840 --> 24:05.640
and they're stringed back out.

24:05.640 --> 24:09.600
But then we only have this one instruction in our byte code rather than what would have

24:09.600 --> 24:14.720
been dozens of instructions to create these strings, put in the dynamic values, and then turn

24:14.720 --> 24:18.320
it into one string at the end.

24:18.320 --> 24:22.200
We also use it internally just for some of our own runtime plumbing.

24:22.200 --> 24:25.520
So when we create a block, we don't want to have to do that over again.

24:25.520 --> 24:29.080
We use invoke dynamic to cache it in place.

24:29.080 --> 24:36.080
We have heat-based local variables, blocks can close around a scope, and still mutate

24:36.080 --> 24:38.960
values outside of them, unlike Java lambdas.

24:38.960 --> 24:42.760
So we need to maintain a separate heap structure for that.

24:42.760 --> 24:46.840
We use invoke dynamic to try and speed the process of digging into that heap structure

24:46.840 --> 24:49.480
and finding those local variables.

24:49.480 --> 24:54.320
We also have to support Ruby's ability to interrupt threads at any time.

24:54.320 --> 24:59.240
So there's a poll that we would do to check and see, have I been interrupted?

24:59.240 --> 25:00.840
Am I supposed to raise an error?

25:00.840 --> 25:04.360
Am I supposed to, as this thread is supposed to kill itself now?

25:04.360 --> 25:07.760
We don't want to constantly be paying a memory location for that.

25:07.760 --> 25:09.920
So we do it with safe points.

25:09.920 --> 25:16.360
If the thread is interrupted, we flip the safe points, the code de-optimizes, we go and do

25:16.360 --> 25:18.680
our interrupt, and then we go back in.

25:18.680 --> 25:22.520
That's obviously very heavy because we throw out a lot of code.

25:22.520 --> 25:28.120
But in general, these are shut down cases, critical cases in a Ruby application, where

25:28.120 --> 25:32.360
you want to cleanly tear down a thread, the only way to do it is to cause it to raise an

25:32.360 --> 25:35.160
error or kill itself.

25:35.160 --> 25:38.760
We also have other little constant dynamic things.

25:38.760 --> 25:42.320
There's places where we still use our old inline caching, call site.

25:42.320 --> 25:46.680
We create that with invoke dynamic and cache it in place just to save some trouble.

25:46.680 --> 25:50.360
Now the last area here I want to talk about, this is kind of a cool thing we're doing

25:50.360 --> 25:52.320
with method handles.

25:52.320 --> 25:58.840
There are certain cases where we can turn a Ruby method into a chain of method handles just

25:58.840 --> 26:00.720
to make it inline a little bit better.

26:00.720 --> 26:05.920
For example, object construction in Ruby involves going to the class, telling it to allocate

26:05.920 --> 26:10.240
an object, and then calling initialize on that object.

26:10.240 --> 26:15.600
That actually is a method, class.new that we call, that creates that, that allocates the

26:15.600 --> 26:20.840
object and calls the constructor, we don't want to have that mega-morphic new method

26:20.840 --> 26:23.360
that we're calling through for every allocation.

26:23.360 --> 26:28.680
So we can turn that into a chain of method handles inline all the way through the constructor

26:28.680 --> 26:32.200
and not have to worry about it, not inlining.

26:32.200 --> 26:39.800
Similarly, Ruby and recent versions added the ability to override not equals as the version

26:39.800 --> 26:42.160
of not with the equals method.

26:42.160 --> 26:46.760
We don't want to have to dispatch through the not equals to get to equals, so we do a

26:46.760 --> 26:51.120
little extra magic with method handles to inline that all.

26:51.120 --> 26:52.680
There are other places we're looking at doing this.

26:52.680 --> 26:58.200
If we know we're calling one of the core loop methods, we can turn that into IR, and then

26:58.200 --> 27:03.440
inline all the things back and have our loop inline with the method that we call.

27:03.440 --> 27:06.560
And again, my PhosDM 2018 shop shows an example of this.

27:06.560 --> 27:11.640
You can basically compile anything into method handles and it will all execute and inline

27:11.640 --> 27:12.640
properly.

27:12.880 --> 27:17.880
We're using it in small cases to sort of intensify some of these Ruby features.

27:17.880 --> 27:23.680
So there's an example of the new method in the class with allocating the initialize.

27:23.680 --> 27:29.840
We turn that into a call from the food method to a method handle chain that does allocate

27:29.840 --> 27:37.560
and initialize as one operation and now it inlines and has the characteristics we need.

27:37.560 --> 27:39.920
All right, I'm basically done here.

27:39.920 --> 27:45.200
The big challenges that we have are with Lambda forms stack traces end up looking pretty

27:45.200 --> 27:47.400
horrendous when you do a thread dump.

27:47.400 --> 27:52.040
We need better ways to deal with that stack trace and more importantly, we need ways to

27:52.040 --> 27:56.120
identify that those are not interesting for profiling purposes.

27:56.120 --> 28:00.440
You might have a call from food to bar that goes through a couple different Lambda form paths

28:00.440 --> 28:05.640
should be considered the same piece in a profile.

28:05.640 --> 28:09.400
The more we use Lambda forms, the more complex this gets and we're putting a lot of load

28:09.400 --> 28:10.400
on the JVM2.

28:10.400 --> 28:15.400
We're also seeing a lot of job-alang and vote objects that are taking up additional

28:15.400 --> 28:16.400
memory.

28:16.400 --> 28:20.480
The more complex our call sites, the more of these objects are in memory, the more allocated

28:20.480 --> 28:22.840
and that impact start up and run time.

28:22.840 --> 28:26.040
I'll skip these two samples.

28:26.040 --> 28:30.440
Again, in the last thing, no Indy from Java makes it very difficult for us to get the

28:30.440 --> 28:33.760
same characteristics calling back into Ruby from Java.

28:33.760 --> 28:38.120
We're hoping that the JVM is able to help us with this more in the future.

28:38.120 --> 28:40.000
So future things that we're going to be doing.

28:40.000 --> 28:44.360
We are going to be filling in a lot of these different invoke dynamic gaps in JVM, doing

28:44.360 --> 28:49.720
better adaptations, getting wired up with Panama for native calls, really looking forward

28:49.720 --> 28:54.240
to working with all of the different Open JDK projects to try and leverage as much as possible

28:54.240 --> 28:59.880
in JVM and give you feedback about what we use and how it can be improved for languages

28:59.880 --> 29:00.880
in the future.

29:00.880 --> 29:01.880
That's all I've got.

29:01.880 --> 29:02.880
Thank you.

29:03.280 --> 29:13.400
We've got room for one question if you want.

29:13.400 --> 29:14.400
Yes, right or right?

29:14.400 --> 29:17.040
Dynamic constants at all.

29:17.040 --> 29:18.040
Dynamic constants.

29:18.040 --> 29:19.640
Do we use dynamic constants?

29:19.640 --> 29:24.280
They can be useful for us for some of our runtime stuff that does not depend on the current

29:24.280 --> 29:25.280
JVM.

29:25.280 --> 29:32.360
But in most cases, we need to know what instance of JVM we're running, what we're fixing

29:32.360 --> 29:33.360
in some class.

29:33.360 --> 29:37.320
There's a lot of runtime data that we need for those dynamic constants that's hard

29:37.320 --> 29:40.480
to get into constant dynamic right now.

29:40.480 --> 29:45.520
So it's good for plumbing, but not really for the core of Ruby.

29:45.520 --> 29:46.520
Yeah.

29:46.520 --> 29:47.520
All right?

29:47.520 --> 29:48.520
Thank you.

29:48.520 --> 29:49.520
Thank you.

29:49.520 --> 29:50.520
Thank you.

29:50.520 --> 29:51.520
We're moving on to work.

