WEBVTT 00:00.000 --> 00:14.080 All right, hello everybody, I'm glad to be back here again, finally. 00:14.080 --> 00:17.280 I don't practice my talks, so this might fit and it might not, so we're just going to 00:17.280 --> 00:19.480 get going a little early on this. 00:19.480 --> 00:23.800 I'm going to talk today a little bit about how we use invoke dynamic and J Ruby, the things 00:23.800 --> 00:28.720 that work, the things that don't, places where we see that there's possible improvement, 00:28.720 --> 00:33.800 and hopefully you'll learn a little bit about how invoke dynamic works for us. 00:33.800 --> 00:39.840 Basic contact information for me, probably the most interesting thing there, up until Jet 00:39.840 --> 00:44.480 July, I was working for Red Hat, they were funding the development of the project. 00:44.480 --> 00:50.120 Since then, we have been building our own open source support company to help keep J Ruby 00:50.120 --> 00:55.840 going to help fund other open source projects, just finding a way to connect up commercial 00:55.840 --> 01:00.960 concerns with open source projects like J Ruby so that we can keep the developers funded 01:00.960 --> 01:04.040 and give the companies the support they need. 01:04.040 --> 01:08.120 So I got stickers in business cards, if you're into that kind of stuff, J Ruby stickers, and some 01:08.120 --> 01:10.800 information about the company too. 01:10.800 --> 01:17.640 So my role today is now kind of two things, I am still one of the J Ruby leads, so I do a lot 01:17.640 --> 01:22.160 of the core development, a lot of the research, experimentation with all of these different 01:22.160 --> 01:27.360 JDK projects, I do most of the community outreach, make sure that pull requests are going 01:27.360 --> 01:31.000 through, make sure we're finding the right people in the community to help develop the 01:31.000 --> 01:32.720 features we need. 01:32.720 --> 01:38.080 Also, a co-founder of the new business, HETIUS Enterprises, we are focusing right now on 01:38.080 --> 01:43.520 providing commercial support to J Ruby users, and it turns out there's a lot of large applications 01:43.520 --> 01:49.560 out there running on J Ruby that are very excited to have us on their team, essentially. 01:49.560 --> 01:52.320 And hopefully trying to bring that to other open source projects. 01:52.320 --> 01:57.400 So if you or someone else has an open source project that you know a lot of companies are 01:57.400 --> 02:03.400 relying on, we can probably find a way to help get some funding organization, funding arrangements 02:03.400 --> 02:04.400 happening. 02:04.400 --> 02:08.320 It's generally days for us, but that's what we're looking to do. 02:08.320 --> 02:12.080 So J Ruby, pretty straightforward, hopefully everybody's heard of it by now. 02:12.080 --> 02:18.040 There's Ruby on the JVM of course, we focus very, very tightly on making it as much like 02:18.040 --> 02:21.400 the standard Ruby experience as possible. 02:21.400 --> 02:27.440 All the same command line, all the same command lines work, all the same libraries work. 02:27.440 --> 02:31.640 If it's pure Ruby, everything should just run great like Ruby and Rails, all the different 02:31.640 --> 02:34.960 database frameworks that are available for Ruby. 02:34.960 --> 02:39.160 And if we have native libraries, we even have FFI support so that we can do the same native 02:39.160 --> 02:43.240 calls and native integration that C Ruby does. 02:43.240 --> 02:46.880 It's also just bringing the best of the JVM to Ruby. 02:46.880 --> 02:50.240 We have been pushing the edge of Ruby performance. 02:50.240 --> 02:57.520 Ruby concurrency and scaling for 15, 20 years now of JVB being able to run all these applications. 02:57.520 --> 03:01.680 And that means leveraging all of these different Open JDK projects. 03:01.680 --> 03:06.480 Every single thing that you would see in this room or at the JVM language summit is absolutely 03:06.480 --> 03:13.320 useful and important for JVB and for us to make a better Ruby experience on the JVM. 03:13.320 --> 03:16.800 This question always comes up, so I'm going to just answer it right now. 03:16.800 --> 03:21.240 One about this Truffle Ruby thing, I thought that was the way that JVM Ruby was going 03:21.240 --> 03:22.240 to go. 03:22.240 --> 03:25.360 Truffle Ruby has very different goals from us. 03:25.360 --> 03:29.560 For those of you not familiar, Truffle Ruby is a Ruby implementation using the Truffle 03:29.560 --> 03:32.560 framework on top of Rall VM. 03:32.560 --> 03:35.480 JVB is a JVM implementation of Ruby. 03:35.480 --> 03:38.680 We want to run on every JVM, not just Rall VM. 03:38.680 --> 03:42.320 We want to run on embedded environments and Android and everything. 03:42.320 --> 03:48.640 So we focus on doing as good a job on implementing Ruby as we can with what the JVM provides, 03:48.640 --> 03:51.560 with what the JDK has for us. 03:51.560 --> 03:56.560 Truffle Ruby being limited to Rall VM, being kind of focused on just two major platforms, 03:56.560 --> 04:00.200 Mac OS and Linux, that's too limited for what we want to do. 04:00.200 --> 04:02.840 And there are trade-offs with the Truffle framework. 04:02.840 --> 04:07.760 You can get some amazing long-term stable performance out of it. 04:07.760 --> 04:14.120 There's a lot of startup overhead, one up overhead, and a much larger memory footprint. 04:14.120 --> 04:16.040 So again, very different goals. 04:16.040 --> 04:20.440 JVB focused on being a JVM targeted Ruby implementation. 04:20.440 --> 04:25.120 I'm not going to talk much about method handles, but in this, it's kind of the other half 04:25.120 --> 04:32.200 of in-vokedynamic, the job aside of the API to build up these method graphs and the adaptations. 04:32.200 --> 04:35.560 My talk from PhasDM 2018 is still very relevant. 04:35.560 --> 04:41.040 It gives an intro to method handles and intro to an API I wrote, called invoke binder, that 04:41.040 --> 04:45.480 makes it easier to work with method handles, and actually shows that you can implement an entire 04:45.480 --> 04:49.800 little language in method handles, and it will compile and optimize really well. 04:49.800 --> 04:52.960 So check that out if you want to learn about that side. 04:52.960 --> 04:55.360 So invoke dynamic in JVB. 04:55.360 --> 04:58.320 Envokedynamic really makes JVB possible. 04:58.360 --> 05:00.640 It makes things optimize the way we want. 05:00.640 --> 05:05.640 It allows us to do all of the different call forms and adaptations that we need to do. 05:05.640 --> 05:09.600 It allows us to shrink down the amount of byte code that we generate so that the optimizations 05:09.600 --> 05:11.800 of the JVM work better. 05:11.800 --> 05:18.400 It really is an essential part of the JVM for getting a language like Ruby running. 05:18.400 --> 05:23.320 Unfortunately, we also still need to support a non-involvedynamic mode in JVB. 05:23.320 --> 05:26.920 Indie takes a little bit longer to warm up because of all of the method handle logic 05:26.920 --> 05:27.920 that goes into it. 05:28.040 --> 05:31.520 There's more profiling that's required to optimize and inline things. 05:31.520 --> 05:34.520 So we still support running without invoke dynamic. 05:34.520 --> 05:38.040 Hopefully, as we go forward, that will be less and less the case. 05:38.040 --> 05:45.040 And if any folks in the room are working on improving method handle and lambda form performance 05:45.040 --> 05:49.160 and memory footprint, that's what we're really looking for here. 05:49.160 --> 05:53.280 So the first area we're going to talk about method calls, it's kind of the obvious thing 05:53.280 --> 05:55.680 that we use invoke dynamic for in JVB. 05:55.760 --> 05:59.640 And it was the earliest case for us to use invoke dynamic. 05:59.640 --> 06:05.160 JVB was pretty much the first dynamic language on the JVM to make use of invoke dynamic. 06:05.160 --> 06:11.240 We actually were integrating it into JVB before it was finalized and released in Java 7 06:11.240 --> 06:16.040 and helped drive a lot of the invoke dynamic development at that point. 06:16.040 --> 06:20.880 This not a straightforward as it seems, it's not just like calling into reflection and getting a method point. 06:20.880 --> 06:26.080 We have lots of different targets, all from Ruby to Ruby to Java. 06:26.080 --> 06:29.000 You can call Ruby in the native through our FFI. 06:29.000 --> 06:34.800 Hey, hey, this got the room sound work. 06:34.800 --> 06:39.720 And okay, yeah, yeah. 06:39.720 --> 06:46.920 So lots of different types of calls, different paths from Ruby to Java, Ruby to native. 06:46.960 --> 06:52.600 Validation and binding of all these Ruby classes and method tables can be mutated at runtime. 06:52.600 --> 06:55.480 It's part of the dynamic characteristics of the language. 06:55.480 --> 06:57.520 So we can't just bind it once. 06:57.520 --> 06:59.200 We have to be able to fall back. 06:59.200 --> 07:02.880 We have to be able to detect changes in the Ruby class structures. 07:02.880 --> 07:06.240 And then we need to be able to bind into all the different overlords of Java. 07:06.240 --> 07:11.200 We need to take calls from a dynamic language, turn them into a static call to a Java thing 07:11.200 --> 07:13.600 and try to make that all wire up correctly. 07:13.600 --> 07:16.200 And then all of the adaptations along the way. 07:16.280 --> 07:22.560 Ruby can support optional arguments, variable length argument lists, keyword arguments. 07:22.560 --> 07:29.200 Ideally, we don't have to box all of these things and throw them into a hash table for keywords or an array for 07:29.200 --> 07:30.520 our orgs. 07:30.520 --> 07:35.560 Ideally, we can get those to go straight through on the stack without any doing extra allocation 07:35.560 --> 07:38.520 and putting more load on the jit to optimize for us. 07:38.520 --> 07:44.200 So Indy really does allow all of these adaptations to happen in J Ruby and to inline an optimized 07:45.200 --> 07:46.200 well. 07:46.200 --> 07:47.200 Here's the eye chart. 07:47.200 --> 07:49.200 I'm not going to go through all of this. 07:49.200 --> 07:50.200 A few key things here. 07:50.200 --> 07:55.960 I mentioned that we have to invalidate if the type structures change if method tables change. 07:55.960 --> 07:58.360 That is an active invalidation. 07:58.360 --> 08:05.200 So whenever we cache a method at a call site, we grab a switch point which is basically an abstraction 08:05.200 --> 08:11.640 around JVM safe points to say if this class changes, if a new class is introduced or a module 08:11.640 --> 08:16.800 is introduced, if a method table change happens, go and invalidate all those call sites 08:16.800 --> 08:22.080 and the next time we'll go back through, we'll get the new methods and re-catch it. 08:22.080 --> 08:28.280 The more complex the binding is, the more adaptations we need to do, the longer the chain 08:28.280 --> 08:32.080 of method handles from our call site to the target method. 08:32.080 --> 08:37.520 Usually those will all get compressed down and inline and optimized away but there are thresholds 08:37.520 --> 08:38.520 that we can crash. 08:38.520 --> 08:44.080 If we create too much complexity in a given call site, we may end up not inlining those 08:44.080 --> 08:45.080 pieces. 08:45.080 --> 08:50.280 Now we're actually performing worse with invoke dynamic because we've actually broke in the 08:50.280 --> 08:52.920 inlining process. 08:52.920 --> 08:55.080 And there's obviously a lot more opportunities here. 08:55.080 --> 09:00.960 I mentioned that some special types of calls in Ruby like doing super class calls or refined 09:00.960 --> 09:05.800 calls, which are methods that are patched within a certain scope. 09:05.800 --> 09:09.800 Those don't do any optimization currently, we're still adding those pieces, hopefully 09:09.800 --> 09:15.040 in the upcoming versions of J Ruby will finish those pieces as well. 09:15.040 --> 09:20.680 So a simple example in Ruby, we've got a full method, the full method calls bar and bar calls, 09:20.680 --> 09:24.920 the dump stack method so we can see where we actually are in our execution. 09:24.920 --> 09:31.880 And I'm running this a couple times in a loop and we don't have the extra overhead there. 09:32.840 --> 09:38.480 J Ruby supports an interpreter, we're actually a mixed mode runtime on top of the JVM. 09:38.480 --> 09:43.960 So most code will run in our interpreter for a while, eventually we will jid it to JVM 09:43.960 --> 09:44.960 bytecode. 09:44.960 --> 09:50.840 And the interpreter, this is what it looks like. 09:50.840 --> 09:55.320 You see these specially named methods, interpret block, interpret method. 09:55.320 --> 10:01.480 This is how we take our interpreter frames and the JVM stack frames splice them together 10:01.480 --> 10:04.320 and produce a normal stack trace. 10:04.320 --> 10:09.280 It's kind of a complicated way of doing things, but it allows us to avoid the overhead 10:09.280 --> 10:14.360 of piling everything to JVM bytecode when a large portion of it will be called never or 10:14.360 --> 10:17.160 maybe only once. 10:17.160 --> 10:22.520 If we actually turn on the JVM jit compiler that produces bytecode, now we see that our Ruby 10:22.520 --> 10:26.560 methods here have actually turned into JVM stack frames. 10:26.600 --> 10:32.080 So at the bottom, the block that we used in the times loop, and then above there the 10:32.080 --> 10:34.240 full method and the bar method. 10:34.240 --> 10:37.000 And this is actual JVM stack frames here. 10:37.000 --> 10:43.080 I use a special little syntax to encode Ruby information about the method. 10:43.080 --> 10:45.560 So I know it's a deaf, it's a method. 10:45.560 --> 10:51.400 The heart marks it as being one of our Ruby frames that we want to pull out. 10:51.440 --> 10:57.160 The number here is basically which not index of that name in this particular script so that 10:57.160 --> 11:03.520 if you have two classes in the same file with the same method, we don't overlap on those. 11:03.520 --> 11:07.880 Now if we keep going with this, the version that doesn't use invoke dynamic and you can 11:07.880 --> 11:13.480 see we've got just a caching call site, basically an inline cache that we use. 11:13.480 --> 11:17.320 If we're not using envy, we just go through all of our own plumbing and then there's multiple 11:17.320 --> 11:22.600 layers of adaptation that goes through our utility classes and finally we make the call. 11:22.600 --> 11:28.480 If we wear that all up with envy, this job of stack trace reduces down to that. 11:28.480 --> 11:33.440 All of the stuff in between the call and the receiver is done as method handles which 11:33.440 --> 11:36.480 turns into lambda forms in the JVM. 11:36.480 --> 11:40.600 Those get compressed down, it generates a bit of code, inlines the whole thing. 11:40.600 --> 11:45.040 Now we can see we're actually doing these calls directly and we can get the optimizations 11:45.040 --> 11:50.120 we would expect from inlining food and bar together, for example. 11:50.120 --> 11:52.400 But this actually hides the complexity. 11:52.400 --> 11:56.880 If you actually force the JVM to do a stack dump, you're going to see something that 11:56.880 --> 11:59.040 looks more like this. 11:59.040 --> 12:04.880 These are the layers of lambda forms that secretly exist between the call and the receiver 12:04.880 --> 12:09.800 to do all of those adaptations and they have these horrible names because it's little bits 12:09.800 --> 12:15.760 of generated code that are created to let the JVM do its normal sort of optimizations. 12:15.760 --> 12:21.280 The same optimizations it does for regular JVM byte code, but as part of this invoked 12:21.280 --> 12:26.960 dynamic method handle adaptation, then it can just do the same, use the same jet process, 12:26.960 --> 12:30.680 the same profiling, optimize and inlining it altogether. 12:30.680 --> 12:34.840 But this is a challenge for us when we deal with things like profiling tools which I'll 12:34.840 --> 12:36.960 talk about later. 12:37.040 --> 12:41.800 Going back here, changing the structure of these methods a little bit, here we're calling 12:41.800 --> 12:46.640 on the food side, we're calling with one argument, it's going into a version of bar that 12:46.640 --> 12:51.720 has variable length argument list, still inlines all correctly, all those adaptations 12:51.720 --> 12:57.640 of turning the one argument into an array of arguments works just fine. 12:57.640 --> 13:01.120 We started running the problems if we get more complex than this. 13:01.120 --> 13:06.400 So here is the same stack trace, but showing where we have the times call, we'll be 13:06.400 --> 13:12.000 fixed num.times, the loop we did, that calls back into the Ruby code. 13:12.000 --> 13:17.200 Now we have these extra things because there's no way to do invoked dynamic calls from Java 13:17.200 --> 13:18.200 directly. 13:18.200 --> 13:23.320 So we have to call through our lock interface, that has to do some juggling of arguments 13:23.320 --> 13:28.480 and moving things around, and then finally it gets back into our compile Ruby code. 13:28.480 --> 13:33.080 This is an area we're looking to improve, try to find a better way that we can do invoked 13:33.080 --> 13:38.280 dynamic from Java so that we can get the Ruby code in line back to the Java code just 13:38.280 --> 13:41.720 like vice versa. 13:41.720 --> 13:47.800 Another example, adapting Ruby's many different ways of doing argument lists to a target 13:47.800 --> 13:48.800 job. 13:48.800 --> 13:54.000 Here we're calling the same dump stack method, but because we don't know how many arguments 13:54.000 --> 13:59.080 there might be in this incoming array, we're splatting this argument array, we have to go 13:59.080 --> 14:03.800 through some extra additional adaptations, we can't do the inlining all the way to the 14:03.800 --> 14:08.840 target method, and this is more of just working in J Ruby to try and bind these two sides 14:08.840 --> 14:14.640 together, try and avoid having to go back into our utility code and make sure it's all 14:14.640 --> 14:16.840 invoked dynamic. 14:16.840 --> 14:20.320 And then other call forms that we're still working on exploring, I mentioned doing dynamic 14:20.320 --> 14:26.720 calls from Java, I've opened the suggestions about ways we can rewrite the Java implementations 14:26.720 --> 14:32.480 of these methods to do dynamic calls, I don't have a good pattern for this right now. 14:32.480 --> 14:36.440 We also have the same problem that Java does with lambdas. 14:36.440 --> 14:41.680 If you call a single method with many different lambdas, well, that lambdas dispatch becomes 14:41.680 --> 14:47.920 mega-morphic, you can't inline through all of that, and the JVM currently does not profile 14:47.920 --> 14:50.720 across that mega-morphic call. 14:50.720 --> 14:54.920 We're looking at doing some of our own manual specialization where if we know that it's 14:54.920 --> 14:59.920 a simple method that receives a simple block, well, let's admit a new copy of it, let's 14:59.920 --> 15:04.880 actually specialize and split that method into another version, then the JVM can see through 15:04.880 --> 15:05.880 it. 15:05.880 --> 15:08.840 We really don't want to have to do this on our own. 15:08.840 --> 15:14.000 We would like to be able to hint to the JVM that you should specialize this path based 15:14.000 --> 15:19.360 on the lambda or the block we passed in rather than each piece along the way, find that 15:19.360 --> 15:22.360 common path and inlining. 15:22.440 --> 15:28.360 We've always thought about and wanted to try to do numeric unboxing, the lure of partial 15:28.360 --> 15:34.400 escape analysis has kind of teased us for so many years that we wouldn't have to worry 15:34.400 --> 15:36.440 about all these boxes. 15:36.440 --> 15:41.760 Now possibly with Valhalla, we'll have value types, and we can double up our arguments and 15:41.760 --> 15:45.200 be able to pass a native version and the box version. 15:45.200 --> 15:47.680 But still, we don't do any unboxing. 15:48.000 --> 15:52.160 We're kind of hoping that the JVM will catch up with what we need to be able to represent 15:52.160 --> 16:00.000 objects or represent primitives as objects, and optimizes. 16:00.000 --> 16:04.560 So the next area that we started using in both dynamic and JVM is for handling Ruby instance 16:04.560 --> 16:07.000 variables or fields basically. 16:07.000 --> 16:08.560 So here's a simple class. 16:08.560 --> 16:13.400 We assign two instance variables name and number to the values passed in to the initialize 16:13.480 --> 16:15.480 constructor here. 16:15.480 --> 16:19.960 Adder accessor is a Ruby feature to just add accessors, getters, and setters for those 16:19.960 --> 16:21.760 two fields. 16:21.760 --> 16:26.800 And then an example of how you can dynamically add new fields to a class. 16:26.800 --> 16:32.120 This is all at runtime and we need to be able to efficiently represent objects in memory 16:32.120 --> 16:38.080 even though the set of fields they contain might change while we run. 16:38.080 --> 16:42.440 So instances variables are basically dynamically allocated object fields in a typical Ruby 16:42.440 --> 16:44.080 implementation. 16:44.080 --> 16:49.000 The way that we get around having this be a separate box of values, a separate array we carry 16:49.000 --> 16:54.760 along, is that we statically look through the method table, look for all the different variable 16:54.760 --> 16:59.440 accesses we see, and then make a best guess about what the shape of this object is going 16:59.440 --> 17:00.440 to be. 17:00.440 --> 17:05.440 That allows us to put most Ruby instance variables directly into Java fields, even though 17:05.440 --> 17:10.680 technically there's no declaration syntax for an instance variable. 17:10.680 --> 17:14.640 And then if we if we turns out that we're wrong we still have our spill array that gets 17:14.640 --> 17:16.680 carried along with an object. 17:16.680 --> 17:20.520 Most of the time objects will be pretty well behaved and won't do a lot of dynamic instance 17:20.520 --> 17:22.280 variables. 17:22.280 --> 17:27.680 And then in both dynamic can actually have us wire up our access of this named field. 17:27.680 --> 17:32.720 This instance variable straight into the Java field and the objects we cut out all of that 17:32.720 --> 17:37.840 access look up all of the validation of it and actually go straight to the memory location 17:37.840 --> 17:41.840 to get the Ruby field. 17:41.840 --> 17:47.640 We also use invoke dynamic heavily for managing Ruby constants and globals, which oddly enough 17:47.640 --> 17:51.440 both of these are mutable, constant values. 17:51.440 --> 17:58.160 So here we have the debug constant being set to true, a debug global variable being set to true. 17:58.160 --> 18:03.160 When you declare modules and classes in Ruby that's actually just assigning new constants 18:03.160 --> 18:06.200 to a module object or a class object. 18:06.200 --> 18:14.360 And then at the bottom there is a fully qualified access of that BAS class in the middle. 18:14.360 --> 18:20.040 So for constants and globals, constants are scoped, lexically scoped, and then also scoped 18:20.040 --> 18:23.440 within the class hierarchy. 18:23.440 --> 18:28.600 They can be already, but it's typically not done, it's considered bad form in a typical 18:28.600 --> 18:32.280 Ruby application and usually you'll get warnings about it. 18:32.280 --> 18:36.560 Formals on the other hand can be modified all the time, there's no warnings for that, 18:36.560 --> 18:40.920 but usually they end up falling into either is constantly being mutated or it's never 18:40.920 --> 18:47.480 mutated, like a debug variable, probably not going to be turned on and off at runtime. 18:47.480 --> 18:50.280 The ending call sites actually work really well for this. 18:50.280 --> 18:52.880 We look up our constant value. 18:52.880 --> 18:58.360 We use a global invalidator based on the location of the constant or the name of the global 18:58.360 --> 18:59.760 variable. 19:00.080 --> 19:07.000 We can get that value to fold in as a constant using the existing invoke dynamic features. 19:07.000 --> 19:12.960 Similarly, we do the same thing with global variables, but we usually have a fallback in case 19:12.960 --> 19:16.240 it's a variable that's actually being used to mutate quite a bit. 19:16.240 --> 19:20.320 If it's being changed a lot, we fall back on a slow path so that we're not constantly 19:20.320 --> 19:27.880 throwing out code and invalidating an entire call graph just because it's a mutable value. 19:27.880 --> 19:29.800 There are places to improve here. 19:29.800 --> 19:35.120 The through bar buzz is actually currently in J.W.B. free separate constant lookups, but it's 19:35.120 --> 19:37.320 always going to produce the same result. 19:37.320 --> 19:42.040 And if we're not doing a lot of changing of those constants in a typical application, it 19:42.040 --> 19:43.960 really should be one lookup. 19:43.960 --> 19:47.680 We can shove that all behind invoke dynamic with do-star byte code size. 19:47.680 --> 19:53.120 So this is another area that we're working to improve in the future. 19:53.120 --> 19:57.200 We actually use invoke dynamic for creating Ruby literal values. 19:57.200 --> 20:03.680 Ruby has a much richer set of literals than what we can store in a constant pool in Java. 20:03.680 --> 20:09.080 We have our numerics, of course, but we have a literal big integer, a big num format that 20:09.080 --> 20:14.800 we need to be able to have as a literal value without constructing it every time. 20:14.800 --> 20:17.160 Ruby strings are not Java strings. 20:17.160 --> 20:20.560 We represent a string as a byte array and an encoding. 20:20.600 --> 20:27.400 So we need a way to re-constitute that into a Ruby string object and ideally cache it 20:27.400 --> 20:32.800 in place like it's a constant value like it was a Java string literal, similarly with regular 20:32.800 --> 20:35.200 expression literals. 20:35.200 --> 20:37.560 There are cases where we're going a little bit beyond this. 20:37.560 --> 20:43.040 If we have arrays or hashes where it's all literal values, we should be able to just 20:43.040 --> 20:48.240 tell invoke dynamic, create an array that looks like this, ideally share some of that 20:48.480 --> 20:53.200 store, not have to recreate the entire structure of the array every time because we know 20:53.200 --> 20:56.480 it has a mutable values in it. 20:56.480 --> 21:00.800 Other things that we're still working on, things like other hack more complex hash 21:00.800 --> 21:07.080 formats, composite types like complex and rational, but most of this we can easily put into 21:07.080 --> 21:13.560 invoke dynamic call sites and have one instruction to create even a large structure like 21:13.560 --> 21:15.560 a hash. 21:15.560 --> 21:19.200 Do what those byte codes look like if you look in J. Ruby. 21:19.200 --> 21:23.520 So at the top we have our fixed num which is a long, we just call into our fixed num 21:23.520 --> 21:28.520 site that does a bootstrap, goes out and creates a Ruby fixed num object and then it's 21:28.520 --> 21:34.040 cached at that point in the code forever. 21:34.040 --> 21:38.520 The next one is a frozen string, Ruby has both mutable and immutable strings. 21:38.520 --> 21:43.320 So here we have our hello string, we know that it's going to be UTF8 encoding. 21:43.320 --> 21:50.320 The 16 here is basically an encoding that says this particular string is only seven 21:50.320 --> 21:56.720 bit characters, so we can optimize accesses to it, and then Ruby also supports debugging 21:56.720 --> 22:02.040 if someone accidentally goes and tries to mutate a frozen string, we can print out an error 22:02.040 --> 22:05.800 that says where that string was allocated and let them know that they're not supposed 22:05.800 --> 22:08.000 to be modifying it. 22:08.240 --> 22:14.840 Similarly, the regular expression here, the execution is full, the encoding is UTF8 and 22:14.840 --> 22:19.240 the 512 is basically just regular expression flags that are embedded in it. 22:19.240 --> 22:24.960 I mentioned the array full of all literals, so this is an array that has two literal numeric 22:24.960 --> 22:28.200 values, the fixed num one and fixed num two. 22:28.200 --> 22:32.680 That gets shoved into a structure passed out to invoke dynamic and we can quickly create 22:32.680 --> 22:35.960 that array without too much trouble. 22:35.960 --> 22:42.000 Big nums of course we have here too, we embed that as a string, turn it into a big integer 22:42.000 --> 22:46.120 that's inside of our big num object and only have to do that allocation once. 22:46.120 --> 22:50.280 And then of course there's a range from one to ten as well. 22:50.280 --> 22:54.880 One of the newer areas that we're playing with invoke dynamic is using it to encode string 22:54.880 --> 22:55.880 interpolation. 22:55.880 --> 23:02.440 Similar to the way that the JVM now does string concatenation using invoke dynamic, reducing 23:02.440 --> 23:05.560 the byte code, making it a little bit more optimal. 23:05.560 --> 23:11.360 We do the same thing, so we embed some additional information, the constant, the static 23:11.360 --> 23:17.160 parts of that string interpolation plus a sort of map that says here we pull a static 23:17.160 --> 23:21.720 string, now we need a dynamic string, now we need a static string. 23:21.720 --> 23:25.520 Based on the size and structure of that, we do it a few different ways. 23:25.520 --> 23:29.560 We can call it a number of different overlodes that will stitch it together. 23:29.560 --> 23:34.520 We can just loop over all those values or we just fall back to a slow case, so we don't 23:34.520 --> 23:40.400 create such a giant tree of method handles and overload the JVM that way. 23:40.400 --> 23:45.880 This example of the string here, the value A was passed. 23:45.880 --> 23:50.480 We embed our two static strings into the invoke dynamic call site. 23:50.480 --> 23:57.000 Then we have the second to the last value here is a bit map that shows where the static 23:57.000 --> 23:59.880 values and the where the dynamic values should go. 23:59.880 --> 24:03.840 And then it's just a matter of telling invoke dynamic to stitch all those pieces together, 24:03.840 --> 24:05.640 and they're stringed back out. 24:05.640 --> 24:09.600 But then we only have this one instruction in our byte code rather than what would have 24:09.600 --> 24:14.720 been dozens of instructions to create these strings, put in the dynamic values, and then turn 24:14.720 --> 24:18.320 it into one string at the end. 24:18.320 --> 24:22.200 We also use it internally just for some of our own runtime plumbing. 24:22.200 --> 24:25.520 So when we create a block, we don't want to have to do that over again. 24:25.520 --> 24:29.080 We use invoke dynamic to cache it in place. 24:29.080 --> 24:36.080 We have heat-based local variables, blocks can close around a scope, and still mutate 24:36.080 --> 24:38.960 values outside of them, unlike Java lambdas. 24:38.960 --> 24:42.760 So we need to maintain a separate heap structure for that. 24:42.760 --> 24:46.840 We use invoke dynamic to try and speed the process of digging into that heap structure 24:46.840 --> 24:49.480 and finding those local variables. 24:49.480 --> 24:54.320 We also have to support Ruby's ability to interrupt threads at any time. 24:54.320 --> 24:59.240 So there's a poll that we would do to check and see, have I been interrupted? 24:59.240 --> 25:00.840 Am I supposed to raise an error? 25:00.840 --> 25:04.360 Am I supposed to, as this thread is supposed to kill itself now? 25:04.360 --> 25:07.760 We don't want to constantly be paying a memory location for that. 25:07.760 --> 25:09.920 So we do it with safe points. 25:09.920 --> 25:16.360 If the thread is interrupted, we flip the safe points, the code de-optimizes, we go and do 25:16.360 --> 25:18.680 our interrupt, and then we go back in. 25:18.680 --> 25:22.520 That's obviously very heavy because we throw out a lot of code. 25:22.520 --> 25:28.120 But in general, these are shut down cases, critical cases in a Ruby application, where 25:28.120 --> 25:32.360 you want to cleanly tear down a thread, the only way to do it is to cause it to raise an 25:32.360 --> 25:35.160 error or kill itself. 25:35.160 --> 25:38.760 We also have other little constant dynamic things. 25:38.760 --> 25:42.320 There's places where we still use our old inline caching, call site. 25:42.320 --> 25:46.680 We create that with invoke dynamic and cache it in place just to save some trouble. 25:46.680 --> 25:50.360 Now the last area here I want to talk about, this is kind of a cool thing we're doing 25:50.360 --> 25:52.320 with method handles. 25:52.320 --> 25:58.840 There are certain cases where we can turn a Ruby method into a chain of method handles just 25:58.840 --> 26:00.720 to make it inline a little bit better. 26:00.720 --> 26:05.920 For example, object construction in Ruby involves going to the class, telling it to allocate 26:05.920 --> 26:10.240 an object, and then calling initialize on that object. 26:10.240 --> 26:15.600 That actually is a method, class.new that we call, that creates that, that allocates the 26:15.600 --> 26:20.840 object and calls the constructor, we don't want to have that mega-morphic new method 26:20.840 --> 26:23.360 that we're calling through for every allocation. 26:23.360 --> 26:28.680 So we can turn that into a chain of method handles inline all the way through the constructor 26:28.680 --> 26:32.200 and not have to worry about it, not inlining. 26:32.200 --> 26:39.800 Similarly, Ruby and recent versions added the ability to override not equals as the version 26:39.800 --> 26:42.160 of not with the equals method. 26:42.160 --> 26:46.760 We don't want to have to dispatch through the not equals to get to equals, so we do a 26:46.760 --> 26:51.120 little extra magic with method handles to inline that all. 26:51.120 --> 26:52.680 There are other places we're looking at doing this. 26:52.680 --> 26:58.200 If we know we're calling one of the core loop methods, we can turn that into IR, and then 26:58.200 --> 27:03.440 inline all the things back and have our loop inline with the method that we call. 27:03.440 --> 27:06.560 And again, my PhosDM 2018 shop shows an example of this. 27:06.560 --> 27:11.640 You can basically compile anything into method handles and it will all execute and inline 27:11.640 --> 27:12.640 properly. 27:12.880 --> 27:17.880 We're using it in small cases to sort of intensify some of these Ruby features. 27:17.880 --> 27:23.680 So there's an example of the new method in the class with allocating the initialize. 27:23.680 --> 27:29.840 We turn that into a call from the food method to a method handle chain that does allocate 27:29.840 --> 27:37.560 and initialize as one operation and now it inlines and has the characteristics we need. 27:37.560 --> 27:39.920 All right, I'm basically done here. 27:39.920 --> 27:45.200 The big challenges that we have are with Lambda forms stack traces end up looking pretty 27:45.200 --> 27:47.400 horrendous when you do a thread dump. 27:47.400 --> 27:52.040 We need better ways to deal with that stack trace and more importantly, we need ways to 27:52.040 --> 27:56.120 identify that those are not interesting for profiling purposes. 27:56.120 --> 28:00.440 You might have a call from food to bar that goes through a couple different Lambda form paths 28:00.440 --> 28:05.640 should be considered the same piece in a profile. 28:05.640 --> 28:09.400 The more we use Lambda forms, the more complex this gets and we're putting a lot of load 28:09.400 --> 28:10.400 on the JVM2. 28:10.400 --> 28:15.400 We're also seeing a lot of job-alang and vote objects that are taking up additional 28:15.400 --> 28:16.400 memory. 28:16.400 --> 28:20.480 The more complex our call sites, the more of these objects are in memory, the more allocated 28:20.480 --> 28:22.840 and that impact start up and run time. 28:22.840 --> 28:26.040 I'll skip these two samples. 28:26.040 --> 28:30.440 Again, in the last thing, no Indy from Java makes it very difficult for us to get the 28:30.440 --> 28:33.760 same characteristics calling back into Ruby from Java. 28:33.760 --> 28:38.120 We're hoping that the JVM is able to help us with this more in the future. 28:38.120 --> 28:40.000 So future things that we're going to be doing. 28:40.000 --> 28:44.360 We are going to be filling in a lot of these different invoke dynamic gaps in JVM, doing 28:44.360 --> 28:49.720 better adaptations, getting wired up with Panama for native calls, really looking forward 28:49.720 --> 28:54.240 to working with all of the different Open JDK projects to try and leverage as much as possible 28:54.240 --> 28:59.880 in JVM and give you feedback about what we use and how it can be improved for languages 28:59.880 --> 29:00.880 in the future. 29:00.880 --> 29:01.880 That's all I've got. 29:01.880 --> 29:02.880 Thank you. 29:03.280 --> 29:13.400 We've got room for one question if you want. 29:13.400 --> 29:14.400 Yes, right or right? 29:14.400 --> 29:17.040 Dynamic constants at all. 29:17.040 --> 29:18.040 Dynamic constants. 29:18.040 --> 29:19.640 Do we use dynamic constants? 29:19.640 --> 29:24.280 They can be useful for us for some of our runtime stuff that does not depend on the current 29:24.280 --> 29:25.280 JVM. 29:25.280 --> 29:32.360 But in most cases, we need to know what instance of JVM we're running, what we're fixing 29:32.360 --> 29:33.360 in some class. 29:33.360 --> 29:37.320 There's a lot of runtime data that we need for those dynamic constants that's hard 29:37.320 --> 29:40.480 to get into constant dynamic right now. 29:40.480 --> 29:45.520 So it's good for plumbing, but not really for the core of Ruby. 29:45.520 --> 29:46.520 Yeah. 29:46.520 --> 29:47.520 All right? 29:47.520 --> 29:48.520 Thank you. 29:48.520 --> 29:49.520 Thank you. 29:49.520 --> 29:50.520 Thank you. 29:50.520 --> 29:51.520 We're moving on to work.