WEBVTT 00:00.000 --> 00:14.000 So, today's presentation is going to be focused on that day made transfer, that you see on the left hand side here. 00:14.000 --> 00:18.000 We're trying to figure out how fast it is to run a string chart. 00:18.000 --> 00:22.000 So, today we're doing with two types of string, a string that is Latin one. 00:22.000 --> 00:27.000 So, it's standard string with just 8 to 8 to set characters. 00:27.000 --> 00:31.000 And the only one is a UTF-16 string. 00:31.000 --> 00:37.000 And for today, what we want to also remember a little bit how string chart works. 00:37.000 --> 00:41.000 The way works is called a basic check to see where this is a Latin one. 00:41.000 --> 00:47.000 If it is, it delegates to a string Latin one throughout, otherwise the string UTF-8, 00:47.000 --> 00:53.000 where its character is two bytes instead of one. 00:54.000 --> 00:59.000 Something that might be worth remembering is how string Latin one throughout works. 00:59.000 --> 01:05.000 All it does really check the index is the index within the boundaries of the string that I'm trying to locate, 01:05.000 --> 01:08.000 or the character I'm trying to locate. 01:08.000 --> 01:13.000 And then, all it does is track the characters and it's simple. 01:13.000 --> 01:17.000 A string UTF-8 chart is very similar. 01:17.000 --> 01:19.000 It's a little more complicated. 01:19.000 --> 01:24.000 We got two characters per string, but for today, it's not so important. 01:24.000 --> 01:26.000 But it's important to remember how this works. 01:26.000 --> 01:28.000 The string Latin one throughout. 01:28.000 --> 01:36.000 And what you see on the right hand side, that is the code that damage generates for an average time benchmark. 01:36.000 --> 01:42.000 So, basically, it wraps your own benchmark around some code, 01:42.000 --> 01:46.000 where it starts by taking the start time, 01:46.000 --> 01:50.000 then it goes into a loop, and invokes your method all the time, 01:50.000 --> 01:53.000 but the number of operations until it's done. 01:53.000 --> 01:54.000 It's done. 01:54.000 --> 02:00.000 It's basically a Boolean variable here. 02:00.000 --> 02:02.000 A Boolean, a volatile Boolean. 02:02.000 --> 02:04.000 And then it calculates a stop time. 02:04.000 --> 02:06.000 It's got the time, it's got the operations pump. 02:06.000 --> 02:08.000 It can correct simple. 02:08.000 --> 02:13.000 So, we're going to do next, it'll just run it. 02:13.000 --> 02:16.000 So, let's go here. 02:16.000 --> 02:17.000 This is big enough. 02:17.000 --> 02:20.000 So, let's run this benchmark. 02:20.000 --> 02:24.000 We're going to run it in a slightly noble way, 02:24.000 --> 02:27.000 noble, because this is the first time 02:27.000 --> 02:30.000 that I'm showing this to the outside world. 02:30.000 --> 02:35.000 What you see is, we're benchmarking 02:35.000 --> 02:37.000 a global VM, native image. 02:37.000 --> 02:39.000 We basically, before this talk, 02:39.000 --> 02:43.000 I wrapped the benchmark into a global VM native image. 02:43.000 --> 02:45.000 And what I'm doing here is I'm basically 02:45.000 --> 02:49.000 invoking the native binary, which is a target on benchmarks. 02:49.000 --> 02:53.000 And you can see from the VM version that there is some cruise here. 02:53.000 --> 02:55.000 We ran on a substrate VM. 02:55.000 --> 02:56.000 So, this is not hotspot. 02:56.000 --> 02:59.000 This is the VM, the runs, global VM native images. 02:59.000 --> 03:02.000 And we see we're running global VM community addition. 03:02.000 --> 03:06.000 This is the first time this is showing life outside of my team. 03:06.000 --> 03:09.000 And we can in some numbers, okay? 03:09.000 --> 03:13.000 So, here at the bottom you can see some numbers. 03:13.000 --> 03:16.000 Let me put them right away at higher up. 03:16.000 --> 03:20.000 So, there is a few questions we can ask about these numbers. 03:20.000 --> 03:24.000 First one, are these numbers fast or the slow? 03:24.000 --> 03:26.000 That's a little bit hard for us to know, 03:26.000 --> 03:31.000 because first, you don't know what the specs of this machine are. 03:31.000 --> 03:36.000 And second, you don't have any point of reference to compare with. 03:36.000 --> 03:39.000 And the other question is, you might want to follow my. 03:39.000 --> 03:41.000 So, let's go to the first. 03:41.000 --> 03:46.000 Follow my, I'm an engineer at the Open JDK team at Red Hat. 03:46.000 --> 03:48.000 I work on global VM native image. 03:48.000 --> 03:51.000 And also hotspot, git compilers. 03:51.000 --> 03:55.000 And for today, the most important thing is I am the creator of the 03:55.000 --> 03:58.000 extension that you're seeing in action there. 03:58.000 --> 04:01.000 What it allows is to do the same extension. 04:01.000 --> 04:05.000 This is essentially benchmark yabba code when it's running inside 04:05.000 --> 04:07.000 a global VM native image. 04:07.000 --> 04:12.000 So, let's go back to our numbers. 04:12.000 --> 04:17.000 What we wanted to answer first is where this was fast or slow. 04:17.000 --> 04:22.000 But before we get there, there's something a little bit not going on here. 04:23.000 --> 04:30.000 We've got lotting one being slower the UTF-A16 by quite a bit. 04:30.000 --> 04:32.000 And this is slightly surprising. 04:32.000 --> 04:35.000 If anything, this should be roughly about the same in performance. 04:35.000 --> 04:36.000 One, but expect. 04:36.000 --> 04:40.000 If anything, one would expect maybe UTF-A16 to be a slightly slower, 04:40.000 --> 04:43.000 because it's a more complex implementation. 04:43.000 --> 04:45.000 But this is surprising. 04:45.000 --> 04:46.000 So, what do we do? 04:46.000 --> 04:48.000 Let's profile it. 04:48.000 --> 04:49.000 How can we profile it? 04:50.000 --> 04:53.000 We can profile it, but who can in a profiler. 04:53.000 --> 04:57.000 So, this profiler here, do you see? 04:57.000 --> 05:01.000 This is one that we've created specifically for this work, for the 05:01.000 --> 05:02.000 global VM native image. 05:02.000 --> 05:05.000 What it does, it wraps the day-mage 05:05.000 --> 05:08.000 invocation into the native invoker here, 05:08.000 --> 05:10.000 around a perfect code invocation. 05:10.000 --> 05:13.000 And we added in the call graph parameter. 05:13.000 --> 05:18.000 In order to use the dual of debugging for symbols available for 05:18.000 --> 05:22.000 native images, in order to then extract, 05:22.000 --> 05:26.000 to be able to match what it is, 05:26.000 --> 05:30.000 they are simply with what code we're running. 05:30.000 --> 05:32.000 The numbers here are not so important. 05:32.000 --> 05:35.000 The most important thing is that out of each benchmark, 05:35.000 --> 05:38.000 we get a pair of binary output here. 05:38.000 --> 05:42.000 And then what we can do is we can expect it. 05:42.000 --> 05:43.000 How can we expect it? 05:43.000 --> 05:46.000 We can basically go Perf, annotate, 05:46.000 --> 05:47.000 and we're going to open it. 05:47.000 --> 05:49.000 Can you see that the bottom here, 05:49.000 --> 05:50.000 and basically going to go. 05:50.000 --> 05:52.000 I'm going to start with Latin one. 05:52.000 --> 05:54.000 Now, we're going to see some assembly. 05:54.000 --> 05:56.000 I'm not going to go into a lot of depth. 05:56.000 --> 05:58.000 I've come back up slides explaining things in 05:58.000 --> 05:59.000 quality detail. 05:59.000 --> 06:03.000 I'm going to try to understand a little bit the flow of what's going on 06:03.000 --> 06:05.000 when we go here. 06:05.000 --> 06:08.000 So, what we see is that for the Latin one, 06:08.000 --> 06:11.000 it tells us that the first method it jumps to is the 06:11.000 --> 06:12.000 string Latin one chart. 06:12.000 --> 06:14.000 Okay, we say earlier, this is the method I get 06:14.000 --> 06:17.000 called for a string chart. 06:17.000 --> 06:21.000 Okay, this is the implementation. 06:21.000 --> 06:23.000 We see that there's a check in this call. 06:23.000 --> 06:25.000 This is the call that we saw earlier. 06:25.000 --> 06:27.000 It's actually not exactly the same call, 06:27.000 --> 06:29.000 but it's one that is on the Nithar. 06:29.000 --> 06:32.000 And that's pretty much what we need to know at this stage. 06:32.000 --> 06:34.000 We can move around a little bit. 06:34.000 --> 06:37.000 Here, what we see now, this is the actual game 06:37.000 --> 06:42.000 that is calling in to our chart benchmark. 06:42.000 --> 06:43.000 Good. 06:43.000 --> 06:49.000 And then here, what we see here is the string. 06:49.000 --> 06:52.000 No, this is... 06:52.000 --> 06:55.000 Oh, it's been something interesting. 06:55.000 --> 06:56.000 It has happened here. 06:56.000 --> 06:57.000 Oh, yeah. 06:57.000 --> 06:58.000 No. 06:58.000 --> 07:01.000 This is our string chart. 07:01.000 --> 07:05.000 A chart Latin, which is calling is to a string chart. 07:05.000 --> 07:08.000 And then here's what I wanted to get to. 07:08.000 --> 07:11.000 This is basically a string chart. 07:11.000 --> 07:15.000 The basically, what is doing is the most important thing here 07:15.000 --> 07:19.000 is calling this to a string Latin one chart. 07:19.000 --> 07:22.000 Okay, so we see, I'll leave it the chain of all the calls. 07:22.000 --> 07:24.000 And I'll leave it a slightly different way, 07:24.000 --> 07:26.000 but we see our benchmark calling the string chart. 07:26.000 --> 07:30.000 String chart, calling to a string Latin one chart, et cetera. 07:30.000 --> 07:31.000 Okay. 07:31.000 --> 07:33.000 That's how it works for Latin one. 07:33.000 --> 07:38.000 What about UTF-A16? 07:38.000 --> 07:39.000 Okay. 07:39.000 --> 07:40.000 So let's go here. 07:40.000 --> 07:41.000 Where are we? 07:41.000 --> 07:43.000 We're in a string chart. 07:43.000 --> 07:45.000 Okay. 07:45.000 --> 07:46.000 So what do we have down here? 07:46.000 --> 07:49.000 We've seen a lot of knobs, a lot of things. 07:49.000 --> 07:51.000 We've got the tech index. 07:51.000 --> 07:54.000 But there's nothing else. 07:54.000 --> 07:59.000 So what you see in the screen is that the string UTF-A16 07:59.000 --> 08:05.000 where our implementation has been inline is to a string chart. 08:05.000 --> 08:08.000 So we can make a theory here. 08:08.000 --> 08:11.000 I can make a theory saying the reason why UTF-A16 08:11.000 --> 08:14.000 was performed in better the string Latin one was because 08:14.000 --> 08:19.000 UTF-A16 chart was inline in the string chart. 08:19.000 --> 08:20.000 Okay. 08:20.000 --> 08:22.000 That's a theory I have. 08:22.000 --> 08:24.000 Now, how can I prove it? 08:24.000 --> 08:26.000 I can prove it. 08:26.000 --> 08:28.000 Let's prove it. 08:29.000 --> 08:30.000 What can we do? 08:30.000 --> 08:32.000 We're going to rebuild a native binary. 08:32.000 --> 08:35.000 Are we going to rebuild this passing in a parameter called 08:35.000 --> 08:38.000 max notes in trivial method 40? 08:38.000 --> 08:39.000 Okay. 08:39.000 --> 08:41.000 So let me make a theory. 08:41.000 --> 08:42.000 What we doing? 08:42.000 --> 08:43.000 This parameter here. 08:43.000 --> 08:44.000 Why are those bills? 08:44.000 --> 08:47.000 I'm going to explain what's going on. 08:47.000 --> 08:52.000 The call compiler will inline one of the conditions for inline 08:52.000 --> 08:56.000 in a method is when a method is considered trivial. 08:56.000 --> 08:59.000 What does it mean to be for a method to be trivial? 08:59.000 --> 09:04.000 It means that inside the method the compiler graph has 20 notes 09:04.000 --> 09:05.000 or less. 09:05.000 --> 09:08.000 So what I'm doing here is something in the case. 09:08.000 --> 09:09.000 Go and inline. 09:09.000 --> 09:13.000 Those methods have got 40 notes instead of 20. 09:13.000 --> 09:14.000 Okay. 09:14.000 --> 09:17.000 So I'm basically giving in more budget to inline bigger methods. 09:17.000 --> 09:20.000 And we're going to see what happens then when we do that. 09:20.000 --> 09:21.000 So this is native image. 09:21.000 --> 09:23.000 So it does the bill. 09:23.000 --> 09:26.000 And then basically eventually comes on to the button. 09:26.000 --> 09:27.000 Okay. 09:27.000 --> 09:29.000 We got you running. 09:29.000 --> 09:30.000 Yes. 09:30.000 --> 09:31.000 Run edge. 09:31.000 --> 09:34.000 See what we see now. 09:34.000 --> 09:35.000 Let's start again. 09:35.000 --> 09:37.000 We see our benchmark is running. 09:37.000 --> 09:38.000 It's of trivial. 09:38.000 --> 09:40.000 Well, we can see at the top. 09:40.000 --> 09:41.000 It's like a benchmarks. 09:41.000 --> 09:43.000 We start to see numbers. 09:43.000 --> 09:47.000 We see numbers are considerably faster than we saw before. 09:47.000 --> 09:52.000 But the interesting thing we're going to see now is we're going to see 09:52.000 --> 09:57.000 the numbers between UTF-16 and LAT-1 are pretty much the same now. 09:57.000 --> 09:59.000 So I feel we seem to have legs. 09:59.000 --> 10:03.000 The reason why things improved was because of inline in. 10:03.000 --> 10:05.000 What we can go is the further. 10:05.000 --> 10:07.000 We can look at the profiling data. 10:07.000 --> 10:11.000 Obviously, I'm not going to go through entire steps because I only got 20 minutes. 10:11.000 --> 10:14.000 But we're going to look at it directly here. 10:14.000 --> 10:19.000 So we're going to do profanoid. 10:19.000 --> 10:23.000 Well, sorry, if I can type. 10:23.000 --> 10:27.000 Well, now we see a string chart for LAT-1. 10:27.000 --> 10:30.000 We basically go into the profiling data for LAT-1. 10:30.000 --> 10:32.000 And we see a string chart. 10:32.000 --> 10:33.000 We see the jump. 10:33.000 --> 10:36.000 This is probably the jump for the coder. 10:36.000 --> 10:38.000 And then we jump down to here. 10:38.000 --> 10:41.000 And then we see this number of techniques. 10:41.000 --> 10:43.000 Check index is gone. 10:43.000 --> 10:44.000 The call. 10:44.000 --> 10:47.000 We also go now a string LAT-1 calls anymore. 10:48.000 --> 10:51.000 So we can see inlining how successfully happened. 10:51.000 --> 10:54.000 And here is basically the final instruction. 10:54.000 --> 10:55.000 This is the instruction. 10:55.000 --> 10:56.000 The moves it'll. 10:56.000 --> 10:59.000 It's a move with a zero. 10:59.000 --> 11:00.000 Zero. 11:00.000 --> 11:02.000 Basically, it's converted the bite into a chart. 11:02.000 --> 11:06.000 In a way that that then can be returned. 11:06.000 --> 11:08.000 So we answer one question. 11:08.000 --> 11:11.000 Why UTF-A-16 was faster than LAT-1. 11:11.000 --> 11:17.000 We have, I post another question earlier. 11:17.000 --> 11:22.000 Which is, is that those numbers we saw earlier, 11:22.000 --> 11:25.000 even the ones here, are they faster or slow? 11:25.000 --> 11:28.000 And obviously, what we can do is a very simple thing is, 11:28.000 --> 11:32.000 how do things allow with hotspot with standard JDK? 11:32.000 --> 11:33.000 So let's do that. 11:33.000 --> 11:35.000 So we're going to package it. 11:35.000 --> 11:41.000 We're going to package it in JV mode. 11:41.000 --> 11:42.000 Great. 11:42.000 --> 11:45.000 Now we're going to run it. 11:45.000 --> 11:49.000 And we're going to focus now. 11:49.000 --> 11:51.000 We're going to focus on the LAT-1. 11:51.000 --> 11:52.000 Okay? 11:52.000 --> 11:54.000 We're going to leave UTF-A-16 aside. 11:54.000 --> 11:59.000 And we're also going to do, we're going to add a profiler. 11:59.000 --> 12:00.000 Perfect. 12:01.000 --> 12:07.000 That allows us to see what the assembly looks like for this particular case. 12:07.000 --> 12:08.000 So we start running. 12:08.000 --> 12:10.000 Obviously, the VM version has changed. 12:10.000 --> 12:13.000 So that you can see it here. 12:13.000 --> 12:17.000 We now have up in JVK version, the invoker is Java. 12:17.000 --> 12:20.000 So this is a clear difference with what we're doing before. 12:20.000 --> 12:22.000 And we start to see numbers. 12:22.000 --> 12:26.000 We see we got 1.7 nanosecond preparation. 12:26.000 --> 12:28.000 So which is faster than LAT. 12:28.000 --> 12:29.000 Okay? 12:29.000 --> 12:30.000 That's not all of news. 12:30.000 --> 12:35.000 I mean, it's something that we would all expect that to happen. 12:35.000 --> 12:40.000 AOT can make the same optimizations as a hospital can do. 12:40.000 --> 12:41.000 Or that's it. 12:41.000 --> 12:43.000 Well, let's have a look. 12:43.000 --> 12:47.000 And I'm going to make this a slightly smaller. 12:47.000 --> 12:49.000 Just that it's been more clear. 12:49.000 --> 12:52.000 So the way it's going to be a little bit confusing. 12:53.000 --> 12:57.000 Maybe hopefully not. 12:57.000 --> 12:58.000 Okay. 12:58.000 --> 13:00.000 That's a big enough. 13:00.000 --> 13:04.000 Once again, I'm not going to try to go into a lot of detail. 13:04.000 --> 13:08.000 What we can see is that the first thing is that the HOTS method is the JMHNAT. 13:08.000 --> 13:09.000 It's called. 13:09.000 --> 13:12.000 In the very last one, we saw the string chart was the HOTS method. 13:12.000 --> 13:15.000 So obviously, the inline in that we achieved with the previous option. 13:15.000 --> 13:18.000 But we increased the budget to 40 notes. 13:18.000 --> 13:22.000 It's not as good as the inline in the G, the HOTS population can do. 13:22.000 --> 13:27.000 Obviously, HOTS can see what is HOTS and can basically optimize the inline in. 13:27.000 --> 13:32.000 So we got a lot more inlining happening here. 13:32.000 --> 13:36.000 And one of the things that is interesting to see as well is that, 13:36.000 --> 13:40.000 essentially what we see is this is a lot of inline assembly. 13:40.000 --> 13:43.000 And then we get back to the bottom. 13:43.000 --> 13:47.000 And then we basically after this there is a loop back at which we don't see it here. 13:47.000 --> 13:51.000 Sometimes you see, sometimes you see it, but you see it as a loop back up. 13:51.000 --> 13:54.000 So basically we run my iteration, we loop back up. 13:54.000 --> 13:58.000 Now, I have one more thing to show you today. 13:58.000 --> 14:04.000 AOT, we all expected that was going to be slower than yet. 14:04.000 --> 14:07.000 What about gravity and peak yield? 14:07.000 --> 14:12.000 So gravity and peak yield is a proprietary technology for Oracle. 14:12.000 --> 14:17.000 But it allows you to, basically it's called profile guided optimization. 14:17.000 --> 14:21.000 The idea is you run your native binary with some training, 14:21.000 --> 14:23.000 with some instrumentation. 14:23.000 --> 14:26.000 Then you run it through your training or your benchmark or whatever. 14:26.000 --> 14:29.000 And then you out of that you get some profile in data. 14:29.000 --> 14:30.000 You tend to profile in data. 14:30.000 --> 14:35.000 You use it to rebuild your native image with this data. 14:35.000 --> 14:40.000 And you basically have got something akin to a kit. 14:40.000 --> 14:43.000 But basically you've done an offline with just some training. 14:43.000 --> 14:46.000 The question is, would that be faster than a hotspot or not? 14:46.000 --> 14:51.000 Who thinks this is going to be a hotspot that is going to be faster than peak yield? 14:51.000 --> 14:54.000 Can you raise your hands? 14:54.000 --> 15:00.000 Two people. Who thinks peak yield is going to be faster than a hotspot? 15:00.000 --> 15:03.000 Six or seven. Okay. 15:03.000 --> 15:05.000 No, no, no participation. Let's have a look. 15:05.000 --> 15:13.000 So here we have already have prevailed things ahead of this code today or ahead of this presentation. 15:13.000 --> 15:16.000 So let's run it. 15:16.000 --> 15:19.000 This is running a slightly different now. 15:19.000 --> 15:22.000 Well, the first notice of what difference is the VM version has changed. 15:22.000 --> 15:26.000 When you're running with Oracle GraphiM, that's a proprietary version. 15:26.000 --> 15:29.000 The VM in Boga has a slightly changed. 15:29.000 --> 15:34.000 But the changes that you see in the screen about the VM Boga name is not so relevant. 15:34.000 --> 15:39.000 But it's relevant is that this VM in Boga initially is a p-year instrumentative vocar. 15:39.000 --> 15:46.000 What we do is behind the scenes, we inject a warmer fork so that it runs on the instrumentative binary. 15:46.000 --> 15:55.000 Then when that completes, we take the profiling data that comes out of the instrumentation and we rebuild the native image with that, 15:55.000 --> 15:58.000 which is what's happening right now here. 15:58.000 --> 16:07.000 When that completes, we basically execute the benchmark with the optimist native binary. 16:07.000 --> 16:11.000 And we start to see the numbers. 16:11.000 --> 16:16.000 And we see it takes about 1.4 nanosecosperification. 16:16.000 --> 16:21.000 Well, we focus on Latin one. 16:21.000 --> 16:27.000 So I leave the aside the UTF-60 because I only as you have told it to run Latin one. 16:27.000 --> 16:35.000 Now the question is why? Why is p-year faster than hotspot? Let's have a look. 16:35.000 --> 16:43.000 So once again, I've done this a little bit of head-of-time, so I don't have to repeat it here all the things. 16:43.000 --> 16:48.000 But we can move the profiling data from p-year. 16:48.000 --> 16:53.000 This is going to take a little bit more exercise. 16:53.000 --> 17:00.000 But what we can, if we go to the very top, if I can, or we can see it here. 17:00.000 --> 17:04.000 The hottest method is the game-h-generated code. 17:04.000 --> 17:07.000 But then something is really different, it starts to happen. 17:07.000 --> 17:10.000 It gets to here. That's where things start to run. 17:10.000 --> 17:15.000 And it starts one time, two times, three times, four times. 17:15.000 --> 17:21.000 So what you see here is something the p-year does that hotspot can do today. 17:21.000 --> 17:26.000 P-year can unroll a loop that is not counted. 17:26.000 --> 17:32.000 A loop that is checks a bulletin value of whether it's done. It can unroll it. 17:32.000 --> 17:39.000 Something that, from what I'm talking to my engineer fellow team members, 17:39.000 --> 17:44.000 I've been opening the gate in, the gate kind of does that hotspot yet. 17:44.000 --> 17:51.000 But still, p-year still keeps most of the original features, for example. 17:51.000 --> 17:58.000 This line here that you see there, that's essentially extracting the code of field out of a string 17:58.000 --> 18:00.000 and checking if it's Latino. 18:00.000 --> 18:04.000 How do I know the C field of a string is the coder? 18:04.000 --> 18:09.000 Well, I can look at, I can just pause to look at the string structure. 18:10.000 --> 18:14.000 And we see the coder is in field 12, so I'm field C. 18:14.000 --> 18:21.000 So this thing here is basically reading the coder, okay? 18:21.000 --> 18:28.000 This field number four, that's the byte array. 18:28.000 --> 18:36.000 Then this field four, we put it into array, and obviously, we had the before I go to the byte array, 18:37.000 --> 18:44.000 the coder goes into EBP, test BPL, does basically change in whether the coder is Latino or not. 18:44.000 --> 18:59.000 Then array R28 is basically then here, where are you going? 19:00.000 --> 19:03.000 That's the byte array, the rates. 19:03.000 --> 19:13.000 And then here is where we then eventually extract the chart out of that. 19:13.000 --> 19:18.000 If R13 is this one, so we extracted the array, 19:18.000 --> 19:22.000 basically we're taking the byte array value out of the string, 19:22.000 --> 19:28.000 putting it into R13, and here in the R13, we basically extract the index, RBP index. 19:28.000 --> 19:32.000 So we see that the structure of the coder is still pretty much the same. 19:32.000 --> 19:37.000 Still, the performance is slightly better. 19:37.000 --> 19:42.000 This is where we are at this stage of this investigation. 19:42.000 --> 19:46.000 Obviously, we're going to do more investigation on this to understand how the, 19:47.000 --> 19:51.000 these are rolling works, so if they are rolling, it's the reason why the performance increase. 19:51.000 --> 19:52.000 Yep. 19:52.000 --> 19:57.000 Still, it means that the ground has optimized the benchmark, 19:57.000 --> 20:02.000 but not the coder, not the coder, not the coder. 20:02.000 --> 20:05.000 Well, this is the implementation of the character. 20:05.000 --> 20:09.000 Yeah, but the message to optimize this, the JMH generated message. 20:09.000 --> 20:10.000 Yeah. 20:10.000 --> 20:13.000 And it has optimized a loop in that message. 20:13.000 --> 20:18.000 So that the coder's password, the character, not the coder. 20:18.000 --> 20:20.000 Yeah. 20:20.000 --> 20:24.000 And that's where you can add the compiler control annotations to tell you to never in line. 20:24.000 --> 20:29.000 And that's what I mean, the process of adding, I was starting in as we are no way. 20:29.000 --> 20:37.000 See, the reason why the JMH does the predicted is because we know that the compiler's regime is starting 20:38.000 --> 20:39.000 Yeah. 20:39.000 --> 20:41.000 But it is something that is interesting to know. 20:41.000 --> 20:45.000 It's something that we need to understand why they are rolling, 20:45.000 --> 20:48.000 but be something that makes things faster or not. 20:48.000 --> 20:50.000 Still, it's something valuable. 20:50.000 --> 20:51.000 Lessons to learn. 20:51.000 --> 20:53.000 I feel out of that. 20:53.000 --> 20:58.000 Yeah. 20:58.000 --> 21:03.000 Obviously, things that are still in progress. 21:03.000 --> 21:08.000 This is basically, as you can see, this is the first time we're speaking about this. 21:08.000 --> 21:12.000 So, this is still working progress, but we learn in things. 21:12.000 --> 21:15.000 So, that's all I really had today. 21:15.000 --> 21:19.000 There's my slides, I've got things that I went through today, 21:19.000 --> 21:23.000 a longer and some details on the assembly as well, more details. 21:23.000 --> 21:28.000 And I want to leave you with this slide, where you can see this way. 21:28.000 --> 21:31.000 We're just saying, I'm finishing a couple of links. 21:31.000 --> 21:36.000 The first one is that's the report where I've been where you can find the JMH extension 21:36.000 --> 21:39.000 to do what I've been doing today. 21:39.000 --> 21:42.000 We don't get half a release of it. 21:42.000 --> 21:46.000 We're trying to figure out the license has been already agreed. 21:46.000 --> 21:50.000 We're using the single license at JMH, because obviously it's heavily rely on the JMH, 21:50.000 --> 21:53.000 which is GPL2 with crosspath. 21:53.000 --> 21:56.000 But we haven't done a Maven release, for example. 21:56.000 --> 21:58.000 But you can still check it out. 21:58.000 --> 22:01.000 You can build it and there are instructions on how to use it. 22:01.000 --> 22:04.000 So, that's on this link. 22:04.000 --> 22:06.000 The other link is basically what I've done today. 22:06.000 --> 22:09.000 So, you can go through it on your own time. 22:09.000 --> 22:12.000 And with that, that's all I wanted to say. 22:12.000 --> 22:13.000 Questions, yeah? 22:13.000 --> 22:16.000 It's like how you launch the Maven release. 22:16.000 --> 22:18.000 But still, it's a Java Maven task. 22:18.000 --> 22:21.000 So, this is the magic of the JMH extension, or... 22:21.000 --> 22:23.000 Well, the thing is... 22:23.000 --> 22:29.000 The thing is, you need to understand, 22:29.000 --> 22:32.000 to understand what's going on to the JMH work. 22:32.000 --> 22:35.000 JMH work is not a single Java process. 22:35.000 --> 22:38.000 Normally, JMH has got a Java process, 22:38.000 --> 22:41.000 but then launches the benchmarks when you do it for it. 22:41.000 --> 22:45.000 There's no reason for that first benchmark to be native. 22:45.000 --> 22:49.000 What we've done is what we are launching is native. 22:49.000 --> 22:52.000 What do you mean is that you get the same experience 22:52.000 --> 22:55.000 as you get normally with the Java version of JMH, 22:55.000 --> 22:57.000 which is like launching a jar. 22:57.000 --> 23:00.000 But underneath, instead of launching a Java process, 23:00.000 --> 23:02.000 what I'm launching is a native process. 23:02.000 --> 23:04.000 That's the method in your JMH. 23:04.000 --> 23:05.000 Yeah, yeah. 23:05.000 --> 23:07.000 It's a part of the poem.xml. 23:07.000 --> 23:11.000 Basically, I make it show that basically the same thing works. 23:11.000 --> 23:13.000 Yeah. 23:13.000 --> 23:15.000 More questions? 23:15.000 --> 23:18.000 Thank you. 23:18.000 --> 23:19.000 Thanks for attending.