WEBVTT 00:00.000 --> 00:12.000 So our first talk today is going to be Ben Sully. He's going to be taking us through the 00:12.000 --> 00:18.000 August time series toolkit for Rust and take it away. 00:18.000 --> 00:21.000 Thanks very much. 00:21.000 --> 00:28.000 Thank you. Welcome to the Rust Room, exciting. 00:28.000 --> 00:33.000 I have a bit of post-end before and now here I am, first on stage. 00:33.000 --> 00:40.000 So thanks for coming. I'm going to be talking about orders, which is a time series toolkit for Rust. 00:40.000 --> 00:46.000 It's more important in time series analysis and it also has Python and JavaScript bindings. 00:46.000 --> 00:51.000 And the last of which is being used already and refined as content, which is pretty cool. 00:51.000 --> 00:54.000 I'll talk a little bit about that in a bit. 00:54.000 --> 00:59.000 So who am I? 00:59.000 --> 01:02.000 Oh, sorry. Is it too loud? 01:02.000 --> 01:11.000 Okay. Any better? 01:11.000 --> 01:15.000 Even higher. 01:15.000 --> 01:17.000 This is as much as we can do. 01:17.000 --> 01:19.000 I'll try and speak louder. 01:19.000 --> 01:21.000 So who am I, first of all? 01:21.000 --> 01:24.000 I'm a software engineer at Carfine Labs and based on the UK. 01:24.000 --> 01:31.000 My spare time. I like to do boldering and running ultramarins, which is a normal thing to do. 01:31.000 --> 01:34.000 I've been at Carfine Labs for about four years. 01:34.000 --> 01:39.000 I tend to use Rust for personal projects and hackathons as much as I can. 01:39.000 --> 01:44.000 But recently it's been kind of squeezed into very different bits of Carfine Labs, which is exciting. 01:44.000 --> 01:49.000 My background is kind of in statistics and machine learning. 01:49.000 --> 01:54.000 Oh. 01:54.000 --> 01:57.000 What do you think we need to do? 01:57.000 --> 02:00.000 I can lower it and you're just going to have to project. 02:00.000 --> 02:03.000 Okay. 02:03.000 --> 02:05.000 Yeah, we're going to do a lower. 02:05.000 --> 02:08.000 We still need the microphone for the audio of the video. 02:08.000 --> 02:10.000 But he's going to try to project. 02:10.000 --> 02:11.000 I can do my best. 02:11.000 --> 02:13.000 Is that any better? Can you hear me? 02:13.000 --> 02:16.000 Some more slightly? 02:16.000 --> 02:18.000 I have a back of this one. 02:18.000 --> 02:19.000 I have work. 02:19.000 --> 02:22.000 That one's working. 02:22.000 --> 02:25.000 Okay. 02:25.000 --> 02:29.000 There's a lot of echo. 02:29.000 --> 02:33.000 Okay. Well, I'll try and speak as loud as possible. 02:33.000 --> 02:34.000 So yeah, we're going to do. 02:34.000 --> 02:36.000 This is a quick summary of the talk. 02:36.000 --> 02:38.000 Hopefully technical issues will be fine. 02:38.000 --> 02:40.000 I think the talk about what always is. 02:40.000 --> 02:41.000 What can you do? 02:41.000 --> 02:46.000 And then the second part is going to be lessons that I've learned while translating various 02:46.000 --> 02:49.000 different ML algorithms to Rust from different languages. 02:49.000 --> 02:52.000 Some of them things like C++ or Fortran, 02:52.000 --> 02:53.000 are Python. 02:53.000 --> 02:55.000 They could have been written 30 or 40 years ago. 02:55.000 --> 03:02.000 So there's lots of things involved in even finding decent source implementations. 03:02.000 --> 03:04.000 Well, it takes me time for questions. 03:04.000 --> 03:06.000 And then as a bonus, this was meant to be. 03:06.000 --> 03:08.000 And I mean, talk about I ran out of time. 03:08.000 --> 03:12.000 And if you want to check it out, feel free to download the slides and have a look. 03:12.000 --> 03:17.000 There's some content there on things I ran into and trade-offs that had to be made, 03:17.000 --> 03:21.000 exposing kind of a JavaScript interface using WebAssembly. 03:22.000 --> 03:25.000 So first of all, what's to deal with the name? 03:25.000 --> 03:30.000 So to Orger is kind of a verb or now a meaning to predict. 03:30.000 --> 03:33.000 So I think this might be one of the few times in my life. 03:33.000 --> 03:36.000 I've actually named something well because it means the like thing. 03:36.000 --> 03:39.000 But I don't know, hopefully I'll get lucky when I have a kid. 03:39.000 --> 03:40.000 I'm not sure. 03:40.000 --> 03:44.000 Maybe on name them using the same method I use here, 03:44.000 --> 03:46.000 which was domain name during development. 03:46.000 --> 03:49.000 Basically, look to a source for a word that ends in R. 03:49.000 --> 03:52.000 I hope that the dot RS domain is available. 03:52.000 --> 03:55.000 And then write the project later. 03:55.000 --> 03:57.000 So I've got to summarize what time series is. 03:57.000 --> 04:00.000 Probably most people know this, but it's pretty straightforward. 04:00.000 --> 04:02.000 It's a measurement taken repeatedly. 04:02.000 --> 04:04.000 Generally at the same interval. 04:04.000 --> 04:06.000 Usually it's something numeric. 04:06.000 --> 04:08.000 Either a counter, a floating point number. 04:08.000 --> 04:11.000 So they're really computer's level working with them. 04:11.000 --> 04:12.000 Because they're just numbers. 04:12.000 --> 04:14.000 It's, it's, it's, it's a nice. 04:14.000 --> 04:18.000 Fun optimizations you can do both with compression and storage. 04:18.000 --> 04:21.000 Whether that's in memory or on one disk. 04:21.000 --> 04:24.000 And also the process in them is, it's fun. 04:24.000 --> 04:27.000 You've got lots of optimizations you can do. 04:27.000 --> 04:30.000 In the real world, there's some examples of where you can see them. 04:30.000 --> 04:32.000 They're kind of all over the place. 04:32.000 --> 04:33.000 You bit quit us. 04:33.000 --> 04:35.000 Here's my heart rate plotted over a day. 04:35.000 --> 04:37.000 Like environmental software. 04:37.000 --> 04:39.000 You'll see them everywhere. 04:39.000 --> 04:42.000 So things sort of things we do. 04:42.000 --> 04:44.000 We visualize them, right? 04:44.000 --> 04:47.000 I mean, the farmer definitely uses visualizing things. 04:47.000 --> 04:51.000 Whether one dashboards or in the kind of explore view, 04:51.000 --> 04:54.000 you can see lots of time series there. 04:54.000 --> 04:58.000 We like to set thresholds for them and get alerted 04:58.000 --> 05:02.000 when things exceed certain boundaries. 05:02.000 --> 05:05.000 For example, disk usage going above 90%. 05:05.000 --> 05:08.000 Or maybe you want your service to auto scale 05:08.000 --> 05:12.000 if requests go per part go so high, something like that. 05:12.000 --> 05:14.000 And you can do that dynamically, 05:14.000 --> 05:19.000 a standard threshold or using something a little bit more advanced, 05:19.000 --> 05:22.000 like a normally detection or outlier detection. 05:22.000 --> 05:25.000 And this is kind of where all of this comes in. 05:25.000 --> 05:30.000 So all of this is, as I mentioned, a time series toolkit for Rust. 05:30.000 --> 05:31.000 Right? 05:31.000 --> 05:34.000 It's designed to help you with all of these previous tasks. 05:34.000 --> 05:37.000 More specifically, someone on Reddit pointed out quite early. 05:37.000 --> 05:39.000 It was a time series analysis toolkit. 05:39.000 --> 05:42.000 We don't do the kind of that all of the things you might imagine 05:42.000 --> 05:44.000 in time series libraries to do. 05:44.000 --> 05:48.000 We don't do things like resampling yet or storage. 05:48.000 --> 05:52.000 Instead, we implement various different machine learning algorithms. 05:52.000 --> 05:54.000 So forecasting is the most obvious one. 05:54.000 --> 05:57.000 That's kind of, you want to predict the future. 05:57.000 --> 05:58.000 Or you even want to predict now. 05:58.000 --> 06:01.000 And you often want to do that with confidence into all of the prediction intervals, 06:01.000 --> 06:04.000 so that you know how accurate your predictions are. 06:04.000 --> 06:08.000 Clustering, you want to group lots of series together. 06:08.000 --> 06:11.000 You have hundreds of series and you want to find the groups within those. 06:11.000 --> 06:14.000 Those series that are behaving similarly. 06:14.000 --> 06:17.000 Outline detection is a bit of a confusing one. 06:17.000 --> 06:20.000 But this, we refer to this as when you have lots of different series. 06:20.000 --> 06:23.000 You expect them to all behave the same. 06:23.000 --> 06:27.000 And you want to identify the ones that aren't behaving similarly to the group. 06:27.000 --> 06:31.000 And the change point protection is what you kind of looking across time. 06:31.000 --> 06:35.000 And you want to see where your time series, your behavior, your time series changes. 06:35.000 --> 06:39.000 Whether that changes in magnitude or changes in variance, various different properties 06:39.000 --> 06:43.000 that you can detect there. 06:43.000 --> 06:46.000 Just a little bit of examples of each of those. 06:46.000 --> 06:48.000 Forecasting some of the most obvious one. 06:48.000 --> 06:49.000 You just want to be prediction, right? 06:49.000 --> 06:52.000 You maybe want to account for things like seasonality. 06:52.000 --> 06:55.000 You have daily seasonality or weekly seasonality. 06:55.000 --> 06:57.000 You depend on things happening on weekends. 06:57.000 --> 07:01.000 And you might want to forecast if I say or is it or capacity. 07:01.000 --> 07:03.000 So that's, yeah, that's very straightforward. 07:03.000 --> 07:05.000 In August, we have three algorithms for this. 07:05.000 --> 07:08.000 Very, like, kind of left to right. 07:08.000 --> 07:13.000 They go from a simpler or complex and faster to slower. 07:13.000 --> 07:15.000 Extremely reductive. 07:15.000 --> 07:19.000 And they support, as you get more advanced, they can support things like holidays. 07:19.000 --> 07:25.000 We have, like, maybe you want to model COVID separately to the rest of the kind of the normal world. 07:25.000 --> 07:31.000 And things like Christmas, if you wanted to model that in your time series. 07:31.000 --> 07:34.000 I can't put that in your predictions. 07:35.000 --> 07:39.000 In our detection, this is used to identify when one or more series is behaving differently. 07:39.000 --> 07:45.000 So, as I said, this example here is my, you can see it very clearly sadly. 07:45.000 --> 07:48.000 But it's my pie hole home blocking ads. 07:48.000 --> 07:51.000 And you kind of expect many of it remains to be the same. 07:51.000 --> 07:58.000 But actually, whatever it is, beacons.gvt2.com is absolutely miles and away higher than anything else. 07:58.000 --> 08:00.000 It's a nightmare. 08:01.000 --> 08:04.000 You can also imagine pods in a Kubernetes deployment. 08:04.000 --> 08:08.000 You expect them to have the same CPU usage, if they're being low balanced correctly. 08:08.000 --> 08:11.000 And you want to flag in case that isn't the case. 08:11.000 --> 08:15.000 Two algorithms that we use, medium absolute deviation. 08:15.000 --> 08:20.000 So, this is when series are roughly expected to be constant, but the same. 08:20.000 --> 08:25.000 And it's really simple, you just kind of flag whenever the medium is, if each series is different from the group. 08:26.000 --> 08:31.000 And DB scan, you can use when series have more complex patterns, like seasonality. 08:31.000 --> 08:35.000 But you still expect them to move similarly and move in the same way. 08:35.000 --> 08:38.000 And you want to, so it's more complex algorithm, a little bit slower. 08:38.000 --> 08:43.000 But it can handle these more complicated cases. 08:43.000 --> 08:51.000 The reason you might use medium absolute deviation, I guess, is because of its more intuitive easy to explain. 08:52.000 --> 08:53.000 Clustering. 08:53.000 --> 08:56.000 You might use clustering if you want to identify a creature series. 08:56.000 --> 09:01.000 So here in the screen show, we've got one big band across the top and then just clearly like two separate bands. 09:01.000 --> 09:03.000 A little bit lower down. 09:03.000 --> 09:06.000 This useful if you want to kind of group these together. 09:06.000 --> 09:12.000 So in this case, we would flag those little groups, which is useful in quite a lot of instances. 09:12.000 --> 09:14.000 A little bit about the way that works. 09:14.000 --> 09:19.000 It's quite a fun algorithm, and also it's probably the coolest name of any of the algorithms I've seen. 09:19.000 --> 09:22.000 The dynamic time warping. 09:22.000 --> 09:23.000 Why is it called that? 09:23.000 --> 09:25.000 I don't know, it's awesome. 09:25.000 --> 09:31.000 The gift kind of gives it a way here, but you're calculating different distances between each of your series. 09:31.000 --> 09:34.000 Each pair of your series in your data set. 09:34.000 --> 09:40.000 And as the kind of naïve where you would use it, like you clearly in distance where you can pair the same time stamp. 09:40.000 --> 09:45.000 And dynamic time warping instead, you're optimizing to find pairs of values that minimize the overall distance. 09:45.000 --> 09:49.000 And kind of account for shift in time of those series. 09:49.000 --> 09:53.000 And I evenly, that would be really, really slow, because you've got to do like N squared comparisons. 09:53.000 --> 09:55.000 And there's a top right shows. 09:55.000 --> 10:02.000 But there's lots of little optimizations you can do to speed that up by limiting the window that you are allowing us to differ by. 10:02.000 --> 10:04.000 So that's quite fun to implement. 10:04.000 --> 10:09.000 And there's lots of options you can tweak in August to do that. 10:09.000 --> 10:18.000 So the way that works after you've got your, what you use a dynamic time warping to find distances between each pair of time series. 10:18.000 --> 10:23.000 And after that, you feed that into an algorithm called DB scan. 10:23.000 --> 10:25.000 Here's a distance matrix. 10:25.000 --> 10:30.000 In this case, we've got one series at the top, and then you can see some of the distances are a little bit higher. 10:30.000 --> 10:36.000 And then by feeding that into DB scan, you just get a simple marker of like whether a series is in one cluster or another. 10:36.000 --> 10:41.000 So the API is a pretty simple and easy to use. 10:41.000 --> 10:44.000 And close them into our colors. 10:44.000 --> 10:46.000 Some stuff that August doesn't do. 10:46.000 --> 10:50.000 But I was trying to useful to know when I shouldn't shouldn't use something. 10:50.000 --> 10:52.000 Some of them might be obvious, but maybe not. 10:52.000 --> 10:57.000 So we don't be plotting, plotting isn't really like time series in certain necessarily. 10:57.000 --> 11:01.000 Like, there are, there are libraries in the rest, so plotters is really good. 11:01.000 --> 11:05.000 And you can use U-plot from JavaScript, or there's loads of options like map. 11:05.000 --> 11:07.000 We don't live in Python. 11:07.000 --> 11:09.000 We don't do time series data structures. 11:09.000 --> 11:11.000 I kind of touched on that earlier. 11:11.000 --> 11:14.000 But it's more, we tend to work with just simple like Vex or floats. 11:14.000 --> 11:16.000 Things are quite straightforward. 11:16.000 --> 11:18.000 And I don't know if there is one of those right now. 11:18.000 --> 11:22.000 So there's scope to add it to August or to other libraries later. 11:22.000 --> 11:26.000 And storage and compression, not really are concern either. 11:26.000 --> 11:29.000 There's not a database, it's not really designed for database usage. 11:29.000 --> 11:33.000 So you wouldn't use it for storing these things instead you'd use. 11:33.000 --> 11:37.000 A time series database like Vex or influx, something like that. 11:37.000 --> 11:41.000 You should use it, my opinion. 11:41.000 --> 11:43.000 First of all, you need to have some time series. 11:43.000 --> 11:45.000 That's very straightforward. 11:45.000 --> 11:50.000 But like the code is clean and it's relatively fresh. 11:50.000 --> 11:55.000 I mean, it's been idiomatically converted from original live algorithm and languages, 11:55.000 --> 11:58.000 which is nice, easy to contribute to. 11:58.000 --> 12:00.000 It's rust only, it's nice and portable. 12:00.000 --> 12:03.000 You can use it wherever you want to compile to. 12:03.000 --> 12:06.000 And we have jobs written by some bindings. 12:06.000 --> 12:09.000 And as you might expect, it's relatively quick. 12:09.000 --> 12:12.000 Like it's, I haven't done extensive benchmarking against other languages, 12:12.000 --> 12:14.000 but I've made sure to profile things. 12:14.000 --> 12:16.000 I make sure that it's not like crazily slow. 12:16.000 --> 12:20.000 So it is generally faster than other implementations. 12:20.000 --> 12:22.000 Even compared to sort of NumPy, 12:22.000 --> 12:26.000 which you would expect to be fast to have things very highly optimized. 12:27.000 --> 12:30.000 I think largely because of the extra control you get in rust 12:30.000 --> 12:33.000 and lazy operations like iterators and compiling down 12:33.000 --> 12:35.000 into very optimized code. 12:35.000 --> 12:39.000 You get fast implementations by default. 12:39.000 --> 12:43.000 Okay, so that was a very high level overview of all this. 12:43.000 --> 12:45.000 There's a lot of things I didn't have time together. 12:45.000 --> 12:48.000 Please do check out the docs and the demo and the code. 12:48.000 --> 12:52.000 And I'll touch on those later. 12:53.000 --> 12:56.000 So next up section two is, 12:56.000 --> 13:01.000 I want to talk about the process of actually conversing in LML algorithm 13:01.000 --> 13:03.000 from another language to rust. 13:03.000 --> 13:06.000 And there's a lot of lessons that kind of came there. 13:06.000 --> 13:08.000 I'd like to pass on. 13:12.000 --> 13:16.000 So this is the kind of process that you might expect to go through. 13:16.000 --> 13:19.000 If you're converting an algorithm right that you've found 13:19.000 --> 13:21.000 in a different language. 13:22.000 --> 13:24.000 It looks fairly sensible. It's quite nice, right? 13:24.000 --> 13:27.000 It's probably what the working with legacy code book says. 13:27.000 --> 13:28.000 I'm not sure. 13:28.000 --> 13:32.000 However, things don't usually go to plan. 13:32.000 --> 13:34.000 And especially here. 13:34.000 --> 13:36.000 So when you're looking for source implementations, 13:36.000 --> 13:38.000 there'll be meant. 13:38.000 --> 13:40.000 Especially if it's a popular algorithm, 13:40.000 --> 13:42.000 you'll find several in different languages. 13:42.000 --> 13:44.000 Each of them will have a different trade-offs 13:44.000 --> 13:47.000 and you have to kind of choose which one to go for. 13:48.000 --> 13:51.000 You'll be very lucky if you find tests for these things. 13:51.000 --> 13:52.000 That will be lovely, wouldn't it? 13:52.000 --> 13:54.000 But they don't really exist. 13:54.000 --> 13:57.000 People will tend to maybe write examples and blog posts, 13:57.000 --> 13:59.000 but you don't tend to get tests. 13:59.000 --> 14:02.000 These algorithms are often written by researchers or scientists 14:02.000 --> 14:04.000 who have better things to do. 14:06.000 --> 14:09.000 Your implementations that you found will all disagree. 14:09.000 --> 14:11.000 Which you won't realize until a little bit later. 14:11.000 --> 14:13.000 That'll be fun to figure out as well. 14:14.000 --> 14:16.000 As you're translating, you might kind of think, 14:16.000 --> 14:17.000 this is awful. 14:17.000 --> 14:20.000 What are they doing all of this manual indexing for? 14:20.000 --> 14:23.000 We should definitely rewrite that a little bit better. 14:24.000 --> 14:26.000 And as you're refaturing, you will think, 14:26.000 --> 14:27.000 yeah, we should do this much sooner. 14:27.000 --> 14:29.000 Let's shift this earlier in the stage. 14:29.000 --> 14:32.000 And then you'll end up with this kind of half translated half, 14:32.000 --> 14:36.000 like Frankenstein's monster of a code base. 14:37.000 --> 14:39.000 And who doesn't love optimizing that? 14:39.000 --> 14:40.000 That will come ways sooner. 14:40.000 --> 14:42.000 You'll definitely start that in 20% through. 14:42.000 --> 14:44.000 There's no way you're waiting to be end. 14:44.000 --> 14:49.000 So the end reality looks a little bit more like this. 14:49.000 --> 14:51.000 You find a lot of indentations. 14:51.000 --> 14:53.000 You'll be swatted back and forth between them. 14:53.000 --> 14:56.000 You'll translate some functions line by line. 14:56.000 --> 14:57.000 Others will be done. 14:57.000 --> 14:59.000 It will be a hot mess. 14:59.000 --> 15:00.000 I'm going to be asked. 15:00.000 --> 15:02.000 Get familiar with the debugger. 15:02.000 --> 15:05.000 But fortunately, there are some things that you can do to improve this process. 15:05.000 --> 15:08.000 So I'm going to go through and give you a little bit of advice. 15:08.000 --> 15:10.000 In case this is something you plan on doing. 15:11.000 --> 15:14.000 So first of all, finding the source implementations. 15:14.000 --> 15:16.000 Nothing's going to be perfect. 15:16.000 --> 15:18.000 The whole point in you doing this is that it isn't perfect. 15:18.000 --> 15:20.000 It's not a raster, right? 15:20.000 --> 15:21.000 So there's no way. 15:21.000 --> 15:24.000 But also, it's going to have problems. 15:24.000 --> 15:27.000 It's going to have just going to be slow things in there. 15:27.000 --> 15:30.000 You're going to have to accept that and make it better as you go. 15:30.000 --> 15:33.000 Why are we trying things with published papers? 15:33.000 --> 15:35.000 It feels obvious if you see it. 15:35.000 --> 15:39.000 It's really useful to have a paper or a book or something concrete 15:39.000 --> 15:42.000 that you can then refer to for motivations. 15:42.000 --> 15:45.000 Extra comments, that kind of thing. 15:45.000 --> 15:49.000 Ideally, you want something that uses as few languages as possible. 15:49.000 --> 15:52.000 I mean, even NumPy drops down as I see you. 15:52.000 --> 15:55.000 So you're probably going to be struggling. 15:55.000 --> 15:59.000 For example, an R library that I translated that ETS implementation 15:59.000 --> 16:03.000 which we use in August for forecasting is written in R. 16:03.000 --> 16:06.000 But it has some, R has some built-in Fortran functions. 16:06.000 --> 16:09.000 So you have to go and figure out what they're doing. 16:09.000 --> 16:12.000 And then it also has a manual C++ implementation of an optimizer. 16:12.000 --> 16:15.000 So we had to go and figure out what that was doing too. 16:15.000 --> 16:17.000 It's not recommended. 16:17.000 --> 16:20.000 Ideally, you will find something on Crate. 16:20.000 --> 16:21.000 So that would be the dream. 16:21.000 --> 16:23.000 And often that is the case. 16:23.000 --> 16:25.000 You need it to be open source. 16:25.000 --> 16:26.000 Ideally, it would have tests. 16:26.000 --> 16:28.000 But as I said, that's unlikely. 16:28.000 --> 16:31.000 And if you have a responsive author or a recent commit, 16:31.000 --> 16:34.000 that's great because you can talk to them and ask some questions. 16:34.000 --> 16:37.000 So do you be getting tests? 16:37.000 --> 16:40.000 As I mentioned, you're not going to find much. 16:40.000 --> 16:41.000 But tell where you can get. 16:41.000 --> 16:43.000 So if you see any reproducible examples, 16:43.000 --> 16:46.000 whether it's in papers, blog posts, books, 16:46.000 --> 16:48.000 you can turn those into examples, 16:48.000 --> 16:53.000 integration tests, spent maps in your ROS Crate. 16:53.000 --> 16:56.000 This is going to come to virtual, 16:56.000 --> 16:59.000 but you're probably going to want to test some of the implementation details. 16:59.000 --> 17:02.000 Like, as you go, there's a lot to do. 17:02.000 --> 17:06.000 It's low-level things that you're not going to be confident in. 17:06.000 --> 17:07.000 So write test for them. 17:07.000 --> 17:09.000 If you have to throw them away, that's fine. 17:09.000 --> 17:11.000 That's not a big deal. 17:11.000 --> 17:13.000 You're going to want to be, this is a bit facetious, 17:13.000 --> 17:15.000 but you need a big screen, 17:15.000 --> 17:17.000 because you're going to want those two things side by side, 17:17.000 --> 17:20.000 and be able to step through each line by line. 17:20.000 --> 17:23.000 So yeah, ROS, as I mentioned, 17:23.000 --> 17:25.000 the test screen won't make this all very nice. 17:25.000 --> 17:27.000 I tend to make everything that I find, as an example, 17:27.000 --> 17:30.000 into an implemented integration test and benchmark. 17:30.000 --> 17:33.000 And exercise your public API to make sure it makes sense, 17:33.000 --> 17:35.000 when it's been converted. 17:35.000 --> 17:36.000 Assertions are great. 17:36.000 --> 17:38.000 Make sure you're using debugger set. 17:38.000 --> 17:43.000 They're helpful. 17:43.000 --> 17:46.000 So when you're translating, when you get to the translating bit, 17:46.000 --> 17:49.000 my advice would be to start with iterator and esitors, 17:49.000 --> 17:53.000 and just the standard library, and work with these as much as you can. 17:53.000 --> 17:57.000 Low dependency count is like the new hot thing. 17:57.000 --> 18:01.000 So if you can get by without adding any of these kind of array dependencies, 18:01.000 --> 18:02.000 then that'll be great. 18:02.000 --> 18:07.000 And also, if you learn the adapters and kind of work with the functional style, 18:07.000 --> 18:11.000 there's often enough in there to reproduce even like a really obscure number by functions. 18:11.000 --> 18:13.000 And I'll show you an example in a minute, 18:13.000 --> 18:18.000 but the other advantage is that this gets really heavily optimized by ROSC. 18:18.000 --> 18:23.000 So you end up with really fast code without even realizing that you've done it. 18:23.000 --> 18:26.000 And you might even get things like SIMD for free. 18:26.000 --> 18:28.000 It's fantastic. 18:28.000 --> 18:31.000 As an example at here, as much we can see there. 18:31.000 --> 18:34.000 But the left example is the NumPy code. 18:34.000 --> 18:38.000 And the right is iterator and esitors in ROSC. 18:38.000 --> 18:43.000 This is exactly the same algorithm and it's literally like taking out of the code bases. 18:43.000 --> 18:46.000 There's exactly two allocations in the ROSC version. 18:46.000 --> 18:49.000 God knows what's going on in the left side. 18:49.000 --> 18:52.000 Performance wise, there's a lot of manual indexing. 18:52.000 --> 18:55.000 There's like loads of copying going on. 18:55.000 --> 19:02.000 So you get like a lot of power just by doing these standard iterator adapters and using those as much as possible. 19:02.000 --> 19:07.000 There's another example of that in the optimization section. 19:07.000 --> 19:17.000 However, if you end up working with 2D arrays and metric matrices and anything that has more dimensions, 19:17.000 --> 19:19.000 things get problematic. 19:19.000 --> 19:21.000 Vex of Vex are inefficient. 19:21.000 --> 19:23.000 They're not great for various reasons. 19:23.000 --> 19:26.000 You have to do two loads of interactions to get to the actual value. 19:26.000 --> 19:31.000 And they really don't play well with kind of auto-dref of ROSC functions. 19:31.000 --> 19:38.000 So especially if you're using them as arguments to your public functions, it's not ideal. 19:38.000 --> 19:42.000 So in this example we have a function that takes a 1D array. 19:42.000 --> 19:44.000 It takes a slice of floats. 19:44.000 --> 19:45.000 That's great. 19:45.000 --> 19:47.000 You can pass a reference to a vector to everything. 19:47.000 --> 19:50.000 We'll get also de-reft, really, and everything works. 19:50.000 --> 19:55.000 As soon as we have the 2D function, we want to take a slice because we don't need to own that value. 19:55.000 --> 19:56.000 But now we can't pass a Vex. 19:56.000 --> 19:57.000 We can't. 19:57.000 --> 20:00.000 If you try and pass that, then everything blows up. 20:00.000 --> 20:04.000 Because the inner Vex can't be de-reft. 20:04.000 --> 20:06.000 And the array has got you covered here. 20:06.000 --> 20:11.000 This has really efficient representations of 2D and multiple dimensional array. 20:11.000 --> 20:14.000 So it's really, really useful. 20:14.000 --> 20:17.000 It's basically, it's got a lot of the functionality that NumPy does. 20:17.000 --> 20:20.000 And you can, they're even a good, like, NumPy. 20:20.000 --> 20:25.000 And the array for NumPy uses this doc that you can use there. 20:25.000 --> 20:27.000 So yeah, the function works. 20:27.000 --> 20:30.000 It's great. 20:30.000 --> 20:35.000 So what if you run into a huge dependency? 20:35.000 --> 20:37.000 And actually, as I was thinking on the training about this, 20:37.000 --> 20:39.000 it doesn't really just apply to ML. 20:39.000 --> 20:43.000 If you're rewriting a code base in a different language, 20:43.000 --> 20:47.000 and it has a huge dependency, what can you do like that? 20:47.000 --> 20:50.000 You don't really have time to rewrite it as too much to expect. 20:50.000 --> 20:55.000 So this happened in August when we were translating the profit algorithm. 20:55.000 --> 20:58.000 Profit basically does a bunch of data manipulation, 20:58.000 --> 21:02.000 and then hands everything off to a library called Stamp, 21:02.000 --> 21:06.000 which is a Bayesian framework for doing Bayesian analysis basically. 21:07.000 --> 21:09.000 Rewriting standard would be impossible. 21:09.000 --> 21:12.000 It's like, a best in class framework. 21:12.000 --> 21:15.000 It's written by experts in the field. 21:15.000 --> 21:18.000 It's like, how a combination of APIs of C++ with, like, 21:18.000 --> 21:20.000 17,000 commits. 21:20.000 --> 21:22.000 And the way that it's called is like, 21:22.000 --> 21:25.000 you compile a binary that represents your model 21:25.000 --> 21:27.000 and class data to it in files. 21:27.000 --> 21:30.000 So, that's not going to be rewritten. 21:30.000 --> 21:33.000 That's going to be too hard. 21:33.000 --> 21:38.000 Instead, we kind of thought, well, how can we still use this? 21:38.000 --> 21:40.000 And we want to use it in the browser as well. 21:40.000 --> 21:43.000 Well, how are we going to be able to do it? 21:43.000 --> 21:47.000 We thought, maybe we could compile stand to WebAssembly, right? 21:47.000 --> 21:51.000 That's a bit of a wild idea, like, some really old, 21:51.000 --> 21:53.000 ten-year-old C++ code bases. 21:53.000 --> 21:54.000 Will it compile to WebAssembly? 21:54.000 --> 21:57.000 Surely there's dependencies there, like, system calls, 21:57.000 --> 22:00.000 and various different APIs that WebAssembly doesn't support. 22:00.000 --> 22:02.000 And also, how are you going to pass the data, 22:02.000 --> 22:04.000 like, WebAssembly only has numbers, 22:04.000 --> 22:08.000 and we need to pass much more complicated things it. 22:08.000 --> 22:13.000 Fortunately, the WebAssembly system interface now exists, 22:13.000 --> 22:15.000 and the component model. 22:15.000 --> 22:19.000 It's a bit nascent, but these things handle exactly this. 22:19.000 --> 22:21.000 Exactly these use cases. 22:21.000 --> 22:23.000 I'm not going to go into too much detail because I'm not 22:23.000 --> 22:27.000 time, but basically, the model is you write an idea 22:27.000 --> 22:30.000 that represents the kind of work that you want to do, 22:30.000 --> 22:32.000 and you do that in a language called WIT, 22:32.000 --> 22:34.000 which broadly represents a prototype, 22:34.000 --> 22:36.000 it's like a kind of idea of a standard idea. 22:36.000 --> 22:38.000 Well, there's an example in a second. 22:38.000 --> 22:40.000 And then you write a tiny bit, promise, 22:40.000 --> 22:44.000 it's not much, tiny bit of C++ to implement that, 22:44.000 --> 22:48.000 using the stand libraries and calling in to stand as a WIT. 22:48.000 --> 22:51.000 And then you can compile that to a WebAssembly component, 22:51.000 --> 22:53.000 and that's basically a WebAssembly module, 22:53.000 --> 22:56.000 self-contained, portable, and can be run using any, 22:56.000 --> 23:00.000 in theory, any WebAssembly runtime. 23:00.000 --> 23:04.000 So the idea looks a little bit like this. 23:04.000 --> 23:06.000 Sorry for the dark mode. 23:06.000 --> 23:09.000 You have records and stocks and all of these things. 23:09.000 --> 23:12.000 When you, the component model's tooling has a bunch of, 23:12.000 --> 23:17.000 but a generator to convert that into idiomatic code for your language. 23:17.000 --> 23:18.000 It doesn't have to be rust. 23:18.000 --> 23:22.000 So this will get started into idiomatic rust, idiomatic go, 23:22.000 --> 23:25.000 or type script, anything you can't need. 23:26.000 --> 23:30.000 It also automatically handles conversion of those types. 23:33.000 --> 23:37.000 So if we've got that, we then need to use it from rust. 23:37.000 --> 23:43.000 So we're using, there's a variant that comes along with the component model of bind gen, 23:43.000 --> 23:46.000 which basically takes the path to your idl file, 23:46.000 --> 23:52.000 and generates you a bunch of stocks and traits and everything for your code, 23:52.000 --> 23:58.000 and you can then call that from inside rust as if it were a stock, 23:58.000 --> 24:01.000 but it is a stock and has functions and methods, 24:01.000 --> 24:05.000 and you can pass everything in there as if it were something you didn't. 24:05.000 --> 24:09.000 In this example, like we were embedding the WebAssembly that we just compiled. 24:09.000 --> 24:13.000 So that's in the binary, there's no runtime dependencies, whatever. 24:13.000 --> 24:20.000 There's a bunch of, setting up the WebAssembly runtime is just a bunch of machinery really. 24:20.000 --> 24:25.000 And then we can call the optimized function just like it was a regular function with regular stocks. 24:25.000 --> 24:29.000 And it runs inside WebAssembly at near native speed. 24:29.000 --> 24:33.000 We don't have any, like, build time dependencies on any C compiler, 24:33.000 --> 24:37.000 or any C++ compiler, and everything is purely embedded in the binary. 24:37.000 --> 24:42.000 So that means we can use it from, we can use it in WebAssembly if we need to. 24:42.000 --> 24:48.000 A few caveats, like I said, it is new, it is, there's a lot of playing around. 24:48.000 --> 24:51.000 This is some of the repose in the bytecode lines. 24:51.000 --> 24:54.000 Org, a huge respect to the bytecode lines. 24:54.000 --> 24:58.000 There's just so much going on there, I don't know how they produce so many things. 24:58.000 --> 25:03.000 It's early days, and you have to do a lot of hacking around, but it's fun. 25:03.000 --> 25:06.000 There's a lot of tools to learn about that kind of thing. 25:06.000 --> 25:11.000 Another downside is that WebAssembly doesn't support certain features, like exceptions, 25:11.000 --> 25:15.000 and we just have to hard abort in those cases, which is not the end of the world, 25:15.000 --> 25:17.000 but it's not necessarily that pretty. 25:18.000 --> 25:22.000 And the minute only, wasm times supports this, but that's not, that's fine. 25:22.000 --> 25:25.000 Wasm times put it in rust, so we can easily just embed that. 25:28.000 --> 25:34.000 Okay, so we're moving on to refactoring now, and I've really got a little bit of advice here, 25:34.000 --> 25:37.000 because I had to cut that out. 25:37.000 --> 25:42.000 But when it comes to writing idiomatic code, you have loads of options. 25:42.000 --> 25:45.000 This is about using the type of system responsibly. 25:46.000 --> 25:51.000 And what I mean by that is, you can do some really clever things with rust type system. 25:51.000 --> 25:57.000 It's probably better than the source language that you're coming from. 25:57.000 --> 26:00.000 So you can use things like typestick to avoid things being misused, 26:00.000 --> 26:05.000 where you want to embed, like a state machine in your type system, 26:05.000 --> 26:09.000 to make sure that people can't, for example, fit a model twice, 26:09.000 --> 26:12.000 or predict when using an unfitted model. 26:12.000 --> 26:14.000 And that's really powerful when you combine it with that type, 26:14.000 --> 26:18.000 so things are combined, and that state machine flows really nicely. 26:18.000 --> 26:24.000 The downside is that your users will probably want some way of doing this at one time. 26:24.000 --> 26:29.000 And if they want to do it at one time, then they have to store these two different structs anyway, 26:29.000 --> 26:32.000 with the different type, they're different types. 26:32.000 --> 26:36.000 So they're going to have to use an either way, like they can't, 26:36.000 --> 26:41.000 unless they store these things in two separate options, which is a poor way of doing things. 26:41.000 --> 26:45.000 So you might want to, you will notice this when you're writing to your bindings, 26:45.000 --> 26:50.000 if you write some of the Python JavaScript, because you are basically your user at that point. 26:50.000 --> 26:55.000 And anything can change when you're writing when you're writing Python JavaScript. 26:55.000 --> 27:01.000 So here's an example, like in machine learning that often you have a concept of an unfit model, 27:01.000 --> 27:05.000 where it's just been created with some hyperparameters, and then you pass it in data, 27:05.000 --> 27:08.000 and it turns into an actual model that you can then make predictions with. 27:09.000 --> 27:14.000 So the type state where you're doing this might be to have an unfit model, which has a fit method, 27:14.000 --> 27:18.000 and then a fit model, which has a predict method, and you can't miss use that. 27:18.000 --> 27:23.000 However, when your user wants to have a button that turns one into the other, 27:23.000 --> 27:26.000 they're going to have to have some way of representing that, and they're going to need to have some kind of tag, 27:26.000 --> 27:32.000 which ends up being probably this enum that we have with an unfit and a fit model. 27:33.000 --> 27:38.000 So you can consider that when you're writing your APIs, but how they're going to be used, 27:38.000 --> 27:42.000 you may just want to offer this instead of the type state model instead. 27:42.000 --> 27:47.000 Obviously, this turns one error into runtime errors, so everything now returns a result. 27:47.000 --> 27:50.000 It's just the way it is, unfortunately. 27:51.000 --> 27:55.000 Finally, a little section on optimization. 27:55.000 --> 27:59.000 So as I mentioned, we want to use real-life examples as benchmarks. 27:59.000 --> 28:03.000 Make them as realistic as possible, if you have your own data, then that's better. 28:03.000 --> 28:07.000 And split them up into the sort of things that your users are likely to do. 28:07.000 --> 28:14.000 In ML, that's going to be things like pre-processing, fitting, predicting, or clustering, that kind of side of things. 28:15.000 --> 28:20.000 Use criterion, read about the timing loop, as well. 28:20.000 --> 28:24.000 So there's no way of passing your data into criterion, 28:24.000 --> 28:27.000 or that's itter, it's a batch, itter batch, or itter batch, or itter. 28:27.000 --> 28:33.000 These differ based on the way, how you have to use the data that you pass into the benchmark, 28:33.000 --> 28:38.000 whether you have to take it by reference or multiple reference or open as an object. 28:42.000 --> 28:47.000 For profiling, so I kind of made this slide for myself as much as everything else, 28:47.000 --> 28:53.000 because the internet recommends various profilers, and I always forget which one that it is that works best. 28:53.000 --> 28:57.000 As an example, it's sampling, it's always sampling, I always forget that. 28:57.000 --> 28:59.000 It's great, it's perfect. 28:59.000 --> 29:07.000 The invocation is tricky because you're calling sampling, which then needs to call cargo bench, 29:07.000 --> 29:12.000 which needs its own arguments, which then needs to call criterion, which has got its own arguments. 29:12.000 --> 29:15.000 So there's a lot of kind of manipulation. 29:15.000 --> 29:19.000 Basically, I would recommend copying this and using it everywhere. 29:20.000 --> 29:23.000 So by default, this uses the five-box profiler, 29:23.000 --> 29:28.000 pops it open in a browser, and then you can use it to kind of debug things immediately. 29:28.000 --> 29:32.000 It's really fantastic, it has everything you would expect from a profiler, 29:32.000 --> 29:37.000 with call three using the four flame graphs, source code, and line counts, that kind of thing. 29:37.000 --> 29:39.000 Including the assembly itself. 29:39.000 --> 29:45.000 You can even share it with other people, and upload it into, kind of, get your issue up here already easily. 29:46.000 --> 29:52.000 In the notes, for this, if you want to download this slide, there are more tips on performance, 29:52.000 --> 29:57.000 specifically written by Nicholas and other group, who just lost performance work is great. 30:00.000 --> 30:04.000 A few more hints on optimization, so specifically, 30:04.000 --> 30:08.000 your slowness is likely to be in allegations. 30:08.000 --> 30:12.000 Generally, you wouldn't want to pre-allocate with capacity. 30:13.000 --> 30:18.000 If that doesn't help, and you're finding a lot of more allegations than you expect, still. 30:18.000 --> 30:21.000 We're not using the same backing buffer a lot. 30:21.000 --> 30:25.000 This sounds obvious when you say it, but a lot of these original algorithms, as you're translating them, 30:25.000 --> 30:28.000 will, they won't have the same level of control that rust offers. 30:28.000 --> 30:34.000 So you're, as you're translating it, you might not notice that you're actually reallocating every single time in a loop. 30:34.000 --> 30:39.000 If you can reuse those same backing buffers, try and keep it out of your public APIs. 30:39.000 --> 30:44.000 You can use a lot of jemilee species up by nicely. 30:44.000 --> 30:48.000 And be aware of how vexed from itter allocates. 30:48.000 --> 30:55.000 The docs for this are actually under from iterator, the implementation of from iterator for vex, 30:55.000 --> 31:00.000 which is, I would say, about two levels away from where you would expect to be, perhaps. 31:00.000 --> 31:05.000 But it underlies the collect method of iterator, using a lot more than you might think, 31:05.000 --> 31:11.000 it has some nuances around how and when it allocates and preallocates things. 31:11.000 --> 31:14.000 This should get you quite far, like with these tips you'll probably be okay. 31:14.000 --> 31:19.000 But if you want to squeeze everything out, you may want to do some more useful things. 31:19.000 --> 31:27.000 So one thing I found was really helpful is to replace any explicit indexing that you're doing with iterator zip or iterator tools, 31:27.000 --> 31:32.000 either. So if you're iterating over a lot of things and you need to assign to an output array or anything like that, 31:32.000 --> 31:38.000 rather than indexing, even using, so on the left here, this old example you used get unchecked. 31:38.000 --> 31:43.000 So you can use unsafe get unchecked and pass it in index. 31:43.000 --> 31:47.000 But then you're adding on safe into your code, which ideally you would avoid doing. 31:47.000 --> 31:50.000 And you can avoid doing by using zip. 31:50.000 --> 31:53.000 And the compiler knows exactly what's going on there. 31:53.000 --> 31:56.000 It can optimize things, it can eliminate the bounds checks, anyway. 31:56.000 --> 32:02.000 And you get by the cache locality because it knows how to load everything in cache lines. 32:02.000 --> 32:11.000 Another one that we found quite often, division and square root is surprisingly slow. 32:11.000 --> 32:14.000 So you need to avoid doing it in loops. 32:14.000 --> 32:22.000 Often you can store a scaling factor instead, so in this case we were doing some multiplication and division on every single call to iterators. 32:22.000 --> 32:25.000 There's an iterator next implementation, right? 32:25.000 --> 32:30.000 We can actually store the scale factor instead, and then we only have to do the most of the divisions once, 32:30.000 --> 32:35.000 and we just do multiplications on every iteration. 32:35.000 --> 32:40.000 Okay, another example of this, by the way, was square root. 32:40.000 --> 32:45.000 So in the implementation of the dynamic time warping that we talked about earlier, 32:45.000 --> 32:48.000 we calculated that you could be in distance. 32:48.000 --> 32:53.000 That involves swearing something and then taking the square root for every calculation. 32:53.000 --> 32:56.000 Instead of that, you can just square everything, sum all the squares, 32:56.000 --> 32:59.000 and then do one division at one square root at the end. 32:59.000 --> 33:00.000 It's pretty obvious. 33:00.000 --> 33:07.000 I was wrong when you look at outside of the code, but it helps loose, and it's a huge optimization we did. 33:07.000 --> 33:09.000 Okay, that was a lot. 33:09.000 --> 33:11.000 I hope there's some useful tips there. 33:11.000 --> 33:13.000 I think we're going to take questions. 33:13.000 --> 33:16.000 Let me do some other questions. 33:17.000 --> 33:19.000 But including questions. 33:19.000 --> 33:20.000 Okay. 33:20.000 --> 33:24.000 I'll do questions first, because I don't know if I'll get through everything otherwise. 33:24.000 --> 33:27.000 So, yeah, there's a little summary. 33:27.000 --> 33:29.000 Please do try August. 33:29.000 --> 33:33.000 Then it's only had time series, and you want to try analyzing it or working with it at all. 33:33.000 --> 33:34.000 Give it a go. 33:34.000 --> 33:39.000 We have, as I said, Python bindings, and JavaScript bindings on MPM. 33:39.000 --> 33:42.000 It's being used in the fastest front end, which is exciting. 33:42.000 --> 33:47.000 We use it for outlight detection already, and there's also APIs for doing forecasting, 33:47.000 --> 33:50.000 and change point section there. 33:50.000 --> 33:51.000 Give it a go. 33:51.000 --> 33:53.000 Give, porting sum and algorithms. 33:53.000 --> 33:55.000 It's fun. 33:55.000 --> 33:58.000 You can have a lot of fun optimizing things and speeding things up, 33:58.000 --> 34:02.000 and yeah, writing some really nice APIs at this piece of piece of piece of piece. 34:02.000 --> 34:05.000 And I would say, explore WebAssembly, both of the, like, 34:05.000 --> 34:10.000 the complex dependencies side, using maybe using WebAssembly components, maybe not. 34:10.000 --> 34:14.000 And for the JavaScript bindings, there is a whole section of this talk, 34:14.000 --> 34:20.000 which I've probably not got time to cover, on the different trade-offs that you can make when using things like 34:20.000 --> 34:24.000 Wasm Pack, and Wasm Vine Gen to generate your Wasm bindings. 34:24.000 --> 34:28.000 And that's usable from anywhere, whether it's in the browser or different languages in one time. 34:28.000 --> 34:30.000 So, give that a go. 34:30.000 --> 34:33.000 I'm going to take some questions. 34:33.000 --> 34:35.000 If anybody has any. 34:35.000 --> 34:39.000 And then we'll, I thought you're not going to have time to do anything else. 34:39.000 --> 34:41.000 Yeah, sure. 34:41.000 --> 34:43.000 Question the mic. 34:43.000 --> 34:51.000 I kind of already did check online, but the NDRA doesn't support like mixed mix types of columns. 34:51.000 --> 34:57.000 So, it's like 2D array or 3D array, but not like 2 arrays of different types. 34:57.000 --> 35:03.000 Do you have happened to have any recommendations of such mixed type? 35:03.000 --> 35:05.000 Yeah, libraries. 35:05.000 --> 35:09.000 But I think the question is about mixed type arrays. 35:09.000 --> 35:15.000 I would suggest using something like, it's more of a data frame approach. 35:15.000 --> 35:20.000 So, in Python, you would traditionally use pandas or something like that. 35:20.000 --> 35:23.000 Pollas is the kind of rust version of pandas. 35:23.000 --> 35:25.000 And it does have rust APIs. 35:25.000 --> 35:27.000 I have to say I haven't played with them much. 35:27.000 --> 35:31.000 And I don't know how well documented they are, because that's how I checked. 35:31.000 --> 35:34.000 It was mainly the Python bit that was documented. 35:34.000 --> 35:38.000 I imagine that would be the way to go is to use something like pollas. 35:38.000 --> 35:40.000 Yeah. 35:40.000 --> 35:43.000 Hello, thank you for the talk. 35:43.000 --> 35:48.000 I'm Chris, how do you exchange large amounts of data with your vast and looking 35:48.000 --> 35:49.000 bonus? 35:49.000 --> 35:54.000 Basically, how do you transfer the large quantities of data in and out of it? 35:54.000 --> 36:00.000 Do you use any kind of civilization for a mad like patchy arrow or something like that? 36:00.000 --> 36:03.000 I can see what you were saying about the microphone. 36:03.000 --> 36:04.000 It's not been great now. 36:04.000 --> 36:08.000 I think the question about passing things in and out of web assembly. 36:08.000 --> 36:09.000 Yeah. 36:09.000 --> 36:17.000 So, in both the case of when you're using wasm bind gen and using the component model, 36:17.000 --> 36:21.000 they basically use the float 64 array type in the JavaScript side. 36:21.000 --> 36:26.000 And it gets converted into a linear memory block, which is then mem copied into a vex. 36:26.000 --> 36:31.000 So, it's not done using any kind of chunking or any kind of library like arrow. 36:31.000 --> 36:35.000 I haven't noticed any performance bottlenecks in that side of things whatsoever. 36:35.000 --> 36:37.000 It's pretty much a memory I think. 36:37.000 --> 36:38.000 So, yeah. 36:38.000 --> 36:43.000 As long as everything's size is green, you can do that efficiently already. 36:43.000 --> 36:56.000 Thank you for the talk. 36:56.000 --> 37:03.000 On the topic of optimizations, did you try speeding up bottlenecks using Cindy? 37:03.000 --> 37:08.000 And if so, what's your story doing that in stable rust? 37:08.000 --> 37:13.000 I have an open issue to try and do some of these things using Cindy. 37:13.000 --> 37:14.000 But I haven't got around to it. 37:14.000 --> 37:17.000 I think the outlier detection one would be a really nice candidate. 37:17.000 --> 37:22.000 Because, actually, if you look at the NumPy code, it already is kind of vectorized. 37:22.000 --> 37:24.000 It's written in a vectorized way. 37:24.000 --> 37:26.000 So, we could try doing that. 37:26.000 --> 37:33.000 I haven't checked if that has, maybe the compiler's already added those Cindy optimizations in there. 37:33.000 --> 37:37.000 And I would love someone to come along and show me how to do it because I've never actually worked with it. 37:37.000 --> 37:46.000 But when I looked into using the portable simmed in the rust libraries, it's still nightly only, I think. 37:46.000 --> 37:50.000 So, I was never quite sure how to do it using stable rust. 37:50.000 --> 37:56.000 I'm not sure that's such a big concern for this because, well, it would be for rust users. 37:56.000 --> 38:04.000 But for the Python and the JavaScript bindings, we can happily just use a nightly compiler to create those bindings in the first place. 38:04.000 --> 38:06.000 So, that would be useful. 38:06.000 --> 38:13.000 So, if there's no more questions, can we thank Ben? 38:13.000 --> 38:16.000 Thank you.