WEBVTT 00:00.000 --> 00:09.000 So, for sharing the lunch time with me, thank you. 00:09.000 --> 00:15.200 So, I will present a firm work for building digital communication chains, which is called 00:15.200 --> 00:17.200 affect. 00:17.200 --> 00:25.200 So, to give a little bit of background, so, because of the context here, so, affect 00:25.200 --> 00:33.200 is a software developed for providing solution for software different radio. 00:33.200 --> 00:43.000 So, this has started a little more than 10 years ago, and well, most of the components 00:43.000 --> 00:51.200 and also the contents of the components, and also the goals to, as you know, to 00:51.200 --> 00:56.200 replace some of the hardware components into software components. 00:56.200 --> 00:59.200 And, I will start with a very basic basic chain. 00:59.200 --> 01:06.200 So, affect is the goal is to under that kind of a chain where you have a source, several 01:06.200 --> 01:15.200 layers of encoding sort of algorithms, and then some channels and that some noise. 01:15.200 --> 01:24.200 And, it would be able to use in two ways, either for simulating a cooperation 01:24.200 --> 01:30.200 algorithms for the purpose of validating algorithms, and we show that they have the 01:30.200 --> 01:34.200 proper question right that we expect. 01:34.200 --> 01:42.200 So, when we do the simulation, we know we launch a lot of random messages that we modify 01:42.200 --> 01:49.200 some control noise, and we check that what we have at the output is much, what we 01:49.200 --> 01:52.200 send it in the input. 01:52.200 --> 01:59.200 And also, it can be used in real communication chains, for instance, which has been used with 01:59.200 --> 02:07.200 some satellite usage or some video processing also. 02:07.200 --> 02:14.200 And, it involves three complete three academic teams in France, two in Bordeaux. 02:14.200 --> 02:22.200 So, there is the IMF laboratory in Bordeaux, which is more in April, Nixon and numerical 02:22.200 --> 02:23.200 communications. 02:23.200 --> 02:28.200 There is the team storm in Latin New Year in Bordeaux, which is more on computer 02:28.200 --> 02:32.200 science and HPC and Tivalsock at lip 6. 02:32.200 --> 02:33.200 I will detail it a little more. 02:33.200 --> 02:41.200 So, Tim Storm, which is the team I belong to, is a team that is initially only on computer science 02:41.200 --> 02:42.200 and HPC. 02:42.200 --> 02:54.200 So, it will be far from radio and SDR, we develop techniques for optimizing codes and programming 02:55.200 --> 02:57.200 for mixed supercomputers. 02:57.200 --> 03:03.200 But, no, we also target smaller devices for radio. 03:03.200 --> 03:09.200 The team from IMF laboratory in Bordeaux is more than three domains. 03:09.200 --> 03:15.200 So, analogy can mix their clits, algorithms, and also numerical communications. 03:15.200 --> 03:21.200 So, this is the main specialist of the team. 03:22.200 --> 03:32.200 And, there is a team also in Paris at lip 6 laboratory, which is from over at the interface 03:32.200 --> 03:40.200 between the two domains, computer science, classical computer science and radio. 03:40.200 --> 03:45.200 So, the effect is organized that way. 03:45.200 --> 03:55.200 So, there is the top layer where you have all the management for the modules. 03:55.200 --> 04:08.200 The communication modules, so the simulator and also the great mix for the aerocoic team codes. 04:08.200 --> 04:16.200 And then, the effect is a bit on top of two pieces of 12 pairs of 3PU, which is the runtime part, 04:16.200 --> 04:23.200 which is in charge of ending the task and mapping them on the CPU course, 04:23.200 --> 04:27.200 and scheduling and optimizing. 04:27.200 --> 04:33.200 And also, the MIP laboratory, which is dedicated on programming, 04:34.200 --> 04:42.200 giving access to the lower level SMD and 36 proteins. 04:42.200 --> 04:47.200 So, I will literally turn more, there is this component. 04:47.200 --> 04:51.200 So, first, the MIP laboratory, this is a super-spersed laboratory. 04:51.200 --> 04:57.200 So, what we call the wrapper laboratory, which means that it is only a header on the laboratory, 04:57.200 --> 05:04.200 which is included at compile time by the compiler, through templates. 05:04.200 --> 05:12.200 And the way it works, so it provides an abstract layer that we express a new program with. 05:12.200 --> 05:20.200 So, if, for instance, here we have a MIP laboratory, which symbolizes vector or g store form, 05:20.200 --> 05:25.200 the same D and system set, that we load with several components. 05:25.200 --> 05:28.200 And then, we do some array matrix here. 05:28.200 --> 05:33.200 And then, when you compile this part of code on different abstract functions, 05:33.200 --> 05:38.200 that are supported, so, for instance, we support the instruction set from Intel, 05:38.200 --> 05:43.200 so, for instance, for AVX, the assignment that you see here, 05:43.200 --> 05:53.200 we translated into the corresponding AVX2 intrinsic from Intel. 05:53.200 --> 06:02.200 Likewise, the operation here, the plus is translated into the plus for AVX2 instance. 06:02.200 --> 06:08.200 And if you compile the same code on a norm, actually, true, 06:08.200 --> 06:14.200 then it will be translated into the new on instance, set instead. 06:14.200 --> 06:23.200 So, the interest is that you keep your code largely independent of that work, 06:23.200 --> 06:26.200 actually, actually, that you are targeted. 06:26.200 --> 06:32.200 And so, it allows you to be quite potable and still to have a lot of control on the 06:33.200 --> 06:36.200 the instruction that you use. 06:36.200 --> 06:42.200 And also, this laboratory is evolving, so, we are also working on SVG. 06:42.200 --> 06:50.200 This is the new iteration of SIMG entrancics and instruction set on ARM, 06:50.200 --> 06:57.200 and also on supporting RISPID and the V extension of RISPID. 06:57.200 --> 07:03.200 And then the second big part of effect is the stream view on timestamp, 07:03.200 --> 07:09.200 which is in charge of unlinked the work and mapping that work on RISPID. 07:09.200 --> 07:18.200 So, it is fully used to use a supersprose thread and use a thread to execute the code 07:18.200 --> 07:24.200 and the work that is submitted by the task in the chain of communication. 07:24.200 --> 07:31.200 So, for that, the programmer uses an API that is a kind of embedded language in 07:31.200 --> 07:34.200 supersprose for building as we change. 07:34.200 --> 07:39.200 So, here is an example, a very basic example where we have just four tasks. 07:39.200 --> 07:47.200 So, the task will be the element that act on the flow of data. 07:47.200 --> 07:52.200 So, you can imagine that on a flow of radio information, work on each 07:52.200 --> 07:55.200 circuit or in frame. 07:55.200 --> 07:59.200 And so, you create the component here. 07:59.200 --> 08:04.200 So, we have a two notion, we have the task, which is the actual routine that we will be used. 08:04.200 --> 08:11.200 And a module is slightly higher level abstraction that groups, 08:11.200 --> 08:16.200 about tasks to get done of data related, for instance. 08:16.200 --> 08:21.200 If you have a coding algorithm, we will add the corresponding decoding algorithm 08:21.200 --> 08:25.200 that will be part of the same module from a server quantity. 08:25.200 --> 08:30.200 And then, so you create your module object and then you build your chain. 08:30.200 --> 08:32.200 So, you bind the task. 08:32.200 --> 08:42.200 Its binding operation is managing both the memory allocation for the refers and also passing data 08:42.200 --> 08:45.200 from one task to the user. 08:45.200 --> 08:52.200 And then, you build the rubbed that into sequence and then you execute the sequence here, 08:52.200 --> 08:57.200 which will do the actual processing of data. 08:57.200 --> 09:00.200 So, this is the simple version. 09:00.200 --> 09:05.200 Then, you can have also more complex aggregated of tasks. 09:05.200 --> 09:09.200 And, for instance, here you have several inputs of several inputs. 09:09.200 --> 09:15.200 My question more to actual change that I have to build. 09:15.200 --> 09:21.200 Also, you have conditional, so you can have branching here. 09:21.200 --> 09:31.200 And you can select different paths according to the data that you find in the frames that you process. 09:31.200 --> 09:37.200 So, it's kind of if else or switch case that you would love in a global run. 09:38.200 --> 09:40.200 You have also loops, of course. 09:40.200 --> 09:43.200 So, you can identify it. 09:43.200 --> 09:50.200 If you have some kind of synchronization in your communication chain to synchronize on the signal. 09:50.200 --> 09:56.200 So, you can repeat a part until some condition is reached. 09:56.200 --> 10:04.200 And then, what we are asked to actually use parallelism is the notion of park line. 10:04.200 --> 10:09.200 So, here you can see that the task can be grouped into several stages. 10:09.200 --> 10:14.200 And each stage will be mapped onto one CPU core. 10:14.200 --> 10:19.200 So, stage one here will be mapped on one core. 10:19.200 --> 10:27.200 Here, we have two trees, tasks in the second stage that will be mapped on the same core. 10:27.200 --> 10:31.200 And then, another stage here. 10:31.200 --> 10:41.200 So, you can, if you want, control everything about the mapping of the stages and all the things are grouped. 10:41.200 --> 10:53.200 And also, we provide some way to optimize, to automatize the mapping of this work on two cores. 10:53.200 --> 11:02.200 So, first step is to automatically replicate the stages that can be run on several threads. 11:02.200 --> 11:08.200 So, the condition for that is that this stage must be steadily. 11:08.200 --> 11:15.200 So, it must not have some state that is kept from one frame to another frame. 11:15.200 --> 11:22.200 If this stage is steadily, that means that, for instance, the frame one can be processed on one thread. 11:23.200 --> 11:30.200 And then, the second frame coming to be processed in parallel in the second thread and so on. 11:30.200 --> 11:40.200 If there is an internal state between the frame, then this is not possible because then you have a dependence and then you cannot replicate this part. 11:40.200 --> 11:51.200 Just to show you a little more, because with a small, so you can see here the pipeline that is replicated and then it's mapping on the CPU core. 11:51.200 --> 12:00.200 So, here until know everything is under the pipeline, so you can see in the code here that we build the stages. 12:00.200 --> 12:08.200 We build mapping on the threads and also we indicate on which core we map each thread. 12:09.200 --> 12:16.200 And also, to give more flexibility and more automativism. 12:16.200 --> 12:25.200 We have some work on automatically mapping the network on the multicore, our architecture. 12:25.200 --> 12:36.200 So, there is a thesis from Jan Oran in Bordeaux, who has been working on this topic, so she is in last year of PhD now. 12:36.200 --> 12:46.200 And she designed an algorithm to map this task and stages onto the CPU cores. 12:46.200 --> 12:52.200 So, again, we will distinguish through two kinds of parallelism. 12:52.200 --> 12:57.200 So, we would have the first pipeline, parallelism, which is just a parallelism intact. 12:57.200 --> 13:01.200 So, we have the one thread here and one thread here. 13:01.200 --> 13:10.200 If we cannot be sure, or if we know that there is a dependency between the stages. 13:10.200 --> 13:20.200 And if we know that there is no dependencies between some stages here, we will allow a replication with the algorithm. 13:20.200 --> 13:28.200 So, the idea is this way, so we will consider two kinds of tasks. 13:28.200 --> 13:30.200 So, the stage task and the stage full task. 13:30.200 --> 13:32.200 And this is the way that it would work. 13:32.200 --> 13:38.200 So, we have some information about the duration of the task. 13:38.200 --> 13:44.200 We assume the assumption that we make is that this task will have the same duration for each frame. 13:44.200 --> 13:49.200 This is the assumption from that, from that algorithm for no. 13:49.200 --> 13:53.200 And we have a frame that enters the process. 13:53.200 --> 14:00.200 And so, it will go to the stage one here and then to stage two, the three. 14:00.200 --> 14:11.200 And then once the stage one has been incomplete, we can let the second frame enter the stage one here. 14:11.200 --> 14:18.200 And you can see that the frame one is no switching stage two on the different core. 14:18.200 --> 14:20.200 And so on. 14:20.200 --> 14:26.200 And you can see here also that we have some parallelism here. 14:26.200 --> 14:34.200 So, we can process the task one in the task five here and the task six here. 14:34.200 --> 14:47.200 And so, the result from that is presented in the in the paper that you can find on the web. 14:47.200 --> 14:52.200 That described the algorithm that is used for that, which is called attack. 14:52.200 --> 14:59.200 So, it has been proved optimal for change when we have the assumption that the task has constant time. 15:00.200 --> 15:05.200 For a given stage each task, we have the same time. 15:05.200 --> 15:08.200 And the way that it behaves. 15:08.200 --> 15:15.200 So, we compare it with other mapping algorithms that are existing right now. 15:15.200 --> 15:26.200 So, for instance, the algorithm that was known as best we thought that, which is called Nicole here. 15:26.200 --> 15:36.200 And so, the experiment that we do here, we have on X, the number of resources that we are able to use. 15:36.200 --> 15:44.200 And on why the megabyte in the passive one that we are able to reach. 15:44.200 --> 15:49.200 And we have two metrics with the dots. 15:49.200 --> 15:58.200 There's our, the metrics that are measured and the strike lines are the metrics that are simulated. 15:58.200 --> 16:06.200 So, what do you see if we were close to what we were expected in terms of results. 16:06.200 --> 16:10.200 And we can see that attack gets the best results. 16:10.200 --> 16:16.200 So, it is able to reach the highest bandwidth. 16:16.200 --> 16:22.200 While using the lowest number of resources, we can see that some alternatives. 16:22.200 --> 16:30.200 So, in terms of the year, the year is able to reach also the same level of bandwidth. 16:30.200 --> 16:34.200 But, it uses more resources for that. 16:34.200 --> 16:36.200 Who does the waste here? 16:36.200 --> 16:41.200 So, here, there's other, the numbers are corresponding to the graphs. 16:41.200 --> 16:54.200 So, what is stimulated, what is measured and the number of resources that we are used to, which is best bandwidth. 16:54.200 --> 16:58.200 Also, some examples of using a fake in the red chain. 16:58.200 --> 17:04.200 So, that was a work between Airbus and IMS laboratory. 17:04.200 --> 17:11.200 What, between a chain for transmitting a video for, for satellite. 17:11.200 --> 17:14.200 So, it's called DVBS2. 17:14.200 --> 17:19.200 So, it's a quite complex chain. 17:19.200 --> 17:23.200 So, each, you can see, we have eight stages. 17:23.200 --> 17:29.200 Some stage, we can see on several tasks, five tasks here. 17:29.200 --> 17:33.200 And this is the recording path. 17:33.200 --> 17:43.200 And we measured, find going level the performance of all these stages. 17:43.200 --> 17:50.200 So, you can see in red, it is the stage at cost the most. 17:50.200 --> 17:53.200 And which is a state full. 17:53.200 --> 17:56.200 So, we cannot parallelize this one. 17:56.200 --> 18:02.200 And you can see that stage seven year is the one that cost the most, but which is parallelizable. 18:02.200 --> 18:05.200 So, it doesn't have an internal state. 18:05.200 --> 18:10.200 So, we can replicate it on several course. 18:10.200 --> 18:15.200 And what we obtain here, using parallelism. 18:15.200 --> 18:17.200 So, this is an example here. 18:17.200 --> 18:23.200 So, if we use a, we need a sequential version. 18:23.200 --> 18:26.200 We are, we have this, this throughput. 18:26.200 --> 18:34.200 So, for Maca-By,さCon, but if we run the same computation on multicore. 18:34.200 --> 18:35.200 So, using parallelism. 18:35.200 --> 18:38.200 So, using replication of the stage at this expensive, 18:38.200 --> 18:46.200 then, we were able to reach 55 Maca-By for, for the same computation. 18:46.200 --> 18:52.200 So, let's not, not do large computations. 18:52.200 --> 18:59.200 So, we have sort of been working on targeting a heterogeneous processor. 18:59.200 --> 19:05.700 So, in this example, I'm not talking about GPU, but using a processor with different 19:05.700 --> 19:10.200 kind of course, a big little, for instance. 19:10.200 --> 19:12.200 So, here is an example. 19:12.200 --> 19:19.200 So, it was a capable processor, but we did also some similar tests with the machines. 19:19.200 --> 19:28.140 So, the way that this machine is laid out is that you have a full percent of performance 19:28.140 --> 19:37.140 course, and you have some small consumption course that are the course here, and we experimented 19:37.140 --> 19:47.140 on the framework of effect computation on top of this machine. 19:47.140 --> 19:54.140 So, first, we made three tests, so one using the West scheduler. 19:54.140 --> 20:00.140 So, we let Linux choose how to map the slides. 20:00.140 --> 20:08.140 And the colors here correspond to the occupancy of each course. 20:08.140 --> 20:16.140 So, you can see that we have four blue course here, so this corresponds to this blue cluster here. 20:16.140 --> 20:26.140 And likewise, we have green, and also we have the blue consumption course in yellow light and brown here. 20:26.140 --> 20:42.140 So, this first test is using Linux, and then we use different pinning by hand, so pinning for throughput or pinning for maximizing energy saving. 20:42.140 --> 20:47.140 And we can see that we can get some interesting gain. 20:47.140 --> 20:55.140 So, for instance, the G1 here for pinning for throughput, so we have a slight gain in throughput. 20:55.140 --> 21:01.140 And also, we have view in terms of energy consumption. 21:01.140 --> 21:07.140 But also, if we are able to allow for lowering the throughput a little bit, 21:07.140 --> 21:17.140 we can have some significant energy efficiency gain here, so what is 20% here. 21:17.140 --> 21:27.140 Okay, so we also have been working on providing some easier way to program with effect. 21:27.140 --> 21:34.140 So, as I said, effect is in C++, and by default, you have to do C++ and to program it. 21:35.140 --> 21:44.140 So, it is not very high level C++ is relevant that we use. 21:44.140 --> 21:59.140 But still, it can be difficult for people that are not used to do that, or that work usually with MATLAB, for instance, for designing error correction algorithms. 21:59.140 --> 22:11.140 So, we also have a way to access effects to Python, with two modules, so key effect is just a translation of the API into Python. 22:11.140 --> 22:21.140 And also, we have the path modules that are used to write C++ custom modules and link them into key effects. 22:21.140 --> 22:33.140 So, to have the benefit of both tools and for the efficiency, when needed, and for the ease of use, when it is prefer. 22:33.140 --> 22:42.140 So, this allows, for instance, for having some not-book that you can use to experiment with parameters, 22:42.140 --> 22:47.140 and immediately results in two checks. 22:47.140 --> 22:58.140 Here's an example of all it looks, you can see that we set the parameters here, so it is very similar to all you would write it in MATLAB. 22:58.140 --> 23:08.140 And then we build the modules for the chain, and then we have the execution here, and we can do the simulation. 23:08.140 --> 23:18.140 So, this allows you to quickly iterate with different parameters and check for different kind of solution, for instance. 23:18.140 --> 23:36.140 We also are expanding with Julia, for the same purpose, but with the feedback that Julia is compile, and also compile that using just in time compiler, so we potential to get more performance than Python. 23:36.140 --> 23:52.140 And also, if you are familiar with Python, you know that there is a global lot in Python that prevents efficiency with disreading. 23:52.140 --> 23:57.140 Even if there is a lot of effort removed, it's still there for no. 23:58.140 --> 24:13.140 So, we are expanding with Julia to interface it with effect, we have been expanding with different kind of packages that provide the interesting between Julia and C++. 24:13.140 --> 24:31.140 And for no, we seem that the G1A package would be the most interesting choice in terms of support for thread, for execution, and also the quality of support for execution. 24:31.140 --> 24:38.140 So, I'm calling C++ from Julia and her recipe. 24:38.140 --> 24:47.140 So, we did some experiment again to know the overhead of during Julia instead of programming directly with C++. 24:47.140 --> 24:56.140 So, it's a very simple case that we did here, it's just to have some work load to do. 24:57.140 --> 25:05.140 And so, on some AMD horizon platform here, so this is the chain that we use. 25:05.140 --> 25:09.140 And so, this is the result that we gain. 25:09.140 --> 25:24.140 So, in black you have the C++ latency, and you can see that with Julia. 25:24.140 --> 25:29.140 So, you know, the C++ here, so that the two versions here. 25:29.140 --> 25:38.140 The two versions are corresponding using copy or not, between the stages, because it is an important part, I will explain why. 25:38.140 --> 25:46.140 And you can see that with Julia, there is quite a gap in terms of latency. 25:46.140 --> 25:50.140 So, in X here, this is the data size. 25:50.140 --> 25:57.140 So, it's an idea of the length of the task that you have. 25:57.140 --> 26:08.140 And at which point it becomes a particular to use Julia in terms instead of C++, getting the frame size for instances. 26:08.140 --> 26:16.140 And the reason for this gap is mainly due to a raw management in Julia to keep things safe. 26:16.140 --> 26:33.140 So, if for your proposed, you are able to go on in one safe manner, then you can lower the cost of using Julia quite significantly you can see here. 26:33.140 --> 26:38.140 And if that makes sense for the usage and that's the target. 26:38.140 --> 26:49.140 And you can see also the difference between using a copy, which is right here, and using a zero copy transfer in blue here. 26:49.140 --> 26:57.140 So, this is still experimental, and also on the Julia part, the interfacing module, the Z itself experimental, so this is a polling. 26:57.140 --> 27:02.140 And Julia also is a polling, so maybe the result would be better in the coming years. 27:02.140 --> 27:12.140 Okay, and well, this is ongoing work for us for now. 27:12.140 --> 27:19.140 And one almost last point is also kind of a sort of, sorry. 27:19.140 --> 27:31.140 So, we are also being linked across the, for actually a short users, and to experiment with different kind of architectures. 27:31.140 --> 27:37.140 So, we are very interested or safe to use different kind of small boards. 27:37.140 --> 27:44.140 It's very similar, I think, to what's even represented just before. 27:44.140 --> 27:54.140 And we are planning to open at least partially this platform to the enthusiasm. 27:54.140 --> 28:02.140 And we are still, there's a possibility to get a concept on that machine. 28:02.140 --> 28:07.140 And right now, it is made of four partitions. 28:07.140 --> 28:20.140 So, we have some AMD, Ryzen, some Intel Core here, and so, was in, is in four, is in five, with your lake, and also different kind of accelerators. 28:20.140 --> 28:25.140 I didn't mention that today, but we also have worked on accelerators again. 28:25.140 --> 28:39.140 So, when we get AMD, Intel, and also the least, we probably have over time, as we integrate, there's this components. 28:39.140 --> 28:49.140 And also some information about the network that we have here, but it's quite normal. 28:49.140 --> 28:58.140 And we try to use more data for a full, but at a moderate power consumption. 28:58.140 --> 29:02.140 So, we try to find a trade-off here. 29:02.140 --> 29:06.140 So, if you want to see, you can contact Adrien Castline, or tipsis. 29:06.140 --> 29:13.140 So, this is in the slide, and the slide, the online also, so we can get the advice, if you need it. 29:13.140 --> 29:20.140 Okay, and one last point in my talk is the online bear fair comparator. 29:20.140 --> 29:26.140 So, bear fair is the, by the way, the front mirror rate comparator. 29:26.140 --> 29:31.140 And it's a comparator for different kind of error coaxing algorithm. 29:31.140 --> 29:40.140 So, you can have that on the website, and see if I have a failure network that can show you. 29:40.140 --> 29:50.140 Let's see. 29:50.140 --> 29:51.140 That seems to work. 29:51.140 --> 30:00.140 So, this is the page of, of effect, and you have the bear fair comparator here. 30:00.140 --> 30:09.140 So, it loads the database of past experiments with different kinds of error coaxing algorithm. 30:10.140 --> 30:17.140 And so, it will appear in the short moment, I think. 30:17.140 --> 30:23.140 And then you can compare the results of this measurement. 30:23.140 --> 30:26.140 So, for instance, you choose one set of parameters. 30:26.140 --> 30:33.140 So, you can see the byte error rate and the formula rate for that set of parameters. 30:33.140 --> 30:41.140 And you can compare that with other algorithms or different sets of parameters. 30:41.140 --> 30:45.140 And also, if you want, you can also upload the file. 30:45.140 --> 30:50.140 So, it's not uploaded on the or machine, it's uploaded in the website. 30:50.140 --> 30:59.140 And the JavaScript instance, so that you can compare your own results with the literature. 30:59.140 --> 31:04.140 So, well. 31:04.140 --> 31:11.140 So, it loads to see what is the set of the art and all of you can. 31:11.140 --> 31:16.140 Or you will put it in yourself compared to the set of the art. 31:16.140 --> 31:18.140 Okay, so, to conclude. 31:18.140 --> 31:23.140 So, as I said, it has been an effort for about 12 years. 31:23.140 --> 31:27.140 Now, about building an open source. 31:27.140 --> 31:32.140 So, for different audio environments. 31:32.140 --> 31:38.140 With focus on parism, so, multicore parism, vector parism also. 31:38.140 --> 31:41.140 It's also completely free. 31:41.140 --> 31:43.140 So, it's a MIT license. 31:43.140 --> 31:50.140 And all the components also are open source, we and an MIT license. 31:50.140 --> 31:52.140 So, you have the all the links here. 31:52.140 --> 31:56.140 And also as I said, there is this this cluster. 31:56.140 --> 32:01.140 That will be open also at this part to the public. 32:01.140 --> 32:02.140 Okay. 32:02.140 --> 32:05.140 So, if you have any question. 32:05.140 --> 32:06.140 Thank you. 32:06.140 --> 32:11.140 Thank you. 32:11.140 --> 32:12.140 Thank you. 32:12.140 --> 32:13.140 Yeah. 32:13.140 --> 32:15.140 Hi, thanks a lot for your work. 32:15.140 --> 32:21.140 I really assembled your MIT PC and called a decol code and my own project. 32:21.140 --> 32:22.140 Yeah. 32:22.140 --> 32:25.140 I needed to build an MIT PC matrix, optimize it. 32:25.140 --> 32:33.140 So, I wrote your MIT PC code and I wrote an MIT code and so I can build a GPU. 32:33.140 --> 32:38.140 So, process this project because I have a lot of GPU power, but not all of it's GPU power. 32:38.140 --> 32:42.140 Would you be interested in microtubulating some of this call back to the project? 32:42.140 --> 32:43.140 Sure. 32:43.140 --> 32:45.140 Definitely yes. 32:45.140 --> 32:48.140 We work on that perfectly. 32:48.140 --> 32:55.140 This is why our code is on GitHub so that we can integrate that. 32:55.140 --> 32:58.140 And also why we use an open source license. 32:58.140 --> 33:01.140 We are really interested in making a connection. 33:01.140 --> 33:03.140 Yes. 33:03.140 --> 33:04.140 Okay. 33:05.140 --> 33:14.140 Just send me an email and then we will contact you with the dry people. 33:14.140 --> 33:16.140 There are some yes. 33:16.140 --> 33:20.140 This is not the main focus for some part of the team. 33:20.140 --> 33:25.140 At least for the people that are working more on the communication layer. 33:25.140 --> 33:31.140 There are more on using CPU or FPGA or so. 33:31.140 --> 33:38.140 But GPU is still in the option because for simulation, GPU is probably the best option anyway. 33:38.140 --> 33:39.140 Okay. 33:39.140 --> 33:40.140 Perfect. 33:40.140 --> 33:41.140 Thank you. 33:41.140 --> 33:42.140 Yeah. 33:42.140 --> 33:46.140 That's a good point. 33:46.140 --> 33:52.140 Of course, we are aware that we are doing this. 33:52.140 --> 33:58.140 The main idea is to have a very small code base that we can hack easily or need. 33:58.140 --> 34:01.140 There is not much to that. 34:01.140 --> 34:05.140 Just a little bit more control on what we do. 34:05.140 --> 34:11.140 This is just one project next to another project. 34:11.140 --> 34:16.140 And also there are some exchanges between the project. 34:17.140 --> 34:18.140 Yeah. 34:18.140 --> 34:22.140 You mentioned, I think it's a, you probably still do you. 34:22.140 --> 34:23.140 Yeah. 34:23.140 --> 34:28.140 The idea is that it allows you to map the, 34:28.140 --> 34:30.140 Can see blocks and the possibilities. 34:30.140 --> 34:31.140 Yeah. 34:31.140 --> 34:33.140 We have some dynamic. 34:33.140 --> 34:36.140 I mean, is this project still not quite time? 34:36.140 --> 34:42.140 For no yes, but we are working on having a dynamic building or so. 34:43.140 --> 34:49.140 In the team, we also have a user on time system that are already doing dynamic. 34:49.140 --> 34:50.140 Excuse me. 34:50.140 --> 34:55.140 Do you have the idea to address the blue? 34:55.140 --> 34:56.140 No. 34:56.140 --> 34:57.140 To map. 34:57.140 --> 34:58.140 Yeah. 34:58.140 --> 34:59.140 Yeah. 34:59.140 --> 35:00.140 The way it's. 35:00.140 --> 35:01.140 Yeah. 35:01.140 --> 35:03.140 Did you read the code behind that? 35:03.140 --> 35:04.140 Did you read the codes? 35:04.140 --> 35:07.140 Can maps if it works to the different. 35:07.140 --> 35:08.140 Yes. 35:08.140 --> 35:09.140 Yes. 35:09.140 --> 35:11.140 For no, we, we use prior knowledge. 35:11.140 --> 35:16.140 So we know that, for instance, one task will be large and one task will be small. 35:16.140 --> 35:17.140 And then. 35:17.140 --> 35:18.140 When you develop, actually. 35:18.140 --> 35:19.140 Yeah. 35:19.140 --> 35:20.140 You need to know this. 35:20.140 --> 35:22.140 And you will map your sources. 35:22.140 --> 35:23.140 Yeah. 35:23.140 --> 35:27.140 And the processing based on your actual knowledge of the. 35:27.140 --> 35:28.140 Yes. 35:28.140 --> 35:29.140 Yeah. 35:29.140 --> 35:30.140 And this is still not. 35:30.140 --> 35:31.140 Design time. 35:31.140 --> 35:32.140 Yeah. 35:32.140 --> 35:35.140 But this can be configured quite easily. 35:35.140 --> 35:37.140 So you will just have to compare now. 35:37.140 --> 35:40.140 And you know, my question was the fact that. 35:40.140 --> 35:43.140 The point I had in mind was, do you. 35:43.140 --> 35:44.140 Can we do. 35:44.140 --> 35:45.140 Pack up. 35:45.140 --> 35:46.140 I mean, you have a. 35:46.140 --> 35:47.140 The main person. 35:47.140 --> 35:48.140 You change. 35:48.140 --> 35:49.140 Mm-hmm. 35:49.140 --> 35:50.140 The box. 35:50.140 --> 35:51.140 And you have a. 35:51.140 --> 35:52.140 A failure. 35:52.140 --> 35:54.140 And you want to have a. 35:54.140 --> 35:55.140 The. 35:55.140 --> 35:56.140 Yeah. 35:56.140 --> 35:57.140 Yeah. 35:57.140 --> 35:58.140 Yeah. 35:58.140 --> 35:59.140 We're. 35:59.140 --> 36:00.140 Yeah. 36:00.140 --> 36:01.140 Yeah. 36:01.140 --> 36:02.140 We're. 36:02.140 --> 36:03.140 Yeah. 36:03.140 --> 36:04.140 And also we're. 36:04.140 --> 36:05.140 We're working on. 36:05.140 --> 36:06.140 And that's. 36:06.140 --> 36:08.140 And being able to. 36:08.140 --> 36:09.140 To have more. 36:09.140 --> 36:11.140 More control at runtime. 36:11.140 --> 36:13.140 So having. 36:13.140 --> 36:14.140 For instance. 36:14.140 --> 36:17.140 That's that may not be always of the same. 36:17.140 --> 36:18.140 You're sure. 36:18.140 --> 36:19.140 Or. 36:19.140 --> 36:20.140 For instance. 36:20.140 --> 36:21.140 If. 36:21.140 --> 36:23.140 You know, you're you are in a situation where you are. 36:23.140 --> 36:24.140 A lot of knowledge. 36:24.140 --> 36:25.140 And then you move and you. 36:25.140 --> 36:26.140 You have less noise. 36:26.140 --> 36:27.140 So probably some. 36:27.140 --> 36:28.140 Contation would be less. 36:28.140 --> 36:29.140 It's a. 36:29.140 --> 36:30.140 So. 36:30.140 --> 36:31.140 We work on. 36:31.140 --> 36:33.140 Having a way to. 36:33.140 --> 36:34.140 Exactly and. 36:34.140 --> 36:35.140 Somebody. 36:35.140 --> 36:36.140 I sort of said. 36:36.140 --> 36:38.140 Didn't tell the. 36:38.140 --> 36:39.140 But. 36:39.140 --> 36:40.140 But. 36:40.140 --> 36:42.140 But what. 36:42.140 --> 36:43.140 That is. 36:43.140 --> 36:44.140 Right collection materials. 36:44.140 --> 36:45.140 Yeah. 36:45.140 --> 36:47.140 We have. 36:47.140 --> 36:48.140 OK. 36:48.140 --> 36:49.140 And if. 36:49.140 --> 36:50.140 OK. 36:50.140 --> 36:51.140 If. 36:51.140 --> 36:52.140 But. 36:52.140 --> 36:53.140 OK. 36:53.140 --> 36:56.140 And then one. 36:56.140 --> 36:57.140 And I. 36:57.140 --> 36:58.140 So. 36:58.140 --> 36:59.140 And it. 36:59.140 --> 37:00.140 The. 37:00.140 --> 37:03.020 It's not a bit of a change, or it's an awesome bottleneck. 37:03.020 --> 37:09.140 So from the way I understand it, that you stop, change the code, we compile, and we run. 37:09.140 --> 37:13.900 So we stare at each other tools to steer what's going on. 37:13.900 --> 37:18.140 Yeah, yeah, yeah, yeah, there are tools to correct me trees and display the cost of each task, 37:18.140 --> 37:25.500 and see which task is taking a lot of time, and the chance you put it to clear the number of things. 37:25.500 --> 37:27.500 Yeah. 37:27.500 --> 37:28.500 Yeah. 37:28.500 --> 37:30.500 I'm curious about it, might you please? 37:30.500 --> 37:31.500 Yep. 37:31.500 --> 37:34.500 So, did you develop the activities to me? 37:34.500 --> 37:37.500 And you put the 26, is there an influence? 37:37.500 --> 37:42.500 Like, one way or the other, do you already look into, how will it look like? 37:42.500 --> 37:48.500 And it's going to have to be used for that, or instead, like, far out of the future? 37:48.500 --> 37:52.500 So, so, I'm used to the beginning. 37:52.500 --> 37:57.500 Also, you have your, I might repeat it like that. 37:57.500 --> 37:58.500 Yeah. 37:58.500 --> 38:04.500 And it's a nice job by messing my instructions to the vector extension. 38:04.500 --> 38:05.500 Yeah. 38:05.500 --> 38:08.500 The space of things you've lost, but there will soon be a thing. 38:08.500 --> 38:09.500 Yeah. 38:09.500 --> 38:16.500 And then that's because I know that there is a thing. 38:16.500 --> 38:21.500 Yeah, at some point, if the C++ instance, 38:21.500 --> 38:25.500 those, all that we need, we can just forget instead of me. 38:25.500 --> 38:31.500 We did the work on me, basically, because when we start working on effect, we didn't have 38:31.500 --> 38:36.500 an alternative that we are fitting, but we needed it. 38:36.500 --> 38:43.500 But as soon as the need is completely fulfilled by C++, then we just move to C++ and that's it. 38:43.500 --> 38:48.500 And the other thing is, like, in the past, they have a lot of problems. 38:48.500 --> 38:51.500 And they, like, have a part of the different projects. 38:51.500 --> 38:52.500 Yeah. 38:52.500 --> 38:57.500 And all of them, especially about, I guess, how you want to treat the use, 38:57.500 --> 39:00.500 because that is in place, because it's really quite nicely, 39:00.500 --> 39:04.500 and you just want to have, like, the end quarter of the quarter. 39:04.500 --> 39:05.500 Yeah. 39:05.500 --> 39:10.500 So, you have a work course, an idea of how we can, like, the printed use, 39:10.500 --> 39:14.500 so it's easy to integrate it into a, and other projects. 39:14.500 --> 39:17.500 Actually, the, all the modules are no separate it. 39:17.500 --> 39:22.500 So, the first, the, the stream view part was part of the effect. 39:22.500 --> 39:26.500 It was just a one piece of code, and then we extracted it and no, 39:26.500 --> 39:28.500 stream view can be used as well. 39:28.500 --> 39:32.500 So, I didn't present that here, but we have, 39:32.500 --> 39:37.500 a work on using stream view, for instance, for astronomy, 39:37.500 --> 39:38.500 for processing astronomy. 39:38.500 --> 39:42.500 There has been already some, some work on detecting, 39:42.500 --> 39:44.500 meteors. 39:44.500 --> 39:47.500 So, change that used the DVB as to, 39:47.500 --> 39:52.500 system to, to collect some, some video, 39:52.500 --> 39:55.500 and detect some meteors in the, in the video. 39:55.500 --> 39:59.500 And also, since there's a, 39:59.500 --> 40:02.500 a change of task, what is in the task is a, 40:02.500 --> 40:05.500 a convenient component of SGR or communication, 40:05.500 --> 40:08.500 so you can put anything that, that you need. 40:08.500 --> 40:11.500 So, that could be AI for instance, or whatever. 40:11.500 --> 40:15.500 Contدمist was curious about the SEC Marks, 40:15.500 --> 40:19.520 So basically, I want to, like, 40:19.520 --> 40:21.520 logistic and important before hearing and the, 40:21.520 --> 40:24.520 for what gravity tries to remove all the, 40:24.520 --> 40:26.520 a monthly printing and what not. 40:26.520 --> 40:29.500 Just to use the helicopter we use. 40:31.500 --> 40:32.500 So, that, you can do that. 40:32.500 --> 40:36.500 If you don't know the, effect, 40:36.500 --> 40:37.500 the effect is no, the set of modules, 40:37.500 --> 40:47.500 for why you are quite soon, so you have a series of schemes that are implemented and you can just target the schemes directly. 40:52.500 --> 41:05.500 Well, if you look at the way that the chains are built, you can see that there is a little blue in the effect part itself. 41:05.500 --> 41:19.500 Everything, almost everything is in the stream view part. So in the effect, you just have routines and classes and you just feel your data to the classes and methods and then you get the results. 41:20.500 --> 41:28.500 I think you are not very difficult. I don't know if there are examples exactly on that point, but it's not very difficult, I think to do that. 41:35.500 --> 41:52.500 So in this part, I will not be able to answer it, because I am really on the computer science. 41:52.500 --> 42:05.500 But well, from the letter I know about the filter, I think that would be okay, because that's the purpose. 42:05.500 --> 42:13.500 It's really the idea to find the data or feed some information and we use it, or loop for some things. 42:13.500 --> 42:22.500 So again, there are just blocks and tasks and what is inside the task is quite free. 42:22.500 --> 42:51.500 So you have two possibility, either you can use a pointer that we followed, so then it is zero copy or if you need to change the data, because you decode and it's larger or it's not the same type, then you can change the type. 42:52.500 --> 43:02.500 So this is already the same type as in the other foot of a task, so it's completely free. 43:02.500 --> 43:05.500 Yeah, yeah. 43:05.500 --> 43:13.500 Yeah. 43:13.500 --> 43:26.500 Yeah. 43:26.500 --> 43:40.500 In the simplest part, there is only the synchronization between the blocks. In the Julia part, you have the override of calling Julia and entering the Julia context. 43:40.500 --> 43:52.500 And also if you are in this version, you have also the cost of the error management of Julia, which is quite important. 43:52.500 --> 44:05.500 So this is the main source of override here, but the synchronization is what you have mainly in the simplest part, you just have the blocks between the task. 44:05.500 --> 44:20.500 So you cannot really have unless you go to atomize or to those are means of synchronization, but I think not many other alternatives to the data. 44:22.500 --> 44:32.500 Okay, so I think it's time to get a break and about to have lunch also.