WEBVTT 00:00.000 --> 00:10.000 The next speaker is going to be talking to us about Macros gone wild. 00:10.000 --> 00:12.000 Good morning. 00:12.000 --> 00:18.000 The original alternative title for this talk was you are not expended to understand this 00:18.000 --> 00:24.000 after a famous comment that appeared the sixth edition of the research Unix. 00:24.000 --> 00:28.000 Who of you are familiar with what the CPU processor does? 00:28.000 --> 00:30.000 Okay. 00:30.000 --> 00:34.000 Most so I will skip that it does a source file inclusion, macro replacement, 00:34.000 --> 00:38.000 conditional compilation and other stuff and that we will just move into problems. 00:38.000 --> 00:42.000 Garnet's throw stroke described the pre-process as a constant problem to 00:42.000 --> 00:46.000 probabilities, maintainers, people reporting code and tool builders. 00:46.000 --> 00:50.000 And this indeed the case is faced by this thing from the compiler. 00:50.000 --> 00:54.000 And therefore it's difficult to analyze when you see a syntax error in the compiler. 00:54.000 --> 00:58.000 You don't know how the pre-processor changed the code and that made that error appear. 00:58.000 --> 01:00.000 It's difficult to reason about. 01:00.000 --> 01:06.000 Confuses the C grammar and the semantics you can have parts that are distinct from the C grammar. 01:06.000 --> 01:12.000 The conditional compilation confuses testing and source associated with many traps and pitfalls. 01:12.000 --> 01:16.000 Books are on the use of C programming language. 01:16.000 --> 01:18.000 This is how to use the pre-processor. 01:18.000 --> 01:22.000 So properly otherwise you will get the burnt and crash. 01:22.000 --> 01:26.000 So what I will look at is how the pre-processor is used in the Linux kernel. 01:26.000 --> 01:30.000 And specifically examine four things. 01:30.000 --> 01:32.000 First of all the usage characteristics. 01:32.000 --> 01:34.000 The extent it is used in the Linux kernel. 01:34.000 --> 01:38.000 Second I will and I will most focus. 01:38.000 --> 01:40.000 I will discuss the introduced technical depth. 01:40.000 --> 01:44.000 We'll see how it has changed over time. 01:44.000 --> 01:48.000 And finally we'll discuss the feasibility of this reducing this technical depth. 01:48.000 --> 01:52.000 Things that make it difficult to switch the pre-processor makes our life difficult. 01:52.000 --> 02:00.000 Especially the feasibility of using Rust as an alternative method for doing many of the things that 02:00.000 --> 02:02.000 Microsoft do nowadays. 02:02.000 --> 02:06.000 For this study I use the tool called the C scout. 02:06.000 --> 02:08.000 This is a refactoring browser for C code. 02:08.000 --> 02:12.000 So it ingest C code allows you to study it. 02:12.000 --> 02:16.000 Taking to account the full semantics and tokens of the pre-processor. 02:16.000 --> 02:22.000 So when you see the code and you click on a token it goes back to the token's definition in a 02:22.000 --> 02:24.000 Macro for example. 02:24.000 --> 02:30.000 Even if this has been defined in a way that there is uses the pre-processor. 02:30.000 --> 02:34.000 So it performs both the semantics and that analysis of the code. 02:34.000 --> 02:38.000 And I have extended this to collect a number of metrics. 02:38.000 --> 02:40.000 What happens before the pre-processor and after the pre-processor. 02:40.000 --> 02:44.000 Both at the end of each function and at the end of each file. 02:44.000 --> 02:48.000 And I also added functionality to measure keywords and some other metrics 02:48.000 --> 02:52.000 have to do with the complexity of the code. 02:52.000 --> 02:58.000 I've analyzed the three kernels which you see here. 02:58.000 --> 03:00.000 Paste about 10 years. 03:00.000 --> 03:08.000 In the 10 years the distance the most recent one is 6.10 in July last year. 03:08.000 --> 03:14.000 And as you see there, the number of lines has been released from 03:14.000 --> 03:18.000 4,000 lines to 23,000 lines files. 03:18.000 --> 03:20.000 C files nowadays. 03:20.000 --> 03:28.000 And similarly the number of lines has been released from 5 million lines to 24 million lines of which I analyzed 03:28.000 --> 03:32.000 20 billion lines because I analyzed only a single architecture. 03:32.000 --> 03:36.000 And the one configuration the most complete configuration of all config. 03:36.000 --> 03:40.000 As you can see on the bottom, this required considerable resources. 03:40.000 --> 03:46.000 More than a week of processing time for the latest kernel and large amount of memory, 03:46.000 --> 03:50.000 113 gigabytes. 03:50.000 --> 03:54.000 The analysis was not easy for the two versions. 03:54.000 --> 03:58.000 For the early versions it cannot be analyzed with a modern GCC. 03:58.000 --> 04:02.000 I'm installing no GCC on a modern Linux was in practical. 04:02.000 --> 04:06.000 And also the 32-bit RAM capacity was insufficient to run a C scout. 04:06.000 --> 04:10.000 For this I used QMU and Hypervisor Accelerator. 04:10.000 --> 04:14.000 I had to force the use of deprecated crypto in order to be able to access 04:14.000 --> 04:18.000 a section to the hyper via into the MU later. 04:18.000 --> 04:22.000 And used architect packages because could otherwise 04:22.000 --> 04:24.000 could not install what was required. 04:24.000 --> 04:30.000 And therefore I compiled it on the QMU and then I analyzed it on a powerful host. 04:30.000 --> 04:34.000 For the 6.10 there isn't a version it required more than a week of processing 04:34.000 --> 04:38.000 which by running it again again would take months. 04:38.000 --> 04:42.000 And also a lot of RAM and for this I utilized the super computer. 04:42.000 --> 04:50.000 I split the task into 32 tasks running in parallel and several 32 super computer nodes. 04:50.000 --> 04:54.000 And then I developed a procedure to merge the results on a powerful 04:54.000 --> 04:58.000 node number of different ways to do it 04:58.000 --> 05:00.000 through SQL recursive queries. 05:00.000 --> 05:02.000 Couldn't do it. 05:02.000 --> 05:04.000 I could not merge graphs. 05:04.000 --> 05:10.000 In the end I developed a command in C scout to merge the parts as you see here. 05:10.000 --> 05:14.000 I performed a binary tournament merge 32 processes running in parallel. 05:14.000 --> 05:18.000 These were reduced and hour and a half later and again in two hours 05:18.000 --> 05:24.000 and the three hours the results were almost the merging results were almost ready. 05:24.000 --> 05:32.000 So now we'll describe the findings for the most recent version 6.10.1. 05:32.000 --> 05:36.000 And later I will give some examples of what happened before that time. 05:36.000 --> 05:40.000 So how is the preprocessor used extensively? 05:40.000 --> 05:42.000 33% of the defined functions are defined. 05:42.000 --> 05:44.000 It looks like a function. 05:44.000 --> 05:46.000 It's a macro. 05:46.000 --> 05:50.000 72% of what is defined as an identifier is a macro identifier. 05:50.000 --> 05:56.000 And when you look at the usage 44% of when you see a function call, 05:56.000 --> 05:58.000 it's actually a macro behind it. 05:58.000 --> 06:04.000 And if you see an identifier actually 44% again it is a macro identified. 06:04.000 --> 06:08.000 Also interesting 94% of the macro identifiers are never used. 06:08.000 --> 06:14.000 This is not that bad because most of these definitions have to do with the hardware constants 06:14.000 --> 06:18.000 and I think it's good to define them for the sake of completion. 06:18.000 --> 06:24.000 It's rather than leaving gaps in having people wonder whether we've forgotten something or not. 06:24.000 --> 06:30.000 The distribution of preprocessor, the electives varies among various areas of the kernel. 06:30.000 --> 06:34.000 So we see that in the main part of the cancer on the kernel directory 06:34.000 --> 06:40.000 we see a large number of conditionals probably because the kernel has to serve many purposes, 06:40.000 --> 06:42.000 many architectures and the configurations. 06:42.000 --> 06:46.000 We see under drivers that conditionals are used very little, 06:46.000 --> 06:50.000 probably because drivers target a very specific configuration. 06:50.000 --> 06:56.000 And also on the architecture part arch, then we have a large number of filing clues. 06:56.000 --> 06:58.000 I'm not sure why. 06:58.000 --> 07:02.000 If we look at the expansion, what the expansion, the simply processor, 07:02.000 --> 07:06.000 it does with the expansion, we see that the number of tokens used doubles 07:06.000 --> 07:10.000 from 2,000 to 4,000 per file. 07:10.000 --> 07:14.000 And there are some explosions to 3 million tokens post expansion. 07:14.000 --> 07:18.000 The same number of statements or declarations 07:18.000 --> 07:22.000 that the compiler sees, it rises from 170 to 300. 07:22.000 --> 07:28.000 And the number of operators from 300 to 760. 07:28.000 --> 07:36.000 Also the if statements increase from 23 to 36 per unit looked. 07:36.000 --> 07:40.000 And also there's a huge number of increase in the number of go-to labels, 07:40.000 --> 07:42.000 but this happens for a very specific purpose. 07:42.000 --> 07:44.000 So I know why. 07:44.000 --> 07:48.000 You 32 and you 16 also are part of labels. 07:48.000 --> 07:52.000 So this conflates those things. 07:52.000 --> 07:54.000 Why is this? 07:54.000 --> 07:56.000 But let me give you some reasons. 07:56.000 --> 08:00.000 Name space pollution at the beginning of its function. 08:00.000 --> 08:02.000 We have 106 global namespace occupants. 08:02.000 --> 08:04.000 Identifies that are visible there. 08:04.000 --> 08:06.000 That's the median value. 08:06.000 --> 08:10.000 Also each marker is used in 81 files, so it's used a lot. 08:10.000 --> 08:14.000 And the 10 most used frequently defined macro names 08:14.000 --> 08:16.000 are defined 30,000 times. 08:16.000 --> 08:20.000 So the same macro name is defined again again in various functions. 08:20.000 --> 08:22.000 30,000 times. 08:22.000 --> 08:28.000 And these are used 152,000 times in 2000 files. 08:28.000 --> 08:30.000 Another thing that happens is namespace confusion. 08:30.000 --> 08:34.000 Look at this dream, but there are many of these. 08:34.000 --> 08:36.000 For example, this defiles BCH. 08:36.000 --> 08:38.000 Our log fills as a macro. 08:38.000 --> 08:42.000 Doing something X with reads time and some value. 08:42.000 --> 08:44.000 Later on this is redefined. 08:44.000 --> 08:48.000 X is defined to create a name out of what happens before. 08:48.000 --> 08:50.000 What of the name? 08:50.000 --> 08:52.000 And creates new names. 08:52.000 --> 08:54.000 And these are become part of an enumeration. 08:54.000 --> 08:57.000 As you see here by invoking this macro. 08:57.000 --> 09:00.000 And later on again, X is defined in a different way. 09:00.000 --> 09:03.000 BCH log fills is called again, 09:03.000 --> 09:05.000 again, invoked again as a macro. 09:05.000 --> 09:08.000 And these accesses members of a structure. 09:08.000 --> 09:12.000 So at the same time, the same identifier must be part of a structure 09:12.000 --> 09:19.000 member and also part of the name here that is defined 09:19.000 --> 09:23.000 in the generated dynamically through token pasting. 09:23.000 --> 09:26.000 I've looked at all areas of confusion. 09:26.000 --> 09:29.000 And you see here that the markers are become also 09:29.000 --> 09:31.000 in time members of enumeration. 09:31.000 --> 09:35.000 Parts of labels, parts of structure or union members, 09:35.000 --> 09:39.000 or structure union tags, or ordinary identifiers. 09:39.000 --> 09:43.000 So these are confused between things that the simple programming language 09:43.000 --> 09:47.000 considers separator in theory separate namespaces. 09:47.000 --> 09:49.000 But macros confused the two. 09:49.000 --> 09:52.000 So if you change a structure or union tag, 09:52.000 --> 09:56.000 you may also need to change the corresponding go to label. 09:56.000 --> 10:00.000 And another thing is that in the coding style, 10:00.000 --> 10:02.000 there are actually forbidden, but it actually happens. 10:02.000 --> 10:04.000 So macros should not affect control for it. 10:04.000 --> 10:07.000 You're not, it's not good to return from a macro. 10:07.000 --> 10:10.000 And there should also access not access local variables 10:10.000 --> 10:13.000 if defined outside of a function. 10:13.000 --> 10:16.000 And yet here's an example of scoping confusion. 10:16.000 --> 10:20.000 This macro here uses BCH, defined outside the function. 10:20.000 --> 10:23.000 200 lines later, we have a definition of BCH. 10:23.000 --> 10:29.000 21 lines later, this macro is involved using this BCH value. 10:30.000 --> 10:32.000 Control flow confusion. 10:32.000 --> 10:39.000 And this happens 3,000 to 7,700 times this happening. 10:39.000 --> 10:44.000 Control flow confusion, you see here again this macro that returns something. 10:44.000 --> 10:48.000 And later on it's called 77 lines later, it's called, 10:48.000 --> 10:51.000 and this return happens automatically. 10:51.000 --> 10:52.000 This doesn't happen a lot. 10:52.000 --> 10:54.000 I found 12 instances of continue. 10:54.000 --> 10:58.000 40 of break 80 go to which are but troubling 10:58.000 --> 11:03.000 and 97 return statements in such macros defined outside functions. 11:03.000 --> 11:06.000 When I gave this talk at ETH, can I develop it? 11:06.000 --> 11:09.000 Yes, yes, yes, yes, but these are the people working on drivers. 11:09.000 --> 11:12.000 And not with don't do it in the kernel directory. 11:12.000 --> 11:14.000 Actually check after that. 11:14.000 --> 11:18.000 And you see here what happens in the kernel, these violations. 11:18.000 --> 11:22.000 And invariable scope actually happens more under the kernel directory. 11:22.000 --> 11:27.000 Another thing are hybrid call paths. 11:27.000 --> 11:31.000 So the case where we don't have C functions calling C functions, 11:31.000 --> 11:34.000 which is the most common or C functions calling macros. 11:34.000 --> 11:38.000 But more complicated stuff, we have instances of macros calling other macros. 11:38.000 --> 11:43.000 And there are almost half a million like 3 chains of C functions calling 11:43.000 --> 11:45.000 another C function via a macro. 11:45.000 --> 11:48.000 If you try to look at this in the debugger, you will not find it. 11:48.000 --> 11:51.000 If you create the call graph using the object file definitions, 11:51.000 --> 11:54.000 you will not find this calls through the macro. 11:54.000 --> 11:56.000 You will find them directly inexplicably so, 11:56.000 --> 12:00.000 because this will not appear thus in the source code. 12:00.000 --> 12:05.000 Another thing that this actually made me study this thing is expansion explosion. 12:05.000 --> 12:10.000 So about 500 files expand to more than 1,000 per cent. 12:10.000 --> 12:12.000 The median is 87%. 12:12.000 --> 12:17.000 And there are found 30 outliers that take 14 seconds to compile, 12:18.000 --> 12:21.000 whereas most files compile in less than 2 seconds. 12:21.000 --> 12:24.000 Let me show you an example here. 12:24.000 --> 12:29.000 This is a file set up dot C from x86 Zen. 12:29.000 --> 12:33.000 It's about 1,000 lines, 26 kilobytes. 12:33.000 --> 12:38.000 When I expand it becomes 50 megabytes, 88,000 lines. 12:38.000 --> 12:43.000 It takes about 7 minutes to compile and 3 gigabytes of RAM. 12:43.000 --> 12:46.000 And I will try to zoom here. 12:46.000 --> 12:50.000 So what I like to see here is the file. 12:50.000 --> 12:53.000 On the right we see where I am, it's very small dot. 12:53.000 --> 12:59.000 And now I'm zooming, moving a bit up and zooming again. 12:59.000 --> 13:04.000 And zooming again, you see the red square and larger. 13:04.000 --> 13:09.000 Okay, we see some code here. 13:10.000 --> 13:26.000 And actually, with a few weeks later after I found it, 13:26.000 --> 13:29.000 it was actually fixed with this commit. 13:29.000 --> 13:31.000 It does excessive expansion. 13:31.000 --> 13:34.000 It was called to the main 3 macro. 13:34.000 --> 13:38.000 There are also complexity metrics that computer scientists study 13:38.000 --> 13:40.000 to see how difficult it is to understand the code. 13:40.000 --> 13:42.000 Things called cyclomatic complexity. 13:42.000 --> 13:48.000 Having to do with jumps around the core and the graph of the instructions. 13:48.000 --> 13:50.000 Jumping from one place in the other. 13:50.000 --> 13:52.000 And this increases for 4 to 7. 13:52.000 --> 13:54.000 This is called a cyclomatic complexity. 13:54.000 --> 13:57.000 And the hard set volume, having to do with the identifiers. 13:57.000 --> 13:59.000 Visible at a given point. 13:59.000 --> 14:02.000 Also in the medium volume increases for 85 to 180. 14:02.000 --> 14:07.000 Which means that it's more difficult to reason about the code and test it. 14:07.000 --> 14:08.000 Other things. 14:08.000 --> 14:13.000 There are composite identifiers piece together through markers about 150,000. 14:13.000 --> 14:15.000 Extensive include hierarchies. 14:15.000 --> 14:22.000 So there are some outlier compilation units that include 1.5 million number of lines. 14:22.000 --> 14:24.000 Each compilation unit. 14:24.000 --> 14:29.000 So it thinks we compile, takes about 2,000 files, imports it. 14:29.000 --> 14:33.000 And also 36 include file outliers, have a depth of 12 nesting. 14:33.000 --> 14:36.000 So something includes something else, include something else 12 times. 14:36.000 --> 14:40.000 And that also found several cyclical includes defect dependencies. 14:40.000 --> 14:45.000 So in total 170,000, 7 per compilation unit. 14:45.000 --> 14:49.000 The longest one consists of 10 elements. 14:49.000 --> 14:51.000 So it's this cycle here. 14:51.000 --> 14:53.000 Of course, this is not an instrument to the core. 14:53.000 --> 14:57.000 Cursion because we protect include files from re-including themselves. 14:57.000 --> 15:01.000 But nevertheless, when we fancy such things, it's difficult to break them. 15:01.000 --> 15:04.000 It's difficult to reason what's happening here. 15:04.000 --> 15:08.000 How has this evolved over time? 15:08.000 --> 15:11.000 So here are the three kernels I looked at. 15:11.000 --> 15:12.000 Two things are good. 15:12.000 --> 15:14.000 Conditional directives are falling. 15:14.000 --> 15:17.000 And include directives are also falling a bit. 15:17.000 --> 15:19.000 Which means, especially in conditional directives. 15:19.000 --> 15:21.000 It's difficult to test stuff. 15:21.000 --> 15:24.000 It's good that they are being reduced. 15:24.000 --> 15:28.000 But other things you see that they are still increasing names with confusion. 15:28.000 --> 15:30.000 They use of the concatenation operator. 15:30.000 --> 15:32.000 So how can we reduce this technical depth? 15:32.000 --> 15:34.000 First of all, one practical thing. 15:34.000 --> 15:37.000 I found that about 5 million object-like macros, 15:37.000 --> 15:41.000 almost most of them, can simply be rewritten. 15:41.000 --> 15:42.000 Either as a static const value. 15:42.000 --> 15:45.000 And I've verified that juicy compiles it, 15:45.000 --> 15:48.000 makes it doesn't take any memory at all. 15:48.000 --> 15:49.000 It's not a problem. 15:49.000 --> 15:50.000 It's not a problem. 15:50.000 --> 15:51.000 It's not a problem. 15:51.000 --> 15:52.000 It's a problem. 15:52.000 --> 15:53.000 It's a problem. 15:53.000 --> 15:54.000 It's a problem. 15:54.000 --> 15:55.000 It's a problem. 15:55.000 --> 15:56.000 It's a problem. 15:56.000 --> 15:57.000 It's a problem. 15:57.000 --> 15:58.000 It's a problem. 15:58.000 --> 16:00.000 It doesn't take any memory at all. 16:00.000 --> 16:03.000 It compiles the code as if it was defined as a macro. 16:03.000 --> 16:05.000 If it's used as a macro. 16:05.000 --> 16:07.000 So without taking it's address, for example, 16:07.000 --> 16:09.000 or assigning to it. 16:09.000 --> 16:12.000 Or they can also be defined as an emulation. 16:12.000 --> 16:16.000 Remember, which makes it possible to use it as a compile-time constant. 16:16.000 --> 16:19.000 For instance, for declaring the size, 16:19.000 --> 16:22.000 define the size of an array with it. 16:22.000 --> 16:26.000 And this is a possible for 77% of the macros. 16:26.000 --> 16:29.000 For the rest, the values probably not a compile-time constant, 16:29.000 --> 16:31.000 about 1 million of them. 16:31.000 --> 16:35.000 Or the value is used as a token concatenation, 16:35.000 --> 16:38.000 or a stingyization, about 90,000 of them. 16:38.000 --> 16:41.000 Or the values used in a simply processor constant. 16:41.000 --> 16:44.000 So it's appears in the nif, if death. 16:44.000 --> 16:45.000 And so on. 16:45.000 --> 16:47.000 And for 23,000 of them. 16:47.000 --> 16:51.000 But for a large majority, we could actually do it. 16:52.000 --> 16:55.000 Well, now let's move back to something more difficult. 16:55.000 --> 16:59.000 The function-like macros, about 100,000 of them. 16:59.000 --> 17:02.000 I've calculated by looking at cases that cannot be easily converted. 17:02.000 --> 17:06.000 That about half of them could be converted into C. 17:06.000 --> 17:11.000 And for the rest, we've heard it in a number of sessions through this conference so far. 17:11.000 --> 17:15.000 Last could promise an answer because, as a more powerful type system, 17:15.000 --> 17:19.000 it allows for typed, syntactic, correct, complete macros. 17:19.000 --> 17:24.000 And torsive can process code declaratively by manipulating the syntax. 17:24.000 --> 17:27.000 And that allows to do more complex things. 17:27.000 --> 17:33.000 So as something that the Germans very elegantly call the Duncan experiment, 17:33.000 --> 17:37.000 a thought experiment, let's think about what it would mean to use 17:37.000 --> 17:42.000 rest to change existing function-like macros into a last code. 17:42.000 --> 17:47.000 It's not really feasible because each change would have to happen together with the rest of the code. 17:47.000 --> 17:51.000 But let's look at what it would involve. 17:51.000 --> 17:55.000 32,000 function-like macros are not used as functions, 17:55.000 --> 17:59.000 so they create data structures or code dynamically. 17:59.000 --> 18:05.000 But they could be converted into rest macros. 18:05.000 --> 18:08.000 9500 of them used talking concatenation for this. 18:08.000 --> 18:13.000 We could use the last concatenance feature. 18:13.000 --> 18:18.000 5,000 used non-object parameters, so they take a parameter that's not an object, 18:18.000 --> 18:21.000 like a function name or a variable name, but for this, 18:21.000 --> 18:25.000 we could use the rest macros meta variables. 18:25.000 --> 18:29.000 1700 of them have some modifications, rest has a stringify, 18:29.000 --> 18:32.000 so this could also work. 18:32.000 --> 18:34.000 200 of them affect control flow. 18:34.000 --> 18:37.000 For this, we could use rest macros or ideally we should 18:37.000 --> 18:41.000 refactor them not use a not effect control flow, 18:41.000 --> 18:45.000 so not return from inside the macro. 18:45.000 --> 18:51.000 33 used the type of for this could use type traits or generic parameters. 18:51.000 --> 18:55.000 Again, I suspect that more used generic types, 18:55.000 --> 19:00.000 so this number may be larger, but rest gives a very good solution for this, 19:00.000 --> 19:04.000 and this could also improve the code's quality. 19:04.000 --> 19:08.000 And also 88 have incomplete syntax, so they have any dangling open bracket 19:08.000 --> 19:10.000 or dangling close brace. 19:10.000 --> 19:15.000 And for this, I think we should be refactor there, also very few. 19:15.000 --> 19:21.000 So overall there are 24 to 43,000 macros about 44% 19:21.000 --> 19:27.000 that could be handled by using the more powerful features of rest. 19:28.000 --> 19:32.000 So to see the overall, these are all the object, 19:32.000 --> 19:35.000 like are all the macros. 19:35.000 --> 19:39.000 These are all C constants that could be 19:39.000 --> 19:42.000 moved directly to C objects. 19:42.000 --> 19:44.000 These are the rest of the objects, 19:44.000 --> 19:48.000 like things that cannot be directly defined as C constants. 19:48.000 --> 19:50.000 And from the function like macros, 19:50.000 --> 19:54.000 some would require rest and another large part 19:54.000 --> 19:59.000 can probably be converted into C directly. 19:59.000 --> 20:02.000 So to conclude what have we seen here, 20:02.000 --> 20:07.000 we have seen that they use of the C preprocessor in the Linux kernel 20:07.000 --> 20:12.000 is extensive, introduces technical depth in all preprocessor dimensions, 20:12.000 --> 20:15.000 so no matter how you use the preprocessor, 20:15.000 --> 20:18.000 in many cases, something bad happens. 20:18.000 --> 20:21.000 The usage is still growing in a number of areas, 20:21.000 --> 20:23.000 and it can be quite expensive to address 20:23.000 --> 20:27.000 there is no simple solution that can help us here. 20:27.000 --> 20:30.000 So for the short term, what could we do, 20:30.000 --> 20:33.000 fix macros explosions, and for example, 20:33.000 --> 20:35.000 where this has already happened. 20:35.000 --> 20:37.000 We can correct frequent cyclic, 20:37.000 --> 20:40.000 frequently occurring cyclic dependencies. 20:40.000 --> 20:42.000 I fixed one as an experiment over the summons 20:42.000 --> 20:45.000 has already been merged. 20:45.000 --> 20:48.000 I think it would be good to reduce other ones. 20:48.000 --> 20:53.000 And also consider converting those 77% of the object-like macros 20:53.000 --> 20:55.000 that can be converted into C constants, 20:55.000 --> 20:58.000 doing this as a benefit they will appear in the debugger, 20:58.000 --> 21:02.000 and it will be more easier to reason about it. 21:02.000 --> 21:04.000 In the longer term, we can prioritize 21:04.000 --> 21:06.000 refactoring a function-like macros, 21:06.000 --> 21:08.000 either into C, or it is possible, 21:08.000 --> 21:11.000 or into rest when modules are converted into rest 21:11.000 --> 21:16.000 or created into rest from the beginning. 21:17.000 --> 21:19.000 This brings me to the end. 21:19.000 --> 21:21.000 I hope you find it useful. Thank you. 21:21.000 --> 21:22.000 Thank you. 21:22.000 --> 21:24.000 Thank you. 21:24.000 --> 21:26.000 Thank you. 21:31.000 --> 21:33.000 All right. Any questions? 21:33.000 --> 21:35.000 All right. 21:39.000 --> 21:41.000 I'm sorry. 21:42.000 --> 21:46.000 So, I think you're all wondering what 21:46.000 --> 21:48.000 why are people doing that, right? 21:48.000 --> 21:50.000 If stuff can easily be written as a C function, 21:50.000 --> 21:51.000 why do we write a macro? 21:51.000 --> 21:55.000 Because they used it from 1980s, or what is your theory? 21:55.000 --> 21:57.000 So, where are we using macros? 21:57.000 --> 22:01.000 I looked, especially when I told that macros have a problem. 22:01.000 --> 22:04.000 I looked back and I was saying why people 22:04.000 --> 22:07.000 consider size of it to be equals to size of pointer, 22:07.000 --> 22:08.000 when can I do it? 22:08.000 --> 22:09.000 Can I do it? 22:09.000 --> 22:11.000 You said that don't do it. 22:11.000 --> 22:12.000 One answer was that can I do it? 22:12.000 --> 22:14.000 You didn't have children at the time. 22:14.000 --> 22:16.000 We told somebody don't do it. 22:16.000 --> 22:19.000 So, in seriousness, now, 22:19.000 --> 22:20.000 why are we doing it? 22:20.000 --> 22:22.000 I think there's a historical precedent. 22:22.000 --> 22:25.000 Early compilers didn't have the optimization 22:25.000 --> 22:27.000 capabilities that modern compilers have. 22:27.000 --> 22:30.000 So, for example, they could not in line functions 22:30.000 --> 22:32.000 that were very small. 22:32.000 --> 22:34.000 Whereas with a macro, this happens automatically. 22:34.000 --> 22:36.000 So, you have the compiler. 22:36.000 --> 22:37.000 The same with constants. 22:37.000 --> 22:40.000 So, now that you see recognizes that something is not used. 22:40.000 --> 22:44.000 It doesn't need to allocate memory, but all compilers were very primitive 22:44.000 --> 22:47.000 and would allocate memory for that. 22:47.000 --> 22:51.000 And constant wasn't even available in the first version of C. 22:51.000 --> 22:55.000 So, a large part, I think, of macro usage has to do with helping 22:55.000 --> 23:00.000 compilers that were not very, very sophisticated. 23:00.000 --> 23:03.000 But also, there are other uses that are defined 23:03.000 --> 23:07.000 so generic functions that can be used with multiple types, 23:07.000 --> 23:08.000 such as minimum. 23:08.000 --> 23:12.000 We saw explored before for conditional compilation 23:12.000 --> 23:14.000 for which there is no alternative. 23:14.000 --> 23:17.000 And also, the C doesn't have a powerful module system. 23:17.000 --> 23:22.000 So, for this, we will use the include directive and header files. 23:22.000 --> 23:25.000 Thank you. 23:25.000 --> 23:28.000 Anything else? 23:28.000 --> 23:29.000 Yeah. 23:41.000 --> 23:44.000 In your sort experiment for a conditional rest, 23:44.000 --> 23:47.000 what will you see as the unit of conversion? 23:47.000 --> 23:49.000 Because obviously we are not going to go. 23:49.000 --> 23:51.000 Sorry, a bit louder, please. 23:51.000 --> 23:54.000 In your sort experiment, to convert into rest, 23:54.000 --> 23:58.000 what will you see as the minimum unit for conversion to rest? 23:58.000 --> 24:02.000 Because obviously we are not going to connect in digital Microsoft. 24:02.000 --> 24:05.000 Exactly, this white was a thought experiment. 24:05.000 --> 24:08.000 The other thing is, if I understood the question correctly, 24:08.000 --> 24:12.000 what to do with the rest of the code, because the code will need to be converted into rest. 24:12.000 --> 24:14.000 Is that a question? 24:14.000 --> 24:15.000 Yes. 24:15.000 --> 24:19.000 So, this is a very larger question where the cannot answer as part of this. 24:19.000 --> 24:24.000 The study people are looking at ways to convert C into safe rest, 24:24.000 --> 24:27.000 because in converting into rest is not difficult, 24:27.000 --> 24:30.000 to safe and read the rest is difficult. 24:30.000 --> 24:33.000 And there are a number of tools that help with that, 24:33.000 --> 24:35.000 such as C to safe rest. 24:35.000 --> 24:38.000 LLMs can help in small areas. 24:38.000 --> 24:41.000 But that's a huge project, 24:41.000 --> 24:45.000 and not sure even that the community is looking at something 24:45.000 --> 24:48.000 that is universally good thing to do. 24:48.000 --> 24:52.000 So, it's beyond the scope of where this study. 24:52.000 --> 24:54.000 That's another question there. 25:00.000 --> 25:01.000 Hello. 25:01.000 --> 25:03.000 Thank you for the presentation. 25:03.000 --> 25:08.000 One thing that I would like to find out from your research is that 25:08.000 --> 25:10.000 the last year we're moving into rest. 25:10.000 --> 25:13.000 The tenant is moving into rest gradually, 25:13.000 --> 25:15.000 and there is a lot of funding there. 25:15.000 --> 25:16.000 So, that's also a very good question. 25:16.000 --> 25:18.000 If you're moving into the right direction, 25:18.000 --> 25:19.000 it's difficult to say it before hand. 25:19.000 --> 25:21.000 Rust has many advantages, 25:21.000 --> 25:24.000 because clearly created as a systems programming. 25:24.000 --> 25:29.000 LLMs and it offers us the ability to write safe code in many areas, 25:29.000 --> 25:32.000 while still being near the hardware. 25:32.000 --> 25:34.000 So, this is a very interesting question. 25:34.000 --> 25:36.000 I would like to ask you a question. 25:36.000 --> 25:38.000 I would like to ask you a question. 25:38.000 --> 25:40.000 I would like to ask you a question. 25:40.000 --> 25:45.000 In many areas, while still being near the hardware. 25:45.000 --> 25:53.000 I think that C is becoming more and more difficult to use in advanced cases. 25:53.000 --> 25:55.000 And there's also a matter of mind share. 25:55.000 --> 25:58.000 So, it's difficult for younger generations to learn 25:58.000 --> 26:01.000 because no longer taught in many universities. 26:01.000 --> 26:05.000 And this makes it difficult to have a community of new developers 26:05.000 --> 26:08.000 who will contribute in the care. 26:08.000 --> 26:10.000 So, I agree maybe Rust has problems, 26:10.000 --> 26:13.000 maybe more difficult to learn than C, 26:13.000 --> 26:18.000 but we need to move forward and bring new people into our community. 26:18.000 --> 26:23.000 And maybe Rust is a part of the solution space. 26:23.000 --> 26:25.000 All right, thanks a lot. 26:25.000 --> 26:26.000 Thank you. 26:26.000 --> 26:28.000 Thank you.