WEBVTT 00:00.000 --> 00:10.240 Hello and welcome to my talk, I'm compared with the previous talks where people talked 00:10.240 --> 00:16.720 about, I cannot louder, sorry, that's my voice. 00:16.720 --> 00:23.200 The other presenters were talking about their company efforts or the community efforts this 00:23.200 --> 00:30.640 time, I will present what happens in a smaller community that is the bottom of the supply 00:30.640 --> 00:36.640 chain. More precise in the Apache Natex Real-Time Operating System. 00:36.640 --> 00:44.320 My name is Alijer Peleau, I'm an open source advocate, number of several communities. Currently 00:44.320 --> 00:52.120 the Apache Natex Real-Time Operating System chair and I work as an open source of the architect 00:52.120 --> 00:58.840 in the Sony Ospo, Europe. You can contact me on the LinkedIn. Due to the time limit, I will 00:58.840 --> 01:05.720 run through the slides, so the disclaimer is there, I will not read it. Most of you probably 01:05.720 --> 01:11.320 didn't know about the Natex Real-Time Operating System, so I will have a small introduction 01:12.440 --> 01:19.960 so that we set the contacts. The Apache Natex is a small footprint, Real-Time Operating System, 01:20.680 --> 01:28.360 it uses standards and it's focused on compliance, it runs from systems, microcontrollers 01:28.360 --> 01:37.640 that are between 8 and 64 bits. It is available and more than 400 boards, it provides documentation 01:37.640 --> 01:45.000 and welcoming community. Also, it has been used and it is used in commercial products. 01:45.000 --> 01:52.440 You can see a few examples of development boards. The one on the right is from Sony, it's called Sony's 01:52.440 --> 02:00.120 presence. History, again, running quite fast, the project has been released under the permissive 02:00.120 --> 02:08.280 BSD license by Gregory Nat. In 2007, in 2019, has been donated by Gregory Nat to the Apache 02:08.280 --> 02:15.240 Software Foundation, thank you Gregory for this donation and we graduated in 2022 as a top-level 02:15.240 --> 02:24.200 project. In 2024 and currently we have the governments of the project as an open governance. 02:24.200 --> 02:33.240 We have 24 members in the committee and we vote all the big decisions. We have many products using 02:33.240 --> 02:40.040 Natex. You see a few pictures. I hope that they are right. Since we discover what companies and 02:40.040 --> 02:47.080 products use Natex when companies come forward to us, but otherwise we have no idea what people are doing 02:47.080 --> 02:54.280 with Natex and he will see soon an even interesting one. Digital Records, 02:54.440 --> 03:02.120 headphones, drones, robots, protection equipment. You can find all those examples in the 03:02.120 --> 03:08.840 recordings for the Natex International Worktrop Conference where people come forward and they talk about 03:08.840 --> 03:22.760 their use cases. Also, Xiaomi is using Natex in the IoT platform and the Vela is called the 03:22.760 --> 03:33.160 platform and they are contributing a lot to the project. Sure. Also, Natex is used 03:33.160 --> 03:40.280 was used on a small robot that landed on the moon by the Jackson Space Agency using Sony's 03:40.280 --> 03:47.080 presence development board exactly the board that you saw previously. So you have the full announcement 03:47.160 --> 03:56.440 and you can Google for it. Now that we have the use case for this operating system, 03:56.440 --> 04:03.880 it probably you realize why it is important for us to be compliant and why we want to provide 04:03.880 --> 04:14.120 as a community DSBOM. So during our transition from a BSD project to an Apache project which 04:14.200 --> 04:18.840 changes the licenses, you can see the progression and the versions of the bottom. 04:20.040 --> 04:26.360 Really, it is less important. It is roughly two years of work. We have to scan every bit of code 04:26.360 --> 04:32.200 and make sure that the licenses are compliant and we do proper license transition. And then 04:32.200 --> 04:40.280 because we identified licenses which were not Apache, we ended up having especially in the application 04:41.160 --> 04:49.400 side. We ended up having menu item as you see it's a fairly common view for kernel where people 04:49.400 --> 04:56.840 can select their licenses. It is impossible to build Natex without DSD and the MIT components 04:56.840 --> 05:04.680 because they are used in the core but you can exclude everything else if you don't want it in your 05:04.680 --> 05:13.160 applications. And then we wanted to see is that really true. So we wanted to have an SBOM to 05:13.160 --> 05:21.960 get a list of what's actually built. We go back to that later. What is an SBOM? Because everybody 05:21.960 --> 05:30.440 is talking about SBOM's and we as community were a bit confused. An SBOM is a softer 05:30.440 --> 05:37.320 bill of materials, a list of all components present in the code base, including license version 05:37.320 --> 05:44.760 metadata which allows security team to quickly identify security risk. The definition is really nice 05:44.760 --> 05:55.960 but which as BOM do we need? According to the CSAT listing there are six. Do we need all of them? 05:56.760 --> 06:05.240 We were groups. Design? The software is there. We don't design anything. People contribute to it. 06:05.240 --> 06:14.520 We have no control. Source we can see the whole source is but would that help the companies 06:14.520 --> 06:20.040 that use our product to really know what they put in their product because they will get a list 06:20.040 --> 06:26.520 of all the licenses, files and correlations for the whole notex. They will build a subset. 06:28.520 --> 06:37.480 So we were thinking that a build SBOM would be what would make some users, some our users, 06:37.480 --> 06:45.000 our companies. Happy. So we decided to go with source to provide a full index of everything that we 06:45.000 --> 06:51.720 have for every release and a build SBOM which means that when you build you get exactly what you have. 06:53.560 --> 07:00.360 And then we were looking even further because there was little information when we started this 07:00.360 --> 07:10.120 discussion two, three years ago that we need SBOMs. Most information that you find comes for products 07:10.120 --> 07:16.440 that have an SBOM for packages and this is how you would have it. You download the package, 07:16.440 --> 07:22.440 you get the package data, you get an DSBOM data, you assemble it and you put a product on the market 07:22.440 --> 07:29.960 with the SBOM data or at least this is our own understanding. But in our case you get code, 07:29.960 --> 07:39.720 it has nothing. The company has to generate an SBOM from whatever and then you put the SBOM 07:39.720 --> 07:47.240 and the software release on the market. So thinking about this picture, we were thinking, okay, 07:47.240 --> 07:54.680 that means that we have to integrate the SBOM generation in our build tools so that when you build 07:54.680 --> 08:05.320 your binary you get also the SBOM for exactly the product that you're building. And to put a bit of 08:05.320 --> 08:13.160 salt on the wind, we had no idea how to do that. At that point we had no idea about SBDX, 08:13.160 --> 08:22.360 there were a few mentions of it, but we had licensees in clear text. So we started looking at 08:22.360 --> 08:29.000 the SBDX, I joined the SBDX community and it's a really nice community, welcoming community 08:29.160 --> 08:39.400 sharing information. And we decided that we will use this also a bit confusing in our research 08:40.120 --> 08:46.280 which version should we go for the latest and the greatest, should we go for what the other 08:46.280 --> 08:55.640 seems to go. So we decided to start with version 2.3 to be more exact. And then when we will see 08:55.720 --> 09:04.840 others using the newer format, we will just update the tools. We decided to go as early as possible 09:04.840 --> 09:14.120 and then we will see. The SBDX website provides many tools. I started looking at all of them 09:14.120 --> 09:20.840 and from my findings, I couldn't use them for generating an SBOM for 09:20.840 --> 09:31.800 secode for real-time operating system. So okay, then we started looking at the whole landscape of 09:32.680 --> 09:42.120 operating systems that are having almost the same goals, from communities, from companies active or 09:42.440 --> 09:50.440 are high. And we saw that we have free articles which implemented a build SBOM quite recently. 09:51.320 --> 09:57.960 This slide was updated last week when we did the initial investigation three years ago. The 09:57.960 --> 10:07.400 landscape was different. Then we have the ZFEROS which provides both source and build SBOM 10:07.960 --> 10:14.920 and we have not text which hopefully will have the first release with a source SBOM and build 10:14.920 --> 10:23.800 SBOM at the build time in March in the next release. So now we have an idea of what is available, 10:24.360 --> 10:34.520 an idea of what other communities are doing, an idea of how we can do it. So we started working. 10:34.520 --> 10:42.840 And this was the painful part. Okay, we have we are in Apache Project under the Apache Foundation. 10:42.840 --> 10:48.840 We added the SBDX license identifier and we were done or so with all. 10:50.680 --> 11:00.600 Yeah, old code. BST0 identified with FOSID. I know it's a commercial tool. I used what I had. 11:00.840 --> 11:11.720 A long list of copyright owners, a long list of contributors, informations, quite scars there. 11:13.480 --> 11:22.680 Or we ended up with code that has multiple licenses. Really nice. Also identified by the scanning tools. 11:23.000 --> 11:31.880 Unknown license. I saw these license for the first time. I don't know how many people are 11:31.880 --> 11:35.880 aware with it but I couldn't identify it from the text. I needed help. 11:38.600 --> 11:45.480 And also the tools can be misleading as you can see here. Quite nice help from the Xiaomi guys 11:45.480 --> 11:53.640 because my tool gave a false positive. I was unsure. They identified it the code as BST0. 11:54.920 --> 12:04.440 My tool identified it as GPL. So we documented everything. If somebody has anything to say later 12:04.440 --> 12:11.160 that the license was not properly checked, we have the answer from two conflicting tools. And 12:11.960 --> 12:20.760 yeah, I mean help. The community is based on help. We help each other to make things work. 12:23.960 --> 12:32.760 Also there are SPDX identifiers that we cannot find. And initially I was looking for something called 12:32.760 --> 12:38.200 public domain. Public domain is handled differently in different jurisdictions. So there is no 12:38.200 --> 12:46.440 identifier. We defined our own not-expublic domain and I plan to submit it to the SPDX list. 12:47.080 --> 12:54.600 So that every time when an S bomb is handled by somebody they can see that it is defined 12:54.600 --> 13:04.280 and properly recognized and propagated in the S bomb. And then we had all the information that we 13:04.280 --> 13:12.840 needed. We had SPDX headers, identifiers. And we started defining what we want to do. We wanted 13:12.840 --> 13:20.040 to make it fully automatic. We wanted to reuse our build system which is make build a make based 13:20.840 --> 13:29.800 and collecting information. What information? Our build system looks mostly like this and it 13:29.800 --> 13:37.800 would make sense. So basically we have the not-eximage. We have the files. Then we get the sources 13:37.800 --> 13:44.760 and the headers and the headers are reused and used by many. I try to simplify the whole thing 13:44.760 --> 13:52.360 because it's quite massive. So the build system will give us debt files for each library. As you 13:52.360 --> 13:58.040 can see, those are the ones that I've been using for the demo. And then for each library you have the 13:58.120 --> 14:04.520 C files that we'll go in the library and the C files get the headers for each one of them 14:04.520 --> 14:11.720 including the system ones. So all the dependencies are aggregated in a big file and then clearing 14:11.720 --> 14:20.920 the path so that it can be reused. We populate the file with the proper files and paths. 14:21.880 --> 14:34.120 Yeah, this one I already jumped over. SPDX to the rescue. You get a really nice documentation 14:34.120 --> 14:41.320 on what fields you actually need and what fields you have to have, what fields are nice to have, 14:41.320 --> 14:50.040 what is the format for them. So if you have any SPDX doubts, just go and read the manual. It's 14:50.120 --> 14:59.400 wonderfully done. Also, they provide the examples. So you can just take a look at their examples 15:00.040 --> 15:07.400 and then you figure out what would work for you and what not. So having also this extra information 15:07.400 --> 15:15.000 we were able to finally have our objectives and define what we need. We decided to go with SPDX 15:15.080 --> 15:21.640 to point three. We decided to go with JSON format because it's easy to parse and easy to transfer 15:21.640 --> 15:27.640 to any other format. There are multiple converters over there. You can also convert it to cyclone 15:27.640 --> 15:34.520 the exit. That is what you need. We decided to collect file hashes, licenses and relationships 15:34.520 --> 15:44.520 between files for both sources and build artifacts. So that is how our header looks. I'm not 15:44.600 --> 15:52.360 100% that this is all that we need, but this is what we have in this moment. So it's a simple 15:52.360 --> 16:04.280 snippet from what we did. I generated the header on 24th. Then for source files we also collect 16:04.600 --> 16:14.120 file name, file type, the ID, show one some licensees, also the license from file and license 16:14.120 --> 16:21.160 concluded. If the file has no source, the concluded license is no assert. If the file has multiple 16:21.160 --> 16:28.600 licensees, the concluded license will be either Apache or if you have a GPL, it will be GPL. 16:29.560 --> 16:39.800 Okay. Same thing for headers. We use our own headers and we also use system headers. Everything 16:39.800 --> 16:49.080 is documented with the shots from each of the files. Some of them have no license information 16:49.080 --> 17:01.880 you to obvious reasons. And we thought that things are done, but they are not. As you would 17:01.880 --> 17:08.520 have dependencies on your package manager, we have dependencies on source code. So we are getting 17:08.520 --> 17:17.640 a build time, different get clones basically from other projects. And in our case, we are using 17:18.520 --> 17:25.800 there are many more roughly 50 projects. So in our folder, we only have the rules to build, 17:25.800 --> 17:31.640 but we do not have the sources. And the problem is that those projects either do not have 17:31.640 --> 17:41.640 SPDX identifiers or we cannot trust them. So the logical thing to do for us as a community 17:41.640 --> 17:50.520 is to help them do a license check, help them get SPDX identifiers on all our projects which 17:50.520 --> 17:56.920 are dependencies and contribute upstream because they may need the helping hand. They may be 17:56.920 --> 18:05.480 a one-man project. The alternate fix which I do not suggest anyone to do is to have patches 18:06.360 --> 18:13.160 near the building rules for each file and add the licenses locally. It is ugly and it will not 18:13.160 --> 18:23.640 help the community. So yeah. And my final slide before some Q&A, I think that we still have a few minutes. 18:23.880 --> 18:36.520 This journey was almost three years. For me, it started with a lot of information. It was 18:36.520 --> 18:45.160 overhelming information. People were mentioning SPDX, but without any reference to an embedded 18:45.160 --> 18:52.600 operating system. SPDX are great, but they need some kind of definition for somebody that is new. 18:53.000 --> 18:59.800 You need an S-bomb for packages, an S-bomb for cloud, an S-bomb for your project. 19:02.040 --> 19:10.200 Open chain and SPDX are wonderful communities. If you have any doubt on anything, just go 19:10.200 --> 19:19.720 there, read, ask. People will help. Looking at our dependencies, SPDX identifiers may be missing. 19:20.520 --> 19:27.720 I bet that is the case for all of you here and all of you online. Maybe it's good to just 19:28.520 --> 19:35.640 talk with the maintainers of those projects to make them aware that you need SPDX headers. 19:37.880 --> 19:43.720 It may be good to try to help them. Maybe they do not have the resources for doing this work. 19:44.680 --> 19:53.480 Also coming from both a company and an open source background, I can say join the open source 19:53.480 --> 20:01.640 community. If you are interested in a specific area, in this case, open chain, SPDX, join those 20:01.640 --> 20:08.440 communities. They will help you get better and you can help them understand what the problem is. 20:08.840 --> 20:12.840 Thank you. 20:18.840 --> 20:22.840 Questions for all of you who was kind enough to leave time? 20:22.840 --> 20:31.000 Yeah, great hard. You spend a lot of time to help source SPDX. Really, good but you do. 20:31.000 --> 20:34.920 But actually, did you request, really request for the source SPDX? 20:36.040 --> 20:44.120 We as a community. Yeah, so the question was, if there was a request for source SPDX. 20:44.760 --> 20:50.840 No, we as a community started discussing this because we are at the bottom of the supply chain. 20:51.480 --> 20:57.800 So if we don't provide a build SPDX, the company building the software cannot do it. 20:58.760 --> 21:01.400 It has to be implemented in the build system. 21:03.000 --> 21:08.840 They will get a full source SPDX, they can do that from the scanning tools, but that is not what 21:08.840 --> 21:18.760 their product has. It's, yeah, you have a subset of all the sources. Thank you for the question. 21:18.840 --> 21:19.400 Yes. 21:32.680 --> 21:35.560 The question was, how do we deal with external? 21:36.520 --> 21:39.480 Have a test version of some library in there that you want to see? 21:45.480 --> 21:52.040 The question was regarding projects that are not your direct dependencies. 21:52.040 --> 21:58.600 They may be tools or test projects that will end up in your S-bomb, but they are not used. 21:58.600 --> 22:02.600 The answer is, for this project, we don't have them. 22:02.840 --> 22:06.280 We only have what we actually use and what we actually build. 22:08.680 --> 22:10.680 We don't get them in our source. 22:12.280 --> 22:18.120 But this again is our case. In other cases, probably there are other answers. 22:33.560 --> 22:39.320 Sorry, I'm sorry. I don't know how to handle it for Java. 22:41.320 --> 22:45.080 There is one last question, I think, or maybe two. 22:54.600 --> 23:01.000 Yes, the source, the question was if there is a difference between source and build S-bomb. 23:01.080 --> 23:06.600 The source S-bomb will be delivered at the packaging time for each release, 23:06.600 --> 23:10.920 and it contains all the information for all sources in the project. 23:11.480 --> 23:15.320 While the build S-bomb, let's say that you have a Raspberry Pi picot, 23:16.200 --> 23:21.560 will contain only the files that end up in the final image on your Raspberry Pi picot. 23:23.560 --> 23:27.080 With the applications and everything that you selected in the menu. 23:27.960 --> 23:31.960 We have time for, I think, one last question. Yes. 23:38.520 --> 23:40.200 In the next generation? 23:40.200 --> 23:46.360 I have, yeah, the question was, how many projects we contribute to the S-bomb information? 23:46.360 --> 23:50.520 In this moment, zero, we are just fixing our own S-bomb. 23:51.240 --> 23:56.920 But the aim is to help our dependencies and then see how we can help. 23:58.040 --> 23:59.800 So, thank you very much. 23:59.800 --> 24:01.800 Thank you.