WEBVTT 00:00.000 --> 00:12.000 So, our final talk of the day, rounding us off, is David Riddner. 00:12.000 --> 00:16.000 I'm feeling sorry. 00:16.000 --> 00:19.000 Maybe that's Mike. 00:19.000 --> 00:26.000 Talking about the adventures and oxidizing arch Linux package management, which is going to be very 00:27.000 --> 00:29.000 very exciting. 00:29.000 --> 00:30.000 Take it away. 00:30.000 --> 00:32.000 Yeah, thanks. 00:38.000 --> 00:41.000 First off, it's really nice to still see a bunch of people here. 00:41.000 --> 00:45.000 I know it's been a very long day for me too, I'm pretty tired. 00:45.000 --> 00:50.000 But, yeah, let's have a look at what we're currently working on. 00:51.000 --> 00:58.000 So, yeah, I'm trying to give like a grant overview of what the arch Linux package management actually 00:58.000 --> 00:59.000 means. 00:59.000 --> 01:00.000 Sorry. 01:00.000 --> 01:02.000 Is it not all enough? 01:02.000 --> 01:04.000 All right. 01:08.000 --> 01:11.000 Just try to hook it up a bit higher. 01:13.000 --> 01:15.000 Maybe that helps. 01:15.000 --> 01:16.000 I don't know. 01:16.000 --> 01:17.000 Is that good? 01:17.000 --> 01:20.000 I'm sorry. 01:20.000 --> 01:27.000 So, I'll try to give an overview of what arch Linux package management actually is, or in this context. 01:27.000 --> 01:32.000 And, yeah, talk a bit about the motivation behind this entire project. 01:32.000 --> 01:37.000 And, what we're currently tackling, or working on. 01:37.000 --> 01:41.000 So, yeah, first off a little bit of background about me. 01:41.000 --> 01:44.000 I'm a freelance software developer. 01:44.000 --> 01:54.000 I have been with Arch Linux for quite some time now, as a package maintainer, and developer, and signing key, etc. 01:54.000 --> 02:02.000 I do a bunch of rust, pro audio things, and there's lots of Python in the past as well. 02:02.000 --> 02:10.000 I have mostly spent my time with installation process and packaging, of course, like way too many packages. 02:10.000 --> 02:18.000 And, well, infrastructure topics, and the LPM project is somewhat an infrastructure topic itself as well. 02:18.000 --> 02:21.000 So, start with the obligatory. 02:21.000 --> 02:25.000 How many people in the room are actually using Arch Linux right now? 02:25.000 --> 02:27.000 Holy shit, that's a lot. 02:27.000 --> 02:28.000 Okay. 02:28.000 --> 02:29.000 That's great. 02:29.000 --> 02:32.000 That's probably 70% roughly in the room. 02:32.000 --> 02:36.000 For those that are listening in and have no clue. 02:36.000 --> 02:39.000 I try to give like a brief introductory. 02:39.000 --> 02:46.000 So, when we think of Arch Linux, we mostly think of Pac-Man, I guess, or often we think of Pac-Man as the project. 02:46.000 --> 02:49.000 This is very centric to the distribution. 02:49.000 --> 02:53.000 It consists of a package manager that is written in C. 02:53.000 --> 03:02.000 We have MayPIGG, which is a package build tool that is written in Bash, with which you build the packages that you then install later on with Pac-Man. 03:02.000 --> 03:09.000 We have tooling, like repo add, that you can use to, like, in a very rudimentary way. 03:09.000 --> 03:12.000 Yeah, deal with package repositories. 03:12.000 --> 03:24.000 And we have Pac-Man key, which is a thin wrapper around GPG, for handling our Pac-Man specific new GPG keyway. 03:24.000 --> 03:31.000 When we think of distribution packaging, so for Arch Linux itself, actually, then we're usually talking about deaf tools, 03:31.000 --> 03:40.000 which is, like, a collection of scripts and, well, a more unified experience, basically, to build in a clean C.H. route. 03:40.000 --> 03:55.000 And we have DB scripts that basically wraps the aforementioned repo add with many more bells and whistles, basically to deal with our repositories. 03:55.000 --> 04:02.000 And, yeah, many of you know, probably the AWR is, like, a platform of package scripts. 04:02.000 --> 04:10.000 And there's a set of, like, unofficial user repositories, of course, that are pre-built packages. 04:10.000 --> 04:19.000 And we have a bunch of AWR helpers that probably have a few use, I guess, to build and install things. 04:19.000 --> 04:32.000 But when we look into ALPM, then, so that's the short form for Arch Linux package management, then we actually look into, yeah, play in package building, I guess, in the beginning. 04:32.000 --> 04:38.000 We usually talk about, like, source repositories, where we have the PKG build, which is the build script. 04:38.000 --> 04:48.000 We clone that thing, we build it, we get a package for the new signet, and then we basically can also create a source info, which is, like, a representation of, 04:48.000 --> 05:06.000 the PKG build that is possible, because, yeah, having bash and the metadata in the PKG build is not really nice in certain context, right, if you want to show that on the website, you want to use bash for that, I guess. 05:06.000 --> 05:14.000 This is a little bit of an example, as I mentioned, like, this is all bash. 05:14.000 --> 05:25.000 So, PKG builds are literally just bash scripts that are evaluated, and that then build packages or lead to actual packages. 05:25.000 --> 05:35.000 So, familiar with bash, you will feel right at home, it's fairly easy to read usually, it's literally just build instructions installation instructions and things like that. 05:36.000 --> 05:45.000 Then we have the source info file, which is, well, that's what most, yeah, we can actually scroll. 05:45.000 --> 05:53.000 That is a bit of, like, an any style representation of the metadata that is the PKG build. 05:53.000 --> 06:07.000 It doesn't really contain any other information than the one that you will find in the PKG build, basically, so very static info, actually. 06:07.000 --> 06:17.000 You may wonder, I mean, even if you have been using an arch, like, what is a package actually, it's literally just a tar file, it's not really that magic. 06:17.000 --> 06:37.000 It contains all the files that you want to install on your system, but where it actually becomes interesting is when it comes to the files that describe the metadata about the package and the scripts that may be running, well, on your host when you install it or uninstall it. 06:37.000 --> 06:45.000 And this package metadata is largely comprised of a build info file, which describes the build environment, which the package has been built. 06:45.000 --> 06:53.000 The entry file, which is used only for very limited purposes, but literally is just a compressed lip archive entry file. 06:53.000 --> 07:02.000 And PKG info, which literally describes all the package metadata that is used then by the package management system. 07:03.000 --> 07:18.000 We also have, well, an understanding of scripts that basically can run, yeah, it's like, can run predefined functions on the host, as I said, on installation update or removal. 07:18.000 --> 07:32.000 These all run as rude, so it's a bit scary actually. We have this, but that's how many distributions actually deal with these post installation scenarios to modify the system. 07:32.000 --> 07:47.000 Yeah, this is a nice example of what the build info file looks like, as you see down here, it has some info about what it used to build from, which build tool it used. 07:47.000 --> 08:05.000 You see dev tools, which we use for package building. We do have a lot of the long list of packages that are installed in that build environment, and this is how you reproduce that standard on archlink of basically using this file on that set is pretty, it's pretty simple. 08:06.000 --> 08:17.000 Entry is not super interesting, I guess. You can also look it up, it's pretty well defined, I would say. It literally describes metadata about the files that are in contained in that package. 08:18.000 --> 08:46.000 More interesting is actually the peak at the info, because it literally gives you very detailed information about and also dynamic information about the package. If you look at the provides declaration over there, it literally gives you a versioned so name dependency, which you don't have when you're just looking at the static data from the source files basically. 08:46.000 --> 08:56.000 So yeah, this is literally what the packman then relies on to evaluate what information it is play, how to compare packages to one another. 08:57.000 --> 09:09.000 A package repository basically just contains the repository metadata, which you can see up there, it's the real DB repo files, more on that later. 09:09.000 --> 09:26.000 And just sinks the state of that package repository, which is basically described by these metadata files, and then downloads any package that it wants to update to validate set and install set. 09:26.000 --> 09:54.000 So we have two types on each that describe two things, basically we have the default one that describes packages, and here you see an example of just the package in version one, it has a description file, to which we will come in a sec, and the lower one has also an additional files file, which contains all the files. 09:56.000 --> 10:24.000 Literally, yeah, that's basically the short form of that. And nice example is here, this disk file describes the state of that package in that repository, it has some additional metadata, such as OpenPGB signature, but this is also kind of optional, it doesn't need to be in there anymore. 10:24.000 --> 10:35.000 It's just something that we still require for tooling reasons at the moment, but literally it contains a lot of the data that you've seen in the package again for before. 10:36.000 --> 10:43.000 It helps the package management system to make sense of like what is they are remotely and what it can upgrade to basically. 10:43.000 --> 10:52.000 The files files are super, super simple, they literally just contain a list of files and that's it. 10:53.000 --> 11:06.000 When we look at the local systems of the user system, then we will find the same files, the desk files, the files also the entry file, and they will basically describe the same thing. 11:06.000 --> 11:14.000 But there's a catch, the local desk file is different from the one that is in the repository metadata. 11:15.000 --> 11:30.000 You will have certain extra, such as validation, and you will have a reason why this is installed, et cetera, so that's there's some extra metadata encoded in these local desk files. 11:30.000 --> 11:38.000 And this is already part of the user systems database, basically, the state of the system that someone has currently on their system. 11:39.000 --> 11:44.000 We'll find that in a valid, a tag menu, you can find these files in the 3D. 11:45.000 --> 11:51.000 Yeah, the file is the same, the entry is literally the same, and it's the other one. 11:52.000 --> 12:08.000 So, yeah, having gone through all of these metadata files, which are like a dry topic, I guess, the question could be like, what's the motivation behind this entire thing, like why would you want to improve, or what would you want to improve, and why are we looking into this. 12:08.000 --> 12:20.000 So, one of the topics is that what we are using on Arch is a system that is, by now quite old, has roughly half of this. 12:21.000 --> 12:26.000 Thanks to left who actually sits in the audience here, thanks for doing good info. 12:26.000 --> 12:35.000 Yeah, we do have Artifact validation, as I mentioned earlier, based on a custom groupie g keywing. 12:36.000 --> 12:46.000 This is quite painful because it's brittle, it's stateful, and groupie g is no longer open pgp compliant actually, so that's pain. 12:47.000 --> 12:50.000 We need to do something about that. 12:51.000 --> 13:14.000 We do have a few closed loops within the context, if you're looking at Pacman as something that you want to consume as an outside project, then, well, I mean, it's nice that Pacman and Compassus Pacman and make gg, so it's the loop of the creation and the consumption is literally at one project, it's great, in some way, but. 13:15.000 --> 13:27.000 The changes to the internal file formats and so on, that are used by Pacman and see are introduced in May, pgg and bash, so it's, it's, it's not very pretty. 13:28.000 --> 13:35.000 And that also means that the changes to these internal file formats, they are defined by. 13:36.000 --> 13:47.000 Pacman releases, so if you're relying on a certain version or certain behavior of an internal file format, you might be broken by a Pacman update. 13:48.000 --> 13:53.000 If you're writing a piece of software that relies on our package ecosystem, basically. 13:54.000 --> 14:06.000 We do have, yeah, some file formats or most of them are not really clearly specified or defined, they don't really have versioning and no deprecation. 14:07.000 --> 14:18.000 That is kind of complex, as you can imagine, and it also means that certain behavior is literally just an implementation detail of Pacman or make bgg potentially. 14:19.000 --> 14:42.000 As I outlined earlier, yeah, producing this in bash is hard and leads to arrows that you can't really guard against easily, it's very hard without really strong unit integration tests, which we don't really have in a good way, I think. 14:42.000 --> 14:52.000 Yeah, so these file formats, they're not documented, which we want to do something about. 14:53.000 --> 15:05.000 The concepts surrounding them also, not necessarily clearly documented, they exist sometimes as footnotes in other documentation that relates to it, but they're not clearly defined. 15:05.000 --> 15:16.000 And that makes it often implementation specific, so it might be that you have a parser, and it can, as slightly behaves differently because there's no spec. 15:18.000 --> 15:32.000 This means that we don't really have anything else, but lip ALPM to link against currently, but that doesn't help us with all the file formats that we want to consume for metadata reasons. 15:32.000 --> 15:43.000 That leads us to either grab bash tooling in our own projects or to reimplement the wheel, basically. 15:44.000 --> 15:59.000 As examples of the former, you will see, like, debiscripts, which also is written in bash, also for historical reasons, but yeah, makes it hard to do certain things right. 15:59.000 --> 16:07.000 Because it's very hard to do transactions properly, deal with rollbacks and things like that. 16:08.000 --> 16:28.000 A project that's tried to literally reimplement the wheel to something we is reporting that try to improve on the concept of debiscripts by implementing, yeah, rollback and also transactional behavior for dealing with our package. 16:29.000 --> 16:52.000 The repositories basically needed to implement parsers and specs for all these file formats, again, because either they were not properly defined or only exist in untyped, yeah, context, basically. 16:52.000 --> 17:13.000 As I mentioned, we do have an issue with validation in that way. I think the very, for example, I actually bought this up before, I think in 2023, when I first started working on this project, is that, yeah, you basically can't stop a lot of stuff into our version comparison and would not complain about it. 17:14.000 --> 17:37.000 Our packages are limited after the fact, so after building them, we have a tool that literally lints over them. I think, lintian is a very similar approach in, in debions package ecosystem, but, yeah, this could happen earlier in the process and less after the fact. 17:38.000 --> 17:52.000 We do have, because of this, fact, a lot of existing parsers now that have very in degree of compatibility and that implement certain aspects of certain five formats, but not all of them and they're not official either. 17:52.000 --> 18:16.000 That brings us to the oxidation part, which is fun, I hope. As I mentioned earlier, we want to have more specifications for all these file formats or better specifications existing ones, because several versions of some of these file formats actually exist already, and they are still out there, basically. 18:16.000 --> 18:23.000 If you update to a newer version, techmen may not necessarily be able to consume all the versions. 18:23.000 --> 18:41.000 If you spoke about the GuQG topic earlier already, we do want to do something about this and move to something that is stateless and is actually totally agnostic of what you're using it for. 18:41.000 --> 18:52.000 Ideally, even cross-technology. That's why we've been working on this UAPI spec that is currently under review. 18:52.000 --> 19:05.000 I employ you to have a look at our approach at basically having a very generic approach to providing very fires for operating system artifacts. 19:05.000 --> 19:23.000 We are currently only a generic library for the lookup and for the use with OpenPGP exists, but this is an extensible format, basically, and maybe you find this interesting. 19:23.000 --> 19:42.000 When we think about AAPM as a project, as something that is a rust project that has extensible directions, then we have source package management repository and package roughly as topics, basically. 19:42.000 --> 20:03.000 In 2023, I started writing this library called AAPM types, which was supposed to contain a lot of common types that we use all across all of these metadata files, basically, to be able to validate them properly. 20:03.000 --> 20:10.000 By now, it also contains a lot of documentation for common concepts and also some file formats. 20:10.000 --> 20:18.000 The AAPM process library is something quite new. 20:18.000 --> 20:24.000 I think Orhun is probably also somewhere, I think there are some areas. 20:24.000 --> 20:32.000 Orhun started working on a lot of parsers for all these file types together with Anna, who isn't here anymore today. 20:32.000 --> 20:59.000 But this is mostly window based and has improved quite substantially over what the current status is, where we actually get useful error messages for users of these libraries when they try to consume certain file types, certain file formats, and so on, that actually gives you a meaningful response to what you're doing wrong. 20:59.000 --> 21:12.000 We do have a library that is just internally for testing, basically it allows us to integration test against all the live data that we have, speak the entire package set, all of it. 21:12.000 --> 21:19.000 Literally, it is pretty fast. 21:19.000 --> 21:27.000 What we're currently looking into or trying to close up on is dealing with the AAPM source side of things. 21:27.000 --> 21:41.000 They are mostly looking into having a clear specification for the source info file format, and a library that allows us to pause and serialize the source info file format. 21:41.000 --> 21:48.000 It's currently under review, but it's like 90% done, basically. 21:49.000 --> 21:56.000 When we're looking into the package domain, then we have achieved quite a bit over the last few months already. 21:56.000 --> 22:05.000 We have a specification for the InstaScriptlet, for the entry format, for the built info and the PKG info format. 22:05.000 --> 22:16.000 Likewise, we do have parsers and serializes for these formats now, that are already functional by now. That's pretty nice. 22:19.000 --> 22:26.000 There's still a ton of work to be done, as you can imagine. I mean, there's lots of ground to cover. 22:26.000 --> 22:40.000 We do want to upstream a lot of the stuff that we currently work on. Those are literally changes to make PKG report at in the future. 22:40.000 --> 22:48.000 It's made PKG very specifically for things like source info, for PKG info and also for built info. 22:48.000 --> 23:05.000 Creation, because then we can actually have validated file formats in the packages that are following some form of the design standard in a way, and would give you a proper error message when it fails. 23:06.000 --> 23:14.000 We do want, although that is more like on the on the back burner at the moment, we do want to have export further languages. 23:14.000 --> 23:19.000 As we do have some tooling integration, for instance, for LIPAPM. 23:19.000 --> 23:24.000 That's called PIRPM. That's a wrap around this. 23:24.000 --> 23:37.000 Obviously, the defense of all of this work, we do have the plan to provide a drop-in replacement for LIPAPM in the future. 23:37.000 --> 23:47.000 For this, we would need to integrate the VOA specification that's basically the one for the verification of artifacts. 23:47.000 --> 23:51.000 We need to implement a lot of stuff for that and Ripple thinking and package download. 23:51.000 --> 23:58.000 The installation and upgrade removal, etc. That is needs to be compliant with what is currently there. 23:58.000 --> 24:05.000 But as for instance, LIPAPM links against GPGME, we don't really want to do that at all. 24:05.000 --> 24:16.000 So drop-in replacement in this case means that in quotes, because we're not going to link against GPGME. 24:17.000 --> 24:28.000 In the future, we would also like to look into the possibility of unifying some of these existing five formats, because they do share a lot of commonalities. 24:28.000 --> 24:40.000 They have a bunch of overlap that we may just be able to better describe in a structured data format going forward, because that's way easier to pass. 24:40.000 --> 24:51.000 Writing the process was quite challenging, I would say, due to little hoops here and there in these file formats. 24:51.000 --> 24:58.000 They're not as trivial as they may seem when you look at them for the first time, because they're not quite any. 24:58.000 --> 25:04.000 They're not quite this or that, and makes it very hard. 25:04.000 --> 25:24.000 Given that we want to have the creation part for these files, also covered for the repository, this would actually allow us to have better tooling and improve our tooling around repository handling in the future. 25:24.000 --> 25:36.000 This is more like a midlong term goal, I would say, maybe more like next year, I would say, although some of this stuff may already be added to this year. 25:36.000 --> 25:53.000 Locally this means that, yeah, we need to have a bunch of libraries that still need to be written and file tabs that need to be described, et cetera, et cetera, a lot of fun, but also, yeah, lots of interesting work to be done. 25:53.000 --> 26:17.000 Funnily enough, this was nicely funded by the sovereign tech agency by the, I think, October we started working on this with a team of four people, so this funding will go on until the end of this year. 26:17.000 --> 26:39.000 We hope to cover a lot of ground literally to be able to provide something that is also sustainable for the future and is able to improve what we use for packaging and how we deliver packages to our users in the future. 26:39.000 --> 26:45.000 You can contact us, well, you can first of all read a lot of documentation if you want to on the website. 26:45.000 --> 26:55.000 We have a repository that combines a bunch of crates and we do hang out on the DC, you can also join that on the matrix if you want. 26:56.000 --> 27:00.000 Yeah, there's my mail address, et cetera, and here's social. 27:00.000 --> 27:13.000 If you're on the figure, other than that, here's a slight link, if you want, I'll put that on the website later, but this is, yeah, with your code, if you want to read the thing. 27:13.000 --> 27:20.000 There's lots of links in there, so if you're interested, it's probably quite nice. 27:20.000 --> 27:27.000 But that is also literally it, I'm probably a bit fast. 27:27.000 --> 27:35.000 If you have any questions, I'll be glad to answer them. 27:35.000 --> 27:57.000 Let's see if I can hear it actually. 27:57.000 --> 28:12.000 Yes, so you said that you want to upstream the changes and provide drop in replacement, but you also said that, for example, the file formats, they changed with the packman updates. 28:12.000 --> 28:24.000 So what if for example, there's a big packman change plan, and then, suddenly, it will not make it compatible with the current implementation. 28:24.000 --> 28:37.000 And would it be possible to basically have two buckets like one, the written in C and one in Rust for packman of the library? 28:37.000 --> 28:41.000 I'm not entirely sure I got the second, the last sentence properly. 28:41.000 --> 28:44.000 I mean like, would it be going forward? 28:44.000 --> 28:51.000 It's a little unclear to be honest, because we're also still collecting information and collecting ideas around this. 28:51.000 --> 28:59.000 As you can imagine, I mean, a lot of this work has been done over the last 20 years, and it's a second accumulated process, right? 28:59.000 --> 29:12.000 So even if you're going in the wrong direction, you might not necessarily realize that right away, you might only realize that like 10 years later, and you're like, wait a second, this may be not the greatest idea, why it did we actually do this. 29:12.000 --> 29:35.000 And the thing here is that while we're accumulating these ideas, how to improve the situation, how to unify it, it definitely becomes clear like that there's overlap, for instance, if you look at built-in for and PKG info, there's some overlap, but there's also like the idea of separation of concerns for these things. 29:35.000 --> 29:45.000 So in an ideal world, you won't need like special super tooling to make sense of a metadata file for your own purpose. 29:45.000 --> 30:01.000 So we need to kind of weigh the pros and cons here as well, but I believe that nonetheless structured data format that is relying on already existing standard would be better here for sure. 30:01.000 --> 30:08.000 We also tried to replace the make package and the make package scripts. 30:08.000 --> 30:14.000 Well, that's not really on the agenda for the work that is currently sponsored. 30:14.000 --> 30:30.000 I mean, there are actually some proof of concepts by Morgan, who has been part of the packing team for some time that worked on, 30:31.000 --> 30:43.000 basically replacing the make package tool with a rust implementation that has been around like one and a half years, so I think. 30:43.000 --> 30:59.000 Our current ideas actually to first of all replace or make it possible to replace the creation of these metadata files because they're currently created in bash and there might be just general trash in that we don't want. 31:00.000 --> 31:09.000 And we've seen this in some package files where we were testing, and it would be nicer to be more robust for that for sure. 31:09.000 --> 31:28.000 And that is easier to just replace like executables in make the G itself and call differently depending on how you configure the build, but as is there's no like direct action from our site to replace make the G. 31:29.000 --> 31:33.000 Hello, sorry, it's just maybe a silly question. 31:33.000 --> 31:35.000 Can you speak up a bit? 31:35.000 --> 31:36.000 Can you hear me? 31:36.000 --> 31:37.000 Yes. 31:37.000 --> 31:38.000 Okay. 31:38.000 --> 31:46.000 So I was going to ask you maybe a silly question, but I want you to know if a live LPM is going to be a dynamic library, 31:46.000 --> 31:50.000 when you're going to rewrite it in the rest. 31:51.000 --> 32:09.000 Yeah, I mean the idea is to have a drop-in replacement that would also be a shared library, literally offering the same CAPI with air quotes because of the GPG and the specifics. 32:09.000 --> 32:25.000 We do have some specific buy-in, basically, for these for for Bnukiji and that's something that we need to look into, but that's something that we do not want to reproduce because it doesn't fit the way at all. 32:25.000 --> 32:26.000 Okay. 32:26.000 --> 32:27.000 Thank you. 32:35.000 --> 32:38.000 Okay, so if there's no more questions, thank you David. 32:38.000 --> 32:39.000 Yep. 32:39.000 --> 32:40.000 Thanks.