WEBVTT 00:00.000 --> 00:10.000 It's about one o'clock, which means the next presentation will start here in the Post-Kaz 00:10.000 --> 00:11.000 Dev Room. 00:11.000 --> 00:18.000 So, let's give a hand to Gultin Genetly Neck, who is about to talk about Post-KazQL locks, 00:18.000 --> 00:20.000 table-level locks. 00:20.000 --> 00:38.000 Okay, how is it now? 00:38.000 --> 00:39.000 Yeah. 00:39.000 --> 00:40.000 Good. 00:40.000 --> 00:45.000 So, today's topic is it says in the presentation, we will talk about table-level locks, 00:45.000 --> 00:46.000 main idea locks. 00:46.000 --> 00:48.000 So, let's start. 00:49.000 --> 00:52.000 A short, like who am I? 00:52.000 --> 00:53.000 Page. 00:53.000 --> 00:56.000 I work in a startup called Zeta. 00:56.000 --> 01:00.000 They do Post-Kaz service and we have our own Post-Kaz tools. 01:00.000 --> 01:02.000 And one of them today, I will mention. 01:02.000 --> 01:09.000 I am also recently recognized a Post-Kaz project as Post-Kaz contributor. 01:09.000 --> 01:15.000 And I have been using Post-Kaz for a long time already and happy to be part of the community. 01:15.000 --> 01:16.000 Thank you for coming. 01:16.000 --> 01:17.000 I am visiting our bedroom. 01:17.000 --> 01:25.000 I also do, like, I run the user group for the product because I live in product for almost eight years now. 01:25.000 --> 01:29.000 So, if you are around and if you are close by to product, you can just visit every month. 01:29.000 --> 01:33.000 We meet and talk Post-Kaz with two talks a month. 01:33.000 --> 01:35.000 And there are other things I do as well. 01:35.000 --> 01:37.000 I have, I am Turkish. 01:37.000 --> 01:40.000 So, I have communities in Turkey that I run as well. 01:40.000 --> 01:46.000 We have this community in Turkey that I have been part of running for ten years now. 01:46.000 --> 01:49.000 And now we are also organizing our own conference. 01:49.000 --> 01:53.000 If you happen to be Istanbul in summer, feel free to join. 01:53.000 --> 01:57.000 And recently also I became a part of the diversity community. 01:57.000 --> 02:02.000 And we are trying to make Post-Kaz welcoming for everybody. 02:02.000 --> 02:06.000 And yeah, that's enough of me. 02:06.000 --> 02:10.000 And I like to come to Post-Kaz a lot. 02:10.000 --> 02:13.000 It is somehow, you know, I don't know how you feel. 02:13.000 --> 02:16.000 But over the years you see that your friends develop your projects. 02:16.000 --> 02:18.000 They develop and you make new friends. 02:18.000 --> 02:21.000 So, yeah, things change, but something stays same. 02:21.000 --> 02:23.000 And that's what I think makes Post-Kaz. 02:23.000 --> 02:27.000 We know that it will be every year in January where February will get together. 02:27.000 --> 02:31.000 And then have this nice conference that we can just talk. 02:31.000 --> 02:33.000 And so, I appreciate it. 02:33.000 --> 02:38.000 I want to share, like, how much I love being here. 02:38.000 --> 02:42.000 And today's talk will be mainly about, like, these are the talk points, 02:42.000 --> 02:44.000 not exactly how the order of the talk will be, 02:44.000 --> 02:47.000 but we'll be talking about MSDC design. 02:47.000 --> 02:49.000 We will have mainly covered a video, 02:49.000 --> 02:51.000 looks, and how the local works in Post-Kaz. 02:51.000 --> 02:55.000 And then how we can arrange the impact of the looks 02:55.000 --> 02:59.000 that happens during the production environments. 02:59.000 --> 03:03.000 And then we will introduce project open source project 03:03.000 --> 03:07.000 that I'm related with SP0. 03:07.000 --> 03:11.000 Okay, so, what is actually, like, 03:11.000 --> 03:14.000 let's go, like, maybe generic and the concept of look. 03:14.000 --> 03:16.000 When I was thinking of writing something about looks, 03:16.000 --> 03:20.000 I was like, I always tend to, like, go to the words 03:20.000 --> 03:24.000 etymology, like, where does the look come from and so on. 03:24.000 --> 03:27.000 So, I had ordered, like, Persian looks, 03:27.000 --> 03:30.000 and history of locking and lock picking and so on. 03:30.000 --> 03:32.000 And then eventually, I noticed, like, 03:32.000 --> 03:35.000 there are some parallels that I can draw between these things 03:35.000 --> 03:37.000 now that I had this books. 03:37.000 --> 03:40.000 To be able to pick a look, you have to understand, like, 03:40.000 --> 03:43.000 how it's inner workings, inner mechanisms work, like, 03:43.000 --> 03:46.000 how the pins interact, what is the, like, tumbler, the wing. 03:46.000 --> 03:49.000 And if you know a position, like, if you know the correct position 03:49.000 --> 03:51.000 of these things, when you enter, when the interactions 03:51.000 --> 03:55.000 of these mechanisms, you can actually open a look, 03:55.000 --> 03:57.000 any type of look without a key, right? 03:57.000 --> 03:59.000 It's, like, the kind of how look picking works. 03:59.000 --> 04:02.000 And the same actually applies for databases as well, 04:02.000 --> 04:05.000 to be able to understand this locking mechanism in databases 04:05.000 --> 04:07.000 in general, in postgres. 04:07.000 --> 04:09.000 You need to understand some of the inner mechanisms 04:09.000 --> 04:11.000 that are intervent here. 04:11.000 --> 04:13.000 And mainly in postgres, this is about, 04:13.000 --> 04:17.000 uh, concurrency, because in the real life in the physical world, 04:17.000 --> 04:20.000 the locking is usually for about security, 04:20.000 --> 04:22.000 securing a save or securing a door and so on. 04:22.000 --> 04:26.000 But in postgres, it's not something we do for securing it, 04:26.000 --> 04:28.000 and it's not in the later security, 04:28.000 --> 04:31.000 but how we will have, how we handle high concurrency, 04:31.000 --> 04:34.000 so that I want to, like, explain a little bit. 04:34.000 --> 04:37.000 And it is not something only postgres uses, obviously, 04:37.000 --> 04:40.000 and it is used everywhere. 04:40.000 --> 04:45.000 And, but how do we, how postgres actually handles it a little bit? 04:45.000 --> 04:48.000 So that, that's why it brings us to topic of MECC, 04:48.000 --> 04:52.000 it's like, I believe everybody knows about it. 04:52.000 --> 04:56.000 Is there anybody who doesn't know what the MECC is in the room? 04:56.000 --> 04:58.000 Yeah. 04:58.000 --> 05:03.000 Okay, but if it refers to for people who actually maybe heard this first time, 05:03.000 --> 05:06.000 it means multi-version concurrency control is so mouthful. 05:06.000 --> 05:08.000 But what does it mean? 05:08.000 --> 05:10.000 When something writes, 05:10.000 --> 05:14.000 when we are writing and updating the data, some data, 05:14.000 --> 05:17.000 it doesn't really change directly in place. 05:17.000 --> 05:21.000 So instead, we do copies of the version, 05:21.000 --> 05:24.000 we multiply the versions of the rows basically, 05:24.000 --> 05:26.000 and then we keep those versions. 05:26.000 --> 05:27.000 And what does it do? 05:27.000 --> 05:29.000 While, let's say, 05:29.000 --> 05:32.000 a query seeing a snapshot of what we have, 05:32.000 --> 05:36.000 and the other queries can get the other versions of the same thing, 05:36.000 --> 05:38.000 without actually blocking each other. 05:38.000 --> 05:41.000 That's the whole concept in a way that we can actually, 05:41.000 --> 05:43.000 when we are writing, we are not blocking the reads, 05:43.000 --> 05:44.000 that's happening at the same time, 05:44.000 --> 05:46.000 and the reads are not blocking the writes. 05:46.000 --> 05:49.000 So this is why it's called like multi-version, 05:49.000 --> 05:53.000 because we have the multi-version of the same row. 05:53.000 --> 05:55.000 And this is very good, 05:55.000 --> 05:57.000 because most of the, 05:57.000 --> 05:59.000 we avoid most of the looking, 05:59.000 --> 06:01.000 thanks to this architecture that we have, 06:01.000 --> 06:03.000 we are using in post-fresh. 06:03.000 --> 06:04.000 And if we didn't have this, 06:04.000 --> 06:06.000 we would probably have to, 06:06.000 --> 06:08.000 we couldn't achieve this level, 06:08.000 --> 06:11.000 we could have probably have to use logs more. 06:11.000 --> 06:12.000 But for example, 06:12.000 --> 06:14.000 for the most of the DML operations, 06:14.000 --> 06:15.000 we are covered, 06:15.000 --> 06:17.000 because it allows like the updates, 06:17.000 --> 06:18.000 delete, merge, 06:18.000 --> 06:19.000 step in, 06:19.000 --> 06:20.000 by using, 06:20.000 --> 06:21.000 utilizing this. 06:21.000 --> 06:22.000 But it is like, 06:22.000 --> 06:24.000 there are some logs that we can't, 06:24.000 --> 06:27.000 we can't avoid some of the looking, 06:27.000 --> 06:29.000 because even though we are able to, 06:29.000 --> 06:32.000 thanks to this multi-version view that we do, 06:32.000 --> 06:34.000 we can actually avoid some of the looking, 06:34.000 --> 06:35.000 but not everything. 06:35.000 --> 06:37.000 So reads still have to look some of the, 06:37.000 --> 06:40.000 we still have to look some of the objects. 06:40.000 --> 06:43.000 So that's why it's important that this bit is, 06:43.000 --> 06:46.000 quite crucial for the, 06:46.000 --> 06:48.000 for how we do the concurrency, 06:48.000 --> 06:49.000 how we do the isolation, 06:49.000 --> 06:51.000 but we can't really avoid not, 06:51.000 --> 06:53.000 not having any looking. 06:57.000 --> 06:59.000 Okay, so, 06:59.000 --> 07:01.000 when I just mentioned like, 07:01.000 --> 07:02.000 in theory, 07:02.000 --> 07:03.000 in Mississippi, 07:03.000 --> 07:05.000 it might not really be clear perhaps. 07:05.000 --> 07:07.000 So that's why I put an example here. 07:07.000 --> 07:09.000 I hope it is readable everywhere in the back. 07:09.000 --> 07:11.000 So what happens? 07:11.000 --> 07:14.000 When a transaction updates data, 07:14.000 --> 07:17.000 as we said, Postgres creates a new version. 07:17.000 --> 07:19.000 And this new version of the role, 07:19.000 --> 07:22.000 but it still has the old version of the role as well. 07:22.000 --> 07:23.000 So we have two versions. 07:23.000 --> 07:26.000 And there are some system, 07:26.000 --> 07:29.000 system columns that it is actually happening at the meantime. 07:29.000 --> 07:31.000 It's not visible to the users, 07:31.000 --> 07:32.000 but if you query you will see it. 07:32.000 --> 07:34.000 So I think two of them, 07:34.000 --> 07:36.000 if you knew two of them that will be useful, 07:36.000 --> 07:38.000 in this example for sure, 07:38.000 --> 07:40.000 X-Men and X-Max columns. 07:40.000 --> 07:41.000 So in the X-Men, 07:41.000 --> 07:44.000 it is the which transaction ID that creates this version. 07:44.000 --> 07:45.000 That's where we market this, 07:45.000 --> 07:47.000 where we put X-Men information. 07:47.000 --> 07:48.000 And the X-Max is like, 07:48.000 --> 07:52.000 this transaction ID that either deletes this version, 07:52.000 --> 07:53.000 or actually updated it, 07:53.000 --> 07:54.000 or if it's empty, 07:54.000 --> 07:55.000 if it's now, 07:55.000 --> 07:56.000 it is the current one that we will see, 07:56.000 --> 08:00.000 which we will check in the example soon. 08:00.000 --> 08:02.000 And here, 08:02.000 --> 08:07.000 this example is based on the default behavior of the isolation level in Postgres. 08:07.000 --> 08:08.000 It is read, read, 08:08.000 --> 08:11.000 so we have multiple level of transaction isolation, 08:11.000 --> 08:13.000 which you can see the details, 08:13.000 --> 08:16.000 but this is the one that is default behavior to Postgres, 08:16.000 --> 08:20.000 and I will explain this example based on the read committed isolation level. 08:20.000 --> 08:23.000 So what happens when, 08:23.000 --> 08:25.000 when Postgres creates this, 08:25.000 --> 08:27.000 every time a statement starts, 08:27.000 --> 08:29.000 we create a snapshot. 08:29.000 --> 08:31.000 So what has this snapshot in high level? 08:31.000 --> 08:33.000 It keeps the older transaction ID, 08:33.000 --> 08:34.000 is the tar running gate to moment. 08:34.000 --> 08:36.000 It has the latest committed transaction ID, 08:36.000 --> 08:39.000 and it has the list of in-progress transactions as well. 08:39.000 --> 08:43.000 And now we can check the example that will be much easier. 08:43.000 --> 08:44.000 So in the first, 08:44.000 --> 08:45.000 there are two transactions, 08:45.000 --> 08:47.000 you see like there is transaction ID, 08:47.000 --> 08:48.000 100 and transaction ID, 08:48.000 --> 08:49.000 120. 08:49.000 --> 08:50.000 When we see the select, 08:50.000 --> 08:52.000 there is like employee of Alice, 08:52.000 --> 08:54.000 and it's salary is 50,000. 08:54.000 --> 08:55.000 And then it starts, 08:55.000 --> 08:56.000 it gets the snapshot, 08:56.000 --> 08:58.000 that is the view that it has. 08:58.000 --> 08:59.000 It has this information, 08:59.000 --> 09:01.000 and it sees salary at 50,000. 09:01.000 --> 09:03.000 And you see there, 09:03.000 --> 09:06.000 the rows X max is the null. 09:06.000 --> 09:08.000 And it means that this is where we are, 09:08.000 --> 09:09.000 that's what we see. 09:09.000 --> 09:10.000 And the transaction ID, 09:10.000 --> 09:12.000 99 is already committed. 09:12.000 --> 09:14.000 That's the transaction before the transaction, 09:14.000 --> 09:16.000 that's how the view is, 09:16.000 --> 09:17.000 that's how we see it. 09:17.000 --> 09:20.000 And then the second transaction, 09:20.000 --> 09:21.000 the 101, 09:21.000 --> 09:25.000 is updates to salary to 60,000 in the another transaction. 09:25.000 --> 09:27.000 And then that, 09:27.000 --> 09:30.000 that this one more version of the row. 09:30.000 --> 09:34.000 And now we see that the all version is marked with the 100,000, 09:34.000 --> 09:35.000 because this is the transaction, 09:35.000 --> 09:37.000 that the second transaction updating it, 09:37.000 --> 09:38.000 because it's like, 09:38.000 --> 09:39.000 it kind of makes it, 09:39.000 --> 09:40.000 it says, 09:40.000 --> 09:42.000 okay, I'm updating this now, 09:42.000 --> 09:44.000 and I created another transaction, 09:44.000 --> 09:47.000 another row with the X mean, 09:47.000 --> 09:49.000 being the same transaction ID. 09:49.000 --> 09:52.000 So it marks it as a 100,000 there. 09:52.000 --> 09:53.000 And then why will we, 09:53.000 --> 09:54.000 this is happening, 09:54.000 --> 09:56.000 unless the transaction ID, 09:56.000 --> 09:57.000 second transaction, 09:57.000 --> 09:58.000 100,000, 09:59.000 --> 10:02.000 when I have another select coming with the transaction 100, 10:02.000 --> 10:04.000 I will still see the 50,000 salary, 10:04.000 --> 10:05.000 because nothing is committed. 10:05.000 --> 10:08.000 But by the time that the update is actually committed, 10:08.000 --> 10:10.000 the second transaction, 10:10.000 --> 10:13.000 since each statement gets an upstairs snapshot, 10:13.000 --> 10:14.000 this time, 10:14.000 --> 10:15.000 the third select on the left column, 10:15.000 --> 10:19.000 you will see the current salary as 60,000. 10:19.000 --> 10:20.000 At that point, 10:20.000 --> 10:22.000 we will be actually being informed, 10:22.000 --> 10:24.000 this transaction is committed, 10:24.000 --> 10:27.000 and then we have the new snapshot with this information, 10:27.000 --> 10:29.000 it's not anymore visible, 10:29.000 --> 10:32.000 like when you start a new snapshot with this statement, 10:32.000 --> 10:35.000 you will not be able to see the old version anymore. 10:35.000 --> 10:36.000 So this is how, 10:36.000 --> 10:37.000 thanks to this, 10:37.000 --> 10:38.000 we are actually managing, 10:38.000 --> 10:39.000 let's say, 10:39.000 --> 10:41.000 there's a long-running transaction in Postgres, 10:41.000 --> 10:42.000 and we are still doing some operations, 10:42.000 --> 10:44.000 it can still has its own version of, 10:44.000 --> 10:45.000 its own snapshot, 10:45.000 --> 10:47.000 its own view of world basically. 10:47.000 --> 10:48.000 And until it is done, 10:48.000 --> 10:51.000 we still have this rows like being kept. 10:51.000 --> 10:52.000 By the time, 10:52.000 --> 10:53.000 when the time that this is actually, 10:53.000 --> 10:54.000 not anymore in use, 10:54.000 --> 10:56.000 that's why it's important to have, 10:56.000 --> 10:57.000 like, 10:57.000 --> 10:58.000 you can easily think of that, 10:58.000 --> 11:00.000 these rows are getting collected. 11:00.000 --> 11:03.000 So we have to think about the vacuum operations, 11:03.000 --> 11:05.000 we have to tune our system for that, 11:05.000 --> 11:07.000 you know, the knowledge of this architecture, 11:07.000 --> 11:09.000 we should be aware of, 11:09.000 --> 11:10.000 like, you know, 11:10.000 --> 11:11.000 I have to think about the regular vacuuming, 11:11.000 --> 11:12.000 I need to tune things, 11:12.000 --> 11:14.000 I need to monitor my long-running queries, 11:14.000 --> 11:15.000 because the more the query is long-running, 11:15.000 --> 11:18.000 the more we have to keep this version of raw, 11:18.000 --> 11:21.000 that we can get rid of basically. 11:21.000 --> 11:22.000 So there are, like, 11:22.000 --> 11:23.000 somethings like, 11:23.000 --> 11:24.000 of course, 11:24.000 --> 11:25.000 that's implied with this architecture, 11:25.000 --> 11:26.000 but thanks to it, 11:26.000 --> 11:28.000 we don't need to look all the time. 11:28.000 --> 11:29.000 So there are, like, 11:29.000 --> 11:32.000 we can easily cover most of the ML operations, 11:32.000 --> 11:33.000 thanks to this architecture, 11:33.000 --> 11:35.000 but we can't avoid everything. 11:35.000 --> 11:36.000 So we still have to have logs. 11:36.000 --> 11:37.000 And this is what, 11:37.000 --> 11:38.000 topic of today will be, 11:38.000 --> 11:39.000 we'll just cover, 11:39.000 --> 11:40.000 like, 11:40.000 --> 11:41.000 video logs, 11:41.000 --> 11:43.000 and we can't avoid most of them. 11:43.000 --> 11:45.000 Okay. 11:45.000 --> 11:47.000 And so also, 11:47.000 --> 11:48.000 like, 11:48.000 --> 11:50.000 I want you to think a little bit like, 11:50.000 --> 11:51.000 okay, 11:51.000 --> 11:52.000 this is how it works, 11:52.000 --> 11:53.000 but why do we look? 11:53.000 --> 11:54.000 Because, like, 11:54.000 --> 11:56.000 it doesn't seem like nature, 11:56.000 --> 11:57.000 like, 11:57.000 --> 11:58.000 because it brings you, 11:58.000 --> 11:59.000 what does it bring? 11:59.000 --> 12:00.000 It reduces our throughput, 12:00.000 --> 12:01.000 you know, 12:01.000 --> 12:02.000 that it will increase latency, 12:02.000 --> 12:03.000 probably, 12:03.000 --> 12:05.000 it means a lot of performance, 12:05.000 --> 12:06.000 and like, 12:06.000 --> 12:07.000 why do we have to look at it? 12:07.000 --> 12:08.000 But then, 12:08.000 --> 12:09.000 if my intention is, 12:09.000 --> 12:10.000 like, 12:10.000 --> 12:11.000 I want to make data correct, 12:11.000 --> 12:12.000 and, 12:12.000 --> 12:14.000 and every time somebody queries the database, 12:14.000 --> 12:17.000 I want them to get correct results. 12:17.000 --> 12:20.000 So if I have to deliver correct results, 12:20.000 --> 12:22.000 without getting things messed up, 12:22.000 --> 12:24.000 you have to have an agreement sort of that's, 12:24.000 --> 12:25.000 like, 12:25.000 --> 12:26.000 okay, 12:26.000 --> 12:27.000 I have to give up something, 12:27.000 --> 12:28.000 and that's why, 12:28.000 --> 12:29.000 actually, 12:29.000 --> 12:30.000 how these transaction isolations 12:30.000 --> 12:32.000 and concurrency model works, 12:32.000 --> 12:35.000 because I could just show something randomly, 12:35.000 --> 12:37.000 I could just deliver wrong results, 12:37.000 --> 12:38.000 but we tried to avoid it. 12:38.000 --> 12:41.000 So that's the whole point of why we actually look, 12:41.000 --> 12:44.000 because we try to keep things in order and incorrect, 12:44.000 --> 12:45.000 and it will take time, 12:45.000 --> 12:47.000 it doesn't come free. 12:47.000 --> 12:48.000 But, 12:48.000 --> 12:49.000 imagine a scenario, 12:49.000 --> 12:50.000 like, 12:50.000 --> 12:51.000 two edges of something, 12:51.000 --> 12:52.000 like, 12:52.000 --> 12:53.000 the scenario that we never looked, 12:53.000 --> 12:54.000 the same, 12:54.000 --> 12:55.000 that is lifted off, 12:55.000 --> 12:56.000 we are in this, 12:56.000 --> 12:57.000 like, 12:57.000 --> 12:58.000 on our keyword, 12:58.000 --> 12:59.000 there's nothing looked, 12:59.000 --> 13:00.000 about what happened. 13:00.000 --> 13:03.000 In one example, 13:03.000 --> 13:05.000 maybe easy that I had prepared for this, 13:05.000 --> 13:07.000 like a financial report is running, 13:07.000 --> 13:08.000 like the year, 13:08.000 --> 13:10.000 and salary upgrades, 13:10.000 --> 13:13.000 and the HR is calculating the averages, 13:13.000 --> 13:16.000 and maybe somebody is creating an audit column. 13:16.000 --> 13:17.000 Normally, 13:17.000 --> 13:19.000 if you didn't visit the set of log, 13:19.000 --> 13:20.000 post-press, 13:20.000 --> 13:21.000 when you are trying to add a column, 13:21.000 --> 13:22.000 it will have to, 13:22.000 --> 13:24.000 it will have access exclusive log. 13:24.000 --> 13:25.000 But imagine in this scenario, 13:25.000 --> 13:27.000 we don't have something like this, 13:27.000 --> 13:29.000 so we are actually, 13:29.000 --> 13:30.000 we are definitely creating, 13:30.000 --> 13:32.000 like, corruption issues, 13:32.000 --> 13:33.000 our schema changes, 13:33.000 --> 13:35.000 and then we can't actually do the query. 13:35.000 --> 13:37.000 So it will be messy. 13:37.000 --> 13:39.000 And that's why we are trying to avoid it, 13:39.000 --> 13:42.000 why having these logings in where we have to do, 13:42.000 --> 13:44.000 when we have to change the data definition, 13:44.000 --> 13:45.000 basically. 13:45.000 --> 13:48.000 And the another scenario that we are looking, 13:48.000 --> 13:49.000 let's say everything, 13:49.000 --> 13:51.000 like every transaction is waiting for each other. 13:51.000 --> 13:52.000 When there's a query, 13:52.000 --> 13:54.000 we are looking to a whole table. 13:54.000 --> 13:55.000 This could also work. 13:55.000 --> 13:57.000 If there is, let's say, 13:57.000 --> 14:00.000 one user using one table or like few users using, 14:00.000 --> 14:02.000 that they never use the same resources, 14:02.000 --> 14:04.000 same rows and same tables at the same time. 14:04.000 --> 14:05.000 That could also work, 14:05.000 --> 14:06.000 but in real life, 14:06.000 --> 14:07.000 it doesn't work like that. 14:07.000 --> 14:09.000 So we have to have at least a few users, 14:09.000 --> 14:10.000 at least that, 14:10.000 --> 14:12.000 and they are always sharing the same resources. 14:12.000 --> 14:14.000 So that's why we actually need to log stuff, 14:14.000 --> 14:16.000 because in real life, this rarely happens. 14:16.000 --> 14:18.000 That's why I think when you think about the two edges, 14:18.000 --> 14:19.000 you understand, 14:19.000 --> 14:20.000 okay, this is reasonable, 14:20.000 --> 14:22.000 that we have to, for the correctness. 14:22.000 --> 14:24.000 We have to have this isolation levels, 14:24.000 --> 14:27.000 and we have to actually add you to some standard. 14:27.000 --> 14:32.000 Okay. 14:32.000 --> 14:36.000 And there are like different type of log-mosts in postgres. 14:36.000 --> 14:37.000 These are not, 14:37.000 --> 14:39.000 we don't only have table-level rocks. 14:39.000 --> 14:41.000 We have also roll-level logs. 14:41.000 --> 14:43.000 We have advisor logs that you can actually use. 14:44.000 --> 14:46.000 Your self or your kind of application. 14:46.000 --> 14:49.000 There are like other logs that like weight logs that are in the, 14:49.000 --> 14:50.000 you don't usually see the material. 14:50.000 --> 14:52.000 So there are lots of type of logs that we can see. 14:52.000 --> 14:53.000 But today, 14:53.000 --> 14:56.000 the most that we will talk about is like table-level logs 14:56.000 --> 14:59.000 that are video logs that I will cover. 14:59.000 --> 15:02.000 So, why I think that is important to understand, 15:02.000 --> 15:04.000 because schemas that, 15:04.000 --> 15:07.000 your database schemas rarely stay static, 15:07.000 --> 15:08.000 they evolve. 15:08.000 --> 15:11.000 So you might add a column, 15:11.000 --> 15:14.000 you may change the, I don't know, type of the column. 15:14.000 --> 15:16.000 You can add a constraint, 15:16.000 --> 15:17.000 drop a partition. 15:17.000 --> 15:18.000 I don't know. 15:18.000 --> 15:20.000 So there are things that you can't, 15:20.000 --> 15:21.000 you have to do these things. 15:21.000 --> 15:23.000 And while you do these operations, 15:23.000 --> 15:24.000 you have to have some sort of knowledge, 15:24.000 --> 15:26.000 like what kind of impact this operation 15:26.000 --> 15:28.000 will have in my system. 15:28.000 --> 15:30.000 And that's where with the idea, 15:30.000 --> 15:33.000 you probably most of the author-table comments, 15:33.000 --> 15:35.000 and their different levels of supplements. 15:35.000 --> 15:37.000 We'll have the exclusive logs, 15:37.000 --> 15:39.000 access, exclusive logs. 15:39.000 --> 15:41.000 And that's why it will create 15:41.000 --> 15:43.000 for potential issues with your application, 15:43.000 --> 15:45.000 timeouts, and even downtime. 15:45.000 --> 15:47.000 So that's why it's important to understand 15:47.000 --> 15:48.000 how this thing's happened, 15:48.000 --> 15:50.000 and what kind of looks better acquired, 15:50.000 --> 15:52.000 and how you can actually minimize 15:52.000 --> 15:55.000 this issues that might be related to the logs. 15:55.000 --> 15:59.000 Yeah, I think I covered everything, 15:59.000 --> 16:01.000 but for example, 16:01.000 --> 16:02.000 when you run a DDL, 16:02.000 --> 16:05.000 you assume that maybe that might 16:05.000 --> 16:06.000 block other DDLs, 16:06.000 --> 16:08.000 but it will also block the manipulation, 16:08.000 --> 16:11.000 as well, like updates and even select. 16:11.000 --> 16:12.000 So that is very important, 16:12.000 --> 16:14.000 because I think that is not naturally 16:14.000 --> 16:16.000 what comes to mind when you are running 16:16.000 --> 16:17.000 an author-table, 16:17.000 --> 16:18.000 and I already should be looking 16:18.000 --> 16:19.000 some select for example. 16:19.000 --> 16:23.000 So we have to be aware that running this DDL 16:23.000 --> 16:26.000 commands will actually impact all sorts of queries, 16:26.000 --> 16:29.000 and we will probably block all the access to your table. 16:29.000 --> 16:31.000 That's why we will have to understand, 16:31.000 --> 16:33.000 like, okay, running one author-table 16:33.000 --> 16:35.000 with different supplements will impact 16:35.000 --> 16:37.000 the system differently than with another one. 16:37.000 --> 16:40.000 That's what we will go through deeper later. 16:40.000 --> 16:44.000 And I divided the presentation with 16:44.000 --> 16:46.000 some kind of checkpoints. 16:46.000 --> 16:48.000 So if you don't take anything from the presentation, 16:48.000 --> 16:50.000 just take away this points that I will show, 16:50.000 --> 16:51.000 and there are, I think, 16:51.000 --> 16:53.000 six or seven of them. 16:53.000 --> 16:55.000 I hope this will be useful. 16:55.000 --> 16:57.000 So for up until this is now, 16:57.000 --> 16:58.000 up until this point, 16:58.000 --> 17:01.000 I think this is the important takeaway that we have. 17:01.000 --> 17:03.000 So we talk about MECC, 17:03.000 --> 17:05.000 this is the approach that Postgres makes it 17:05.000 --> 17:08.000 highly efficient for the concurrent DM operations, 17:08.000 --> 17:10.000 because we are not modifying data in place. 17:10.000 --> 17:12.000 We are just writing in copies of data. 17:12.000 --> 17:14.000 So this will allow us to, 17:14.000 --> 17:16.000 as we said, write, not blocking reads, 17:16.000 --> 17:18.000 and read, not blocking writes. 17:18.000 --> 17:20.000 But this, even in this system, 17:20.000 --> 17:21.000 some level of looking, 17:21.000 --> 17:23.000 we will not be able to avoid. 17:23.000 --> 17:26.000 That's why I think it's important to know 17:26.000 --> 17:29.000 different versions of different DDL commands. 17:29.000 --> 17:31.000 We'll have different levels of looks. 17:31.000 --> 17:33.000 So we have to be aware of this 17:33.000 --> 17:35.000 where I level of strengths of the looks, 17:35.000 --> 17:40.000 that will be the rest of the presentation about. 17:40.000 --> 17:42.000 Okay, so I keep saying like, 17:42.000 --> 17:44.000 looks and what are those. 17:44.000 --> 17:46.000 The list that I'm showing is like, 17:46.000 --> 17:48.000 the category in the table level, 17:48.000 --> 17:49.000 looks in Postgres. 17:49.000 --> 17:53.000 I marked the first three is a bit more lighter, 17:53.000 --> 17:56.000 and the other ones are a bit more darker. 17:56.000 --> 18:00.000 So it starts from the more restrictive, 18:01.000 --> 18:03.000 less restrictive one to the more restrictive 18:03.000 --> 18:06.000 when you go from top to bottom. 18:06.000 --> 18:09.000 Because, as you see, for example, 18:09.000 --> 18:11.000 in the access share, 18:11.000 --> 18:14.000 this is when selects are actually acquiring the slope, 18:14.000 --> 18:16.000 and they are not CDL, right? 18:16.000 --> 18:17.000 And then roll share, 18:17.000 --> 18:18.000 these are like select for share, 18:18.000 --> 18:20.000 update also requiring this, 18:20.000 --> 18:22.000 and then the roll of exclusive is required 18:22.000 --> 18:23.000 by the DMN operations, 18:23.000 --> 18:26.000 so these are not only required by the DDL looks, 18:26.000 --> 18:27.000 that's why I kind of like, 18:27.000 --> 18:29.000 logically separate the middle of the slope 18:29.000 --> 18:32.000 from the maintenance operations. 18:32.000 --> 18:35.000 I make the mark as like more restrictive, 18:35.000 --> 18:38.000 there are more scale and stronger looks. 18:38.000 --> 18:41.000 So you can see all this, 18:41.000 --> 18:44.000 like, which statement gets what type of look 18:44.000 --> 18:46.000 from the Postgres documentation, 18:46.000 --> 18:49.000 and there's I think some website even check the Postgres code 18:49.000 --> 18:51.000 and like shows it like when you type which command 18:51.000 --> 18:53.000 it will take what kind of look. 18:53.000 --> 18:54.000 So this bit is, 18:54.000 --> 18:56.000 I think, is kind of important to understand, 18:56.000 --> 18:58.000 but it's not as important, 18:58.000 --> 19:02.000 I think, to understand how these different modes of looks 19:02.000 --> 19:04.000 interact with each other. 19:04.000 --> 19:06.000 Like when I'm trying to acquire some look 19:06.000 --> 19:08.000 and there's already an existing look, 19:08.000 --> 19:10.000 how this will acquire, 19:10.000 --> 19:12.000 how they will be behaving against each other. 19:12.000 --> 19:13.000 So they are connection, 19:13.000 --> 19:15.000 because not every look, 19:15.000 --> 19:18.000 every mode will block the other type of mode, 19:18.000 --> 19:21.000 some of them will allow each other to work in parallel in multiple, 19:21.000 --> 19:22.000 at the same time, 19:22.000 --> 19:24.000 but some of them will block basically 19:24.000 --> 19:26.000 if you have the some type of look, 19:26.000 --> 19:29.000 it will be blocking the others that are trying to acquire the same look. 19:29.000 --> 19:31.000 So it's important to understand this relationship, 19:31.000 --> 19:34.000 and Postgres is a matrix that shows it clearly. 19:34.000 --> 19:36.000 So I recommend you to check the documentation 19:36.000 --> 19:38.000 about exclusive looking page, 19:38.000 --> 19:40.000 I think that you can see this matrix 19:40.000 --> 19:43.000 how they are interacting. 19:43.000 --> 19:47.000 Okay, so that will bring us to, 19:47.000 --> 19:48.000 sorry, 19:48.000 --> 19:53.000 to what kind of things I should understand from this list that I show. 19:54.000 --> 19:57.000 What is the very important of course, 19:57.000 --> 20:01.000 access to exclusive will not only block the other, 20:01.000 --> 20:04.000 like DDLs and other rights, 20:04.000 --> 20:06.000 but like it will also block the reads. 20:06.000 --> 20:09.000 That means that that table that we are trying to access 20:09.000 --> 20:11.000 will be blocked for everything. 20:11.000 --> 20:15.000 So that's really important when you are considering how to avoid this, 20:15.000 --> 20:18.000 that's why we are trying to minimize the impact of it. 20:18.000 --> 20:22.000 And some of the key things that understand it, 20:22.000 --> 20:24.000 Postgres, if it is possible, 20:24.000 --> 20:28.000 it will try to always acquire the less restrictive look. 20:28.000 --> 20:31.000 There are optimizations that are already included, 20:31.000 --> 20:34.000 so it will not do something that if it can avoid. 20:34.000 --> 20:35.000 So that's good to know, 20:35.000 --> 20:42.000 and it's not like it's trying to make it the most out of this looking. 20:42.000 --> 20:48.000 And yeah, at the end, I mentioned something like, 20:48.000 --> 20:51.000 a transaction when a transaction takes a look, 20:51.000 --> 20:54.000 it will keep it until the end of the transaction. 20:54.000 --> 20:57.000 And this will bring me to actually the most critical lesson 20:57.000 --> 20:59.000 from this to take away is that, 20:59.000 --> 21:02.000 imagine a transaction that you started and it has multiple statements, 21:02.000 --> 21:04.000 and it took, let's say, access exclusive. 21:04.000 --> 21:06.000 But you are doing, 21:06.000 --> 21:09.000 while you do this DDL change, 21:09.000 --> 21:11.000 you're also doing some, I don't know, 21:11.000 --> 21:13.000 update some DML that you are running. 21:13.000 --> 21:16.000 Normally, if you didn't put these things in the same transaction, 21:16.000 --> 21:18.000 this state must in the same transaction, 21:18.000 --> 21:20.000 you would be actually taking access exclusive, 21:20.000 --> 21:22.000 maybe with the first operation, 21:22.000 --> 21:23.000 and then in the second transaction, 21:23.000 --> 21:26.000 you will have the DML operation that you wouldn't need to, 21:26.000 --> 21:27.000 have access exclusive, 21:27.000 --> 21:29.000 maybe it will have the raw exclusive look. 21:29.000 --> 21:31.000 By putting them in the same transaction, 21:31.000 --> 21:34.000 you are actually blocking the whole thing with access exclusive, 21:34.000 --> 21:37.000 until all the state must in the transaction finishes. 21:37.000 --> 21:39.000 There's something that I think not everybody realizes, 21:39.000 --> 21:41.000 because I see that it is kind of a pattern 21:41.000 --> 21:43.000 that all the things put together, 21:43.000 --> 21:46.000 and the DLs and the MS people use them is mixed. 21:46.000 --> 21:50.000 So I would recommend to not mix this different level, 21:50.000 --> 21:52.000 acquiring statements. 21:52.000 --> 21:54.000 So when you are running a transaction, 21:54.000 --> 21:55.000 try to minimize this impact, 21:55.000 --> 21:56.000 that if you otherwise, 21:56.000 --> 21:58.000 the until all the things finishes, 21:58.000 --> 22:01.000 the transaction will keep the look that it's already true. 22:01.000 --> 22:04.000 So I recommend people to don't mix this comments. 22:04.000 --> 22:08.000 And then it will be minimizing this looking 22:08.000 --> 22:11.000 for operation that is needlessly taken by this transaction 22:11.000 --> 22:14.000 until the end of this transaction. 22:16.000 --> 22:21.000 Okay, and this works as also the how postpress handles the look you, 22:21.000 --> 22:24.000 how does it work in postpress. 22:24.000 --> 22:28.000 So I said like what important is not one statement 22:28.000 --> 22:30.000 is getting one type of look, 22:30.000 --> 22:33.000 but how they are actually relating to each other. 22:33.000 --> 22:37.000 Because when I'm when a transaction is requesting a look 22:37.000 --> 22:40.000 that's trying to require a acquire a look, 22:40.000 --> 22:43.000 and then it's already there's a transaction already held, 22:43.000 --> 22:46.000 there's a look that already held by another transaction. 22:46.000 --> 22:49.000 This one that we are trying to acquire will have to wait. 22:49.000 --> 22:51.000 And then the one after that will come, 22:51.000 --> 22:53.000 that we will be blocked by after this as well. 22:53.000 --> 22:55.000 So we will just keep adding to this queue, 22:55.000 --> 22:59.000 and this will like make a cascading impact on the system. 22:59.000 --> 23:04.000 And again, maybe this is something not easily be understood, 23:04.000 --> 23:07.000 and that's why I think show you some example 23:07.000 --> 23:09.000 the colleague actually works. 23:09.000 --> 23:12.000 And if you didn't specify a look timeout, 23:12.000 --> 23:16.000 this queue will the transaction that gets into the queue 23:16.000 --> 23:18.000 will wait indefinitely. 23:18.000 --> 23:19.000 So that's really important. 23:19.000 --> 23:21.000 So if you didn't set up anything, 23:21.000 --> 23:26.000 you will notice that this will definitely go on 23:26.000 --> 23:28.000 and on until you get probably applications slowed down, 23:29.000 --> 23:30.000 and maybe down time. 23:30.000 --> 23:31.000 So it's really important, 23:31.000 --> 23:33.000 especially when you're running details, 23:33.000 --> 23:36.000 to make sure that you have some look timeout set up, 23:36.000 --> 23:37.000 which we will talk later. 23:41.000 --> 23:43.000 And maybe one thing that I should also mention, 23:43.000 --> 23:46.000 normally, we would expect that we have the system, 23:46.000 --> 23:47.000 you'll peachy looks, 23:47.000 --> 23:50.000 and you would expect to see this looks in peachy looks clearly, 23:50.000 --> 23:52.000 but it is not there. 23:52.000 --> 23:57.000 So you have to actually use this PG blocking PID function 23:57.000 --> 24:00.000 to see which backhand is blocking what backhand is, 24:00.000 --> 24:03.000 which looks like, by the blocking what process. 24:03.000 --> 24:05.000 So that's also important, you know, 24:05.000 --> 24:07.000 which is not so straightforward, 24:07.000 --> 24:09.000 like to understand what's blocking what's just, 24:09.000 --> 24:12.000 you can't really see this in the PG looks. 24:15.000 --> 24:16.000 Okay. 24:18.000 --> 24:21.000 As I said, if you didn't set up the look timeout, 24:21.000 --> 24:25.000 we would have this waiting transactions will wait forever, 24:25.000 --> 24:27.000 that's why it's important to set it. 24:27.000 --> 24:29.000 But then again, it's not like something, 24:29.000 --> 24:31.000 you can just say, okay, I set the look timeout, 24:31.000 --> 24:34.000 all this covered now, I will have no issues. 24:34.000 --> 24:35.000 When you set the look timeout, 24:35.000 --> 24:37.000 then you will maybe get the timeout, right, 24:37.000 --> 24:39.000 when you're running this commands. 24:39.000 --> 24:40.000 And what happens? 24:40.000 --> 24:45.000 You have to think about how to handle this timeout gracefully, 24:45.000 --> 24:48.000 and you have to make a mechanism to retry mechanism, 24:48.000 --> 24:50.000 or if somebody's actually running these things, 24:50.000 --> 24:52.000 they have to try this retry logic implemented, 24:52.000 --> 24:55.000 to otherwise pick the application will probably get errors, 24:55.000 --> 24:57.000 and we don't understand what's happening. 24:57.000 --> 24:59.000 So database is giving some error and application, 24:59.000 --> 25:00.000 not understand it. 25:00.000 --> 25:03.000 So we have to make sure, like if you are sitting this look timeout, 25:03.000 --> 25:06.000 you should also be doing something in case things get timeout. 25:06.000 --> 25:11.000 They need to retry it or revert it through back and so on. 25:11.000 --> 25:12.000 Okay. 25:15.000 --> 25:20.000 And now that we kind of covered the theory of how the looking queue works, 25:20.000 --> 25:22.000 the look-contention post-press. 25:22.000 --> 25:25.000 I wanted to show a simple, like, few query examples 25:25.000 --> 25:28.000 that to display you how it works in real life. 25:28.000 --> 25:32.000 So imagine a select query that is like running for a very long time, 25:32.000 --> 25:34.000 and it is having an access share look. 25:34.000 --> 25:36.000 The access share look is the one that's selects day. 25:36.000 --> 25:38.000 It is the less disruptive one. 25:38.000 --> 25:41.000 It can be used by multiple transactions at the same time. 25:41.000 --> 25:45.000 It almost conflicts with nothing besides this access exclusive look. 25:45.000 --> 25:49.000 But there is this thing that before I started my schema change, 25:49.000 --> 25:51.000 and this select is running. 25:51.000 --> 25:54.000 And I'm trying to run this second comment, 25:54.000 --> 25:56.000 author-table-detached partition. 25:56.000 --> 25:58.000 This specific example is given, 25:58.000 --> 26:00.000 because it's happened in the production, 26:00.000 --> 26:02.000 which has a giant payment system, 26:02.000 --> 26:05.000 because the tool that they were using didn't 26:05.000 --> 26:08.000 it concurrently at the end for this comment, 26:08.000 --> 26:11.000 and it really like co-stop time. 26:11.000 --> 26:13.000 So that's why this particular author-table-detached partition 26:13.000 --> 26:15.000 that you might not think, 26:15.000 --> 26:18.000 the partition might actually influence the main table, 26:18.000 --> 26:19.000 but it did. 26:19.000 --> 26:23.000 So let's say that I'm trying to acquire this will 26:23.000 --> 26:26.000 actually normally need to be the brief access exclusive look. 26:26.000 --> 26:29.000 But it can be that because the select that is running already 26:29.000 --> 26:31.000 for a long time, it has the access share look, 26:31.000 --> 26:34.000 and this access exclusive that I'm trying to acquire, 26:34.000 --> 26:36.000 it actually conflicts with it. 26:36.000 --> 26:37.000 So I can't take it. 26:37.000 --> 26:38.000 So what happens? 26:38.000 --> 26:41.000 The author-table is placed in this lock queue that I mentioned, 26:41.000 --> 26:44.000 and then imagine there are other 20, 30, 26:44.000 --> 26:45.000 they can just try and do, 26:45.000 --> 26:47.000 they also try to like simple primary key lookups, 26:47.000 --> 26:48.000 sub-selects, 26:48.000 --> 26:50.000 but they can't do anything either. 26:50.000 --> 26:53.000 So now they have to queue behind this author-detached partition 26:53.000 --> 26:54.000 command. 26:54.000 --> 26:56.000 So they are all conflicting, 26:56.000 --> 26:59.000 and so what happens that they go back behind this command, 26:59.000 --> 27:02.000 and then until the long-running select finishes, 27:02.000 --> 27:05.000 and until this then author-table finishes, 27:05.000 --> 27:07.000 then they will get their chance. 27:07.000 --> 27:11.000 So we just have this like-custating impact of the effects. 27:11.000 --> 27:13.000 So you would normally not think probably, 27:13.000 --> 27:17.000 because I don't know if all of you check before you run the DDL, 27:17.000 --> 27:19.000 is there some long-running select. 27:19.000 --> 27:20.000 People watching, 27:20.000 --> 27:21.000 yeah, simple select, 27:21.000 --> 27:22.000 somebody's running a view somewhere, 27:22.000 --> 27:24.000 it just runs like an array for, 27:24.000 --> 27:25.000 I don't know, this is how it runs. 27:25.000 --> 27:27.000 But before we actually do this scheme, 27:27.000 --> 27:29.000 it changes or maintenance operations, 27:29.000 --> 27:30.000 like welcome and so on. 27:30.000 --> 27:31.000 We have to be careful, 27:31.000 --> 27:32.000 that's why I put this information. 27:32.000 --> 27:33.000 So as simple select, 27:33.000 --> 27:35.000 the wrong-running select can actually cause this issue, 27:35.000 --> 27:40.000 if you are trying to do the modifications to your schema. 27:41.000 --> 27:44.000 That's why we had this like, you know, 27:44.000 --> 27:47.000 before we said that we had to set the log time out, 27:47.000 --> 27:49.000 and we need to be also ready for, 27:49.000 --> 27:50.000 if it times out, 27:50.000 --> 27:53.000 we have to have some sort of retry logic. 27:56.000 --> 27:58.000 Okay, and I already mentioned, 27:58.000 --> 28:01.000 but this is the important takeaway to just note it again, 28:01.000 --> 28:05.000 any type of long-running query can actually cause issues. 28:05.000 --> 28:09.000 Especially when you have a high read-write workload, 28:09.000 --> 28:11.000 production environment, that's risky. 28:11.000 --> 28:13.000 So please, at least for EGL operations, 28:13.000 --> 28:14.000 set this log time out, 28:14.000 --> 28:15.000 because normally, 28:15.000 --> 28:19.000 I think when you think of an application regular application, 28:19.000 --> 28:22.000 the ideas are not really triggered by users, right? 28:22.000 --> 28:23.000 But users do, 28:23.000 --> 28:24.000 they just make up the same data, 28:24.000 --> 28:26.000 they query stuff, 28:26.000 --> 28:27.000 but maybe they do some deletes, 28:27.000 --> 28:29.000 but they don't rarely probably, 28:29.000 --> 28:31.000 if it depends on their application type, 28:31.000 --> 28:32.000 but adding a column more, 28:32.000 --> 28:34.000 changing something like that, 28:34.000 --> 28:35.000 it doesn't really happen. 28:35.000 --> 28:36.000 So if you can cover the EGL operations, 28:36.000 --> 28:38.000 at least with this log time out, 28:38.000 --> 28:39.000 you should be pretty okay. 28:39.000 --> 28:42.000 You shouldn't be worried about this log time out, 28:42.000 --> 28:45.000 impacting you when you are running this standard DML. 28:47.000 --> 28:49.000 Okay, so we, so far, 28:49.000 --> 28:51.000 talk about kind of like what happens, 28:51.000 --> 28:52.000 what can go wrong, 28:52.000 --> 28:54.000 but there are things that you can actually do 28:54.000 --> 28:57.000 to minimize this looking impact. 28:57.000 --> 29:00.000 Now let's see it over some queries. 29:00.000 --> 29:01.000 For example, 29:01.000 --> 29:03.000 if I'm running some query, 29:03.000 --> 29:06.000 and that query option has concurrently, 29:07.000 --> 29:08.000 try to use that, 29:08.000 --> 29:09.000 like because if, 29:09.000 --> 29:12.000 it will surely will be less impacting 29:12.000 --> 29:15.000 than running the same command without concurrently. 29:15.000 --> 29:17.000 But it doesn't mean that like, 29:17.000 --> 29:18.000 I run the, 29:18.000 --> 29:19.000 how that I have this like, 29:19.000 --> 29:20.000 creating this concurrently, 29:20.000 --> 29:22.000 I use this detached partition concurrently, 29:22.000 --> 29:23.000 all my issues solved. 29:23.000 --> 29:25.000 I don't need to think about anything anymore. 29:25.000 --> 29:26.000 I will not block anything. 29:26.000 --> 29:27.000 It doesn't work like that. 29:27.000 --> 29:29.000 It will probably make it a bit less restrictive, 29:29.000 --> 29:31.000 because concurrently has its, 29:31.000 --> 29:34.000 it devises the things in the different transactions, 29:34.000 --> 29:35.000 actually internally. 29:35.000 --> 29:38.000 So they will maybe briefly take the access exclusive, 29:38.000 --> 29:39.000 and then do the other thing, 29:39.000 --> 29:43.000 and then it will try to make it less impactful. 29:43.000 --> 29:44.000 But this will take longer, 29:44.000 --> 29:46.000 that's one thing that you need to remember, 29:46.000 --> 29:47.000 it will take longer, 29:47.000 --> 29:49.000 and it might not complete successfully, 29:49.000 --> 29:51.000 so that it might leave stuff in like, 29:51.000 --> 29:52.000 not completed, 29:52.000 --> 29:54.000 and so you need to be also aware of, 29:54.000 --> 29:55.000 you need to check them, 29:55.000 --> 29:56.000 like you can't just say, 29:56.000 --> 29:57.000 okay, I run concurrently, 29:57.000 --> 29:58.000 it should be okay. 29:58.000 --> 30:00.000 So you need to be aware that there are, 30:00.000 --> 30:02.000 some risks associated with that too, 30:02.000 --> 30:04.000 but as long as you are aware of it, 30:04.000 --> 30:05.000 you can, 30:05.000 --> 30:06.000 that are actually the comments, 30:06.000 --> 30:07.000 for example, in this partition, 30:07.000 --> 30:09.000 the touch partition thing, 30:09.000 --> 30:10.000 there's a, 30:10.000 --> 30:11.000 a comment called finalize, 30:11.000 --> 30:12.000 you can actually fix these things, 30:12.000 --> 30:15.000 if they are kind of somehow disrupted. 30:15.000 --> 30:18.000 So it's important that as long as you know, 30:18.000 --> 30:20.000 what can cause also another thing that is very important 30:20.000 --> 30:21.000 when you use concurrently, 30:21.000 --> 30:23.000 the other comments are transactional, 30:23.000 --> 30:24.000 but when you use concurrently, 30:24.000 --> 30:25.000 these are not transactional, 30:25.000 --> 30:26.000 but do I mean by that, 30:26.000 --> 30:28.000 I can't roll back these things easily. 30:28.000 --> 30:29.000 So I have to start over, 30:29.000 --> 30:31.000 I have to do it again, 30:31.000 --> 30:33.000 and I have to run the finalize or something, 30:33.000 --> 30:34.000 you can't really like reverted, 30:34.000 --> 30:35.000 you can't really like, 30:35.000 --> 30:36.000 you know, 30:36.000 --> 30:38.000 it can't be in the transaction below. 30:38.000 --> 30:40.000 So these are information that if you have this knowledge, 30:40.000 --> 30:42.000 you can use the concurrently, 30:42.000 --> 30:45.000 but use them by knowing what you are getting into. 30:48.000 --> 30:50.000 Okay, the second thing, 30:50.000 --> 30:53.000 how actually we can do more 30:53.000 --> 30:56.000 by just utilizing postgres. 30:56.000 --> 30:59.000 So one of the things that I think comes to mind, 30:59.000 --> 31:01.000 that we can split this large, 31:02.000 --> 31:04.000 the most impact in the videos, 31:04.000 --> 31:06.000 to something some different steps 31:06.000 --> 31:09.000 that will be less restrictive in the system. 31:09.000 --> 31:10.000 So what do I mean by that, 31:10.000 --> 31:11.000 for example, 31:11.000 --> 31:13.000 I have this first command here, 31:13.000 --> 31:14.000 after table, 31:14.000 --> 31:16.000 I'm going to add a not null with default, 31:16.000 --> 31:18.000 with a volatile function, 31:18.000 --> 31:20.000 it's like clock time step. 31:20.000 --> 31:21.000 So if I have this, 31:21.000 --> 31:23.000 like it will block everything, 31:23.000 --> 31:24.000 even the select, 31:24.000 --> 31:26.000 so it will have the access exclusive, 31:26.000 --> 31:27.000 it will block everything, 31:27.000 --> 31:28.000 even the selects, 31:28.000 --> 31:30.000 it will hold this look for the entire duration of the table, 31:30.000 --> 31:31.000 right? 31:31.000 --> 31:33.000 Because it has to rewrite the entire table. 31:33.000 --> 31:35.000 So if your table is really large, 31:35.000 --> 31:36.000 this can actually, 31:36.000 --> 31:38.000 can need to significant downtime. 31:38.000 --> 31:39.000 That's why I'm thinking, 31:39.000 --> 31:42.000 how actually I can't avoid having this access exclusive 31:42.000 --> 31:43.000 for a very long. 31:43.000 --> 31:44.000 So what can I do? 31:44.000 --> 31:46.000 So I can actually have some, 31:46.000 --> 31:49.000 we can have some simple strategies to apply here first. 31:49.000 --> 31:51.000 I'm having it like, 31:51.000 --> 31:53.000 I'm adding the nullable column with a default, 31:53.000 --> 31:54.000 default, 31:54.000 --> 31:56.000 instead of having it not null at the beginning. 31:56.000 --> 31:57.000 So I'm adding like, 31:57.000 --> 31:58.000 I'm adding a variable, 31:58.000 --> 31:59.000 let's call them, 31:59.000 --> 32:00.000 just I had the default, 32:00.000 --> 32:01.000 and I have this default function. 32:01.000 --> 32:02.000 And then, 32:02.000 --> 32:03.000 after that, 32:03.000 --> 32:05.000 I'm trying to update the table with 32:05.000 --> 32:06.000 populate all the data, 32:06.000 --> 32:08.000 and covering the null values. 32:08.000 --> 32:10.000 So I'm trying to go through the, 32:10.000 --> 32:11.000 this update, 32:11.000 --> 32:13.000 and my populating goal in the null values, 32:13.000 --> 32:15.000 and you should do this in batches, 32:15.000 --> 32:17.000 because if you run all this at once, 32:17.000 --> 32:19.000 you remember in the long-running queries, 32:19.000 --> 32:20.000 are scary and dangerous, 32:20.000 --> 32:23.000 so you have to remember to do this in batches. 32:23.000 --> 32:24.000 And then, 32:24.000 --> 32:25.000 after that, 32:25.000 --> 32:26.000 I actually covered all this null values, 32:26.000 --> 32:29.000 and I already had my column, 32:29.000 --> 32:31.000 it's like created before, 32:31.000 --> 32:32.000 it's a default value, 32:32.000 --> 32:34.000 so I covered everything now. 32:34.000 --> 32:36.000 And after this adding and not now concentrate, 32:36.000 --> 32:38.000 it will be much easier, 32:38.000 --> 32:39.000 because then, 32:39.000 --> 32:41.000 when I split this, 32:41.000 --> 32:42.000 what I did, 32:42.000 --> 32:45.000 I reduced the time that I need to use access exclusive. 32:45.000 --> 32:47.000 I actually avoid, 32:47.000 --> 32:49.000 I actually avoid doing this update, 32:49.000 --> 32:52.000 so I kind of make sure that new inserts are impacted, 32:52.000 --> 32:53.000 but the existing, 32:54.000 --> 32:56.000 we are not actually blocking those things, 32:56.000 --> 32:58.000 so I actually did, 32:58.000 --> 33:00.000 like a smart operation, 33:00.000 --> 33:01.000 and at the end, 33:01.000 --> 33:04.000 then I am trying to add the not now. 33:04.000 --> 33:05.000 That's like, 33:05.000 --> 33:08.000 it already reduced the impact a lot. 33:08.000 --> 33:10.000 Okay. 33:10.000 --> 33:11.000 And then, 33:11.000 --> 33:13.000 I want to show actually more, 33:13.000 --> 33:15.000 what actually could be done better, 33:15.000 --> 33:18.000 remember the last commented I run in the previous example, 33:18.000 --> 33:20.000 while setting the not null column like this, 33:20.000 --> 33:22.000 and set the not null. 33:22.000 --> 33:23.000 But there are, 33:23.000 --> 33:25.000 this still needs to take a long time. 33:25.000 --> 33:26.000 It will still block the right, 33:26.000 --> 33:27.000 so what can we do, 33:27.000 --> 33:29.000 that it's still better, 33:29.000 --> 33:31.000 because it doesn't actually block increase, 33:31.000 --> 33:32.000 because the previous one, 33:32.000 --> 33:34.000 when you're adding with default and with volatile function, 33:34.000 --> 33:36.000 it was already blocking reach too. 33:36.000 --> 33:37.000 But this is a little bit better, 33:37.000 --> 33:39.000 but it's still blocking the reach. 33:39.000 --> 33:40.000 I'm blocking the right. 33:40.000 --> 33:41.000 So what I did here, 33:41.000 --> 33:44.000 that I utilize the check constraints, 33:44.000 --> 33:47.000 so while I'm adding in the last three, 33:47.000 --> 33:48.000 you could see, 33:48.000 --> 33:52.000 I'm first adding the not null as a not valid. 33:52.000 --> 33:55.000 So it doesn't really check the ones 33:55.000 --> 33:56.000 that are existing already, 33:56.000 --> 33:58.000 not for it means that it doesn't mean that, 33:58.000 --> 34:01.000 because I think it confuses people sometimes I see, 34:01.000 --> 34:04.000 this check constraints is applied for the new data 34:04.000 --> 34:05.000 that is being inserted, 34:05.000 --> 34:07.000 so it doesn't apply for the data that's already there. 34:07.000 --> 34:09.000 So what I'm trying to achieve is that don't, 34:09.000 --> 34:11.000 don't trigger for like checking for all the things 34:11.000 --> 34:12.000 that I have in there. 34:12.000 --> 34:15.000 So I'm trying to make sure this check constraint that I'm adding, 34:15.000 --> 34:18.000 is actually applicable for the new data, 34:18.000 --> 34:19.000 but it's not, 34:19.000 --> 34:21.000 it's still in the normal state. 34:21.000 --> 34:22.000 And after that, 34:22.000 --> 34:24.000 I am trying to validate it, 34:24.000 --> 34:27.000 and then after I validated, 34:27.000 --> 34:29.000 I can actually add the constraint, 34:29.000 --> 34:32.000 if I want to not null constraint. 34:32.000 --> 34:34.000 And then once you actually postrate this smart enough 34:34.000 --> 34:37.000 that it knows that it actually check constraint is fulfilled, 34:37.000 --> 34:40.000 then adding a not null is like a simple material operation, 34:40.000 --> 34:41.000 so it is easy. 34:41.000 --> 34:44.000 It will still have the access exclusive. 34:44.000 --> 34:46.000 It will be really brief, 34:46.000 --> 34:49.000 and you have to remove now the other constraint that you created, 34:49.000 --> 34:51.000 if you want it to have not now already there. 34:51.000 --> 34:53.000 So remove the redundant thing, 34:53.000 --> 34:56.000 it will still again have a brief access exclusive. 34:56.000 --> 34:57.000 But like, 34:57.000 --> 34:59.000 as you consider our methodologies, 34:59.000 --> 35:01.000 that we are trying to reduce the content, 35:01.000 --> 35:02.000 the time of it, 35:02.000 --> 35:06.000 and how much that I hold this access exclusive, 35:06.000 --> 35:07.000 that looks everything even reads. 35:07.000 --> 35:09.000 So these are writing important, 35:09.000 --> 35:12.000 and it's just an example of it. 35:12.000 --> 35:14.000 There are multiple different operations, 35:14.000 --> 35:16.000 for each operations you might need to probably think about it 35:16.000 --> 35:17.000 a little bit more, 35:17.000 --> 35:19.000 how can I avoid this. 35:22.000 --> 35:25.000 Okay, so that will bring us to the fifth takeaway, 35:25.000 --> 35:27.000 find something that is less, 35:27.000 --> 35:29.000 that does a less looking, 35:29.000 --> 35:31.000 and sometimes, 35:31.000 --> 35:34.000 like, of course, you can't avoid some looking, 35:34.000 --> 35:36.000 and that it can be hard, 35:36.000 --> 35:37.000 and also it will, 35:37.000 --> 35:39.000 of course, require some sort of expertise, 35:39.000 --> 35:40.000 which is really clear, 35:40.000 --> 35:42.000 that it's an imposterous documentation. 35:42.000 --> 35:43.000 So if you check this, 35:43.000 --> 35:45.000 how we can avoid these are not something we discovered, 35:45.000 --> 35:46.000 or I discovered, 35:46.000 --> 35:47.000 is clear, 35:47.000 --> 35:48.000 like, check the check constants, 35:48.000 --> 35:49.000 how the invalid words, 35:49.000 --> 35:51.000 how the concurrent the commands are, 35:51.000 --> 35:54.000 so just you can find this information. 35:55.000 --> 35:56.000 And again, 35:56.000 --> 35:59.000 this is the slight that I like the most. 35:59.000 --> 36:01.000 Postgres keeps getting better. 36:01.000 --> 36:03.000 So it doesn't stay again like the same. 36:03.000 --> 36:04.000 So, 36:04.000 --> 36:05.000 some operations before, 36:05.000 --> 36:06.000 maybe in the past, 36:06.000 --> 36:09.000 was like requiring maybe more restrictive looking, 36:09.000 --> 36:12.000 but some of them sure already just made the changes. 36:12.000 --> 36:14.000 And so, for example, 36:14.000 --> 36:18.000 it's a very simple example that I wanted to add here, 36:18.000 --> 36:20.000 adding it like column, 36:20.000 --> 36:22.000 not now with default value one. 36:22.000 --> 36:23.000 Before, 36:23.000 --> 36:25.000 this will still require an access exclusive, 36:25.000 --> 36:26.000 and it will block everything, 36:26.000 --> 36:28.000 but in postgres, 36:28.000 --> 36:29.000 it will be very, 36:29.000 --> 36:30.000 now in the modern postgres, 36:30.000 --> 36:31.000 it will be very quick, 36:31.000 --> 36:33.000 because this is a constant value. 36:33.000 --> 36:35.000 So postgres recognizes that this is constant, 36:35.000 --> 36:36.000 it is one, 36:36.000 --> 36:37.000 it can be stored as metadata. 36:37.000 --> 36:38.000 I don't need to rewrite the table, 36:38.000 --> 36:39.000 but before, 36:39.000 --> 36:41.000 this would be requiring a rewrite, 36:41.000 --> 36:43.000 as in the example that we had with this volatile function, 36:43.000 --> 36:45.000 when we were having the same command. 36:45.000 --> 36:47.000 And this location became minimal, 36:47.000 --> 36:50.000 it will make this operation much less disruptive now. 36:50.000 --> 36:52.000 But if you are running an older version of postgres, 36:52.000 --> 36:55.000 this might be having the same impact on your system. 36:55.000 --> 36:57.000 So please try to upgrade, 36:57.000 --> 36:58.000 if possible, 36:58.000 --> 37:00.000 and make use of it. 37:00.000 --> 37:02.000 So then you will have more, 37:02.000 --> 37:04.000 more developments to do how, 37:05.000 --> 37:07.000 how long they will take, 37:07.000 --> 37:09.000 and more concurrently comments as well, 37:09.000 --> 37:12.000 in the example that we see. 37:12.000 --> 37:13.000 Yeah. 37:13.000 --> 37:17.000 So this is used latest version of postgres, 37:17.000 --> 37:19.000 if you can. 37:19.000 --> 37:23.000 And this is the kind of like final part of the presentation. 37:23.000 --> 37:25.000 I will briefly discuss the tool, 37:25.000 --> 37:26.000 that is, 37:26.000 --> 37:28.000 we are developing in the company that I work for. 37:28.000 --> 37:30.000 It is open source applied to license, 37:30.000 --> 37:31.000 so if you try it, 37:31.000 --> 37:33.000 let us know it is in the version 0.8, 37:33.000 --> 37:35.000 and we are trying to, 37:35.000 --> 37:37.000 you know, first major version. 37:37.000 --> 37:39.000 Soon, that is being planned. 37:39.000 --> 37:40.000 So if you see something, 37:40.000 --> 37:41.000 if you try it, 37:41.000 --> 37:42.000 let me know, 37:42.000 --> 37:43.000 because yeah, 37:43.000 --> 37:45.000 we need more users to try it, 37:45.000 --> 37:46.000 and maybe more contributors, 37:46.000 --> 37:48.000 if you are interested. 37:48.000 --> 37:50.000 So that must be motivation. 37:50.000 --> 37:52.000 I think the motivation is up until now, 37:52.000 --> 37:53.000 it is very clear. 37:53.000 --> 37:54.000 So these things are happening. 37:54.000 --> 37:56.000 We have to sort it out. 37:56.000 --> 37:58.000 And there are some impact on our system 37:58.000 --> 38:00.000 that we can't maybe foresee. 38:00.000 --> 38:02.000 So I have to plan how the looking issues work, 38:02.000 --> 38:03.000 and we have to make sure that there is, 38:03.000 --> 38:05.000 if you don't want to create an impact, 38:05.000 --> 38:06.000 your system, 38:06.000 --> 38:08.000 you have to do some data backfilling back and forth. 38:08.000 --> 38:10.000 And what Pedro, how we do it, 38:10.000 --> 38:14.000 is that we are trying to utilize a pattern called 38:14.000 --> 38:16.000 expand contract pattern, 38:16.000 --> 38:18.000 which is hard to say about. 38:18.000 --> 38:19.000 What does it do? 38:19.000 --> 38:20.000 I think it's a software pattern. 38:20.000 --> 38:22.000 It's not something we discovered. 38:22.000 --> 38:24.000 We have to do it. 38:24.000 --> 38:25.000 We have to do it. 38:25.000 --> 38:27.000 We have to do it. 38:27.000 --> 38:29.000 We have to do it. 38:29.000 --> 38:31.000 We have to do it. 38:32.000 --> 38:34.000 While you are doing the migration, 38:34.000 --> 38:36.000 you don't touch the existing column. 38:36.000 --> 38:37.000 So you are trying to, 38:37.000 --> 38:38.000 in the same physical table, 38:38.000 --> 38:40.000 you are trying to add another column. 38:40.000 --> 38:41.000 You do all these operations there. 38:41.000 --> 38:43.000 Let's say you are trying to add a constraint. 38:43.000 --> 38:44.000 You do it there. 38:44.000 --> 38:46.000 And you don't do anything on the old one. 38:46.000 --> 38:48.000 And then we have multiple views, 38:48.000 --> 38:50.000 because we duplicate schema. 38:50.000 --> 38:53.000 So we people who are application client applications. 38:53.000 --> 38:56.000 They will have their own different versions of schema, 38:56.000 --> 38:58.000 which one of them will be the old schema version, 38:58.000 --> 39:02.000 with the physical table that only shows the existing column. 39:02.000 --> 39:05.000 And we also add the other columns and create 39:05.000 --> 39:06.000 another schema version. 39:06.000 --> 39:08.000 And they will all, when they connect, 39:08.000 --> 39:11.000 they will all see a different view of the old schema. 39:11.000 --> 39:14.000 So that they will see that the new one will access 39:14.000 --> 39:16.000 with the one that it is, 39:16.000 --> 39:18.000 which are not now constraint, let's say. 39:18.000 --> 39:20.000 And the other one will don't have this, 39:20.000 --> 39:22.000 it's not now constraint. 39:22.000 --> 39:23.000 But it will keep serving. 39:23.000 --> 39:25.000 So both of the versions live at the same time. 39:25.000 --> 39:30.000 This is the most useful thing about PGLO. 39:30.000 --> 39:32.000 So you can serve both of the applications. 39:32.000 --> 39:33.000 You are not disrupting anything. 39:33.000 --> 39:35.000 And then once you are done with the old one, 39:35.000 --> 39:36.000 you can actually mark it. 39:36.000 --> 39:39.000 And in the meantime, you can test your changes 39:39.000 --> 39:42.000 while both of the versions are at the same time available. 39:42.000 --> 39:44.000 But how we do is, of course, impact is like, 39:44.000 --> 39:46.000 you have to do dual rights. 39:46.000 --> 39:48.000 You have to keep everything. 39:48.000 --> 39:51.000 You need to, if somebody entered, let's say, 39:51.000 --> 39:53.000 a value that is actually null value. 39:53.000 --> 39:55.000 And you are trying to enforce a not null. 39:55.000 --> 39:58.000 So you have to change, modify this null value, 39:58.000 --> 40:01.000 that fits in your new version of you. 40:01.000 --> 40:02.000 So you have to modify this. 40:02.000 --> 40:04.000 And if somebody like added any value, 40:04.000 --> 40:08.000 you have to push this to down as well with the update 40:08.000 --> 40:09.000 the old version too. 40:09.000 --> 40:12.000 So you have to keep all these both versions intact. 40:12.000 --> 40:15.000 So this is what we call the tool rights. 40:15.000 --> 40:17.000 And yeah, this is mainly, I think, 40:17.000 --> 40:19.000 the theory behind it. 40:19.000 --> 40:21.000 We are utilizing triggers here. 40:21.000 --> 40:26.000 And I will show more later on. 40:26.000 --> 40:29.000 And of course, when you have both versions at the same time, 40:29.000 --> 40:30.000 that means that if you don't like something, 40:30.000 --> 40:31.000 you can roll back. 40:31.000 --> 40:34.000 And normally, I see, in my experience, 40:34.000 --> 40:36.000 when I see people that have migration plans, 40:36.000 --> 40:38.000 and the sigma migrations, 40:38.000 --> 40:40.000 there are, like, tendons that will have, 40:40.000 --> 40:42.000 like, the sigma migrations scripts that type pile up. 40:42.000 --> 40:45.000 And there's also the scrollback scripts that type pile up. 40:45.000 --> 40:46.000 They version them. 40:46.000 --> 40:49.000 And be changing anything in the database schema, 40:49.000 --> 40:51.000 become becomes a problem. 40:51.000 --> 40:53.000 Multiple teams are involved. 40:53.000 --> 40:55.000 I see that some companies have their own scripts 40:55.000 --> 40:57.000 of how to do these things, what to run, 40:57.000 --> 40:58.000 and what to change afterwards. 40:58.000 --> 41:00.000 And this application and database 41:00.000 --> 41:02.000 changes, arranging those things. 41:02.000 --> 41:04.000 It becomes quite problematic. 41:04.000 --> 41:07.000 And that's why I think this tool might help. 41:07.000 --> 41:08.000 Okay. 41:08.000 --> 41:11.000 I briefly explained how it first, 41:11.000 --> 41:14.000 but maybe if we go like a more step by step, 41:14.000 --> 41:16.000 it allows higher level operations. 41:16.000 --> 41:20.000 Like, when we were seeing how I can actually run this SQL 41:20.000 --> 41:22.000 comment, I show you that I split this thing to, 41:22.000 --> 41:24.000 let's say, to three steps. 41:24.000 --> 41:26.000 And I try to handle the, 41:26.000 --> 41:27.000 how the locking would work, 41:27.000 --> 41:29.000 which kind of order I should run. 41:29.000 --> 41:31.000 Tool allows you to say, 41:31.000 --> 41:33.000 let's say, run an alter table, 41:33.000 --> 41:35.000 and then it actually handles this in the back word. 41:35.000 --> 41:40.000 So you can use a JSON file to adjust what you want to do. 41:40.000 --> 41:43.000 How do you want to do this casting between the different columns 41:43.000 --> 41:44.000 and different data? 41:44.000 --> 41:47.000 And as I say, we use this expank contract, 41:47.000 --> 41:48.000 method, which is actually, 41:48.000 --> 41:50.000 once you do the migration, 41:50.000 --> 41:52.000 both of the versions are available, 41:52.000 --> 41:53.000 you expand the things. 41:53.000 --> 41:55.000 And once you agree that, okay, 41:55.000 --> 41:56.000 I'm not rolling this pack. 41:56.000 --> 41:58.000 I'm just completed the migration. 41:58.000 --> 42:00.000 You actually drop all the other things, 42:00.000 --> 42:02.000 then you'll just keep the new version. 42:02.000 --> 42:05.000 So that is the phase that we call contract phase. 42:05.000 --> 42:08.000 And yeah, you have the multiversions of schematics. 42:08.000 --> 42:14.000 As I said, it allows the higher level operations. 42:14.000 --> 42:17.000 So we convert this operations through the escrow 42:17.000 --> 42:19.000 commands that it will run in a way 42:19.000 --> 42:22.000 that it will not probably cause issues or looking. 42:22.000 --> 42:25.000 And we have a long time out already set for the video operations. 42:25.000 --> 42:27.000 So you don't need to worry about it. 42:27.000 --> 42:30.000 And it will also handle the back filling. 42:30.000 --> 42:31.000 And you can actually change, 42:31.000 --> 42:33.000 however you want it to be done. 42:33.000 --> 42:35.000 And you can actually add, 42:35.000 --> 42:37.000 when you do this back filling, 42:37.000 --> 42:39.000 you can do it in batches. 42:39.000 --> 42:41.000 You can adjust the size of the batch. 42:41.000 --> 42:43.000 You can actually put delays to it. 42:43.000 --> 42:45.000 So there are multiple ways that you can actually 42:45.000 --> 42:48.000 have more control over stuff that maybe it wouldn't be easier. 42:48.000 --> 42:52.000 Just if you try to do this all directly in escrow. 42:52.000 --> 42:55.000 Okay, I think we don't need to talk about this part. 42:55.000 --> 42:57.000 We already said like, yeah, 42:57.000 --> 43:00.000 we are adding the columns to the table and we are just hiding 43:00.000 --> 43:03.000 or showing through different views how they look. 43:03.000 --> 43:06.000 And when no longer needed, we are just dropping them. 43:06.000 --> 43:07.000 We are all back. 43:07.000 --> 43:10.000 Or we are just going more into the new version. 43:10.000 --> 43:12.000 Okay. 43:12.000 --> 43:15.000 So I think this is a good image to show like during the, 43:15.000 --> 43:18.000 there is a time that is kind of in the middle of between 43:18.000 --> 43:21.000 version one of application and version two of their application. 43:21.000 --> 43:24.000 This at the same time, this is in the red kind of color. 43:24.000 --> 43:26.000 They leave simultaneous data same time, 43:26.000 --> 43:27.000 which makes it super easy. 43:27.000 --> 43:29.000 If you want to do this library of green deployments, 43:29.000 --> 43:31.000 you can start the migration, 43:31.000 --> 43:32.000 have tests, everything, 43:32.000 --> 43:34.000 and make sure that all the things are done. 43:34.000 --> 43:36.000 And then either you roll back to the operation, 43:36.000 --> 43:37.000 if you don't want it, 43:37.000 --> 43:38.000 or you finalize the migration. 43:38.000 --> 43:40.000 And then at that point, this common, 43:40.000 --> 43:42.000 the distinct finishes. 43:42.000 --> 43:47.000 We contracted, I may continue with the new version of the application. 43:47.000 --> 43:49.000 And if you are interested, 43:49.000 --> 43:51.000 there is, I added some links to the, 43:51.000 --> 43:52.000 most of the slides, 43:52.000 --> 43:55.000 and you can go read more how it's actually done. 43:55.000 --> 43:57.000 A bit more details. 43:57.000 --> 43:59.000 And I like this example. 43:59.000 --> 44:00.000 I don't know. 44:00.000 --> 44:02.000 It's like it's not so easy to understand before the back 44:02.000 --> 44:03.000 or the back story effect, 44:03.000 --> 44:06.000 because I took it from one of the blocks that we created. 44:06.000 --> 44:08.000 I think here, for example, 44:08.000 --> 44:09.000 is a good example. 44:09.000 --> 44:11.000 When you have created a user stable, 44:11.000 --> 44:12.000 you had this full name. 44:12.000 --> 44:14.000 And then by time you decided that full name is not covering, 44:14.000 --> 44:16.000 I want to have a first name, 44:16.000 --> 44:17.000 and I want to have a last name. 44:17.000 --> 44:18.000 I want to divide this thing. 44:18.000 --> 44:20.000 And I want to have the H, 44:20.000 --> 44:21.000 I see that there are nulls in H. 44:21.000 --> 44:23.000 I don't want any nulls in the H column, 44:23.000 --> 44:27.000 so I want to add the null constraint of the H column. 44:27.000 --> 44:28.000 So what we do is, 44:28.000 --> 44:29.000 in the user stable, 44:29.000 --> 44:30.000 we have this full name, 44:31.000 --> 44:32.000 and I'm adding this like, 44:32.000 --> 44:34.000 Pigeral is not null column, 44:34.000 --> 44:36.000 Pigeral is this is the last name. 44:36.000 --> 44:38.000 And while I do this, 44:38.000 --> 44:40.000 I'm trying to make sure that if somebody is entering full name, 44:40.000 --> 44:43.000 I'm trying to make sure it is up and down, 44:43.000 --> 44:45.000 pardon the JSON that I show. 44:45.000 --> 44:46.000 I can actually see, 44:46.000 --> 44:48.000 okay, you make this transformation, 44:48.000 --> 44:49.000 and fill the last name, 44:49.000 --> 44:50.000 fill the first name, 44:50.000 --> 44:52.000 and add this column without actually 44:52.000 --> 44:53.000 disrupting the view. 44:53.000 --> 44:55.000 So all version will be, 44:55.000 --> 44:57.000 it will still be showing the same one. 44:57.000 --> 44:59.000 And the new version will start to already 44:59.000 --> 45:00.000 in the old and new entries, 45:00.000 --> 45:03.000 and will actually behave differently. 45:03.000 --> 45:05.000 And then once I'm done with it, 45:05.000 --> 45:07.000 I'm just actually able to continue. 45:07.000 --> 45:08.000 And how we do it, 45:08.000 --> 45:11.000 we just changed the searchpad setting. 45:11.000 --> 45:12.000 So I'm setting, 45:12.000 --> 45:13.000 for example, 45:13.000 --> 45:14.000 in the first one, 45:14.000 --> 45:16.000 it doesn't allow null in the name, 45:16.000 --> 45:18.000 fill let's say we have a name fill. 45:18.000 --> 45:20.000 So I'm trying to insert the null while you 45:20.000 --> 45:21.000 in the name fill. 45:21.000 --> 45:22.000 It's just say, 45:22.000 --> 45:23.000 okay, you can't enter it, 45:23.000 --> 45:25.000 because now I'm having the view, 45:25.000 --> 45:27.000 that is not allowing me to enter. 45:27.000 --> 45:29.000 So it is why I wait in this not null constraint, 45:29.000 --> 45:31.000 but when I switch back to the old schema, 45:31.000 --> 45:32.000 it will allow the nulls, 45:32.000 --> 45:35.000 but this nulls will be actually transforming the back, 45:35.000 --> 45:37.000 so that the new version of the columns 45:37.000 --> 45:39.000 will actually be filled with this, 45:39.000 --> 45:41.000 whatever transformation that we are trying to do, 45:41.000 --> 45:43.000 and it will continue as this. 45:43.000 --> 45:45.000 So users will not see any disruption. 45:45.000 --> 45:49.000 And yeah, 45:49.000 --> 45:50.000 based this up and down, 45:50.000 --> 45:52.000 that's how we are waking the casting. 45:52.000 --> 45:55.000 So I'm just transforming things here. 45:55.000 --> 45:59.000 I think this operation is a column change operation. 45:59.000 --> 46:02.000 I'm changing the column from, 46:02.000 --> 46:04.000 I don't know, 46:04.000 --> 46:06.000 integer to text or text, 46:06.000 --> 46:07.000 or something. 46:07.000 --> 46:09.000 So I'm trying to backhand for, 46:09.000 --> 46:10.000 use expressions, 46:10.000 --> 46:12.000 and I can control the bed size and the rate 46:12.000 --> 46:14.000 while I'm doing these operations. 46:14.000 --> 46:16.000 Okay, 46:16.000 --> 46:17.000 so I think again, 46:17.000 --> 46:18.000 this is showing the similar thing. 46:18.000 --> 46:21.000 I think one benefit of having this dual right, 46:21.000 --> 46:22.000 so it will, 46:22.000 --> 46:24.000 we are actually moving this, 46:24.000 --> 46:26.000 a problem from the application layer, 46:26.000 --> 46:27.000 directly into database layer. 46:27.000 --> 46:28.000 It actually, 46:28.000 --> 46:29.000 I think, 46:29.000 --> 46:31.000 most helpful in that regard. 46:31.000 --> 46:33.000 So migration logic doesn't end up between, 46:33.000 --> 46:36.000 like the split between the database and the application code. 46:36.000 --> 46:38.000 We are taking these things by using 46:38.000 --> 46:39.000 triggers to the, 46:39.000 --> 46:42.000 directly to the database layer. 46:42.000 --> 46:44.000 And if you want to do 46:44.000 --> 46:45.000 similar way that, 46:45.000 --> 46:47.000 without serving both of the versions, 46:47.000 --> 46:50.000 you will have to do some sort of similar logic anyway, 46:50.000 --> 46:51.000 so it's not, 46:51.000 --> 46:53.000 it's not a very double thing. 46:53.000 --> 46:54.000 If you want to keep both, 46:54.000 --> 46:56.000 you have to have the stool right in any way. 46:56.000 --> 46:57.000 And remember, 46:57.000 --> 47:00.000 when I was saying that when you have the locked timeout, 47:00.000 --> 47:02.000 you have to make sure that you are handling the, 47:02.000 --> 47:04.000 a lot of timeout errors with gracefully, 47:04.000 --> 47:06.000 so there's already, 47:06.000 --> 47:09.000 we deal with this already in the code. 47:09.000 --> 47:10.000 We set the locked timeout, 47:10.000 --> 47:13.000 and we have also this retry logic built in. 47:13.000 --> 47:15.000 Okay, 47:15.000 --> 47:18.000 and we are almost at the end. 47:19.000 --> 47:21.000 So I wanted to basically show, 47:21.000 --> 47:22.000 showcase this tool, 47:22.000 --> 47:24.000 and to give you the, 47:24.000 --> 47:25.000 you know, chance to see it, 47:25.000 --> 47:27.000 and you don't necessarily use it, 47:27.000 --> 47:29.000 but at least take away the, 47:29.000 --> 47:32.000 the same mechanisms that I described, 47:32.000 --> 47:33.000 that is like, 47:33.000 --> 47:35.000 it doesn't need to be part of the tool, 47:35.000 --> 47:37.000 but try to apply in your life, 47:37.000 --> 47:39.000 how, what kind of mechanisms that we can have 47:39.000 --> 47:41.000 directly in progress. 47:41.000 --> 47:43.000 Okay. 47:43.000 --> 47:45.000 Yeah, 47:45.000 --> 47:47.000 so importance is like we need to balance this looking, 47:48.000 --> 47:49.000 and we need to, 47:49.000 --> 47:51.000 because between looking, 47:51.000 --> 47:52.000 performance and the correctness, 47:52.000 --> 47:53.000 there is a balance. 47:53.000 --> 47:55.000 You can't really like avoid, 47:55.000 --> 47:57.000 you can't keep everything at the same level, 47:57.000 --> 47:59.000 so you have to sacrifice something, 47:59.000 --> 48:01.000 but to be able to make sure that we are correct, 48:01.000 --> 48:03.000 and we are actually having multi-current, 48:03.000 --> 48:04.000 like, 48:04.000 --> 48:05.000 concurrence, 48:05.000 --> 48:06.000 the high-current, 48:06.000 --> 48:07.000 the same environments, 48:07.000 --> 48:09.000 we have to be aware of looking. 48:09.000 --> 48:10.000 And yeah, 48:10.000 --> 48:11.000 you can also use tools, 48:11.000 --> 48:13.000 if it helps. 48:13.000 --> 48:16.000 If you like to read more about this topic, 48:16.000 --> 48:20.000 I have a few blocks that I write in pgroll.com, 48:20.000 --> 48:23.000 or you can go to the GitHub repository itself, 48:23.000 --> 48:25.000 and try to contribute already. 48:25.000 --> 48:27.000 And yeah, 48:27.000 --> 48:30.000 this was the block that I started the block first, 48:30.000 --> 48:33.000 and then the block turned to this talk. 48:33.000 --> 48:35.000 I hope you enjoyed this so far. 48:35.000 --> 48:37.000 And yeah, 48:37.000 --> 48:38.000 some people are taking picture, 48:38.000 --> 48:40.000 I will wait for it. 48:40.000 --> 48:42.000 Thank you. 48:42.000 --> 48:48.000 So the technical part of the talk finished, 48:48.000 --> 48:51.000 but I wanted to share something very personal to me now, 48:51.000 --> 48:52.000 and I don't know, 48:52.000 --> 48:54.000 it feels a bit vulnerable, 48:54.000 --> 48:57.000 but last year I lost someone that is really important to me, 48:57.000 --> 48:59.000 and it is really hard to be around 48:59.000 --> 49:01.000 and not having to see my rounds, 49:01.000 --> 49:02.000 but if you don't know his name, 49:02.000 --> 49:03.000 please check it out. 49:03.000 --> 49:06.000 He was really important for the Postgres community, 49:06.000 --> 49:08.000 and for me personally. 49:08.000 --> 49:09.000 So, 49:09.000 --> 49:10.000 yeah, 49:10.000 --> 49:11.000 if you want to remember him, 49:11.000 --> 49:12.000 because we lost him, 49:12.000 --> 49:14.000 watch this talk that he, 49:14.000 --> 49:15.000 it was his last talk, 49:15.000 --> 49:18.000 he was doing the keynote in Prague, 49:18.000 --> 49:20.000 and he was a visionary for Postgres, 49:20.000 --> 49:22.000 he contributed to Postgres a lot, 49:22.000 --> 49:24.000 and I think his vision will lead us 49:24.000 --> 49:26.000 for a long time already. 49:26.000 --> 49:27.000 So just, 49:27.000 --> 49:29.000 I remember him basically. 49:29.000 --> 49:30.000 Thank you. 49:30.000 --> 49:32.000 Thank you.