WEBVTT 00:00.000 --> 00:11.680 So, our next speakers are Jerome Gawa, and I'm on the Injambar. 00:11.680 --> 00:15.320 And it is about auditing web trackers. 00:15.320 --> 00:18.320 So I hand over the microphone. 00:18.320 --> 00:19.320 Welcome. 00:19.320 --> 00:21.320 Thank you very much. 00:21.320 --> 00:22.320 Thank you. 00:22.320 --> 00:27.600 Thank you very much for adding us. 00:27.600 --> 00:28.600 So I'm a model in Jomer. 00:28.600 --> 00:33.600 I'm working at the EDPB secretariat at Technology Experts. 00:33.600 --> 00:39.920 And we are giving this talk with Jerome, who is researcher in privacy at French University 00:39.920 --> 00:42.800 that's called Unila Sen. 00:42.800 --> 00:50.080 And so just for this, we don't know the EDPB is a European data protection board. 00:50.080 --> 00:56.960 The idea is a European board who includes all the data protection, all national data protection 00:56.960 --> 01:07.600 authority in Europe, plus the tree from the FTA country, which are nation-time, Norway, and Iceland. 01:07.600 --> 01:08.760 And also the EDPBs. 01:08.760 --> 01:11.080 So we are often confused with EDPBs. 01:11.080 --> 01:20.040 So EDPBs is the supervisory authority for the EU institution, while the EDPB is the board 01:20.080 --> 01:28.920 who helps all the data protection authority to agree on themselves to discuss, to move forward. 01:28.920 --> 01:34.200 In practice, so I am working in the secretariat, so to help to not help to the confusion, 01:34.200 --> 01:37.440 the secretariat is technically provided by the EDPBs. 01:37.440 --> 01:41.040 But it just means that on my page clip it's written EDPBs. 01:41.040 --> 01:47.400 But in practice, my boss is the chair of the EDPB, who he has for now the chair of the 01:47.400 --> 01:51.320 Finnish SA, and Italian. 01:51.320 --> 01:58.160 And so in practice at the EDPB, we have a lot of different tools to make all the data protection 01:58.160 --> 02:02.680 authority thrive, and one of them is the support pool of experts. 02:02.680 --> 02:07.080 The idea of this program is we have the call for expression of interest for experts in 02:07.080 --> 02:14.320 legal, technical fields, and they can say, okay, we are interested to have a contract 02:14.320 --> 02:15.320 with you. 02:16.240 --> 02:18.160 Actually, we have a lot of people on the field. 02:18.160 --> 02:25.080 Yes, we have 30% technical and 75% legal, but it just means that we are also people who are 02:25.080 --> 02:27.320 both technical and legal. 02:27.320 --> 02:32.640 And we have all those people who can pick for our project. 02:32.640 --> 02:39.640 Since July 2022, we had 22 projects out of which 18 have been completed, and we try to 02:39.680 --> 02:48.800 make them public, so we are publishing once they are ready, as the same says. 02:48.800 --> 02:51.960 And most of these projects are report. 02:51.960 --> 02:57.680 For example, we have the one stop shop, the idea is we will have one tematic, and we 02:57.680 --> 03:03.120 will pick our layer or legal professor, and we will look at all the decisions that have 03:03.120 --> 03:07.640 been taken on this subject by all authority, and we will do a report. 03:07.640 --> 03:15.360 We also have technical reports, like the standardized messenger audit, where the professor 03:15.360 --> 03:21.640 looked at what kind of requirement we call half to do an audit on the messenger half. 03:21.640 --> 03:24.760 So we have all those reports, and we have the exception. 03:24.760 --> 03:30.480 And the exception is the EDPB website of the team tool, which is a software, who has been 03:30.480 --> 03:32.640 developed by agerum. 03:32.640 --> 03:35.880 And so what was the idea of the what? 03:35.880 --> 03:38.440 So internet is full of cookies everywhere. 03:38.440 --> 03:44.600 We have cookies, for example, when you are authenticating your user, you want to be sure 03:44.600 --> 03:48.400 that it's a same person in one page, and the next one, you need a cookie, but you also 03:48.400 --> 03:53.080 have tracker cookies, so you have a lot of different kind of cookies. 03:53.080 --> 04:02.360 And the thing is, whenever you are using a cookie to talk about one individual, then you 04:02.360 --> 04:10.240 have, you have data, personal data, and so the GDPR apply to privacy apply. 04:10.240 --> 04:14.920 You also have a certain piece of legislation that applies, that is the E-Privacy Directive, 04:14.920 --> 04:19.520 and in fact, not all data protection is written, but some of them are also responsible 04:19.520 --> 04:21.720 of the application of the E-Privacy. 04:21.720 --> 04:27.440 In any case, at least for the first case, authority will need when they will audit a website 04:27.440 --> 04:35.280 to check out whether the cookies are, you know, ligates or not well done or not. 04:35.280 --> 04:38.480 And so what do they have to look at? 04:38.480 --> 04:44.280 The third thing is, they have to understand what is this cookie, what is it used for, okay? 04:44.280 --> 04:46.840 So that's what we call the purpose. 04:46.840 --> 04:52.000 And it's very important because it has a legal impact. 04:52.000 --> 04:57.080 If you have a technical cookie, and it's just here, so your website's function, then you 04:57.080 --> 04:59.840 don't have to ask consent for it. 04:59.840 --> 05:05.640 But if you are, for example, using a cookie for advertising purpose, then you need to ask, 05:05.640 --> 05:09.280 you know, your user to consent to get this cookie. 05:09.280 --> 05:11.160 And then we are talking about consensus. 05:11.160 --> 05:16.640 So if we are talking about consensus, and the first thing is, I have a cookie there, okay? 05:16.640 --> 05:21.120 I come to ask you to eat them before asking you if you want them, agree? 05:21.120 --> 05:27.080 So the first thing is you have to check whether the cookie arrives on the website before or 05:27.080 --> 05:30.360 after the consent, the first thing you have to check. 05:30.360 --> 05:36.920 Second is, the user should be able to choose which kind of cookie they agree to eat. 05:36.920 --> 05:41.560 If I have nice milk cookie like this one, maybe you will agree, maybe if I have mint, 05:41.560 --> 05:45.800 chocolate one, you will, you know, disagree, I don't know. 05:45.960 --> 05:54.160 The last one is, you know, any good consensus should be something that you can change, okay? 05:54.160 --> 05:58.560 I can tell you, yes, I want to cookie now, and I test it, and I don't like it. 05:58.560 --> 06:01.920 And I should be allowed to put it in the bin, okay? 06:01.920 --> 06:04.800 It's the same with your cookie on your browser. 06:04.800 --> 06:11.000 So you should be able to raise your content and it should not be hard, okay? 06:11.000 --> 06:16.000 And so all of that are included in any audits that you are doing as, you know, 06:16.000 --> 06:20.760 the top protection authority whenever you are looking at the website. 06:20.760 --> 06:30.960 The same years, we wanted to do that in a tool, and our dream tool would be easy to use, 06:30.960 --> 06:35.760 because in our 30 most officer are legal officer, okay? 06:35.760 --> 06:40.800 The technical officer are very rare resource. 06:40.800 --> 06:48.080 So we want the legal officer to be able to do most or even all of the assessment. 06:48.080 --> 06:53.000 So we want some things that is easy to use, where they can interact with the websites, 06:53.000 --> 06:56.840 because you have things that you don't like, dipping a new circumstances, 06:56.840 --> 07:01.720 you may need to click on the website on buttons or this kind of things. 07:01.720 --> 07:08.720 And then we want to simplify our lives, because we need to do the best with our resources 07:08.720 --> 07:15.880 to be in the same software to do the audits, generate the reports, send the report, 07:15.880 --> 07:18.760 do the evaluation, generate the report and so on. 07:18.760 --> 07:26.160 And we want to reuse knowledge, because it's both for efficiency, but also for legal certainty. 07:26.160 --> 07:31.560 I mean, if we have the technical decision at the time, we want to be sure that our colleague 07:31.560 --> 07:36.520 or in the future will take the decision and for the act, we need to be able to reuse 07:36.520 --> 07:40.960 knowledge to reuse it and also to work with the colleague. 07:40.960 --> 07:46.560 And the last one, when we did it, the brainstorming about that, is that we need to create 07:46.560 --> 07:50.560 an open software. 07:50.560 --> 07:57.760 So at the EDPB level, we wanted to do software for transparency reason, but inside some 07:57.760 --> 08:04.560 of the data protection authority, it was also requiring, for many two reasons, the first 08:04.560 --> 08:10.720 one was, so they want public free and open software to be, to be precise, because they 08:10.720 --> 08:16.720 want whenever they do an audits, for the auditi to be able to reduce the audits and check 08:16.720 --> 08:19.560 what they are found and say, okay, you know. 08:19.560 --> 08:24.200 And so as we said, we don't want just a browser, but we want to be able to do the evaluation 08:24.200 --> 08:30.400 and the report and so on, we want to be able to prove that the entire software don't 08:30.400 --> 08:31.920 mess up with the browser. 08:31.920 --> 08:39.200 So that's what we say we have is that effectively what is it. 08:39.200 --> 08:45.000 And so we look at what was existing and we had a few things that was, we say, could qualify 08:45.000 --> 08:52.080 us very naughty and all things that are, you know, like not answering our beans, maybe 08:52.080 --> 08:55.640 because they were using machine learning and so had false positive, what is counting 08:55.640 --> 09:00.480 that you cannot have in the audits that we have legal consequences behind. 09:00.480 --> 09:05.320 So you want to be sure that what your insight is true. 09:05.320 --> 09:09.000 So we decided to develop the website of this thing too. 09:09.000 --> 09:17.800 So with the CPSP program, using the romp, nicely as to subscribe to our call, as free and 09:17.800 --> 09:22.920 open source software under the EU here and based on different other false projects. 09:22.920 --> 09:28.040 And we just take one minute, says it's, we are, in particular, used as the EDPS quick. 09:28.040 --> 09:34.200 So you see EDPS, this time, it's a complementary tool, the idea of the work is really 09:34.200 --> 09:35.880 to have bulk code it. 09:35.880 --> 09:45.000 So it's also website as a cookie auditing tool, but the thing is, they are the off script 09:45.000 --> 09:50.800 and you can't bulk audit, but you cannot do a long website. 09:50.800 --> 09:59.600 And so in a way, the work is more or less, you know, we could say that what is far, far 09:59.600 --> 10:04.000 away from our work, the version was zero, it was very, very close of it. 10:04.000 --> 10:09.960 So we are very thankful for the colleagues in the way that are working on this one. 10:09.960 --> 10:15.080 Now I will give the floor to the romp who we will talk about website testing too on how 10:15.080 --> 10:16.080 it works. 10:16.080 --> 10:18.520 Thank you very much. 10:18.520 --> 10:24.200 So let's talk about the technical side of this project. 10:24.200 --> 10:31.360 So there is many requirements in it, so this slide will talk to developers in this room. 10:31.360 --> 10:39.800 So one of the requirements was to have a browser that act the same way as the majority 10:39.840 --> 10:41.240 of browser. 10:41.240 --> 10:48.840 So we decide to use electron, because it has a chromium inside and chromium is now used 10:48.840 --> 10:53.040 in the majority of browser in the market. 10:53.040 --> 10:59.280 And the interesting thing we have electron is that you have a Node.js that is able to analyze 10:59.280 --> 11:08.440 everything that is happening inside the browser and also going out and in from the browser. 11:08.440 --> 11:16.440 So we do our analysis and then we display this analysis into a nice, angular interface. 11:16.440 --> 11:23.240 So if you want to have a look to what the tool looks like, you can go on this URL and 11:23.240 --> 11:30.440 you will have just a subset on the interface, a subset that doesn't include the browser itself. 11:30.440 --> 11:37.280 So to summarize the tool is open source, it's only based on why the use frameworks. 11:37.280 --> 11:44.960 It's cross-platform, you can use it on Linux, Windows, MacOS, and it's all written in TypeScript. 11:44.960 --> 11:51.840 So here is what the tool looks like, so I designed the tool, I developed it and when I 11:51.840 --> 11:58.240 started the project, what I want it to do is to take the user by the hand, because this 11:58.240 --> 12:06.880 tool targets legal and technical editor, but it can also be used by data controllers, 12:06.880 --> 12:14.320 data processors to inspect the own website, I mean it can be used by anybody, by everybody. 12:14.320 --> 12:16.040 So that's it. 12:16.040 --> 12:21.040 So when you start, you have like an extensive documentation about how the tool works just 12:21.040 --> 12:27.760 to not be lost inside the tools and on the left, you have all the functionality which 12:27.760 --> 12:30.600 are categorized into FreeSection. 12:30.600 --> 12:37.080 So what the tool is mainly doing is producing analysis, so it will store analysis inside 12:37.080 --> 12:38.080 the tool. 12:38.080 --> 12:44.320 The first section is to manage your analysis, and we will get in-depth into it. 12:44.320 --> 12:50.600 When you are doing analysis analysis, then you will see like session inside the running 12:50.600 --> 12:56.080 sessions, the section categories, and then you have editors that will help you to create 12:56.080 --> 13:01.600 your own knowledge base and to edit some to play to generate pretty print reports, and the 13:01.600 --> 13:05.640 other sections for setting and documentation. 13:05.640 --> 13:10.640 So if you click on browse, then you will get a number browser, I mean you will have all 13:10.640 --> 13:20.080 the tool all the button from Chrome, but what is the specificity of this tool is that 13:20.080 --> 13:25.040 it's provided as a stratum, many analysis, like this. 13:25.040 --> 13:31.520 So the idea is to have something very modern, so you can add, easily, you can add, easily, 13:31.520 --> 13:38.400 some analysis, so for instance, click on the first banner of the first stratum, which is 13:38.400 --> 13:44.800 about cookies, so maybe you will be disappointed because I didn't with it like bad guys 13:44.800 --> 13:52.280 and only goes on the European Data Production Board website, I click on Consent and 13:52.280 --> 13:58.640 on the right, you will see that there is free cookies which are used by this website, when 13:58.640 --> 14:05.960 you click on Consent, you have one technical cookie and two which are used for analytics 14:05.960 --> 14:11.560 developers, and on each of this line you will get more in-depth information when you click 14:11.560 --> 14:12.560 on it. 14:12.560 --> 14:19.040 So there is the function, the goal of this panel is to dive a little bit on each of 14:19.040 --> 14:20.040 this information. 14:20.040 --> 14:24.920 So on the first one, there is several tab, you see, and the first one you get some details 14:24.920 --> 14:30.320 about, for instance, the cookies, the name, the values, the values one, so this one is not 14:30.320 --> 14:38.160 used for tracking, when you click on log, then you will get more information, more in-depth 14:38.160 --> 14:43.440 information to do your investigation, because something which is hard when you are trying 14:43.440 --> 14:52.040 to classify the purpose of a cookie is to really know who is entering this and for what, 14:52.040 --> 14:56.960 so if you have an instance, so JavaScript cookie, then you will get the full costak of 14:56.960 --> 15:03.240 every script, so everyone will get involved in the writing process inside the browser, 15:03.240 --> 15:08.120 and if it's a JavaScript, it's a quick request cookie, sorry, I made by a set cookie, 15:08.120 --> 15:14.000 and you will see the first request, we stored this cookie. 15:14.000 --> 15:20.280 And the last tab is maybe the most important one, it's the dynamic matching functionality 15:20.280 --> 15:21.600 for cookies. 15:21.600 --> 15:29.040 So the idea is that the tool, when you get a cookie, we are always trying to find matching 15:29.040 --> 15:36.520 characteristics in the database to see which purpose this cookie is usually stored. 15:36.520 --> 15:40.480 You see, so you have several ways to do the matching. 15:40.480 --> 15:45.360 If you have an exact match, it means that the domain who usually stored this cookie with 15:45.360 --> 15:51.440 this name as this specific purpose, I mean regarding the knowledge place. 15:51.440 --> 15:58.000 If you have a match domain, it's usually this domain is stored in cookie for this purpose, 15:58.040 --> 16:06.240 so this purpose could be the objective of this cookie, and then the final one that you 16:06.240 --> 16:12.440 come to see, it's the match name, because usually you have a lot of first-party cookie, 16:12.440 --> 16:19.760 for instance, for analytics cookie, and so it will try to identify the name of the cookie, 16:19.760 --> 16:22.720 regarding some SDK, you know. 16:22.720 --> 16:29.920 So that's it, so all the analysis are, when you click on browse, it's just to have a preview 16:29.920 --> 16:35.400 of a website, then if you want to store your analysis on the tool, then you have to create 16:35.400 --> 16:41.520 a new analysis, that's why you have the this button, and all these analysis will be attached 16:41.520 --> 16:47.920 to see now, you know that you have to interact with the website itself to see is 16:47.920 --> 16:53.760 it be ever, when you consent to it, when you reject cookie, where you're just visiting it 16:53.760 --> 17:02.600 without interacting with the cookie banner, so all of these are scenario attached to analysis. 17:02.600 --> 17:09.520 So when you store analysis, then you will be able to mark every information to assess, 17:09.520 --> 17:15.120 wherever they are compliant or not, with the current regulation, so that's why you have 17:15.200 --> 17:22.320 all this tick and this button, and then you can explain why you, I mean, you can explain 17:22.320 --> 17:23.920 your evaluation. 17:23.920 --> 17:28.920 So that's it, once you have finished analysis, then you can share your analysis with 17:28.920 --> 17:36.720 others, and you can export them as pretty print report using some templates, and you can 17:36.720 --> 17:47.760 export it in many formats like PDF, tachyx, etc. So you remember that one of the requirements 17:47.760 --> 17:55.040 of the mission was to foster the reuse of knowledge regarding cookies, so there is like 17:55.040 --> 18:04.400 a full editing tool to edit your own database. So when you install this tool, this is maybe 18:04.480 --> 18:09.840 the more complex thing when you are not like in the topotection authorities, you don't have 18:09.840 --> 18:16.800 any knowledge base, mostly because most of the knowledge base which are used by that topotection 18:16.800 --> 18:23.920 authorities are covered by a secret of instructions, you know, but the tool that you to create 18:23.920 --> 18:31.440 your own database to reuse and the database, and I have made like some example of how to 18:31.520 --> 18:37.520 translate some existing database to the format of the tool itself. So this is like a project 18:37.520 --> 18:44.480 that I've put in some of my own repository, you will find the URL there, and that makes the 18:44.480 --> 18:52.960 links to my research, which is, I mean, there is a lot of research perspective open by this tool, 18:53.520 --> 19:00.720 I mean, one of one of one of something that could be done, I mean, by any researcher, if you 19:00.800 --> 19:09.040 have work inside the cookie topic, see it's like there is many database which are existing, 19:10.080 --> 19:15.840 there is no command methodology to build this database to find the purpose of this matter database, 19:15.840 --> 19:25.680 some are using like machine learning, and over adjust reading like the cookie notice, you know, 19:25.760 --> 19:30.640 so the idea is like if you have work on it, then I will be happy to discuss on you, see, 19:30.640 --> 19:36.320 we can translate it, and if you can set like a confidence level to it, 19:38.560 --> 19:46.320 over research perspective, like now browser, by default, enabling tracking protection, 19:46.960 --> 19:54.160 so it seems, I mean, there is a lot of research paper that show that there is some alternative 19:54.240 --> 20:02.400 to cookie that are used by website to try to track the user without joining cookie by itself, 20:02.400 --> 20:08.800 or try to hide their own cookies by putting them as a first domain, so I'm thinking about 20:08.800 --> 20:14.480 synemicroking for instance, just to hide, you know, a URL, or to take some fingerprint of your 20:14.480 --> 20:22.320 terminal, so there is many research paper, there is many scripts, and I will be really interested 20:22.320 --> 20:29.760 to speak with anyone who has work on this, and if we can integrate their algorithm into tools, 20:30.560 --> 20:39.200 and last one is also regarding browser, is now there are trying to provide alternatives to cookies, 20:39.200 --> 20:45.440 it's like the privacy initiatives, privacy sandbox for Google, privacy preserving 20:45.440 --> 20:55.680 attribution for Firefox, so all these initiatives are, I mean, most of the DPA say that 20:55.680 --> 21:05.760 they still require concerns, so I will be very curious to see how popular this initiative are, 21:05.760 --> 21:13.360 and if they are really like compliant with all the expectations from the regulation, 21:14.240 --> 21:22.160 and of course, if you have any other research ideas, if you have any initiative on this project, 21:22.160 --> 21:28.560 I will be very happy to discuss with you after this presentation, and you can also find me later 21:28.560 --> 21:34.080 in the audience, I will be there today and also tomorrow because I will also never, I'm speaking 21:34.080 --> 21:41.520 on another track, so contact me, and now for the SP part, I'll give the floor again to another one. 21:42.160 --> 21:50.480 So for more projects, what's in particular, so the first thing is the adpv is commenting to continue 21:50.480 --> 21:57.760 to maintain the future, even to continue to develop it, for information, we have started the new 21:57.840 --> 22:05.600 SP project, this year, to make a server version, we will pick them again, hoping it will do a 22:05.600 --> 22:13.440 good job again, we will see, the idea is to make it a collaborative and we hope it will find a way 22:13.440 --> 22:18.400 to allow to store knowledge that they send out its server sites, and of course, if you have 22:18.400 --> 22:23.280 idea, you want to discuss, we are open to it, thank you very much. 22:23.280 --> 22:27.280 Thank you. 22:29.280 --> 22:34.720 Have you made any questions? One question? I think this one works first. 22:38.160 --> 22:41.840 Hi, I'm Martin, I hope that you would like to work on cooking on this 22:41.840 --> 22:46.880 compliance, like automated analysis of all those burners, so there's no work before you 22:46.880 --> 22:52.560 don't know any work before to analyze for skookie's burners, there's pretty much good. 22:53.280 --> 22:58.080 And the next question is, is there any work on vendors themselves, so called vendors, 22:58.080 --> 23:03.360 the company is behind, because if you click on those links and you see, it's always bullshit. 23:03.680 --> 23:06.560 So, Mike, they're pointing at each other and it's a disaster. 23:09.360 --> 23:12.000 Because I'm sorry, I didn't get the question already. 23:14.640 --> 23:20.560 Is it working? Okay, but so called vendors, the company is behind this tracking, is there any 23:20.560 --> 23:26.240 like a data base, analyzing vapor uses, they have their own policies which are usually, you know, 23:27.200 --> 23:29.440 Generic and stuff. Thank you. 23:29.440 --> 23:35.840 It's sort of a confidence of the information, so could we trust that these kind of vendors 23:35.840 --> 23:42.320 information that it, because you know that you have some technical cookies, also that, I mean, 23:42.320 --> 23:51.120 one cookie can have new type of disease, and all the information about what needs the 23:51.120 --> 23:57.200 preposterous phone cookies are not always available, you know, so the, the, yeah, that's it, 23:57.200 --> 24:01.760 answer your question. And at the same time, if you looked in on like it's two, that the base that 24:02.640 --> 24:08.560 Jerome at on this slide, you I choose first one was EDPB1, so that, you know, information that 24:08.560 --> 24:14.160 shared, and you need to know by this, between SA, but the second one is built upon a 24:14.160 --> 24:19.120 free software project that is completely not vated by EDPB, I'm not saying that you know, like, 24:19.920 --> 24:25.760 we will do the same things, that was a legal part, but there are a bit based on that, I mean, 24:25.760 --> 24:29.200 there are readings and also it is, and it's how people are contributing to it. 24:31.760 --> 24:34.480 I'm sorry, I think that's all we have time for today, but thank you so much, 24:34.480 --> 24:40.400 thank you for your time. I'll be outside in front. And I do have cookies for those who want.