WEBVTT 00:00.000 --> 00:10.960 Okay, hi everyone, I am Pratikshaya and thank you for joining this session especially 00:10.960 --> 00:16.080 my talk and today in my presentation I will be highlighting the open source component 00:16.080 --> 00:20.560 that we have to offer in the Copernicus Data Space ecosystem. 00:20.560 --> 00:25.320 Within the presentation, how we will start from the definition of remote sense again 00:25.400 --> 00:31.080 Earth observation which in a geospatial session might not be that much relevant but still 00:31.080 --> 00:36.360 I thought it would be nice to relate to the Earth observation because prior to this presentation 00:36.360 --> 00:41.000 I have seen most of you been talking about different geospatial locations but not exactly 00:41.000 --> 00:45.880 on the focus with the Earth observation itself so I thought it would be nice to start with 00:45.880 --> 00:53.160 that and then I will introduce you the Copernicus Data Space ecosystem with the different components 00:53.240 --> 00:58.920 that we have to offer including data API and application but with the focus to the open source 00:58.920 --> 01:03.720 tools that I will maybe conclude with highlighting on how do we support community and how 01:03.720 --> 01:11.400 community can support us. Starting with the definition of remote sensing just to connect 01:11.400 --> 01:19.000 among ourselves naturally when we want to observe anything we do that with our eye so remote sensing 01:19.080 --> 01:24.280 is something similar in there there is sensors placed at a remote location and it tried to 01:24.280 --> 01:31.320 observe something in a distance that is exactly what remote sensing is how ever with sensor they 01:31.320 --> 01:36.280 have an added value that they can see beyond what human eye cannot so there is a range of 01:36.280 --> 01:42.360 spectrum which they can identify and we cannot so ensure this can be defined as the remote sensing 01:42.360 --> 01:48.680 process where it tried to detect something being placed at a certain location to understand 01:48.760 --> 01:54.680 his characteristics and behavior and connecting that with Earth observation Earth observation is 01:54.680 --> 02:01.000 simply a remote sensing process where different satellites or the aircraft placed at a higher 02:01.000 --> 02:07.480 location is trying to observe our Earth whether it be above below or on the surface of the Earth 02:07.480 --> 02:12.200 and according later try to tell us about the physical characteristics or different related 02:12.200 --> 02:18.200 phenomenon that is going on the surface and these kind of information can be very helpful in 02:19.160 --> 02:25.880 special temporal global maps or as someone said earlier maybe some pretty maps or some useful 02:25.880 --> 02:31.880 maps that can be used in making decisions or various other research and planning purpose these can 02:31.880 --> 02:38.600 also be very helpful to monitor different human activities at all scales and if I talk about the 02:38.600 --> 02:44.280 different data sets as well as what can be used with this data sets it can be whether if you want 02:44.360 --> 02:51.080 to simply monitor the ground or monitor the Earth surface like in this case the global land cover 02:51.080 --> 02:55.640 map that is getting popular in order to address the food security and other reasons so that could 02:55.640 --> 03:01.800 be very helpful with the Earth observation data set or even to monitor the water bodies not just 03:01.800 --> 03:07.560 not just in Alaska but maybe small rivers or lakes that is also possible with the Earth observation 03:07.880 --> 03:14.440 data set you can also check the or you can also analyze or observe the temporal variation 03:14.440 --> 03:19.640 that is going throughout the year though we talk about the global warming and everything but there 03:19.640 --> 03:23.720 should be a pattern that we want to observe through the satellite data that's possible with this 03:23.720 --> 03:29.320 this Earth observation technique there is also different kinds of data that would be rather 03:29.320 --> 03:34.280 or optical data sets based on which you cannot only not only analyze the 03:35.240 --> 03:40.440 analyze what is going on the Earth surface but what's going below for example in case of 03:41.160 --> 03:47.240 maybe Earthquake or landslide itself they use rather base images in order to analyze what is happening 03:47.240 --> 03:55.160 in the different phenomena is related so those are the those are the few application of remote sensing 03:55.160 --> 04:02.520 data sets however in past it's like not in past even in like every day their database of data are 04:02.520 --> 04:08.600 being recorded and it has always been kind of a complication on how do we store this data or 04:08.600 --> 04:15.320 how to manage this data one task as a geospatial injury that always come to my table used to 04:15.320 --> 04:21.160 be like okay data set available out there but how do I access it and suppose I want to get 04:21.160 --> 04:26.040 even the optical data as well as the rather data but they both are located at a two different 04:26.040 --> 04:31.640 location so do I have to take every thing do I have to take the steps in a different manner 04:31.640 --> 04:37.640 so that has always been a kind of a discussion so in there came the Copernicus data space 04:37.640 --> 04:43.880 ecosystem where we are trying to centralize the data at one place with a set of different tools 04:43.880 --> 04:49.160 that could be used to directly access the data there process it there and also analyze it 04:49.160 --> 04:55.560 using the different visualization tool that's available that is in a summary what is Copernicus 04:55.560 --> 05:01.400 data space ecosystem so again if I have to highlight it is a user centric platform which 05:01.400 --> 05:07.320 allows you to access terabytes of earth observation data set it gives you tools to 05:07.880 --> 05:15.720 access and process them and also platform to visualize them and within the CDAC there are different 05:15.720 --> 05:22.760 components starting with the Sentinel data set I assume most of you have heard the terminology 05:23.640 --> 05:29.160 Copernicus so Copernicus provides you with this Sentinel data set and the Copernicus data space 05:29.240 --> 05:35.240 ecosystem will be the authoritative source where you will find the first hand data set and there 05:35.240 --> 05:39.720 is the Copernicus contributing mission data set as well you will find now within this platform 05:40.360 --> 05:45.160 and there will be additional EOD data sets that would be available maybe for European scale 05:45.160 --> 05:52.280 or for global scale it will depend for example the recently developed is a world cover map you can 05:52.280 --> 05:59.000 find this data set as well in the single in the same single platform and with this you can access 05:59.400 --> 06:04.840 maybe directly with the website or using different streamline data access and API tool or even 06:04.840 --> 06:11.480 there are different catalog APIs that supported within the ecosystem one example could be the stack 06:11.480 --> 06:18.360 API itself there is Copernicus browser which allows you to even visualize like how does this data 06:18.360 --> 06:24.760 look like or how does this data look over time so you can directly go and visualize and this I 06:24.840 --> 06:30.600 think for those who are not yet sure on how to get started with handling this data set 06:30.600 --> 06:36.760 Copernicus browser could be a good point to get yourself started then there are several on 06:36.760 --> 06:42.920 mode code repository that we provide along with a cloud computing capacity there is certain 06:42.920 --> 06:47.880 amount of cloud computing capacity available for everyone who want to use this data set 06:49.000 --> 06:54.600 and there are online code labs and interface and there is also a concept called federation 06:54.600 --> 07:01.000 and user identity service suppose you have a set of data set which you want to share with global 07:01.000 --> 07:06.520 audience you can reach out to the Copernicus data space team and accordingly share it with a wide 07:06.520 --> 07:12.920 audience that's where it comes into the federation concept like for Belgium there is a there is 07:12.920 --> 07:19.080 the entity called Terra scope which provides you with the different data set more focused with 07:19.080 --> 07:24.200 Belgium so that is that is one of the federation part of the Copernicus data space ecosystem 07:24.600 --> 07:28.760 similarly if you have your own data set that you prepare and you want to provide then that could 07:28.760 --> 07:35.800 also be part of it and it it it whole makes as an open ecosystem within the within the ecosystem itself 07:37.000 --> 07:41.800 so that was the different components that I talked about what we have to offer in Copernicus 07:41.800 --> 07:47.720 data space ecosystem most of them being open source however there is one very interesting 07:47.800 --> 07:55.400 component and which is the open you to just simplify the definition of what exactly is open 07:55.400 --> 08:03.080 you open you is a is a is an open source API in the form of source code which you can use not only 08:03.080 --> 08:10.200 to access the data but also you can use it directly to perform analyzes or do some processing 08:10.200 --> 08:20.200 with this satellite data on the cloud itself and there are few pillars that is always used 08:20.200 --> 08:25.480 in defining what exactly does open you does so as I said earlier that there used to be different 08:25.480 --> 08:30.520 platforms that used to provide you with different satellite data and it used to be always 08:30.520 --> 08:35.960 complicated like how how many accounts do I create or how many platforms should I visit in order 08:35.960 --> 08:41.800 to fetch the data so you know with open you it becomes easier you can just use your one account 08:41.800 --> 08:47.160 and no matter which back in the data is saved you can access this additionally I can also say 08:47.160 --> 08:53.960 that if your data is in a stack form and it is hosted somewhere then using open you you can also 08:53.960 --> 09:00.280 access that data if it is not already the satellite data is not already there in the ecosystem 09:00.280 --> 09:05.720 but there is some raster data that you want to use within your workflow then you can access that 09:05.720 --> 09:12.280 using open you that would be a simple data access and processing workflow and no matter what kind of 09:12.280 --> 09:16.600 workflow you develop using open you they will be scalable and efficient for processing 09:17.960 --> 09:24.920 since they open you is developed using open source code and and supports the open 09:24.920 --> 09:30.360 community so we have also we can also already say that it supports fair and open science principle 09:31.000 --> 09:37.000 and however one more thing I would like to highlight with open science and fair principle of open 09:37.000 --> 09:42.760 you or sports is the code that you have developed they are independent of underlying technology 09:42.760 --> 09:48.440 whether you develop using r or python or even javascript so open you you can if it supports 09:48.440 --> 09:52.600 whichever library you want to use and it is independent of the underlying technologies 09:53.000 --> 10:01.080 and the work that you repeat the work that you prepare so you can reuse and share it with 10:01.080 --> 10:05.800 the global audience I will later show you how you can do that if you develop it using open you 10:05.800 --> 10:10.920 workflow then it just has to be a matter you will just be provided with a name space which you can 10:10.920 --> 10:18.520 access with any different APIs how open you workflow look like is here I am trying to connect to 10:18.520 --> 10:25.800 the company because data space back back in so I just connect using authenticate OIDC and then I 10:25.800 --> 10:31.480 just provide what is the data that I want to access what is the temporal extent what is the special 10:31.480 --> 10:37.240 extent and in this case I was trying to get Sentinel 1 which is a rather data so I am just trying 10:37.240 --> 10:45.000 to get to to polarization band out of it and the next step I just showed a very simple 10:45.560 --> 10:51.960 process used here which is already built in open you minimize a minimum time because we have 10:51.960 --> 10:56.920 asked for a data over a temporal range so we just want to minimize reading it to one time frame 10:56.920 --> 11:02.840 that is why I just use one simple process in here however there could be a scenario that okay 11:02.840 --> 11:08.280 there is there is a very complex workflow and the processes are not already there in the open you 11:08.840 --> 11:14.040 then what you can do is use this concept called user defined function which are basically python scripts 11:15.320 --> 11:20.760 you sorry which are basically the python script that you can bring it to the open you and 11:20.760 --> 11:27.160 make use it within your workflow it does not have to be that all the processes are in there and at 11:27.160 --> 11:34.040 then you just write the yeah you just get the result when you get the result since I am sorry 11:34.040 --> 11:39.080 I just missed this part that open you uses the concept called data cube because we have to deal 11:39.080 --> 11:44.200 with a large amount of data over different time frame so it could be multi spectral data as well 11:44.520 --> 11:49.880 multi temporal data so we use the concept of data cube and the data cube is executed as the 11:49.880 --> 11:56.760 end of the whole workflow you can also do it in the intermediate step however I am showing here in the 11:56.760 --> 12:03.400 example the last step so that is how it would look like and I think I wanted to show it in a 12:03.400 --> 12:10.200 demo but before that I mentioned earlier that you can share your code how I just wanted to 12:10.200 --> 12:16.360 highlight it already here we have something called open you algorithm plaza where if you have 12:16.360 --> 12:22.360 your workflow that is developed it should include the open you component in it you can already 12:22.360 --> 12:29.000 publishize it here so that why everyone or wide audience can use it one example is 12:30.920 --> 12:38.040 as this pi u gpr which was recently published by one of the user which is basically which is 12:38.040 --> 12:43.320 basically a python machine learning work library that try to predict the biophysical 12:43.320 --> 12:49.080 trait using the Gaussian regression process so similarly if you have it does not have to be very 12:49.080 --> 12:55.400 complex it can be very simple like you can already say few examples of band math calculation 12:55.400 --> 13:00.920 that is done here it can be as simple as that to very complex workflow you can already share it 13:01.080 --> 13:09.560 within here and yeah it will be used by a wide audience that is the idea of shareability 13:09.560 --> 13:14.760 that we support in open you and I just quickly wanted to show you a demo in the Jupyter lab 13:14.760 --> 13:20.200 environment that is offered to everyone in Copernicus data space ecosystem so you can access it 13:20.200 --> 13:26.840 you have three different flavors to choose from when using the when using the Jupyter lab environment 13:26.920 --> 13:32.040 and you have our and python kernel installed so I'm not sure how many here would be interested in 13:32.040 --> 13:37.720 using our kernel hopefully yes with the python kernel so I will show an example with the python itself 13:38.920 --> 13:45.960 let me quickly I just wanted to show you how does it look like in the the workflow 13:46.840 --> 13:50.840 I hope I can 13:54.840 --> 13:56.840 it has not come 14:05.160 --> 14:07.160 already 14:07.160 --> 14:11.320 you open display like in your settings see if I can see the extra 14:12.440 --> 14:14.760 or what I think it's fine 14:16.600 --> 14:20.440 you might have a button you can press like F8 to switch display 14:20.440 --> 14:45.800 okay okay so this is the Jupyter environment if you register yourself in the Copernicus 14:45.800 --> 14:52.360 ecosystem you will get this you will get access to this for free and then you will 14:52.360 --> 14:57.800 you can choose from any of the kernel that you want to use they separately open your kernel provided 14:57.800 --> 15:03.960 for you as well so the idea of different kernel with different name is that we have already installed 15:03.960 --> 15:10.040 few libraries that you might need that's the idea and in addition to that there are some sample 15:10.120 --> 15:16.680 examples provided in here so just go to open you there are few to get started for example if I go 15:16.680 --> 15:26.280 inside one of this how it looks like is basically as I said earlier that you just have to authenticate 15:26.280 --> 15:31.960 yourself provide the different parameters that you want to and in this case yeah using the process 15:31.960 --> 15:39.400 that was already built in open you but there are can be cases where I want to do more and open 15:39.400 --> 15:45.000 you does not have this process so in this case I have showed in the form of string which is not an 15:45.000 --> 15:50.120 ideal form I can understand that so you can just do you can just write in your python file 15:50.840 --> 15:56.440 and what you have to do is just import it from there using the from UDF command of open you 15:58.600 --> 16:03.000 and just you get the result how it looks like at the end is this there is very cloudy image 16:03.720 --> 16:08.920 so this is just a sample example of what you can do with Sentinel two data just downloading 16:08.920 --> 16:14.680 and visualizing the RGB image but similarly you can do with the different raster data set you have 16:14.680 --> 16:20.040 available as I said earlier if you have your very high resolution data available in stack 16:20.840 --> 16:25.400 to evaluate in stack compliant format then you can directly fetch it using open you 16:25.400 --> 16:31.400 before the analysis on it it can be as simple as just visualizing it to as complex as maybe 16:32.360 --> 16:37.080 running a machine learning model or using the train model to get the inference 16:38.200 --> 16:53.000 and in order to save I think I have an example I wanted to show on how how UDP can UDP which is the 16:53.000 --> 16:58.760 user defined process the one that I showed you earlier was user defined function which are 16:58.760 --> 17:05.080 the function that you define user defined process are which you want to save as a process for others 17:05.080 --> 17:11.480 to use or for yourself to use it later so all you have to do is define the input parameters 17:11.480 --> 17:16.280 in which format you want like in this case for temporal I gave this key method okay it should 17:16.280 --> 17:22.920 be temporal interval and array and for a special extent in bounding box or maybe simple special 17:23.000 --> 17:29.400 extent with suppose both bounding box as well as the feature collection and then you have to 17:29.400 --> 17:35.640 define the workflow as it was but at this time pass in the parameter not the actual value and 17:35.640 --> 17:40.280 at then you just save it with a different name that you want to give it will be a sign with 17:41.320 --> 17:48.680 it will be the ID can be any if you want however we request it to be unique as much as possible 17:48.760 --> 17:54.840 however the user ID will use the ID that is associated with your account the name space itself will 17:54.840 --> 18:00.680 be unique so that will help in identifying the process that you have created and you just share 18:00.680 --> 18:07.320 you will get a link that is what you share in the open your algorithm plaza so that was 18:08.600 --> 18:09.800 quick demo as well 18:10.680 --> 18:25.960 I am back to the action but before that let me do that I wanted to also show you how you can contribute 18:28.520 --> 18:34.600 if we have everything provided in GitHub so it is an open source course so if you want to maybe 18:34.600 --> 18:39.080 if you already have an idea okay this feature might be a good one to have in open you as well you can 18:39.160 --> 18:44.360 already come and maybe create issues or pull because anything like that however there is another 18:44.360 --> 18:51.880 repository called community example where we try to create example maybe basic to very complex one 18:51.880 --> 18:57.080 for users to have an idea on what they can do with the remote sensing data set and also with 18:57.080 --> 19:03.880 open you you can take a reference out of this and could be useful in some of your application like 19:03.880 --> 19:10.120 in here there is also an example of how to load your stack the data that is 17 stack you can 19:10.120 --> 19:15.880 directly load in open you that that there is an example how to do that there is how to do large 19:15.880 --> 19:23.080 scale processing to do multi-backend processing and also you can find all the range of different 19:23.080 --> 19:30.040 application examples in here and you can also if you already have your use case you can directly 19:30.120 --> 19:35.400 come and create a pull request if it is if it is valid and I think that it will surely be 19:35.400 --> 19:42.520 accepted and merged so I think that was more or less I had to say so do do do do 19:49.080 --> 19:52.360 maybe coming back to the presentation 19:53.160 --> 19:57.800 I think my time is also almost over 20:08.440 --> 20:15.000 so yeah that was about community support and when using Copernicus data specifically if you 20:15.000 --> 20:21.400 run ran into any issue then please feel free to post the issue in the forum or create ticket 20:21.400 --> 20:27.560 however it is encouraged mostly to post it in forum because they should that you ran into someone 20:27.560 --> 20:35.000 else might also have the same problem so it is always helpful in that scenario and to summarize 20:35.000 --> 20:41.160 again about the Copernicus data space ecosystem it is a free public platform where you can share 20:41.160 --> 20:46.120 your data you can use the data set that's already there use the tools that's provided to 20:46.760 --> 20:52.600 you or you can also share the tools that you have developed that was more or less all thank you for 20:52.600 --> 21:05.480 your attention on the app of every round so yeah I also have my team member from the Copernicus 21:05.480 --> 21:09.640 data space ecosystem here as well as the OpenU core team so if you have any in-depth 21:09.640 --> 21:26.840 question as well please feel free to raise yes please yes it everything happens in the cloud 21:32.200 --> 21:33.560 yes yes that's as well 21:40.200 --> 21:46.120 I don't think we ourselves it is not publicly shown for sure the user defined function even the 21:46.120 --> 21:51.640 UDP you're just posting the URL you don't show the whole core and everything is handled as the 21:51.640 --> 21:59.320 process wrap in adjacent format so yeah your UDP UDF it won't be publicly announced I don't know 21:59.320 --> 22:04.680 from the back inside yeah I am to that you have to tell to be that in existence will be able to 22:04.680 --> 22:10.280 use the spot system that's ready to run in and watch so if you are running the tasks on the 22:10.280 --> 22:15.160 set of that data you're fighting for the global learning of service if you need to be in a 22:15.160 --> 22:26.280 speed up especially during the course and then what's the speed up maybe this was can I mine 22:34.280 --> 22:39.960 yes please yeah for three years ago this data would be offered free to download for almost 22:39.960 --> 22:46.440 everybody on ESA's side-up platform and also in national archive such as Francis Peps program 22:47.320 --> 22:53.400 both of which were decommissioned rather recently and moving towards this cloud-based platform 22:54.600 --> 22:59.320 on which it is much more difficult to export the data out to process on a local machine you have 22:59.320 --> 23:04.840 you kind of have to use the you kind of to use the given platform and given machines given virtual 23:04.920 --> 23:13.640 machines by by the by the service is there any why why is such why can I know the 23:13.640 --> 23:19.800 export of data regarding the question of like yeah there was I have and everything that allowed 23:19.800 --> 23:25.080 you to download the data but now why cannot you download it locally I don't yet but not 23:25.080 --> 23:30.840 no longer in high volumes I think it should be possible in high volume as well with regards to 23:30.840 --> 23:36.280 open you it is more on processing so it is not only about downloading the data but with regards 23:36.280 --> 23:41.800 to downloading high volume that also should be possible unless you go beyond yeah beyond too much 23:42.360 --> 23:46.920 that should not yeah they certainly limit because everyone around the world we have given certain 23:46.920 --> 23:53.080 quotas so that's the thing and Tyher is now the Copernicus browser that's there so I think the 23:53.080 --> 23:59.160 feature that was there in Tyher we still preserve now that's there and I think it should be 23:59.240 --> 24:07.000 possible but yeah how about the tooling that was also deprecated around the same time the Python interfaces 24:07.000 --> 24:14.120 to snap with regards to snap I don't know how it would be related with this one yes I'm sorry 24:17.160 --> 24:24.760 other things like you are the data cube based on xrd data cube they are to some extent 24:25.400 --> 24:31.240 but yeah the data cube that is there handle in xr this one is slightly different in the part 24:32.280 --> 24:39.880 there is not only multi-specure but also temporal and yeah in cluster yeah I think it is divided accordingly 24:39.880 --> 24:41.880 unless I don't want to add something 24:55.320 --> 25:02.600 okay so thank you 25:06.840 --> 25:10.840 thank you