WEBVTT

00:00.000 --> 00:26.120
Thank you, yes, probably not a good idea to not write my name there, but it'll come

00:26.120 --> 00:33.240
later. So, yeah, I'm trying to talk about open cloud and software-defined storage and what

00:33.240 --> 00:39.160
we are kind, the aspects of storage that we are consuming, and for that I would really like

00:39.160 --> 00:45.880
to ask a few questions, but they're not here. Oh, I've started the wrong. So, before I get

00:45.880 --> 00:52.880
started, who knows, or who's using some kind of software-defined storage at home, at home,

00:52.880 --> 01:00.840
really, I don't think. Okay, like 15 people, okay, who's using it at the, at the, on the job?

01:00.840 --> 01:06.840
Everyone, or nearly everyone, okay, maybe two hands-in-go-out, okay, cool. Who year-knows

01:06.840 --> 01:14.840
what open cloud is? That's pretty good, okay, like 12 people, 13. Who of you knows

01:14.840 --> 01:25.800
that open cloud has different storage drivers? 1, 2, 3, 4, 5. Okay, who of you knows that open cloud

01:25.800 --> 01:31.480
can actually synchronize files when they are changed on the storage, on the software-defined

01:31.480 --> 01:38.280
storage? The guy from Sirn, and you are from, I didn't know, okay, but, okay, cool. So, maybe

01:38.280 --> 01:44.400
you will get to know a few more things that what open cloud can do. Who am I? I am, my name's

01:44.400 --> 01:52.000
you Andrea. I have been employee number seven at on cloud. It was like ages ago, and I stayed

01:52.000 --> 02:01.280
there until on cloud was bought by kiteworks, and then found myself at open cloud, where

02:01.280 --> 02:08.200
we continued the work on open cloud infinite scale, which is a rewrite of the backend of

02:08.280 --> 02:14.960
own cloud that was started by Sirn, but by us. So, they just made an offer that we couldn't

02:14.960 --> 02:22.800
refuse, and we actually tried to collaborate on that and continue it, and adapted to our

02:22.800 --> 02:29.720
needs, to the needs of our end users. And now I'm trying to deploy this in Q&A text and

02:29.720 --> 02:37.720
find a good software-defined storage that works well in that scenario. Okay, so our end users

02:37.800 --> 02:43.400
really have a few things that they got used to, what you all know, like from the days that

02:43.400 --> 02:49.400
Dropbox started, when somebody changes something, if I should be synchronized, the discipline

02:49.400 --> 02:54.280
should pick up files, and that's the normal synchronization stuff that has some requirements

02:54.280 --> 02:59.720
on the storage, as we will see later. And the other things, if I send a public link to

02:59.720 --> 03:06.040
someone, that link should not break when somebody renames files in the storage. Okay, so we cannot

03:06.120 --> 03:13.640
do path-based sharing links. So, an idea is important. We have the typical things that people

03:13.640 --> 03:19.800
see, so we currently show the full tree size of a folder with all the files in it, which is

03:19.800 --> 03:25.560
not what you see in a POSIX file system, but that's what people got used to. We have a trash

03:25.560 --> 03:31.560
that works automatically. We have filed individual revisions, so if you write anything, you can

03:31.560 --> 03:37.960
always restore a version, and our end user actually now used to that. And it's all about self-service

03:37.960 --> 03:44.040
and enabling end users to take care of the files without having to talk to an administrator.

03:45.640 --> 03:51.720
And the last thing is, okay, the user's at metadata like tags, favorites, and whatnot,

03:51.720 --> 03:59.320
and they want to be able to search and find these things based on these metadata tags or based on

03:59.320 --> 04:06.840
file content, so we actually have to index files and look inside that. Now, if you recognize this,

04:07.720 --> 04:13.480
I hope so. I mean, this looks like the average software defined storage. You have some kind of

04:13.480 --> 04:19.240
metadata storage that is replicated. You have some kind of bucket storage where all the blocks are,

04:19.240 --> 04:25.160
you have some kind of system that's translating all this stuff and some side services. They all look

04:26.120 --> 04:30.680
the same. And actually, on Cloud 10 looks like this. You have your patchy service, your PHP code,

04:31.640 --> 04:37.800
Galaera Cluster, to be clear or often, or Postgres or whatever, and you have a POSIX storage,

04:37.800 --> 04:41.640
or even a three storage. So all these systems look the same. Next slide looks like this.

04:43.000 --> 04:46.760
Surface looks like this. Surface looks like this. Surface looks like this.

04:47.240 --> 04:54.120
Lay a son of as looks like this. And for a Samba, I think, also looks like this.

04:56.920 --> 05:03.480
He's not hitting me, okay. Okay. So basically, you can press all your stuff,

05:03.480 --> 05:11.640
it defines storage solutions into this kind of architecture. And I find it pretty sad that we are all

05:11.640 --> 05:17.240
kind of re-implementing these things. And as Open Cloud, we don't really want to become the

05:17.240 --> 05:23.960
next software-defined storage, but in effect, we are. And I would like to change that and just

05:23.960 --> 05:28.120
expose whatever software-defined storage is. They are make it accessible. That's the whole idea.

05:28.120 --> 05:33.000
We should not be our, me should not be a storage. We should make storage accessible.

05:35.400 --> 05:40.120
So regarding the two or the different storage drivers, I will only use two as an example. We have

05:40.200 --> 05:44.040
decomposed of S, which kind of puts the POSIX file system on top of its head.

05:44.600 --> 05:51.400
Because as I said, we need to address files by their ID. And the POSIX of S, which tries to integrate

05:51.400 --> 05:59.720
with the underlying storage as good as it can. So back to properties that we need, or that our

05:59.720 --> 06:07.800
end users are used to. Our sync algorithm is based on e-tech changes. And we need the storage

06:08.120 --> 06:14.440
side to propagate the e-tech from a leaf that changed up to the root. Because that's how our

06:14.440 --> 06:20.920
sync line detects and where it should navigate to to actually find the changes and synchronize them.

06:22.520 --> 06:29.400
Indicum POSIFS, or when we just POSIX of S without watching, we have everything in our control,

06:29.400 --> 06:35.400
all the requests go through our stack. We can propagate the changes ourselves, everything's nice,

06:35.880 --> 06:42.360
but if we want to allow other processes to interact with our files in the POSIX scenario,

06:42.360 --> 06:48.040
we need to be notified somehow. I mean, SFS has a recursive change time. We can use that.

06:49.240 --> 06:53.960
That it's internally propagates slowly and then we can use that to synchronize that work.

06:54.600 --> 07:01.240
Here's a three modification time TM time property that they are actually built for

07:01.240 --> 07:06.280
own cloud 10 back in the day because they have their own back end. But local filesystems, well,

07:07.720 --> 07:13.080
I'm not aware of any local filesystem that uses has this kind of property. Of course,

07:13.080 --> 07:20.600
you can use change notifications, like I notify that allow us to then do the E-tech propagation

07:20.600 --> 07:26.360
ourselves. We also need it for other things like the size tree aggregation, if we have something

07:26.360 --> 07:30.440
changes, we can actually calculate the new size and propagate that up to the root so that end

07:30.440 --> 07:40.360
users see in the web UI are. This is so many bytes in it. But I notify only works for one machine.

07:41.560 --> 07:48.760
There are integrations like Samba that does expose this, but then we need to, if we now have

07:48.760 --> 07:53.240
two or three instances of our software that all mount the same storage, they will all see

07:53.240 --> 07:59.240
these notifications. So we get them even three times more. So it would be nicer if there was

07:59.240 --> 08:05.720
some kind of queue that we could digest to see what happens on the storage. And that already exists.

08:05.720 --> 08:11.640
So GPS has it as a Kafka queue. They have a file lock that we could also consume. That is

08:11.640 --> 08:19.400
that is implemented. SFFFS, there is a company CROIT that built an event queue based on the metadata

08:19.480 --> 08:28.680
service that we can consume. The guys from Lail storage, the SFFFS also have something for the metadata

08:29.400 --> 08:36.360
service. Samba has no defy that we could use. So there are things we have a generic

08:36.360 --> 08:42.120
not stream that we can consume. But yeah, this is something that we would really like to

08:42.120 --> 08:46.680
like to have. Being notified of something changes so that we can actually populate ourselves.

08:47.640 --> 08:54.280
The second most crucial thing, which goes back to being able to share links, is that fires

08:54.280 --> 09:00.760
need to be addressable by ID. We need to be able to look them up by ID, not only for links,

09:00.760 --> 09:07.160
but also for metadata. So every the data that we index has a file ID and we need to be able to look

09:07.160 --> 09:15.000
up the file of the search result from the search index by ID because if somebody moves files around

09:15.080 --> 09:23.800
these links should still work. There are some more details here, but I think I don't need to go

09:23.800 --> 09:31.320
into the difference between decouples and POSIX. Maybe a little bit because the decouples

09:31.320 --> 09:37.480
the file system since it stores files by their ID, that works without any other

09:37.800 --> 09:45.320
and because all the requests go through our stack, we don't need to rely on any other system.

09:45.320 --> 09:52.840
We don't need to be notified nothing, but if there is an external file system, we cannot use

09:52.840 --> 09:58.760
the native properties of POSIX files because I know are typically reused. If you think about that,

09:58.760 --> 10:03.320
we could just use the I know, no, if you delete a file and then create a new file, it might be

10:03.320 --> 10:10.600
that the new file gets an older file ID that has been shared and now you would point to a new file

10:10.600 --> 10:16.600
with completely different data and so that's not an option. So what we do is we write an extra

10:16.600 --> 10:23.000
ID as an extended entry in the file and that's how we keep track of these files. So if you're using

10:23.000 --> 10:29.320
the change notifications to detect if a file was renamed or if it's new or if it was deleted

10:29.320 --> 10:38.360
and then we clean up our cache. Our cache, which we need to populate on start because it's really

10:38.360 --> 10:45.800
a cache, it's not persisted, we just keep it and nuts for now. There are, we have ideas to actually

10:45.800 --> 10:51.800
make it persistent, but then we would really be a software defined storage because then we would keep

10:51.880 --> 10:59.080
all the metadata and we would no longer need to access files at all. We have trash, fire,

10:59.080 --> 11:05.320
vision, snapshots and they all in the decomposed file systems, since we are in full control,

11:05.320 --> 11:09.880
we can write new files to a new location, we can keep track of older visions that's all easy,

11:09.880 --> 11:16.600
and we can keep track of trash, but in a POSIX file system, yes we can follow the free

11:16.600 --> 11:21.960
desktop org trash specification, so we put files in a special location and that follows standards

11:21.960 --> 11:29.800
that works, but fire revisions, the only fire system that has fire revisions is Windows and TFS.

11:31.800 --> 11:36.600
So if somebody knows the fire system that has revisions looking at you, some Bob. No,

11:37.720 --> 11:43.080
okay that would be cool. So and but a lot of fire system supports snapshots.

11:44.040 --> 11:49.720
SEPFS, actually you can use the SEPFS with snapshots without us doing anything, if you know how

11:49.720 --> 11:55.400
that works, but it might be a good idea for us to expose that, oh by the way you could do a snapshot

11:55.400 --> 12:00.360
on this fire system, that's something that we haven't explored yet because end users are still

12:00.360 --> 12:07.800
used to the file individual revisions that we have. Tree size aggregation is the idea of being able

12:07.800 --> 12:15.640
to see how deep how many bytes are below this tree, and again endigopol service since we have

12:15.640 --> 12:21.000
we are in control, everything's fine, but in POSIX service we have to actually maintain that,

12:21.000 --> 12:28.600
but we can do that with the change notifications, which we need as well for the e-tech or change

12:28.600 --> 12:33.560
propagation or the ten-time propagation, and we do that in the same step. So when something changes,

12:33.640 --> 12:38.600
we not only propagate the change time, but also the new size of the folder up to the root.

12:40.520 --> 12:48.840
SEPFS and EOS both have recursive change, time, a recursive size, what do they call it?

12:48.840 --> 12:53.240
SEPFS calls it recursive counting, and it's the mount option that you have to set, but then it

12:53.240 --> 12:59.960
actually does that. So a lot of software defines dodges, have properties already out there,

13:00.520 --> 13:06.440
and we are kind of re-implementing them. That's annoying. The last point, metadata indexing

13:06.440 --> 13:11.160
is probably not something that software defined storages will implement because it might be out of

13:11.160 --> 13:19.560
their domain, but whenever a finds a file change, we actually index them, we look at the metadata,

13:19.560 --> 13:25.800
and eventually we can index them via TKA, and expect content also to OCR, which takes a while,

13:25.800 --> 13:30.120
if you have thousands of pictures, but that's actually nice because we can search for anything.

13:31.720 --> 13:37.160
But it would be great if the software defined storage, if there is a software defined search,

13:37.160 --> 13:43.080
that has some kind of metadata index that we can query. I mean, yeah, that means so we are using

13:43.960 --> 13:50.760
a badger DB or open search, which was the elastic search, for this reverse lookup and metadata

13:51.480 --> 13:57.320
but I guess we could combine there. If there is something used in the storage system,

13:57.320 --> 14:04.600
if there is a reverse index somewhere, maybe we can integrate there. And then there is wishful thinking,

14:04.600 --> 14:11.800
I thought, because our end users are using the web UI, they can only probably see 30 files or so,

14:12.520 --> 14:20.200
which is a performance bottleneck when you have a directory with 10,000 files,

14:20.200 --> 14:27.160
it doesn't sound much, but in the web UI, we have to fetch that via web DAF, and dozens of properties,

14:27.160 --> 14:34.280
and then this request of please give me this folder becomes like megabytes big and that is really annoying.

14:34.280 --> 14:38.920
So I would really like to save some time and only get the files that we need to have a

14:38.920 --> 14:47.000
paginated directory listing, and I already heard that the Sambi can do that, so I will look

14:47.000 --> 14:53.240
it to that, but are there any other software defined storage solutions that can give me a paginated

14:54.120 --> 15:03.080
directory listing, probably even only specific properties sorted by namesize or change time,

15:03.160 --> 15:10.840
that would be my wishful thinking. So if there is nothing out there, where would I propose that,

15:10.840 --> 15:15.240
where would I go to? I can come here next year, so if you have any ideas, ping me.

15:18.200 --> 15:21.880
And then the question is how do we interact with these systems? Currently, we are using

15:22.520 --> 15:31.720
POSICS, CISCOLDS, which is kind of slow, at least when we compare it,

15:31.720 --> 15:37.160
when we're talking about network file systems, because they have to do caching, and then we get caching

15:37.160 --> 15:45.160
consistency, which we don't want, so we actually cache all the metadata of files in nuts and

15:45.160 --> 15:49.560
talking, and we're doing a directory listing and doing 1,000 calls to nuts as faster than

15:49.560 --> 15:57.640
doing 1,000 CISCOLDS in our case. And I think is there is a software defined storage that

15:57.640 --> 16:03.480
actually has some other kind of API, deals from a certain, for example, has a GRPC API that we could use,

16:06.040 --> 16:13.640
I think CIFFS has lipsIFFS, which we can use, so there are other options, and I would really

16:13.640 --> 16:27.800
like to hear of those. Yeah, maybe the idea was, if somebody is working on software behind

16:27.800 --> 16:33.400
storage, shouldn't we all organize, or should we come together and define some kind of API that

16:33.400 --> 16:38.760
we all implement, or that you all implement, that we then can consume, not only we, but maybe others as

16:38.760 --> 16:46.600
well, and be notified of change notifications, and these things like files as to maybe I call it.

16:47.400 --> 16:54.040
I don't know if maybe this is 3 API, which is an API which is GRPC based, that's developed in

16:54.040 --> 16:59.400
the scientific world, is an option there, and I can only invite you to come to the talk and

16:59.400 --> 17:06.920
awesome, of the CISC conference, but trying to find ideas and ways forward to better connect these things

17:06.920 --> 17:14.520
together. So, yeah, we don't want to become the next software defined storage, because there are

17:14.520 --> 17:20.440
good solutions up there, and I really want to expose them and make them accessible to the outside

17:20.440 --> 17:28.280
world with a nice UI, and there's so client's mobile clients, so yeah, let's get in contact,

17:28.280 --> 17:34.040
and talk to me, find me, you will add putonic, basically everywhere you see, add putonic, let's me,

17:34.440 --> 17:41.240
and my email address is here as well, so you will know how to find me, that's all good, doesn't

17:41.240 --> 17:46.040
make sense. Thank you.

17:48.440 --> 17:52.440
Happy to dive into any questions.

17:52.440 --> 17:55.240
Have I just talked about sharing?

17:55.240 --> 18:02.600
Only about, yeah, I haven't talked about sharing, I was mentioning public links, that's the

18:02.600 --> 18:09.720
part that is, that works with every POSICS file system, because we can always make files accessible

18:09.720 --> 18:13.800
via public links, and that they don't, they mustn't break if something changes on the

18:13.800 --> 18:20.600
POSICS file system, if paths change, if you're talking about sharing between users on an instance,

18:20.680 --> 18:25.480
the interesting thing is that requires even more integration with the underlying storage,

18:25.480 --> 18:32.440
not only the storage, but the user management system, so we can integrate and allow people,

18:33.720 --> 18:42.120
we can actually support using the group ID of a user to change on the files that are written by

18:42.120 --> 18:47.880
him, so you can configure a space to say, okay, or you can configure open cloud, to always change

18:47.880 --> 18:54.520
on the files in a specific space, to be owned by the owner of the root of that space, so that

18:54.520 --> 19:00.200
all the files that are written in that space will always are grouped on by the same user, but that

19:00.200 --> 19:05.240
it always requires a better integration with the underlying storage system, and the user management,

19:05.240 --> 19:12.840
and for guest users, that's then the question, how do you organize that if a user is provisioned?

19:13.800 --> 19:19.160
Do you have, do we have the permission to create a new user and the customer's

19:19.160 --> 19:25.800
identity management system on the fly kind of? That's rarely the case, they usually have processes

19:25.800 --> 19:30.440
that they follow to onboard new users, but then it's possible to actually integrate and take

19:30.440 --> 19:36.520
existing users and make files on the files system, visually be owned by the user and group,

19:36.600 --> 19:43.320
that actually is in the system, but it's the full circle, the identity provider needs to actually

19:43.320 --> 19:48.840
tie to the elder server, needs to be tied to the storage system, needs to be available in the systems,

19:49.800 --> 19:58.280
so that's cool, it does work, but yeah, it's challenging. Thank you, more questions.

19:58.280 --> 20:16.280
So the question is if we work without that scope, we absolutely require not, we require some kind of

20:16.280 --> 20:22.120
cash, and I'm aware of the jeps and report, yes. I'm aware of the jeps and report on not, yes.

20:22.360 --> 20:29.720
Thank you. What is not actually? Not as a key value store, actually it's a streaming

20:29.720 --> 20:36.840
solution that is native to Kubernetes, cloud-nated environments, and built on streaming, they implemented

20:36.840 --> 20:45.000
a key value store and an object store, and we only use it for cashing. We could have used it,

20:45.080 --> 20:52.200
we are using it like radius, and we are using it as our event queue internally, and it's

20:52.200 --> 21:00.280
it's a really nice, it works really well in Kubernetes, and if you deploy it on via compose,

21:00.280 --> 21:06.600
then we are just using a build and nuts, you can embed it, so for us it's a really nice solution.

21:10.440 --> 21:11.240
Any more questions?

21:11.240 --> 21:18.840
Yes, please. I think that's the comment, because if you want to cash back, I think that for

21:18.840 --> 21:24.840
the record listings, I can recommend talking to S&E store, the cloud-nated director uses, which means

21:24.840 --> 21:31.240
that you open the end of the record to list it, and to hold on this record even any big changes,

21:31.240 --> 21:40.040
but that's called again cashing. Nice. Yeah, we should definitely talk now, right?

21:40.760 --> 21:47.880
Yeah, the question was, if the comment was that if I want to use a look into some

21:47.880 --> 21:52.840
cashing and notifications, some of us has even more nice things, that's really cool. Yeah, thank you.

21:56.040 --> 21:59.480
Any other thing? Any other self-defined storage solutions that I don't know of?

22:04.760 --> 22:08.360
Okay, I guess that's it, and then let's begin.

22:10.040 --> 22:12.040
Thank you.