WEBVTT

00:00.000 --> 00:20.480
So, hello everybody, my name is Marty Nechas, I work at Red Hat, and I work there as a technical

00:20.480 --> 00:24.840
lead team lead on a maintainer on the project called forklift.

00:24.840 --> 00:34.360
It is an open source project for migrating VMs from various places to Kubernetes Cubevert.

00:34.360 --> 00:41.640
And I think most of you here are familiar with the VMR situation, but for those who are

00:41.640 --> 00:43.640
not.

00:43.640 --> 00:45.640
Okay, this is what happened.

00:45.640 --> 00:50.960
We are also acquired by Broadcom, it's not that well visible.

00:51.880 --> 01:01.720
So, I love this image, and immediately after that, we started getting some articles.

01:01.720 --> 01:10.080
This is from Red Hat, from Register, and from Business Insider.

01:10.080 --> 01:15.560
There are many more, and I think some of you who are also from what I heard from statistics

01:15.560 --> 01:17.880
were also influenced by this.

01:17.880 --> 01:22.280
So, it feels a bit like being held for ransom.

01:22.280 --> 01:24.880
So, how to escape this ransom?

01:24.880 --> 01:32.440
It could answer a Ruby 2V, which I think many of you know, and for forklift.

01:32.440 --> 01:33.440
And that's it.

01:33.440 --> 01:34.440
Thank you.

01:34.440 --> 01:41.520
Okay, so, let's go to the details.

01:41.520 --> 01:49.480
So, Ruby 2V, really known tool, been developed by Richard Jones for many, many years,

01:49.480 --> 01:57.640
focusing lightly on the VMR to for migration to Cube.

01:57.640 --> 02:04.960
Why Ruby 2V, why we will one with some choices, main thing is gas conversion.

02:04.960 --> 02:13.640
The VMs running on VMR have the their own drivers, their own devices, and we want

02:13.640 --> 02:19.280
to use the proper Cube materials and virtual drivers.

02:19.280 --> 02:25.920
So, what Ruby 2V is doing, it removes the VMR tools, it insults the virtual drivers,

02:25.920 --> 02:34.520
insults the gas agent, and does a lot of editing inside a VM, and configurations.

02:35.320 --> 02:40.280
It even rebels in a trauma phase, and many more, I'm not a perfect guy for this.

02:40.280 --> 02:42.840
The Richard Jones would be perfect for it.

02:42.840 --> 02:47.320
I'm just kind of user of Ruby 2V.

02:47.320 --> 02:49.400
How does the Ruby 2V work?

02:49.400 --> 02:53.560
Inside it there is a tool called NVDKit.

02:53.560 --> 03:01.920
I highly recommend this possum talk from Richard a few years back, the NVDKit allows you

03:01.920 --> 03:12.400
to attach devices, remote devices to your machine as they would be in your local laptop.

03:12.400 --> 03:20.600
It allows you to do many things to it, the protocol itself and NVD is straightforward.

03:20.600 --> 03:27.120
It allows you to read rights, I have a quick demo, which highly depends on the Wi-Fi,

03:27.120 --> 03:29.480
so we'll see.

03:29.720 --> 03:44.040
For example, I got here the Fedora RhoxE file and using NVDKit, I can, oh, that's not that well visible.

03:44.040 --> 03:55.000
I apply the curl plug-in to the image itself, and the file is exe compressed, so what it does,

03:55.000 --> 04:01.000
it uncompresses all the blocks, so it doesn't download the file and open it.

04:01.000 --> 04:06.840
It reads only the blocks which it needs, and then it applies a co-filter on top of it,

04:06.840 --> 04:12.840
so all rights will be done to the local machine, and I can do whatever I want with it.

04:12.840 --> 04:35.720
I can, let me just copy past it, I can see the partitions, that Wi-Fi, I can mount it,

04:35.720 --> 04:46.920
and now I will not have the TMP, please, I have TMP anyway, yeah, I can mount it and do whatever

04:46.920 --> 04:54.520
with it Wi-Fi is slow, so let's move on, so how does the V3 to be worked?

04:54.520 --> 05:05.480
This was the NVDKit, the V3 to V, we have the VMDK on the VMware side, this format through which

05:05.480 --> 05:14.440
we connect to the NVDKit. VMware has the library called PDDK, which are to manipulate their

05:14.440 --> 05:22.840
disks, and NVDKit has a plug-in for it, which allows the NVDKit to do the same things as I was

05:22.840 --> 05:29.640
trying to show with the curl. On top of the NVDKit, there is started another tool called

05:29.720 --> 05:37.800
Guestafest, which attaches the NVDKit to VM, and inside it it's running a demon,

05:39.000 --> 05:49.480
the LDR allows you to manipulate disks securely and quite stable, and it runs a lot of the conversion

05:49.480 --> 05:57.720
scripts inside it, so we have the VMDK, and then all the changes, what we are trying to do

05:58.040 --> 06:09.160
are written to the cache. So first, we do the conversion, we fail fast, and we see what goes wrong

06:09.160 --> 06:17.000
with the conversion itself. It also allows us to use a tool like a pest stream, so we throw away

06:17.000 --> 06:23.320
the unnecessary files, unnecessary blocks, and they are written to like zeros to the cow cache,

06:23.400 --> 06:30.360
and we don't need to transfer it from the VMDK, the blocks themselves, so we even improve the migration

06:30.360 --> 06:40.200
time. So then we do the NVDK first reading from the cache, and then the remote blocks from the

06:40.200 --> 06:49.160
VMDK, and we got the destination. Advantage is, it fails fast, it's really good at it, if something

06:49.160 --> 06:56.520
goes wrong, it supports many operating systems, various really old Windows versions, the third

06:56.520 --> 07:04.600
depends on the virtual drivers, which are available, it supports Ubuntu, Santos, Fedora, many, many more,

07:05.880 --> 07:13.640
and it manages the disk transfers for you. So it manages the disk transfers, so all you need to do

07:13.640 --> 07:20.520
is run V2V, and you have it. You don't need to do anything fancy with it. disadvantage, it has

07:20.520 --> 07:26.200
high downtime, because the VM needs to be turned off throughout the whole process. Otherwise,

07:26.200 --> 07:32.600
you would get disk eruptions. And disadvantage, it manages the disk transfer. So when for

07:32.600 --> 07:40.680
cliff game, we wanted to do some tricks around it, and we needed to get rid of it, that those

07:40.760 --> 07:49.240
parts. Okay, so that was weird to be. What about for cliff? For cliff is tool around Cuba,

07:50.360 --> 07:59.640
and it focuses on migrating from VMware, open site, over to the Cuba itself. So the whole migration

07:59.640 --> 08:08.280
processes, you have configured VMware infrastructure on on your source, and you need to first

08:08.360 --> 08:15.400
configure the Kubernetes itself. We have some projects and some POCs to do it for you, but nothing

08:15.400 --> 08:22.120
concrete yet. So the administrators need to go and create a storage process, and they need to

08:22.120 --> 08:29.080
create a networking. The system administrators know them the best, so we are relying on them.

08:29.800 --> 08:36.680
Then the user needs to add a provider to the forklift, immediately scans the infrastructure,

08:36.760 --> 08:42.200
there are lists of all the VMs, and sees the which networks are used for the VMs, and with which

08:42.200 --> 08:49.080
storage data sources are used. Then we have internal validations. We tell users which features

08:49.080 --> 08:56.120
are available, which are not if the VM migration will fail. For example, we are not supporting

08:56.120 --> 09:03.480
right now are the VMs. So we let users immediately know. Next, user needs to create the

09:04.200 --> 09:11.080
network and storage. So we have VM with some specific storage, and they need to tell us

09:11.080 --> 09:17.640
from which data store needs to be migrated to which storage class, and the same goes for networks.

09:19.000 --> 09:25.000
We have done two migration types called and warm. The cold one is the V2E flow. It

09:25.000 --> 09:33.800
shut down the VM migrates it, and then boots it up. The high downtime are the same as V2E. It's

09:33.800 --> 09:39.880
actually using V2E like under the hood, and then starting in migrations. But more interestingly

09:39.880 --> 09:52.600
is the warm migration. So we have again the VMDK on VMware side. We have removed the NVDK

09:53.320 --> 10:01.320
from V2V to separate project, called a keyword container data importer, CDI, and which manages

10:01.320 --> 10:11.160
the transfer. This allows us to use additional VM bar features such as change block tracking.

10:12.360 --> 10:20.040
So we create a snapshot on top of the VM, and we migrate underlying this.

10:20.040 --> 10:29.160
So for example, the blue one could be 500 gigs. We transfer it, and then we would be transferring

10:29.160 --> 10:35.400
only the changes, which would be happening. So the orange one can be five gigabytes. And then

10:35.400 --> 10:40.040
we do another one. And the red one can be even less, because the migration time of five gigabytes

10:40.040 --> 10:48.040
it will be much less than migration time of 500 gigs. And we do this periodically, until the

10:48.040 --> 10:59.800
user sets the cut-over time. And that point we will shut down the VM and do the conversion

10:59.800 --> 11:06.200
using V3E to be in place on the already finished disk. So there is still downtime. It's not

11:06.200 --> 11:16.440
life migration, but it's warm, so something in between. Advantage is low downtime. Much

11:16.440 --> 11:23.560
lower than the called migrations. It removes the V3E to be disk transfer. And this allows us

11:23.560 --> 11:33.000
additional features. V3E to be will not continue the migration if the gas conversion fails.

11:34.040 --> 11:43.000
That's how it was created. And if the users want to try using emulation instead of

11:43.080 --> 11:51.640
drivers and proper tools, they can still try it. We have this working progress, but it allows

11:51.640 --> 12:00.120
it to us. Additionally, it allows us to do a short transfer share disk. V2E looks at all disks

12:00.120 --> 12:06.920
attached to the VM and transfers all of them. Here we can select the disks independently. And

12:06.920 --> 12:14.120
we are also starting working on a floating, that we would copy the disks within the storage

12:14.120 --> 12:22.360
arrays. Right now and be the kit everything goes over the network. And it can take some time.

12:22.360 --> 12:29.640
This advantage is it fails the slow and requires a digital tracking on the VM. And it can take

12:29.640 --> 12:35.400
a bit longer because there are more steps we create snapshots, we delete them and do everything

12:35.400 --> 12:49.800
what's needed. And I have a quick demo. Yes, Wi-Fi working. So I do create this. This is in the

12:49.800 --> 12:56.200
open chip. It can be also in the OKD UI. I create a vis-villier provider.

13:10.040 --> 13:17.560
Credentials. And now it's got all the information from the VMR. I see all my VMs. I see

13:17.560 --> 13:22.920
the validations which are on top of it. There is warning that it's running.

13:30.760 --> 13:35.320
I create migration plan with the source provider.

13:35.560 --> 13:46.200
I choose the storage mappings. I need to name it.

13:54.520 --> 14:01.880
I can choose which destination the VM should be migrated to. So it can be isolated for

14:02.440 --> 14:08.360
several administrators. We can choose wherever we go. There are a lot of other configurations and

14:08.360 --> 14:18.680
settings. I enable the word migration for changebook tracking and starting the plan.

14:18.680 --> 14:41.800
We do create this on the cluster itself to which we will start the migration to.

14:41.880 --> 14:55.960
I'll go to the VM. The migration is happening on the background. I write some simple file

14:56.040 --> 15:01.000
to it. As a test, it will be also on the destination.

15:01.000 --> 15:22.440
And then I hit a couple of times. I set it immediately.

15:31.000 --> 15:40.520
I can see the process progress of this transfers.

15:45.720 --> 15:54.760
Then we will run the conversion and at the end we create a VM. If the VM was turned off,

15:54.760 --> 16:02.040
we will not turn it on. If it was running, we turn it on again. We are trying to keep it as

16:02.040 --> 16:11.480
persistent as possible. Now I see a Cuba that the VM is running. I'll look into it again.

16:24.920 --> 16:45.960
I have my demo file. How is it with large scales? We have tested it with many hundreds of

16:45.960 --> 16:54.840
VMs, huge disks, and even a high-eye of databases. Those are a bit tricky. We got it in the end

16:54.840 --> 17:06.440
and now it works without a problem. It works quite well. We are working on the mapping of the

17:07.720 --> 17:14.280
networks and trying to add additional operating systems. The storage of loading that we would

17:14.280 --> 17:19.720
do the copying not over the network, but within the storage areas or some improvement to the speed

17:19.720 --> 17:27.800
transfers. Maybe additional source providers, we've been asked also to add something like AWS

17:27.800 --> 17:35.240
and other hyperscalers, but right now we are not working on them. But feel free to go so and try it

17:35.240 --> 17:45.880
yourself. We have it on GitHub. I really made it to do it a bit better. That's it. Thank you very

17:46.040 --> 17:54.840
everybody.

17:56.040 --> 18:04.200
Yes, please.

18:04.200 --> 18:11.400
Question was if it's already available in open shift, it is available in open shift already.

18:11.400 --> 18:18.400
We have an operator hub that you can install it as operator and which maintains and installs everything for you.

18:18.400 --> 18:20.400
Yes, please.

18:20.400 --> 18:24.400
What are the common problems during the migration?

18:24.400 --> 18:29.400
What are the common problems for the during the migrations?

18:29.400 --> 18:37.400
Most often it's what we've been hitting various operating systems and various configurations.

18:38.400 --> 18:40.400
There is no standardization.

18:40.400 --> 18:43.400
Any administrator can do whatever they want with VM.

18:43.400 --> 18:49.400
We are getting a lot of strange configurations which we are creating custom scripts.

18:49.400 --> 18:54.400
Also, problems are the devices themselves that we need to,

18:54.400 --> 18:59.400
because between VMware and Cumo, there are so many differences.

18:59.400 --> 19:10.400
So we need to inject some new dev rules to keep some, for example, names to be persistent during migrations.

19:10.400 --> 19:12.400
Yes, in the back.

19:12.400 --> 19:14.400
So we know, we know, yes.

19:14.400 --> 19:17.400
Can I install the virtual virtual virtual environment?

19:17.400 --> 19:19.400
Why the machine is on VMware?

19:19.400 --> 19:20.400
Not reboot.

19:20.400 --> 19:23.400
And then move them to VMware by sharing them now.

19:23.400 --> 19:26.400
And then whatever you come, go ahead and do virtual browsers.

19:27.400 --> 19:34.400
So the question was if we can install the virtual drivers first on the VMware side,

19:34.400 --> 19:39.400
and then shut it down and then transfer it.

19:39.400 --> 19:43.400
You can do it.

19:43.400 --> 19:46.400
It would be much better for us.

19:46.400 --> 19:51.400
For example, if the conversion will would fail,

19:52.400 --> 19:53.400
you could do it.

19:53.400 --> 19:58.400
But we are trying to do it this for the users themselves automatically.

19:58.400 --> 19:59.400
But yeah.

19:59.400 --> 20:02.400
Because isn't it good if you can just keep the VMware drivers just in case,

20:02.400 --> 20:05.400
but what is the further problem we're moving it?

20:05.400 --> 20:09.400
So we are doing the,

20:09.400 --> 20:16.400
if we need to remove the VMware tools and VMware drivers,

20:16.400 --> 20:18.400
we are doing coping.

20:18.400 --> 20:20.400
So we are not touching the source VM.

20:21.400 --> 20:23.400
The source VM is still staying there.

20:23.400 --> 20:25.400
If you want to,

20:25.400 --> 20:29.400
if the migration process fails or something happens,

20:29.400 --> 20:32.400
or you decide it's on working for me,

20:32.400 --> 20:35.400
the VM is still there until you delete it.

20:35.400 --> 20:36.400
Yes?

20:36.400 --> 20:37.400
Yes.

20:37.400 --> 20:40.400
I would love to explore it.

20:40.400 --> 20:47.400
Stay on and touch my VM from two of the errors in the letter of the system storage.

20:47.400 --> 20:49.400
Can you tell or,

20:49.400 --> 20:54.400
it's very nice that I'm going to want to come back to the end?

20:54.400 --> 20:59.400
I quite a question was about exporting from Cuba itself.

20:59.400 --> 21:00.400
Yeah.

21:00.400 --> 21:03.400
Could you be asking for data?

21:03.400 --> 21:08.400
So I am not sure if Cuba has something like that right now,

21:08.400 --> 21:11.400
about exporting VMs from Cuba,

21:11.400 --> 21:13.400
but we do a,

21:13.400 --> 21:16.400
the work that allows an important of OVAs.

21:16.400 --> 21:17.400
Yeah.

21:17.400 --> 21:18.400
Yeah.

21:18.400 --> 21:20.400
Because I experimented with the,

21:20.400 --> 21:22.400
the OCI and the work there,

21:22.400 --> 21:25.400
I couldn't find, you know, the opposite of that.

21:25.400 --> 21:28.400
And I was wondering if you're looking for something or anything.

21:28.400 --> 21:29.400
Just,

21:29.400 --> 21:32.400
I thought that you see that sometimes the,

21:32.400 --> 21:33.400
the VM image,

21:33.400 --> 21:35.400
and then talking from there.

21:35.400 --> 21:36.400
It's a pity.

21:36.400 --> 21:37.400
Exactly.

21:37.400 --> 21:38.400
Yeah.

21:38.400 --> 21:40.400
And then you can use it because it's clear.

21:40.400 --> 21:41.400
Yeah.

21:41.400 --> 21:42.400
Yeah.

21:42.400 --> 21:44.400
And then you can use it because it's clear VM.

21:44.400 --> 21:45.400
Yeah.

21:45.400 --> 21:46.400
Yeah.

21:46.400 --> 21:48.400
And then you can use it because it's clear VM.

21:48.400 --> 21:49.400
Yeah.

21:49.400 --> 21:50.400
Yeah.

21:50.400 --> 21:51.400
Yeah.

21:51.400 --> 21:53.400
What kind of problem did you,

21:53.400 --> 21:55.400
with a really high workload system,

21:55.400 --> 21:57.400
like the database system,

21:57.400 --> 21:58.400
you,

21:58.400 --> 22:00.400
you have a,

22:00.400 --> 22:01.400
like,

22:01.400 --> 22:02.400
as a limitation?

22:02.400 --> 22:03.400
The question was,

22:03.400 --> 22:06.400
what problems we encountered with a,

22:06.400 --> 22:08.400
like,

22:08.400 --> 22:09.400
the data basis?

22:09.400 --> 22:11.400
With a very high workload system,

22:11.400 --> 22:12.400
like the data basis?

22:12.400 --> 22:14.400
With high IOPS.

22:14.400 --> 22:16.400
So we had some problems with,

22:16.400 --> 22:20.400
as CDI,

22:20.400 --> 22:23.400
that it wasn't querying correctly,

22:23.400 --> 22:24.400
the changes,

22:24.400 --> 22:27.400
using the VM versus changing bug tracking,

22:27.400 --> 22:28.400
system,

22:28.400 --> 22:31.400
and we fixed it a few months ago.

22:31.400 --> 22:32.400
And,

22:32.400 --> 22:33.400
yeah.

22:34.400 --> 22:35.400
So it returns,

22:35.400 --> 22:37.400
not all the changes at once.

22:37.400 --> 22:39.400
We needed to do some additional queries for it.

22:39.400 --> 22:41.400
But that was just technical problem.

22:41.400 --> 22:42.400
Yes?

22:42.400 --> 22:44.400
Well, it's slightly off talking.

22:44.400 --> 22:45.400
But if,

22:45.400 --> 22:46.400
broken,

22:46.400 --> 22:47.400
or VM,

22:47.400 --> 22:48.400
while say,

22:48.400 --> 22:49.400
18 months earlier,

22:49.400 --> 22:51.400
do you think red hat would still drop,

22:51.400 --> 22:53.400
so both are over?

22:53.400 --> 22:55.400
I have worked on over it.

22:55.400 --> 22:56.400
And I,

22:56.400 --> 22:59.400
I'm not the right person to answer this.

22:59.400 --> 23:00.400
And,

23:01.400 --> 23:02.400
oh,

23:02.400 --> 23:05.400
there were in the world that they wouldn't revive it or not.

23:05.400 --> 23:06.400
Yeah.

23:06.400 --> 23:07.400
At the end of the day,

23:07.400 --> 23:09.400
it seems like Google is the way to go.

23:09.400 --> 23:10.400
They have,

23:10.400 --> 23:11.400
just for the recording,

23:11.400 --> 23:12.400
the question was,

23:12.400 --> 23:14.400
if the broadcast would announce

23:14.400 --> 23:16.400
our acquired VM earlier,

23:16.400 --> 23:20.400
if red hat would kept over it.

23:20.400 --> 23:21.400
Yeah,

23:21.400 --> 23:23.400
you could hear the answers.

23:23.400 --> 23:25.400
Yes?

23:25.400 --> 23:26.400
So,

23:27.400 --> 23:34.400
if we have tried it on other Kubernetes distributions,

23:34.400 --> 23:35.400
like Rancher,

23:35.400 --> 23:36.400
personally,

23:36.400 --> 23:37.400
I have not.

23:37.400 --> 23:38.400
Sorry.

23:38.400 --> 23:40.400
If it would work.

23:40.400 --> 23:41.400
Yeah.

23:41.400 --> 23:45.400
We are not using anything open to specific.

23:45.400 --> 23:47.400
I think it would work.

23:47.400 --> 23:49.400
Well,

23:49.400 --> 23:50.400
if not,

23:50.400 --> 23:53.400
please open an issue on GitHub.

23:54.400 --> 23:55.400
Yes?

23:55.400 --> 23:56.400
Yes?

23:56.400 --> 23:58.400
If not,

23:58.400 --> 24:00.400
please open an issue on GitHub.

24:00.400 --> 24:01.400
Yes?

24:01.400 --> 24:04.400
Do you have any questions for migration?

24:04.400 --> 24:05.400
Exactly.

24:05.400 --> 24:06.400
Why is that?

24:06.400 --> 24:08.400
If you see the question from Google,

24:08.400 --> 24:09.400
Google,

24:09.400 --> 24:10.400
Google,

24:10.400 --> 24:11.400
Google,

24:11.400 --> 24:12.400
can I say,

24:12.400 --> 24:13.400
Pakistan trust with Google,

24:13.400 --> 24:15.400
only after Google based?

24:15.400 --> 24:17.400
The question was,

24:17.400 --> 24:22.400
if we can order or group the VMs throughout the migration,

24:23.400 --> 24:24.400
that is feature,

24:24.400 --> 24:27.400
which will actually cost us the lot.

24:27.400 --> 24:29.400
We've been thinking about it,

24:29.400 --> 24:30.400
and not right now,

24:30.400 --> 24:33.400
but what we can do is create a separate plans.

24:33.400 --> 24:36.400
You can do a group of VMs together,

24:36.400 --> 24:38.400
within the plan itself,

24:38.400 --> 24:41.400
and then you can create another plan for additional VMs.

24:41.400 --> 24:43.400
Cool.

24:43.400 --> 24:46.400
Anybody else?

24:46.400 --> 24:47.400
Cool.

24:47.400 --> 24:48.400
In that case,

24:48.400 --> 24:49.400
thank you everybody.

24:49.400 --> 24:50.400
Thank you.

24:50.400 --> 24:51.400
Thank you.