Page 1 of 1

Color Space Question after reading the book

Posted: Sun Aug 06, 2023 7:18 pm
by Ryzen1988

So I just read the exploring deepfake books, very cool so left a good review.

But in there ofcourse was the resolution doubling -> vram & compute x 4 rule
The color space conversion rgb -> bgr and the fact that you do the whole convolutional shabang for each of the tree colours.
It already said that the mask itselfs is a lower resolution.

Since at least for me almost all faces come from video, and go back to video why isn't YUV 4:2:0 more used?
That would only use one layer at full resolution and the chroma luma at 25% the size.
Think of all the Vram and compute saved or the possibility to increase the output resolution without the heavy penalty.
I see not really any reason why the same convolutions and dense layers would not work with that instead of BGR :ugeek:

But probably people already tried this so why isn't this a option?

From what i have read YUV is not supported as such, but it seems fairly easy to map U & V to a grayscale layer and transform it back at the end.


Re: Color Space Question after reading the book

Posted: Wed Aug 09, 2023 12:09 am
by torzdf

The short answer is, Faceswap has always been built to work with 8-bit BGR images. Video support was added much later in the life-cycle.

I have never experimented with other colour spaces for training, but, either way, to implement into Faceswap now would be a mammoth undertaking as it would impact every part of the codebase, so this kind of thing is unlikely to ever become a priority with the limited time I have available to develop.


Re: Color Space Question after reading the book

Posted: Fri Aug 11, 2023 5:34 pm
by Ryzen1988

Oke, thats a very good reason for not doing that.

Thanks for the response