Hi all! This is my first post.
I got curious and decided to play around with deepfakes and see what I can do.
My system is 6-7 years old, and my video card is (pasting from my system76 order confirmation): "2 GB nVidia GeForce GTX 750 Ti with 640 CUDA Cores"
From my basic experimentation, I can do the following:
Extract:
-
Can do: Mtcnn, Fan aligner, Hist normalization, Re Feed 8
-
Crashes (OOM): S3Fd, additional maskers (e.g. Vgg-Obstructed)
-
Train:
-
Can do: Lightweight, Original (with lowmem enabled), batch sizes up to 16
-
Crashes (OOM): Original without lowmem, any other trainer I tried (haven't tried most though tbh)
-
To test, I grabbed a couple of youtube videos with good, stable faces, and played with it. I got some not-too-terrible results by editing the originals down to representative 30 second clips, extracting them to O(725) images each, and training 100K iterations. Not sure what the rules are around posting results of experiments from swapping youtubers, so I'll err on the side of caution and not post them at this point.
I'm hoping someone can give me some advice about how to make the most of my limited resources. Specifically:
-
Are there other models that work reasonably well with my lower video memory setup?
-
I don't really see a difference so far between original (lowmem) and Lightweight. Are they basically the same? What are the types of scenarios where one outperforms the other?
-
Out of curiosity: Is "Original with lowmem" going have the same results as "Original (not lowmem)", just slower? Or is non-lowmem Original going to provide BETTER results, for the same data and the same settings?
-
What's generally better with these lighter-weight models: More iterations at a lower batch size, or fewer iterations at a higher batch size? I notice BS=1 churns through iterations much faster than BS=16, so if I have say 12 hours to do some training, which route would you take? Or does this really depend on the data? If so... in what way? I'm trying to avoid wasting time as much as possible.
Thanks everyone! I'm really excited to play around with it more. Last time I trained a NN, I was in university taking an AI class, and the term "machine learning" wasn't all mainstream. We built a NN from scratch, to recognize a small set of handwritten characters (5, 6, 7, 8, 9, I if remember right). We've come a long way haha.