Hey Guys!! I am back with a new blog on creating deepfakes using First Order Motion Model. Creating deepfakes in the past was not an easy task, however with recent advances it became a five-minutes job. In this blog, we will explore how deepfakes are created and how we’ll apply a First Order Motion Modeling method, which allows us to create deepfakes in a matter of minutes.
What are DeepFakes ?
Deepfakes are synthetic media in which a person in an existing image or video is replaced with someone else’s likeness. While the act of faking content is not new, deepfakes leverage powerful techniques from machine learning and artificial intelligence to manipulate or generate visual and audio content with a high potential to deceive. The main machine learning methods used to create deepfakes are based on deep learning and involve training generative neural network architectures, such as autoencoders or generative adversarial networks (GANs).
DeepFakes are realistic-looking fake videos, in which it seems that someone is doing and/or saying something even though they didn’t.
How Deepfakes are Created?
The basis of deepfakes, or image animation in general, is to combine the appearance extracted from a source image with motion patterns derived from a driving video. For these purposes deepfakes use deep learning, where their name comes from (deep learning + fake). To be more precise, they are created using the combination of autoencoders and GANs.
Autoencoder is a simple neural network, that utilizes unsupervised learning (or self-supervised if we want to be more accurate). They are called like that because they automatically encode information and usually are used for dimensionality reduction.
Generative Adversarial Networks or GANs are composed of two networks that are competing against each other. The first network tries to generate images that are similar to the training set and it is called the generator. The second network tries to detect where does the image comes from, training set, or the generator and it is called – the discriminator.
Learn more about Autoencoders and GANs here.
First Order Model for Image Animation
The whole process of First Order Model is separated into two parts: Motion Extraction and Generation. As an input the source image and driving video are used. Motion extractor utilizes autoencoder to detect keypoints and extracts first-order motion representation that consists of sparse keypoints and local affine transformations. These, along with the driving video are used to generate dense optical flow and occlusion map with the dense motion network. Then, the outputs of dense motion network and the source image are used by the generator .
It also has features that other models just don’t have. The really cool thing is that it works on different categories of images, meaning you can apply it to face, body, cartoon, etc. This opens up a lot of possibilities. Another revolutionary thing with this approach is that now you can create good quality Deepfakes with a single image of the target object, just like we use YOLO for object detection.
Building your own Deepfake
We will be using pre-trained model and use our source image and driving video to generate deepfakes.
Importing necessary libraries
import imageio import numpy as np import matplotlib.pyplot as plt import matplotlib.animation as animation from skimage.transform import resize from IPython.display import HTML import warnings warnings.filterwarnings("ignore")
Cloning the repository and Mounting Google Drive
We need to do is clone the repository and mount your Google Drive. Once that is done, you need to upload your image and driving video to drive. Make sure that image and video size contains only face, for the best results. Then all you need is to run the below piece of code.
!git clone https://github.com/imvansh25/first-order-model.git cd first-order-model from google.colab import drive drive.mount('/content/gdrive')
Add folder https://drive.google.com/drive/folders/1kZ1gCnpfU0BnpdU47pLM_TQ6RypDDqgw?usp=sharing to your google drive.
Load driving video and source image
#crop your video !ffmpeg -i /content/gdrive/My\ Drive/first-order-motion-model/p1.mp4 -ss 00:08:57.50 -t 00:00:08 -filter:v "crop=600:600:760:50" -async 1 p1.mp4 source_image = imageio.imread('/content/gdrive/My Drive/first-order-motion-model/cartoon-04.jpg') driving_video = imageio.mimread('/content/gdrive/My Drive/first-order-motion-model/p1.mp4',memtest=False) #Resize image and video to 256x256 source_image = resize(source_image, (256, 256))[..., :3] driving_video = [resize(frame, (256, 256))[..., :3] for frame in driving_video] def display(source, driving, generated=None): fig = plt.figure(figsize=(8 + 4 * (generated is not None), 6)) ims =  for i in range(len(driving)): cols = [source] cols.append(driving[i]) if generated is not None: cols.append(generated[i]) im = plt.imshow(np.concatenate(cols, axis=1), animated=True) plt.axis('off') ims.append([im]) ani = animation.ArtistAnimation(fig, ims, interval=50, repeat_delay=1000) plt.close() return ani HTML(display(source_image, driving_video).to_html5_video())
Creating a model and loading checkpoints
Now, we’ll create a model and load the checkpoints.
from demo import load_checkpoints generator, kp_detector = load_checkpoints(config_path='config/vox-256.yaml', checkpoint_path='/content/gdrive/My Drive/first-order-motion-model/vox-cpk.pth.tar')
Performing image animation
from demo import make_animation from skimage import img_as_ubyte predictions = make_animation(source_image, driving_video, generator, kp_detector, relative=True) #save resulting video imageio.mimsave('../generated.mp4', [img_as_ubyte(frame) for frame in predictions]) #video can be downloaded from /content folder HTML(display(source_image,driving_video, predictions).to_html5_video())
Deepfakes have garnered widespread attention for their uses in fake news, frauds, scams, and many other illegal activities. It is getting harder and harder to understand what is truth and what is not. It seems that nowadays we can not trust our own senses anymore. So, be careful while using deepfakes.
Thank you for reading!