Welcome to this week’s Checkpoint, featuring image vectorisation, text-to-speech, text-to-human animation, realtime deepfakes, custom Latent Diffusion models and more.
Made with Pytti / VQGAN+Clip
Latest research & releases
Generating long videos of dynamic scenes – Another approach to procedurally generating video. While the demo videos are pretty good, they’re still pretty dreamlike as elements morph and merge. Cool and unsettling at the same time. Paper.
15.ai– Good collection of text-to-speech models, covering a wide variety of voices from games, film and tv. Lots of Friendship is Magic. See also Uberduck.
Deepfake Offensive Toolkit – Suite of tools for creating realtime, controllable deepfakes for use with virtual cameras. Teams meetings will never be the same.
Latent Diffusion fine-tuning – Code for fine-tuning Latent Diffusion on your own data. See the models fine-tuned for generating aesthetic images and logos for examples.
CLIP-Actor – Animates a 3D human model based on a text prompt. Code to be released soon.
the ai art thing is fake. i’m the guy who has to draw all the requests like the chess player inside the mechanical turk. you’re torturing me. i spend every waking hour drawing shit like “joe biden asuka wedding” and “donkey kong nuremberg trials” please stop. i need to sleep
If you have anything you’d like to be featured or want to get in touch, give me a shout on Twitter or via email. Please also consider supporting me on Patreon so that I can spend more time creating content like this.