Vedio Generation
AI video generation utilizes artificial intelligence to automatically produce video content. By understanding user input, it generates visual and audio elements, offering an efficient way to create videos for various applications.
Advancements
-
Make-A-Video (opens in a new tab) by Meta AI, is an AI system that generates high-quality video clips from text prompts. It leverages recent advancements in text-to-image generation technology and utilizes publicly available datasets for transparency, it empowers users to create content easily and can create videos from images or remix existing ones. (paper)
-
Emu Video (opens in a new tab) and Emu Edit (opens in a new tab) by Meta AI. Emu Video uses diffusion models to create high-quality videos from text prompts, outperforming previous methods in human evaluations. Meanwhile, Emu Edit is a versatile picture editing tool allowing detailed instructions for tasks like local/global edits, background manipulation, color/geometry transformations, detection, and segmentation. (blog) (Emu Video paper) (Emu Edit paper)
-
Stable Video Diffusion (opens in a new tab) by Stability AI, a model that converts text and image inputs into dynamic scenes, bringing concepts to life in cinematic form. Released as two image-to-video models, it offers video durations of 2-5 seconds, frame rates of up to 30 FPS, and processing times of 2 minutes or less. (paper)
-
ProPainter (opens in a new tab), developed by NTU's S-Lab team, is an open-source algorithm for video editing and repair. It employs propagation and transformer models to enhance video inpainting quality, addressing tasks like object and watermark removal, video mask completion, and expansion. (code) (paper) (demo)
Article & Papers
- Imagen Video: High Definition Video Generation with Diffusion Models (opens in a new tab) (2022) by Google, is a paper introducing a high-definition video generation method using diffusion models. The technique, based on sequential noise addition to latent representations, achieves top results in video synthesis benchmarks. (website)
Reference
-
Papers with Code (opens in a new tab): Collection of papers and benchmarks related to video generation, Text-to-Video Generation and Image to Video Generation.
-
Camenduru's GitHub Video ML Papers (opens in a new tab): This collection comprises repositories containing video machine learning papers. It also encompasses projects related to text-to-video synthesis, diffusion models, and video retalking.
-
Camenduru's 3D Motion Papers (opens in a new tab): This collection contains repositories on 3D motion papers, includes projects like MotionDiffuse, NIKI, PHALP, DWPose, 4D-Humans, vid2avatar, and PARE.