BlogOur latest product updates and thoughts on state-of-the-art AI capabilities.
How Scenery approaches human-centric AI video understanding with Sieve
by Mokshith Voodarla • 2 min read
We discuss a partnership between VEED and Sieve to launch VEED Clips, a new AI-powered video clipping tool.
SAM 2 can't natively take in text prompts. We discuss various ways to build pipelines around SAM 2 to accomplish text-prompted segmentation.
Learn about Meta's SAM 2 (Segment Anything Model 2) and how Sieve's optimized implementation runs 2x faster. Explore use cases, benchmarks, and how to use SAM 2.
MuseTalk: Real-Time High Quality Lip Synchronization with Latent Space Inpainting
by Gaurang Bharti • 4 min read
We walk through using the Sieve API to download and dub an entire Khan Academy course in under 10 minutes.
We discuss the launch of Sieve’s Dubbing API, the first AI dubbing solution purpose-built for developers.
Introducing Autocrop 1.0: Format videos into different aspect ratios with AI editing
by Mokshith Voodarla • 3 min read
We discuss the importance of AI in video communication and why Zight chose Sieve to power their new AI features.
We do a deep dive into building an intricate algorithm on top of LLMs to accurately identify and extract highlights from long-form video content.
We discuss the first time computers drastically changed video creation and how it’s changing once again because of new AI models.
Introducing Describe: Incredibly descriptive audiovisual summaries for videos
by Gaurang Bharti • 5 min read
In this post, we build an app that adds sound effects to stock videos using vision language models and audio generation models.
In this post, we discuss support for GPU sharing on Sieve and how it enables faster, more cost-effective AI models.
In this post, we discuss active speaker detection as a deep learning task and how we built a solution that performs ~90% faster than other solutions.
In this post, we discuss the commoditization of audio transcription and a new Sieve offering around it that is 5x cheaper than other providers while still maintaining speed and accuracy.
We discuss modifying current lipsyncing solutions such as OpenRetalker’s Video Retalking to get a performant, production-ready lipsyncing solution.
Learn how we developed a quality AI audio enhancement app with open-source, rivaling the best APIs in the market. Try it for yourself!
In this blog post, we go through the process of generating video chapter titles with OpenAI's Whisper + GPT-3 models and an open-source text segmentation technique!
Learn about our process building a Twitter AI bot that can generate avatar videos and responses in minutes using Sieve.
The explosion of rich data, the Sieve public beta, our ~$4M seed round, and how we enable developers to build amazing experiences with video + AI.