New Lightricks LTXV-13B Model Generates Videos With Record Speed

light-tricks-tech

Israeli artificial intelligence company Lightricks is looking to win a wider market share in the generative AI video space with the launch of its biggest open-source model yet. Last week, the developers lifted the lid on LTXV-13B, a 13-billion parameter model that’s more powerful than its predecessor and is capable of generating high-quality, detailed videos up to 30-times faster than rival similar-sized models.

What’s more, with LTXV-13B, creative teams don’t need to rent a cluster of high-powered cloud GPUs to achieve this kind of performance, because the model is built with open-sourced scalability features to run at blazing speeds even on consumer-grade laptops and PCs.

Lightricks explained that the new release is the successor to its original LTVX model, which was one of the first open-source video generation models to launch on platforms such as Hugging Face and Github when it debuted in November. Until then, the vast majority of advanced AI video generation tools were proprietary, locked behind APIs that require payment to access, with no ability for developers to adapt or fine-tune them in any way.

With the LTVX model, Lightricks changed the game for video generators, enabling developers to make all kinds of adjustments to its training data and weights, in the hope that the open-source community would help to improve its capabilities.

The Lightricks team has also received accolades for its models’ lightweight architecture. Able to run without problems on consumer-grade graphics cards, the previous version of LTXV, dubbed 0.9.6 Distilled, was released just a few weeks ago and maintained impressive performance and speed while improving quality.

Now 13B takes the quality standards and control levels even higher, with many times the parameters and having been trained on content licensed via partnerships with Shutterstock and Getty.

 

Optimised for Low-Memory GPUs

 

Despite running through 13 billion parameters, the 13B model is still able to run on the kind of affordable GPUs found in gaming laptops, thanks to the way it has been optimized to run within their memory limits.

In an interview with VentureBeat, Lightricks Chief Executive Zeev Farbman explained that video models from competitors such as Runway and Luma generally have to be deployed in the cloud, where they can utilize massive racks of clustered GPUs with at least 80GB of RAM.

The problem with this approach is, it makes those models extremely expensive, because the user has to pay to rent all of that computing power – meaning they’re only really accessible to developers whose employers are willing to shell out thousands of dollars on their projects.

That’s why Farbman ensured that LTXV-13B was optimized to run with as little as 24GB of VRAM, which is the standard for consumer-grade GPUs like the Nvidia 3090 and 4090 chips. “The full model, without any quantization, without any approximation, you will be able to run on top consumer GPUs; 3090, 4090, 5090, including their laptop versions,” he told VentureBeat.

In a press release announcing 13B, Lightricks revealed that this optimization was made possible thanks to a contribution from the open-source community, namely the UEfficient Q8 kernel, which helps to scale its performance in lower-memory environments.

In terms of new features, the biggest innovation here is a new multiscale rendering technique, which enables LTXV-13B to generate details gradually, similar to how an artist might bring his or her paintings to life by adding one layer of detail at a time.

“You’re starting on the coarse grid, getting a rough approximation of the scene, of the motion of the objects moving, etc,” Farbman explained. “And then the scene is kind of divided into tiles. And every tile is filled with progressively more details.”

The model also employs newer compression techniques that enable it to use less memory without affecting the quality of its outputs. “With videos, you have a higher compression ratio that allows you, while you’re in the latent space, to just take less VRAM,” Farbman said.

No Copyright Concerns

 

Being able to run LTXV-13B on consumer laptops is a massive advantage that should ensure many more developers can start experimenting and looking to integrate its capabilities within their own applications. But there’s another key consideration that’s gone into the model that makes it even more accessible.

Unlike most other video generation tools, LTXV-13B is safe to use even for commercial applications thanks to the ethical nature of its training data. Companies such as OpenAI and Runway AI are extremely secretive about how they train their video generation models, and they’ve been accused of using questionable methods to obtain the data – namely, scraping proprietary digital libraries for their video content, without asking permission to do so.

This is a concern, because AI firms and creators are facing legal challenges as a result of these habits, and depending on the outcome of those cases, it may well be that anyone using OpenAI’s and Runway’s tools could be guilty of copyright violations. But users can rest assured they’ll be fine when they use 13B, for all of its data was legally licensed from Getty Images and Shutterstock.

“We have big customers in our enterprise segment that care about this kind of stuff, so we need to make sure we can provide clean models for them,” he explained.

LTXV-13B does have its limitations, however. Farbman admitted that the model’s outputs are still a long way from matching the best Hollywood movie productions, and it’s going to take time for the industry to get there. But he believes it still has applications in areas such as animation, helping to automate many of the time-consuming aspects of production.

Developers have plenty of options if they want to put LTXV-13B through its paces, with the model available on Hugging Face and GitHub, and free to use for enterprises with less than $10 million in annual revenue. It’s also being integrated with Lightricks’ subscription-based storytelling tool LTX Studio, which is aimed at marketing professionals and video production companies.

Enterprises that generate more than $10 million in sales per year will have to negotiate a license directly with Lightricks. Farbman said this is similar to how graphics engines like Unreal Engine and Unity operate.

This means that academics, researchers and hobbyists are free to do whatever they like with the LTXV-13B model, Farbman said. He’s hopeful that they’ll come up with some interesting contributions.