🧠 Uni-1 — The Future of Unified Intelligence

Uni-1: Less Artificial. More Intelligent.

Uni-1 is the world's first multimodal reasoning model that generates pixels. Built on Unified Intelligence by Luma Labs, it bridges the gap between language understanding and visual creation — reasoning, imagining, and generating in one unified architecture. Experience the next generation of AI image generation.

Ranked #1 in human preference Elo for Overall, Style & Editing, and Reference-Based Generation

What Makes Uni-1 Revolutionary

Uni-1 represents a paradigm shift in artificial intelligence. Unlike traditional models that separate language and vision, it grows a mind's eye from a logical brain — jointly modeling time, space, and logic in a single decoder-only autoregressive Transformer. This unified approach enables forms of visual reasoning and image generation that fragmented pipelines simply cannot achieve.

Unified Architecture

At its core, Uni-1 is a decoder-only autoregressive Transformer where text and images are represented in a single interleaved sequence. This design enables seamless cross-modal reasoning — a fundamental advantage over models that treat language and vision as separate systems.

Reasoning Engine

Uni-1 performs structured internal reasoning before and during image synthesis. Given a complex prompt, it decomposes instructions, resolves spatial constraints, plans composition, and renders accordingly — achieving state-of-the-art results on the RISEBench benchmark for reasoning-informed visual editing.

Visual Understanding

Uni-1 demonstrates that learning to generate images materially improves visual understanding. It excels at fine-grained tasks like open-vocabulary object detection (ODinW-13), showing that generation and understanding reinforce each other within the unified framework.

Multimodal Pipeline

Uni-1 processes text and images in a single interleaved sequence — both as input and output. It can accept text prompts, reference images, and editing instructions all at once, producing pixel-perfect results that reflect deep understanding of every input element.

Why Choose Uni-1 Over Other AI Models

Uni-1 outperforms competitors across multiple evaluation dimensions. In human preference Elo rankings, it takes first place for Overall quality, Style & Editing, and Reference-Based Generation — and second in Text-to-Image. Here's what makes it the leading choice for intelligent image generation.

Uni-1 brings common-sense scene completion, spatial reasoning, and plausibility-driven transformation to image generation. You get images that make physical sense — objects have weight, shadows fall correctly, and scenes evolve with temporal coherence. It doesn't just generate; it thinks before it creates.

Core Capabilities

Uni-1 delivers a comprehensive suite of AI image generation capabilities — all powered by a single unified Transformer model. Every feature benefits from its reasoning-first architecture.

Text to Image

Generate stunning images from text descriptions. The reasoning engine automatically plans scene composition, spatial layout, lighting, and perspective before rendering each pixel.

Image Editing

Edit images with natural language instructions. Uni-1 decomposes complex edits into logical steps — modifying exactly what's needed while preserving everything else.

Multi-Reference Generation

Provide up to 8 reference images to guide generation. Identity, style, and compositional constraints are preserved across all references, enabling powerful creative workflows.

Spatial Reasoning

Uni-1 understands 3D space, object relationships, and physical plausibility. Objects are placed with correct perspective, depth, and occlusion — creating spatially coherent scenes every time.

Visual Grounding

The generation capability enhances visual understanding. Objects, regions, and layouts can be identified, localized, and reasoned about with fine-grained precision across diverse visual domains.

Style Transfer

Seamlessly transform between artistic styles — from photorealism to watercolor, from manga to oil painting. Subject identity is preserved while adopting any target aesthetic with cultural awareness.

Frequently Asked Questions

Everything you need to know about Uni-1, the multimodal reasoning model from Luma Labs.









Get Started with Uni-1 Today

Join thousands of creators exploring the future of AI image generation. Experience what happens when reasoning meets visual creation.