
Qwen-Image-Edit — Advanced AI Image Editing Model
Qwen-Image-Edit extends the capabilities of Qwen-Image to precise image editing. Edit visuals and text effortlessly with state-of-the-art semantic and appearance control.
Qwen-Image-Edit: AI-Powered Image Editing Reinvented
Transform images with precision using advanced semantic and appearance controls. Edit both visuals and text effortlessly.
What is Qwen-Image-Edit?
Qwen-Image-Edit is the image editing extension of the 20B Qwen-Image model. By combining Qwen2.5-VL for semantic control and the VAE Encoder for appearance control, it enables highly precise edits for both the visual content and embedded text. Users can now modify images while preserving the original style, layout, and consistency.
Key Features
- Semantic and Appearance Editing
- Low-level appearance edits: add, remove, or modify elements while keeping all other regions unchanged
- High-level semantic edits: rotate objects, transfer styles, or create new IPs while maintaining semantic consistency
- Precise Text Editing
- Supports bilingual (Chinese and English) text editing
- Directly add, delete, or modify text in images
- Preserves original font, size, and style for natural results
- Strong Benchmark Performance
- Achieves state-of-the-art results on multiple public benchmarks
- Reliable foundation model for professional-grade image editing
Showcase
One of the highlights of Qwen-Image-Edit lies in its powerful capabilities for semantic and appearance editing. Semantic editing refers to modifying image content while preserving the original visual semantics. To intuitively demonstrate this capability, let's take Qwen's mascot—Capybara—as an example:
As can be seen, although most pixels in the edited image differ from those in the input image (the leftmost image), the character consistency of Capybara is perfectly preserved. Qwen-Image-Edit's powerful semantic editing capability enables effortless and diverse creation of original IP content. Furthermore, on Qwen Chat, we designed a series of editing prompts centered around the 16 MBTI personality types. Leveraging these prompts, we successfully created a set of MBTI-themed emoji packs based on our mascot Capybara, effortlessly expanding the IP's reach and expression.
Moreover, novel view synthesis is another key application scenario in semantic editing. As shown in the two example images below, Qwen-Image-Edit can not only rotate objects by 90 degrees, but also perform a full 180-degree rotation, allowing us to directly see the back side of the object:
Another typical application of semantic editing is style transfer. For instance, given an input portrait, Qwen-Image-Edit can easily transform it into various artistic styles such as Studio Ghibli. This capability holds significant value in applications like virtual avatar creation:
In addition to semantic editing, appearance editing is another common image editing requirement. Appearance editing emphasizes keeping certain regions of the image completely unchanged while adding, removing, or modifying specific elements. The image below illustrates a case where a signboard is added to the scene. As shown, Qwen-Image-Edit not only successfully inserts the signboard but also generates a corresponding reflection, demonstrating exceptional attention to detail.
Below is another interesting example, demonstrating how to remove fine hair strands and other small objects from an image.
Additionally, the color of a specific letter "n" in the image can be modified to blue, enabling precise editing of particular elements.
Appearance editing also has wide-ranging applications in scenarios such as adjusting a person's background or changing clothing. The three images below demonstrate these practical use cases respectively.
Another standout feature of Qwen-Image-Edit is its accurate text editing capability, which stems from Qwen-Image's deep expertise in text rendering. As shown below, the following two cases vividly demonstrate Qwen-Image-Edit's powerful performance in editing English text:
Qwen-Image-Edit can also directly edit Chinese posters, enabling not only modifications to large headline text but also precise adjustments to even small and intricate text elements.
Finally, let's walk through a concrete image editing example to demonstrate how to use a chained editing approach to progressively correct errors in a calligraphy artwork generated by Qwen-Image:
FAQs
Q: What is Qwen-Image?
Qwen-Image is a next-gen AI model for generating high-quality images from text prompts. It’s ideal for designers, creators, and developers who need fast, realistic, or stylized visuals.
Q: What is Qwen-Image-Edit?
Qwen-Image-Edit is an AI-powered editor that lets you modify images using plain language instructions. From changing backgrounds to reconstructing faces or adding objects, it produces precise, photorealistic results.
Q: How does Qwen-Image-Edit work?
Upload your image and describe the edits in natural language. The AI interprets your instructions and generates updated images, while preserving context, character consistency, and visual quality.
Q: What kinds of edits are supported?
- Background replacement and scene editing
- Face and character modifications
- Object insertion or removal
- Style transfer and visual enhancements
- Multi-image context support for consistent results
