Apple has unveiled a groundbreaking AI image editing tool, MGIE (MLLM-Guided Image Editing), which allows users to make edits to their photos simply by describing them in plain language. Developed through a collaboration between Apple and the University of California, Santa Barbara, MGIE eliminates the need for conventional photo editing software, offering users a seamless and intuitive editing experience.
MGIE is capable of performing a wide range of editing tasks, including cropping, resizing, flipping, and applying filters, all through text prompts. This innovative model represents a significant leap forward in image editing technology, catering to both basic and complex editing needs with ease and precision.
Powered by multimodal language models, MGIE interprets user prompts and generates corresponding edits accordingly. For instance, a request for a “bluer sky” will result in adjustments to the brightness and color of the sky portion of an image, ensuring accurate interpretation and execution of editing instructions.
Consider editing a photo of a pepperoni pizza and instructing the model to “make it more healthy”; MGIE will intelligently add vegetable toppings to the image. Similarly, requesting to “add more contrast to simulate more light” will enhance the brightness and clarity of dark images, such as those featuring tigers in the Sahara.
In a statement accompanying the release, Apple researchers emphasized MGIE’s ability to discern explicit visual-aware intentions, leading to meaningful image enhancements. Extensive studies have validated MGIE’s effectiveness across various editing scenarios, highlighting its superior performance and competitive efficiency. Additionally, researchers foresee the MLLM-guided framework contributing to future advancements in vision-and-language research.
Apple has made MGIE available for download via GitHub, with a web demo accessible on Hugging Face Spaces. However, the company has not disclosed its plans for the model beyond research purposes.
While platforms like OpenAI’s DALL-E 3 offer similar capabilities in image generation, and Adobe’s Firefly AI model powers generative fill in Photoshop software, Apple’s entry into the generative AI space underscores its commitment to integrating advanced AI features into its products. CEO Tim Cook has previously expressed the company’s intention to expand AI functionalities across its devices, with recent initiatives including the release of the open-source machine learning framework MLX in December, aimed at facilitating AI model training on Apple Silicon chips.