MGIE: Nowe narzędzie edycji zdjęć przy użyciu języka naturalnego

Apple Research has unveiled a new model called MGIE (MLLM-Guided Image Editing) that allows users to describe what they want to change in a photo using simple language, without the need for a photo editor. Developed in collaboration with the University of California, Santa Barbara, MGIE enables cropping, resizing, flipping, and adding filters to images through text commands.

The MGIE model can be used for both simple and more complex image editing tasks, such as changing the shape of specific objects in a photo or adjusting the brightness. It combines two different ways of using multimodal language models. First, it learns to interpret user commands, and then it “imagines” how the editing would look (for example, a request for a bluer sky in a photo translates to increasing the brightness in the sky area of the image).

During photo editing with MGIE, users only need to write what they want to change in the image. For example, in the article, the editing of a pizza image is demonstrated. Typing the command “make it more healthy” adds vegetable toppings. The photo of tigers in the Sahara appears dark, but after commanding the model to “add more contrast to simulate more light,” the image becomes brighter.

“MGIE takes a clear and visual-aware aspect of editing images instead of brief but ambiguous instructions, leading to more meaningful image editing. We have conducted extensive research on different aspects of editing and have proven that our MGIE model effectively improves performance while maintaining competitive efficiency. We also believe that the MLLM-guided structure can contribute to future research on the relationship between vision and language,” said the researchers in the article.

Apple has made MGIE available for download via GitHub and has also provided a demo on the Hugging Face Spaces website, according to VentureBeat. The company has not disclosed its specific plans for the model beyond research. Other image-generating platforms, such as OpenAI’s DALL-E 3, can perform simple photo editing tasks based on input text. Adobe, the creator of Photoshop, also has its own AI editing model. Their Firefly AI model supports generative filling, which adds generated backgrounds to photos.

Apple has not been a major player in the generative AI field until now, unlike companies like Microsoft, Meta, or Google. However, Apple’s CEO, Tim Cook, has stated that the company plans to add more AI features to its devices this year. In December, Apple researchers released an open-source machine learning framework named MLX to facilitate training AI models on Apple Silicon chips.

FAQ:

  1. What is the MGIE model developed by Apple?
  2. The MGIE (MLLM-Guided Image Editing) model developed by Apple in collaboration with the University of California, Santa Barbara, allows users to crop, resize, flip, and add filters to images through text commands. Users can describe what they want to change in a photo using simple language, without the need for a photo editor.

  3. What image editing tasks can the MGIE model perform?
  4. The MGIE model can be used for both simple and more complex image editing tasks, such as changing the shape of specific objects in a photo or adjusting their brightness. Users can also add filters and make other adjustments.

  5. How is the operation of the MGIE model described in the article?
  6. During photo editing with the MGIE model, users only need to write what they want to change in the image. For example, by typing the command “make it more healthy” while editing an image of pepperoni pizza, vegetable toppings are added. By giving the command “add more contrast to simulate more light” while editing a photo of tigers in the Sahara, the image becomes brighter.

  7. What are Apple’s plans for the MGIE model?
  8. Apple has made MGIE available for download via GitHub and has provided a demo on the Hugging Face Spaces website. However, the company has not disclosed specific plans for the model beyond research.

  9. Which other companies have AI models for image editing based on text?
  10. Other companies, such as OpenAI with its DALL-E 3 model and Adobe with its Firefly AI model, also have their own AI models for image editing based on text.

  11. What are Apple’s plans for implementing AI in its devices?
  12. Apple’s CEO, Tim Cook, has stated that the company plans to add more AI features to its devices this year. Apple is becoming increasingly involved in the field of generative AI.

  13. Has Apple made MLX open source?
  14. Yes, Apple has released the MLX machine learning framework as open source. This framework aims to facilitate training AI models on Apple Silicon chips.

The source of the article is from the blog enp.gr