ImEW: A Framework for Editing Image in the Wild

Tasnim Mohiuddin*, Tianyi Zhang, Maowen Nie, Jing Huang, Qianqian Chen, Wei Shi

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The ability to edit images in a realistic and visually appealing manner is a fundamental requirement in various computer vision applications. In this paper, we present ImEW, a unified framework designed for solving image editing tasks. ImEW utilizes off-The-shelf foundation models to address four essential editing tasks: object removal, object translation, object replacement, and generative fill beyond the image frame. These tasks are accomplished by leveraging the capabilities of state-of-The-Art foundation models, namely the Segment Anything Model, Grounding DINO, LaMa, and Stable Diffusion. These models have undergone extensive training on large-scale datasets and have exhibited exceptional performance in understanding image context, object manipulation, and texture synthesis. Through extensive experimentation, we demonstrate the effectiveness and versatility of ImEW in accomplishing image editing tasks across a wide range of real-world scenarios. The proposed framework opens up new possibilities for realistic and visually appealing image editing and enables diverse applications requiring sophisticated image modifications. Additionally, we discuss the limitations and outline potential directions for future research in the field of image editing using off-The-shelf foundation models, enabling continued advancements in this domain.

Original languageEnglish
Title of host publicationLGM3A 2023 - Proceedings of the 1st Workshop on Large Generative Models Meet Multimodal Applications, Co-located with
Subtitle of host publicationMM 2023
PublisherAssociation for Computing Machinery, Inc
Pages34-44
Number of pages11
ISBN (Electronic)9798400702839
DOIs
Publication statusPublished - 2 Nov 2023
Externally publishedYes
Event1st Workshop on Large Generative Models Meet Multimodal Applications, LGM3A 2023 - Ottawa, United States
Duration: 2 Nov 2023 → …

Publication series

NameLGM3A 2023 - Proceedings of the 1st Workshop on Large Generative Models Meet Multimodal Applications, Co-located with: MM 2023

Conference

Conference1st Workshop on Large Generative Models Meet Multimodal Applications, LGM3A 2023
Country/TerritoryUnited States
CityOttawa
Period2/11/23 → …

Keywords

  • diffusion models
  • generative models
  • image editing
  • segment anything model

Fingerprint

Dive into the research topics of 'ImEW: A Framework for Editing Image in the Wild'. Together they form a unique fingerprint.

Cite this