New Method REFAM Boosts Image, Video Segmentation Accuracy
Researchers have developed a new method, REFAM, for referring image and video segmentation. The technique, presented by Zhengcong Fei, Mingyuan Fan, and Junshi Huang, leverages diffusion models and is set to improve semantic segmentation across domains.
REFAM, introduced in a paper titled 'Prompting Diffusion Representations for Cross-Domain Semantic Segmentation', uses cross-attention features from pre-trained diffusion models. It introduces 'attention magnets' to focus on relevant regions, using stop words and extended stop words to filter attention. The method removes attention maps corresponding to common stop words, reducing noise and enhancing focus on content words. It also introduces additional stop words to capture background details and refine attention maps. The authors, Anna Kukleva et al., present RefAM as a training-free framework that achieves unprecedented accuracy in pinpointing referred objects using large generative diffusion models.
REFAM, a novel method for referring segmentation, has been developed by Fei, Mingyuan Fan, and Junshi Huang. It leverages diffusion models and attention magnets to improve semantic segmentation. The method, presented in a paper by Kukleva et al., shows great promise in enhancing the accuracy of referred object segmentation across different domains.