This paper proposes an innovative framework for weakly-supervised open-vocabulary segmentation, named Uni-OVSeg, which drastically reduces reliance on labor-intensive image-mask-text triplets.
Uni-OVSeg achieves impressive segmentation performance in open-vocabulary settings and outperforms previous state-of-the-art weakly-supervised methods, even surpassing fully-supervised methods on the PASCAL Context-459 dataset.
By leveraging independent image-text and image-mask pairs, we enhance the quality of region embeddings and alleviate noise in mask-text correspondences, leading to significant performance improvements.
This framework not only underscores the potential for efficiency in segmentation tasks but also sets the stage for future research in open-vocabulary segmentation methodologies.
#weakly-supervised-learning #open-vocabulary-segmentation #image-processing #deep-learning #artificial-intelligence
Collection
[
|
...
]