Iterative Robust Visual Grounding with Masked Reference Based Centerpoint Supervision.
IEEE Transactions on Circuits and Systems for Video Technology(2024)
Key words
Deep learning,visual grounding,robust learning,visual language,visual-linguistic alignment
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined