Success
Special Tokens: <image>, Visual Grounding: <|ref|>{query}<|/ref|>, Grounding Conversation: <|grounding|>{question}
<image>
<|ref|>{query}<|/ref|>
<|grounding|>{question}