VLM Running on Zero Agents 74 EVF-SAM-2 ๐ 74 Segment objects in images and videos using text prompts
VQA Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model Paper โข 2405.09215 โข Published May 15, 2024 โข 22
Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model Paper โข 2405.09215 โข Published May 15, 2024 โข 22
VLM Running on Zero Agents 74 EVF-SAM-2 ๐ 74 Segment objects in images and videos using text prompts
VQA Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model Paper โข 2405.09215 โข Published May 15, 2024 โข 22
Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model Paper โข 2405.09215 โข Published May 15, 2024 โข 22