Hello 🎊
I am Jinxi He (何锦熙),
a Master student at Carnegie Mellon University
,
supervised by
Prof. Katia Sycara
.
My research focused on Multi-modal Large Language Model (MLLM) hallucination and all kinds of interesting generation tasks.
I am also deeply interested in Robot Learning, particularly long horizon visual task planning and execution.
News 🐝
[2025.04] - Why Reasoning Matters? A Survey of Advancements in
Multimodal Reasoning
[arXiv]
[github]
[2025.04] - Caption Anything in Video: Fine-grained Object-centric
Captioning via Spatiotemporal Multimodal Prompting
[arXiv]
[github]
[2025.03] - VERIFY: A Benchmark of Visual Explanation and Reasoning
for Investigating Multimodal Reasoning Fidelity.
[arXiv]
[website]
Click to see more!