Experience

Publications

Vision-Language Learning

Food-500 Caption ACM MM 2023
Food-500 cap: A fine-grained food caption benchmark for evaluating vision-language models
Zheng Ma*, Mianzhi Pan*, Wenhan Wu, et al.
ACM Multimedia, 2023

A fine-grained benchmark to measure cross-modal understanding ability of Vision-Language Models in food domain. Enabling more faithful and diverse VLM applications.

EMNLP Findings 2022
Probing cross-modal semantics alignment capability from the textual perspective
Zheng Ma, Shi Zong, Mianzhi Pan, et al.
Findings of EMNLP, 2022

A probing study revealing potential vulnerability of cross-modal alignment in large-scale Vision-Language Models.

AI for Science

Digital Discovery 2025
Generative AI-powered inverse design for tailored narrowband molecular emitters
Mianzhi Pan*, Tianhao Tan*, Yawen Ouyang*, et al.
Digital Discovery, 2025

A generative pipeline for inverse design of OLED molecular emitters with application-driven spectral properties, enabling rapid discovery cycles.

ICLR Workshop
Towards Extrapolation in Deep Material Property Regression
Mianzhi Pan, Jianfei Li, Yawen Ouyang, et al.
AI4Mat Workshop @ ICLR 2025

Empirical and methodological insights on extrapolation for material property regressors.

Enhancing Spatial Reasoning in Large Language Models for Metal-Organic Frameworks Structure Prediction
Enhancing Spatial Reasoning in Large Language Models for Metal-Organic Frameworks Structure Prediction
Mianzhi Pan*, Jianfei Li*, et al.
arXiv, 2026

An LLM-based metal-organic framework (MOF) structure predictor.