Xiaoyu (Lesley) Zhu (xiaoyuz3@cs.cmu.edu)
Xiaoyu Zhu
Ph.D. Student in Artificial Intelligence
Language Technologies Institute
School of Computer Science
Carnegie Mellon University
Greetings! I'm a Ph.D. student at Language Technologies Institute under the supervision of Prof. Alexander Hauptmann.
My research focuses on robust feature representation learning for visual perception and generation. I have been developing novel representation learning methods including: (1) Contrastive/Siamese Learning for video and 3D scene understanding; (2) Masked Visual Modeling for human action analysis from videos and 3D point clouds/meshes; and (3) Generatively Pretrained Vision-Language Models for open-vocabulary 3D scene understanding and generation. I have published first-authored papers in top computer vision conferences such as CVPR and ICCV. My research has been featured by CMU News and Science Daily, and deployed by the U.S. Army Research Lab and Federal Emergency Management Agency.
I'm actively seeking full-time industrial positions starting in Fall 2024 or Spring 2025. Please reach out via email if you are interested!
  1. STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition
    Zhu, X., Huang, P.*, Liang, J.*, De Melo, C., Hauptmann, A.
    CVPR 2023
  2. Leveraging Body Pose Estimation for Gesture Recognition Using Synthetic Data
    Zhu, X., De Melo, C., Hauptmann, A.
    SPIE DCS 2023
  3. Weakly Supervised 3D Semantic Segmentation Using Cross-Image Consensus and Inter-Voxel Affinity Relations
    Zhu, X., Chen, J., Zeng, X., Liang, J., Li, C., Liu, S., Xu, M.
    ICCV 2021
  4. MSNet: A Multilevel Instance Segmentation Network for Natural Disaster Damage Assessment in Aerial Videos
    Zhu, X., Liang, J., Hauptmann, A.
    WACV 2021
Selected Media