Zhida Huang

Senior Machine Learning Engineer / Tech Lead working on AI Search, LLM/MLLM, Multi-modal AI, and Large-scale Machine Learning Systems.

Zhida Huang

About

I am a Senior Machine Learning Engineer / Tech Lead at TikTok, working on AI Search, LLM/MLLM, and large-scale multi-modal systems. My work focuses on bringing foundation models into real-world products, spanning search, recommendation, and content understanding at scale.

Previously, I led the development of large-scale AI systems for TikTok E-commerce and visual search, driving substantial gains in efficiency and system scalability. I also worked on human-centric vision systems at Aibee, and conducted research on scene text detection at Microsoft Research Asia.

I hold a Master’s degree in Software Engineering from Peking University and a Bachelor’s degree from South China University of Technology. My research has been published in venues such as CVPR, ACL, and NAACL, with a focus on computer vision and multi-modal learning.

Experience

2020 – Present

ByteDance / TikTok — Senior ML Engineer / Tech Lead

  • Working on AI Search and LLM/MLLM-driven content summary and understanding systems.
  • Led and built CV/NLP/Multi-modal/LLM models for E-commerce governance systems
  • Developed visual search system in E-commerce listing with search, clustering, deep matching, and reranking
2019 – 2020

Aibee — Senior Algorithm Engineer

  • Worked on human detection and tracking systems for retail intelligence
2017 – 2018

Microsoft Research Asia — Research Intern

  • Worked on scene text detection and recognition

Publications

MagFace: A Universal Representation for Face Recognition and Quality Assessment Qiang Meng, Shichao Zhao, Zhida Huang, Feng Zhou CVPR 2021 (Oral)
[Paper]
Mask R-CNN with Pyramid Attention Network for Scene Text Detection Zhida Huang, Zhuoyao Zhong, Lei Sun, Qiang Huo WACV 2019
[Paper]
Visible Feature Guidance for Crowd Pedestrian Detection Zhida Huang, Kaiyu Yue, Jiangfan Deng, and Feng Zhou ECCVW 2020
[Paper]
Groundinggpt: Language enhanced multi-modal grounding model Zhaowei Li, Qi Xu, Dong Zhang, Hang Song, Yiqing Cai, Qi Qi, Ran Zhou, Junting Pan, Zefeng Li, Vu Tu, Zhida Huang, Tao Wang ACL 2024
[Paper]
UnifiedMLLM: Multi-modal Multi-task Learning with Large Language Models Zhaowei Li, Wei Wang, YiQing Cai, Qi Xu, Pengyu Wang, Dong Zhang, Hang Song, Botian Jiang, Zhida Huang, Tao Wang NAACL 2025
[Paper]
Qcrd: Quality-guided contrastive rationale distillation for large language models Wei Wang, Zhaowei Li, Qi Xu, Yiqing Cai, Hang Song, Qi Qi, Ran Zhou, Zhida Huang, Tao Wang EMNLP 2025
[Paper]

Full list on Google Scholar

Academic Service

Reviewer for top-tier venues including NeurIPS, ICLR, ICML, CVPR, ECCV, TKDE, FCS, and RA-L.