PhD Candidate Applicant in Computer Science & Multimodal AI
Institut Polytechnique de Paris | ENSTA Paris, France
I have achieved my Master’s degree in Engineering from École Nationale Supérieure de Techniques Avancées (ENSTA Paris), and my Bachelor’s degree in Automation from Chang'an University, Xi'an, China.
Currently, I am working at CVTE, a publicly listed company, where I have been engaged in algorithm design including machine learning, deep learning, and image processing for approximately two years.
My research focuses on advancing Multimodal AI systems, with an emphasis on integrating vision-language architectures, optimizing edge computing solutions, and leveraging self-supervised learning techniques. I am particularly passionate about bridging theoretical advancements in AI with practical, real-world applications—especially in industrial automation and resource-constrained edge environments.
I am always open to inquiries and collaborations. Please feel free to connect with me.
Email / GitHub / LinkedIn / Google Scholar
My research interests lie at the intersection of Multimodal AI, Computer Vision, and Edge Computing, with a focus on:
Master of Engineering in Information and Communication Sciences & Technologies
Sep 2019 - Dec 2022
Bachelor of Automation (Information and Control)
Sep 2016 - Sep 2019
Developed a cross-modal framework integrating user prompts (e.g., clicks) with Vision Transformers via cross-attention and contrastive learning, enhancing adaptability for industrial inspection tasks.
Jan 2024 - Present
Designed a lightweight CNN by distilling knowledge from Segment Anything Model (SAM), achieving 91.3% F1-Score on a high-resolution PCBA dataset with 50ms inference latency on edge devices.
Jun 2022 - Dec 2023
Achieved SOTA on Cityscapes, COCO-Stuff, and ADE20K via frequency-aware loss. Submitted to CVPR 2023.
May 2022 - Oct 2022
Lead Researcher
Developed a cross-modal framework integrating user prompts with Vision Transformers via cross-attention and contrastive learning. Leveraged shared features for target classification and rotation prediction, ensuring real-time performance. Enhanced adaptability for industrial inspection tasks.
Jan 2024 - Present
Lead Researcher
Designed a lightweight CNN by distilling knowledge from the Segment Anything Model (SAM) for high-resolution industrial anomaly detection. Innovatively introduced a threshold-adaptive network structure to eliminate manual parameter tuning, enhancing system stability and robustness.
Jun 2022 - Dec 2023
Research Intern
Achieved SOTA on Cityscapes, COCO-Stuff, and ADE20K via frequency-aware loss. Submitted to CVPR 2023.
May 2022 - Oct 2022
Research Intern
Developed pseudo-morphological layers using AutoML to enhance CNN performance in edge detection and segmentation (BSD500 dataset). Published in Pattern Recognition (SCI Q1, IF=8.518).
May 2020 - Jul 2022
Research Assistant
Developed a self-supervised pipeline using contrastive learning on COCO, achieving state-of-the-art (SOTA) segmentation performance on Cityscapes and BDD100k with minimal supervision. Submitted to Pattern Recognition (SCI Q1).
Apr 2021 - Jan 2022
Research Intern
Proposed Superpixel-mix, a data augmentation method reducing model uncertainty by 18% under distribution shifts (e.g., adverse weather). Accepted at BMVC 2021 (CCF-B).
Jan 2021 - Aug 2021
Yufei Hu, N. Belkhir, J. Angulo, A. Yao, G. Franchi. Pattern Recognition, 131, 108893, 2022 (SCI Q1, IF=8.518).
G. Franchi, N. Belkhir, M. L. Ha, Yufei Hu, et al. BMVC 2021, 2021 (CCF-B).
Application Number: CN202410901829.2
Developed a vision foundation model-based framework for real-time detection of foreign objects in industrial environments, achieving high precision in edge deployment.
Application Number: CN202410443098.1
Innovated a multi-sensor fusion method to reconstruct high-resolution height maps using texture guidance, enhancing accuracy and efficiency in 3D industrial inspection.
Application Number: CN202410442633.1
Designed a fusion framework integrating RGB, depth sensors, and gradient information, guided by RGB to enable high-precision 3D height data fusion using a Kalman filter strategy, providing robust support for 3D industrial inspection.
Application Number: CN202410443406.0
Proposed a gradient-aware restoration algorithm for height data recovery in low-resolution industrial scans.