About
I am a Ph.D. candidate in Control Science and Engineering at the College of Information Engineering, Zhejiang University of Technology (ZJUT), and a CSC visiting Ph.D. student at the School of Computer Science and Engineering, UNSW Sydney.
Broadly, I am interested in computer vision, AIGC, parameter-efficient fine-tuning (PEFT), diffusion models and embodied AI. Recently, I have been working on retrieval-augmented visual prompt learning, PEFT-based pruning for large language models, motion-controllable video diffusion models, and AI-generated image detection. I am fortunate to be advised by Prof. Linlin Ou and Prof. Xinyi Yu at ZJUT, and to work closely with Prof. Chunhua Shen, Prof. Hao Chen, and Prof. Dong Gong.
News
- 2026 – Our paper MIMIC: Mask-Injected Manipulation Video Generation with Interaction Control is accepted to ICLR 2026.
- 2026 – Retrieval-Enhanced Visual Prompt Learning for Few-shot Classification is accepted by IEEE TCSVT.
- 2025 – RealHD: A High-Quality Dataset for Robust Detection of State-of-the-Art AI-Generated Images is accepted by ACM MM 2025.
- 2025 – We propose a training-free motion customization framework for distilled video generators (video diffusion); the work is on arXiv and under review for CVPR 2026. Project page: motionecho.
Selected Publications
-
Retrieval-Enhanced Visual Prompt Learning for Few-shot ClassificationIEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2026.
-
Across-task Neural Architecture Search via Meta LearningInternational Journal of Machine Learning and Cybernetics (IJMLC), 2022.
-
Soft Taylor Pruning for Accelerating Deep Convolutional Neural NetworksIECON 2020, Annual Conference of the IEEE Industrial Electronics Society.
-
Training-Free Motion Customization for Distilled Video Generators with Adaptive Test-Time DistillationarXiv preprint arXiv:2506.19348, 2025. (under review for CVPR 2026)
-
MIMIC: Mask-Injected Manipulation Video Generation with Interaction ControlInternational Conference on Learning Representations (ICLR), 2026.
-
Improving Neural Indoor Surface Reconstruction with Mask-guided Adaptive Consistency ConstraintsIEEE International Conference on Robotics and Automation (ICRA), 2024.
-
GSORB-SLAM: Gaussian Splatting SLAM Benefits from ORB Features and Transmittance InformationIEEE Robotics and Automation Letters (RA-L), 2025.
-
Learning Layer-wise Composable Textural Inversion Concepts for Text-to-Image GenerationVisual Informatics, JCR Q1 (under review).
-
Graph Pruning for Model CompressionApplied Intelligence, 2022.
-
RepNAS: Searching for Efficient Re-parameterizing BlocksIEEE International Conference on Multimedia and Expo (ICME), 2023.
-
RealHD: A High-Quality Dataset for Robust Detection of State-of-the-Art AI-Generated ImagesACM International Conference on Multimedia (ACM MM), 2025.
Research Projects
I currently work mainly on:
- PEFT & diffusion models (2022 – ) – parameter-efficient tuning and pruning for LLMs and vision models, motion-controllable video diffusion, and generative modelling for embodied scenarios.
- NAS & few-shot learning (2020 – 2021) – across-task NAS and efficient re-parameterization blocks for transferable architectures.
- Model compression & pruning (2019 – 2020) – Soft Taylor pruning, graph-based pruning, and fast CNN deployment.
Education
-
Zhejiang University of Technology (ZJUT), 2019 – now
Ph.D. in Control Science and Engineering, College of Information Engineering. -
UNSW Sydney, 2024 – 2025
CSC visiting Ph.D. student, School of Computer Science and Engineering.
Supervisor: Prof. Dong Gong. -
Zhejiang University of Technology (ZJUT), 2015 – 2019
B.Eng. in Automation, College of Information Engineering.
Experience
-
Visiting Student, State Key Laboratory of CAD&CG, Zhejiang University (ZJU), Jun 2022 – present
Mentors: Prof. Chunhua Shen and Prof. Hao Chen.
Contact
The best way to reach me is by email: jintaorong283@gmail.com. I am happy to chat about research ideas or potential collaborations in CV, AIGC, PEFT, diffusion models, and embodied AI.