About

I am a Ph.D. candidate in Control Science and Engineering at the College of Information Engineering, Zhejiang University of Technology (ZJUT), and a CSC visiting Ph.D. student at the School of Computer Science and Engineering, UNSW Sydney.

Broadly, I am interested in computer vision, AIGC, parameter-efficient fine-tuning (PEFT), diffusion models and embodied AI. Recently, I have been working on retrieval-augmented visual prompt learning, PEFT-based pruning for large language models, motion-controllable video diffusion models, and AI-generated image detection. I am fortunate to be advised by Prof. Linlin Ou and Prof. Xinyi Yu at ZJUT, and to work closely with Prof. Chunhua Shen, Prof. Hao Chen, and Prof. Dong Gong.

News

  • 2026 – Our paper MIMIC: Mask-Injected Manipulation Video Generation with Interaction Control is accepted to ICLR 2026.
  • 2026Retrieval-Enhanced Visual Prompt Learning for Few-shot Classification is accepted by IEEE TCSVT.
  • 2025RealHD: A High-Quality Dataset for Robust Detection of State-of-the-Art AI-Generated Images is accepted by ACM MM 2025.
  • 2025 – We propose a training-free motion customization framework for distilled video generators (video diffusion); the work is on arXiv and under review for CVPR 2026. Project page: motionecho.

Selected Publications

A full list is available on Google Scholar .
  • Retrieval-Enhanced Visual Prompt Learning for Few-shot Classification
    J. Rong, H. Chen, X. Yu, L. Ou, et al.
    IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2026.
    Few-shot learning · Visual prompts
  • Across-task Neural Architecture Search via Meta Learning
    J. Rong, X. Yu, M. Zhang, et al.
    International Journal of Machine Learning and Cybernetics (IJMLC), 2022.
    Neural architecture search · Meta learning
  • Soft Taylor Pruning for Accelerating Deep Convolutional Neural Networks
    J. Rong, X. Yu, M. Zhang, L. Ou
    IECON 2020, Annual Conference of the IEEE Industrial Electronics Society.
    Model compression · Pruning
  • Training-Free Motion Customization for Distilled Video Generators with Adaptive Test-Time Distillation
    J. Rong, X. Xie, X. Yu, L. Ou, X. Zhang, C. Shen, D. Gong
    arXiv preprint arXiv:2506.19348, 2025. (under review for CVPR 2026)
    Video diffusion · Motion control · project page
  • MIMIC: Mask-Injected Manipulation Video Generation with Interaction Control
    T. Chen, J. Rong, H. Chen, et al.
    International Conference on Learning Representations (ICLR), 2026.
    Embodied AI · Manipulation video
  • Improving Neural Indoor Surface Reconstruction with Mask-guided Adaptive Consistency Constraints
    X. Yu, L. Lu, J. Rong, G. Xu, L. Ou
    IEEE International Conference on Robotics and Automation (ICRA), 2024.
    Neural surface reconstruction · Indoor scenes
  • GSORB-SLAM: Gaussian Splatting SLAM Benefits from ORB Features and Transmittance Information
    W. Zheng, X. Yu, J. Rong, L. Ou, Y. Wei, L. Zhou
    IEEE Robotics and Automation Letters (RA-L), 2025.
    SLAM · Gaussian splatting · arXiv · code
  • Learning Layer-wise Composable Textural Inversion Concepts for Text-to-Image Generation
    J. Rong, H. Chen, L. Ou, et al.
    Visual Informatics, JCR Q1 (under review).
    Text-to-image · Diffusion models
  • Graph Pruning for Model Compression
    M. Zhang, X. Yu, J. Rong, et al.
    Applied Intelligence, 2022.
    Graph pruning · Compression
  • RepNAS: Searching for Efficient Re-parameterizing Blocks
    M. Zhang, X. Yu, J. Rong, et al.
    IEEE International Conference on Multimedia and Expo (ICME), 2023.
    NAS · Re-parameterization
  • RealHD: A High-Quality Dataset for Robust Detection of State-of-the-Art AI-Generated Images
    H. Yu, Y. Ye, J. Rong, et al.
    ACM International Conference on Multimedia (ACM MM), 2025.
    Dataset · AI-generated image detection · code

Research Projects

I currently work mainly on:

  • PEFT & diffusion models (2022 – ) – parameter-efficient tuning and pruning for LLMs and vision models, motion-controllable video diffusion, and generative modelling for embodied scenarios.
  • NAS & few-shot learning (2020 – 2021) – across-task NAS and efficient re-parameterization blocks for transferable architectures.
  • Model compression & pruning (2019 – 2020) – Soft Taylor pruning, graph-based pruning, and fast CNN deployment.

Education

  • Zhejiang University of Technology (ZJUT), 2019 – now
    Ph.D. in Control Science and Engineering, College of Information Engineering.
  • UNSW Sydney, 2024 – 2025
    CSC visiting Ph.D. student, School of Computer Science and Engineering.
    Supervisor: Prof. Dong Gong.
  • Zhejiang University of Technology (ZJUT), 2015 – 2019
    B.Eng. in Automation, College of Information Engineering.

Experience

Contact

The best way to reach me is by email: jintaorong283@gmail.com. I am happy to chat about research ideas or potential collaborations in CV, AIGC, PEFT, diffusion models, and embodied AI.