*equal contribution, #corresponding author
2023
- Jianfeng Dong, Minsong Zhang, Zheng Zhang, Xianke Chen, Daizong Liu, Xiaoye Qu, Baolong Liu, Xun Wang. Dual Learning with Dynamic Knowledge Distillation for Partially Relevant Video Retrieval. International Conference on Computer Vision (ICCV), 2023. [paper] [code]
- Shengkai Sun, Daizong Liu, Jianfeng Dong#, Xiaoye Qu, Junyu Gao, Xun Yang, Xun Wang, Meng Wang. Unified Multi-modal Unsupervised Representation Learning for Skeleton-based Action Understanding. ACM international conference on Multimedia (ACM MM), 2023. [paper] [code]
- Baolong Liu, Tianyi Zheng, Peng Zheng, Daizong Liu, Xiaoye Qu, Junyu Gao, Jianfeng Dong#, Xun Wang. Lite-MKD: A Multi-modal Knowledge Distillation Framework for Lightweight Few-shot Action Recognition. ACM international conference on Multimedia (ACM MM), 2023. [paper] [code]
- Daizong Liu, Xiaoye Qu, Jianfeng Dong#, Guoshun Nan, Pan Zhou#, Zichuan Xu, Lixing Chen, He Yan, Yu Cheng. Filling the Information Gap between Video and Query for Language-Driven Moment Retrieval. ACM international conference on Multimedia (ACM MM), 2023. [paper]
- Zhenguang Liu, Xinyang Yu#, Ruili Wang, Shuai Ye, Zhe Ma, Jianfeng Dong#, Sifeng He, Feng Qian, Xiaobo Zhang, Roger Zimmermann, Lei Yang. Video Infringement Detection via Feature Disentanglement and Mutual Information Maximization. ACM international conference on Multimedia (ACM MM), 2023. [paper] [code]
- Xingyu Shen, Xiang Zhang, Xun Yang, Yibing Zhan, Long Lan, Jianfeng Dong, Hongzhou Wu. Semantics-Enriched Cross-Modal Alignment for Complex-Query Video Moment Retrieval. ACM international conference on Multimedia (ACM MM), 2023. [paper]
- Jianfeng Dong, Xiaoman Peng, Zhe Ma, Daizong Liu, Xiaoye Qu, Xun Yang, Jixiang Zhu, Baolong Liu. From Region to Patch: Attribute-Aware Foreground-Background Contrastive Learning for Fine-Grained Fashion Retrieval. International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2023. [paper] [code]
- Jianfeng Dong, Shengkai Sun, Zhonglin Liu, Shujie Chen, Baolong Liu, Xun Wang. Hierarchical Contrast for Unsupervised Skeleton-based Action Representation Learning. Thirty-seventh AAAI Conference on Artificial Intelligence (AAAI), 2023. [paper] [code]
- Qi Zheng, Jianfeng Dong, Xiaoye Qu, Xun Yang, Yabing Wang, Pan Zhou, Baolong Liu, Xun Wang. Progressive Localization Networks for Language-based Moment Localization. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2023. [paper]
- Zhixiao Fu, Xinyuan Chen, Daizong Liu, Xiaoye Qu, Jianfeng Dong#, Xuhong Zhang, Shouling Ji. Multi-level Feature Disentanglement Network for Cross-dataset Face Forgery Detection. Image and Vision Computing, 2023. [paper]
2022
- Fan Hu, Aozhu Chen, Ziyue Wang, Fangming Zhou, Jianfeng Dong, Xirong Li. Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video Retrieval. European Conference on Computer Vision (ECCV), 2022. [paper] [code]
- Jianfeng Dong, Xianke Chen, Minsong Zhang, Xun Yang, Shujie Chen, Xirong Li, Xun Wang. Partially Relevant Video Retrieval. ACM international conference on Multimedia (ACM MM, oral), 2022. [homepage] [paper] [code]
- Yabing Wang*, Jianfeng Dong*, Tianxiang Liang, Minsong Zhang, Rui Cai, Xun Wang. Cross-Lingual Cross-Modal Retrieval with Noise-Robust Learning. ACM international conference on Multimedia (ACM MM), 2022. [paper] [code]
- Jianfeng Dong*, Yabing Wang*, Xianke Chen, Xiaoye Qu, Xirong Li, Yuan He, and Xun Wang. Reading-strategy Inspired Visual Representation Learning for Text-to-Video Retrieval. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2022. [paper] [code]
- Baolong Liu, Qi Zheng, Yabing Wang, Minsong Zhang, Jianfeng Dong#, and Xun Wang. FeatInter: Exploring Fine-Grained Object Features for Video-Text Retrieval. Neurocomputing, 2022. [paper]
- Xun Yang, Shanshan Wang, Jian Dong, Jianfeng Dong, Meng Wang, and Tat-Seng Chua. Video Moment Retrieval with Cross-Modal Neural Architecture Search. IEEE Transactions on Image Processing (TIP), 2022. [paper]
- Jianfeng Dong, Xirong Li, Chaoxi Xu, Xun Yang, Gang Yang, Xun Wang, Meng Wang. Dual Encoding for Video Retrieval by Text. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022. [paper] [code]
2021
- Jianfeng Dong*, Zhe Ma*, Xiaofeng Mao, Xun Yang, Yuan He, Richang Hong, Shouling Ji. Fine-Grained Fashion Similarity Prediction by Attribute-Specific Embedding Learning. IEEE Transactions on Image Processing (TIP), 2021. [paper] [code]
- Daizong Liu, Xiaoye Qu, Jianfeng Dong, Pan Zhou. Adaptive Proposal Generation Network for Temporal SentenceLocalization in Videos. Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
- Leiming Zhang, Jianfeng Dong#, Cuizhu Bao, Shouling Ji, Xun Wang. Click-through Rate Prediction for Video Cold-start Problem (面向视频冷启动问题的点击率预估). Journal of Software (软件学报), 2021. [paper]
- Daizong Liu, Xiaoye Qu, Jianfeng Dong, Pan Zhou, Yu Cheng, Wei Wei, Zichuan Xu, Yulai Xie. Context-aware Biaffine Localizing Network for Temporal Sentence Grounding. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021. [paper] [code]
- Jianfeng Dong, Zhongzi Long, Xiaofeng Mao, Changting Lin, Yuan He, Shouling Ji. Multi-level Alignment Network for Domain Adaptive Cross-modal Retrieval. Neurocomputing, 2021. [paper] [code]
- Zhe Ma, Fenghao Liu, Jianfeng Dong#, Xiaoye Qu, Yuan He, Shouling Ji. Hierarchical Similarity Learning for Language-based Product Image Retrieval. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021. [paper] [code]
- Zhixiao Fu, Xinyuan Chen, Jianfeng Dong#, Shouling Ji. Multi-order Adversarial Representation Learning for Composed Query Image Retrieval. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021.
2020
- Zhe Ma, Jianfeng Dong#, Yao Zhang, Zhongzi Long, Yuan He, Hui Xue, Shouling Ji. Fine-Grained Fashion Similarity Learning by Attribute-Specific Embedding Network. Thirty-fourth AAAI Conference on Artificial Intelligence (AAAI), 2020. [paper] [poster ] [code] [news]
- Xun Yang, Jianfeng Dong#, Yixin Cao, Xun Wang, Meng Wang, Tat-Seng Chua. Tree-Augmented Cross-Modal Encoding for Complex-Query Video Retrieval. International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2020. [paper] [slides]
- Yining Lang, Yuan He, Fan Yang, Jianfeng Dong, Hui Xue, Which is Plagiarism: Fashion Image Retrieval based on Regional Representation for Design Protection, IEEE Conference on Computer Vision and Pattern Recognition (CVPR, oral), 2020. [paper] [news 1] [news 2]
- Xiaoye Qu, Pengwei Tang, Zhikang Zhou, Yu Cheng, Jianfeng Dong, Pan Zhou. Fine-grained Iterative Attention Network for TemporalLanguage Localization in Videos. ACM international conference on Multimedia (ACM MM), 2020. [paper]
- Daizong Liu, Xiaoye Qu, Xiaoyang Liu, Jianfeng Dong, Pan Zhou, Zichuan Xu. Jointly Cross- and Self-Modal Graph Attention Network for Query-Based Moment Localization. ACM international conference on Multimedia (ACM MM), 2020. [paper]
- Yining Lang, Yuan He, Jianfeng Dong#, Fan Yang, Hui Xue. DESIGN-GAN: Cross-Category Fashion Translation Driven by Landmark Attention. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP, oral), 2020.[paper]
- Da Chen, Xiang Wu, Jianfeng Dong#, Yuan He, Hui Xue, Feng Ma. Hierarchical Sequence Representation with Graph Network. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020. [paper]
- Daizong Liu, Xiaoye Qu, Jianfeng Dong, Pan Zhou. Reasoning Step-by-Step: Temporal Sentence Localization in Videos via Deep Rectification-Modulation Network. The 28th International Conference on Computational Linguistics (COLING), 2020. [paper]
2019
- Jianfeng Dong, Xirong Li, Chaoxi Xu, Shouling Ji, Yuan He, Gang Yang, Xun Wang. Dual encoding for zero-example video retrieval. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [paper] [slides] [poster] [code]
- Jianfeng Dong, Xun Wang, Leimin Zhang, Chaoxi Xu, Gang Yang, Xirong Li. Feature Re-Learning with Data Augmentation for Video Relevance Prediction. IEEE Transactions on Knowledge and Data Engineering (TKDE), 2019. [paper] [code]
- Xun Wang, Yali Du, Leimin Zhang, Xirong Li, Miao Zhang, Jianfeng Dong#. Exploring Content-based Video Relevance for Video Click-Through Rate Prediction. ACM international conference on Multimedia (ACM MM), 2019. [paper] [poster]
- Xirong Li, Chaoxi Xu, Gang Yang, Zhineng Chen, Jianfeng Dong#. W2VV++: Fully Deep Learning for Ad-hoc Video Search. ACM international conference on Multimedia (ACM MM), 2019. [paper] [code]
- Changjiang Li, Haiqin Weng, Shouling Ji, Jianfeng Dong, Qinming He. DeT: Defending Against Adversarial Examples via Decreasing Transferability. International Symposium on Cyberspace Safety and Security, 2019. (Best Paper Award) [paper]
- Xirong Li, Jinde Ye, Chaoxi Xu, Shanjinwen Yun, Leimin Zhang, Xun Wang, Rui Qian, Jianfeng Dong. Renmin University of China and Zhejiang Gongshang University at TRECVID 2019: Learn to Search and Describe Videos. TRECVID Workshop, 2019. [paper]
- Zhaoyu Zhang, Xiang Wu, Jianfeng Dong, Yuan He, Hui Xue, Feng Mao. Noise Learning for Weakly Supervised Segment Classification in Video. IEEE International Conference on Computer Vision Workshop ICCV Workshop, 2019. [paper]
2018
- Jianfeng Dong, Xirong Li, Chaoxi Xu, Gang Yang, Xun Wang. Feature Re-Learning with Data Augmentation for Content-based Video Recommendation. ACM international conference on Multimedia (ACM MM), 2018. [paper] [slides] [code]
- Jianfeng Dong, Xirong Li, Cees G. M. Snoek. Predicting Visual Features from Text for Image and Video Caption Retrieval. IEEE Transactions on Multimedia (TMM), 2018. [paper] [code]
- Jianfeng Dong, Xirong Li, Duanqing Xu. Cross-Media Similarity Evaluation for Web Image Retrieval in the Wild. IEEE Transactions on Multimedia (TMM), 2018. [paper] [code]
- Tingting Qiao, Jianfeng Dong, Duanqing Xu. Exploring Human-like Attention Supervision in Visual Question Answering. The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI), 2018. [paper] [data]
- Xirong Li, Jianfeng Dong, Chaoxi Xu, Jing Cao, Xun Wang, Gang Yang. Renmin University of China and Zhejiang Gongshang University at TRECVID 2018: Deep Cross-Modal Embeddings for Video-Text Retrieval. TRECVID Workshop, 2018. [paper] [AVS slides] [VTT slides]
Before 2018
- Weiyu Lan, Xirong Li, Jianfeng Dong. Fluency-Guided Cross-Lingual Image Captioning. ACM international conference on Multimedia (ACM MM), 2017. [paper] [slides] [code]
- Jianfeng Dong. Cross-media Relevance Computation for Multimedia Retrieval. ACM international conference on Multimedia (ACM MM), 2017. [paper]
- Jianfeng Dong, Shaoli Huang, Duanqing Xu, Dacheng Tao. DL-61-86 at TRECVID 2017: Video-to-Text Description. TRECVID Workshop, 2017. [paper] [slides]
- Jianfeng Dong, Xirong Li, Weiyu Lan, Yujia Huo, Cees G. M. Snoek. Early Embedding and Late Reranking for Video Captioning. ACM international conference on Multimedia (ACM MM), 2016. (Grand Challenge Award) [paper] [slides]
- Xirong Li, Weiyu Lan, Jianfeng Dong, Hailong Liu. Adding Chinese Captions to Images. Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval (ICMR), 2016. [paper] [data]
- Cees G. M. Snoek, Jianfeng Dong, Xirong Li, Xiaoxu Wang, Qijie Wei, Weiyu Lan, Efstratios Gavves, Noureldien Hussein, Dennis C Koelma, Arnold WM Smeulders. University of Amsterdam and Renmin University at TRECVID 2016: Searching Video, Detecting Events and Describing Video. TRECVID Workshop, 2016. [paper] [slides]
- Jianfeng Dong, Xirong Li, Shuai Liao, Jieping Xu, Duanqing Xu, Xiaoyong Du. Image Retrieval by Cross-Media Relevance Fusion. ACM international conference on Multimedia (ACM MM), 2015. [paper] [slides] [code] [data]