Abstract is missing.
- Combining Adversarial and Reinforcement Learning for Video Thumbnail SelectionEvlampios E. Apostolidis, Eleni Adamantidou, Vasileios Mezaris, Ioannis Patras. 1-9 [doi]
- Efficient Indexing of 3D Human MotionsPetra Budíková, Jan Sedmidubský, Pavel Zezula. 10-18 [doi]
- Global Relation-Aware Attention Network for Image-Text RetrievalJie Cao, Shengsheng Qian, Huaiwen Zhang, Quan Fang, Changsheng Xu. 19-28 [doi]
- MS-SincResNet: Joint Learning of 1D and 2D Kernels Using Multi-scale SincNet and ResNet for Music Genre ClassificationPei-Chun Chang, Yong-Sheng Chen, Chang-Hsing Lee. 29-36 [doi]
- MLFont: Few-Shot Chinese Font Generation via Deep Meta-LearningXu Chen, Lei Wu, Minggang He, Lei Meng, Xiangxu Meng. 37-45 [doi]
- Facial Structure Guided GAN for Identity-preserved Face Image De-occlusionYiu-ming Cheung, Mengke Li, Rong Zou. 46-54 [doi]
- Heterogeneous Side Information-based Iterative Guidance Model for RecommendationFeifei Dai, Xiaoyan Gu, Zhuo Wang, Mingda Qian, Bo Li, Weiping Wang. 55-63 [doi]
- Dense Scale Network for Crowd CountingFeng Dai, Hao Liu, Yike Ma, Xi Zhang, Qiang Zhao. 64-72 [doi]
- Leveraging Two Types of Global Graph for Sequential Fashion RecommendationYujuan Ding, Yunshan Ma, Wai-Keung Wong, Tat-Seng Chua. 73-81 [doi]
- HSGMP: Heterogeneous Scene Graph Message Passing for Cross-modal RetrievalYu Duan, Yun Xiong, Yao Zhang, Yuwei Fu, Yangyong Zhu. 82-91 [doi]
- GCNBoost: Artwork Classification by Label Propagation through a Knowledge GraphCheikh Brahim El Vaigh, Noa Garcia, Benjamin Renoust, Chenhui Chu, Yuta Nakashima, Hajime Nagahara. 92-100 [doi]
- Can Action be Imitated? Learn to Reconstruct and Transfer Human Dynamics from VideosYuqian Fu, Yanwei Fu, Yu-Gang Jiang. 101-109 [doi]
- SAGN: Semantic Adaptive Graph Network for Skeleton-Based Human Action RecognitionZiwang Fu, Feng Liu, Jiahao Zhang, Hanyang Wang, Chengyi Yang, Qing Xu, Jiayin Qi, Xiangling Fu, Aimin Zhou. 110-117 [doi]
- Text-Guided Visual Feature Refinement for Text-Based Person SearchLiying Gao, Kai Niu 0005, Zehong Ma, Bingliang Jiao, Tonghao Tan, Peng Wang. 118-126 [doi]
- RGB-D Scene Recognition based on Object-Scene Relation and Semantics-Preserving AttentionYuhui Guo, Xun Liang. 127-134 [doi]
- Multi-Feature Graph Attention Network for Cross-Modal Video-Text RetrievalXiaoshuai Hao, Yucan Zhou, Dayan Wu, Wanqian Zhang, Bo Li, Weiping Wang. 135-143 [doi]
- HPOF: 3D Human Pose Recovery from Monocular Video with Optical Flowbin Ji, Chen Yang, Shunyu Yao, Ye Pan. 144-154 [doi]
- Leveraging EfficientNet and Contrastive Learning for Accurate Global-scale Location EstimationGiorgos Kordopatis-Zilos, Panagiotis Galopoulos, Symeon Papadopoulos, Ioannis Kompatsiaris. 155-163 [doi]
- Relation-aware Hierarchical Attention Framework for Video Question AnsweringFangtao Li, Ting Bai, Chenyu Cao, Zihe Liu, Chenghao Yan, Bin Wu. 164-172 [doi]
- Cross-Modal Image-Recipe Retrieval via Intra- and Inter-Modality Hybrid FusionJiao Li, Jialiang Sun, Xing Xu 0001, Wei Yu, Fumin Shen. 173-182 [doi]
- Unsupervised Deep Cross-Modal Hashing by Knowledge Distillation for Large-scale Cross-modal RetrievalMingyong Li, Hongya Wang. 183-191 [doi]
- A Unified-Model via Block Coordinate Descent for Learning the Importance of FilterQinghua Li, Xue Zhang, Cuiping Li 0001, Hong Chen. 192-200 [doi]
- Local-enhanced Interaction for Temporal Moment LocalizationGuoqiang Liang, Shiyu Ji, Yanning Zhang. 201-209 [doi]
- Reading Scene Text by Fusing Visual Attention with Semantic RepresentationsZhiguang Liu, Liangwei Wang, Jian Qiao. 210-218 [doi]
- Generative Adversarial Networks with Bi-directional Normalization for Semantic Image SynthesisJia Long, Hongtao Lu. 219-226 [doi]
- A Smart Adversarial Attack on Deep Hashing Based Image RetrievalJunda Lu, Mingyang Chen, Yifang Sun, Wei Wang, Yi Wang, Xiaochun Yang. 227-235 [doi]
- Image-to-Image Transfer Makes Chaos to OrderSanbi Luo, Tao Guo. 236-243 [doi]
- Summary of the 2021 Embedded Deep Learning Object Detection Model Compression Competition for Traffic in Asian CountriesYu-Shu Ni, Chia-Chi Tsai, Jiun-In Guo, Jenq-Neng Hwang, Bo-Xun Wu, Po-Chi Hu, Ted T. Kuo, Po-Yu Chen, Hsien-Kai Kuo. 244-249 [doi]
- Nested Dense Attention Network for Single Image Super-ResolutionCheng Qiu, Yirong Yao, YunTao Du. 250-258 [doi]
- Multi-scale Dynamic Network for Temporal Action DetectionYifan Ren, Xing Xu, Fumin Shen, Zheng Wang, Yang Yang 0002, Heng Tao Shen. 267-275 [doi]
- Distractor-Aware Tracker with a Domain-Special Optimized Benchmark for Soccer Player TrackingZikai Song, Zhiwen Wan, Wei Yuan, Ying Tang, Junqing Yu, Yi-Ping Phoebe Chen. 276-284 [doi]
- Efficient Nearest Neighbor Search by Removing Anti-hubKimihiro Tanaka, Yusuke Matsui, Shin'ichi Satoh 0001. 285-293 [doi]
- A Denoising Convolutional Neural Network for Self-Supervised Rank Effectiveness Estimation on Image RetrievalLucas Pascotti Valem, Daniel Carlos Guimarães Pedronette. 294-302 [doi]
- Know Yourself and Know Others: Efficient Common Representation Learning for Few-shot Cross-modal RetrievalShaoying Wang, Hanjiang Lai, Zhenyu Shi. 303-311 [doi]
- Neural Symbolic Representation Learning for Image CaptioningXiaomei Wang, Lin Ma 0002, Yanwei Fu, Xiangyang Xue. 312-321 [doi]
- G-CAM: Graph Convolution Network Based Class Activation Mapping for Multi-label Image RecognitionYangtao Wang, Yanzhao Xie, Yu Liu, Lisheng Fan. 322-330 [doi]
- NASTER: Non-local Attentional Scene Text RecognizerLei Wu, Xueliang Liu, Yanbin Hao, Yunjie Ma, Richang Hong. 331-338 [doi]
- Few-Shot Action Localization without Knowing BoundariesTing-Ting Xie, Christos Tzelepis, Fan Fu, Ioannis Patras. 339-348 [doi]
- Learning Hierarchical Visual-Semantic Representation with Phrase AlignmentBaoming Yan, Qingheng Zhang, Liyu Chen, Lin Wang, Leihao Pei, Jiang Yang, Enyun Yu, Xiaobo Li, Binqiang Zhao. 349-357 [doi]
- Social Relation Analysis from Videos via Multi-entity ReasoningChenghao Yan, Zihe Liu, Fangtao Li, Chenyu Cao, Zheng Wang, Bin Wu. 358-366 [doi]
- Aligning Visual Prototypes with BERT Embeddings for Few-Shot LearningKun Yan, Zied Bouraoui, Ping Wang, Shoaib Jameel, Steven Schockaert. 367-375 [doi]
- TEACH: Attention-Aware Deep Cross-Modal HashingHong-Lei Yao, Yu-Wei Zhan, Zhen-Duo Chen, Xin Luo 0006, Xin-Shun Xu. 376-384 [doi]
- Scene Text Recognition with Cascade Attention NetworkMin Zhang, Meng Ma, Ping Wang. 385-393 [doi]
- Multi-Attention Audio-Visual Fusion Network for Audio SpatializationWen Zhang, Jie Shao. 394-401 [doi]
- Multi-Initialization Graph Meta-Learning for Node ClassificationFeng Zhao, Donglin Wang, Xintao Xiang. 402-410 [doi]
- Question-Guided Semantic Dual-Graph Visual Reasoning with Novel AnswersXinzhe Zhou, Yadong Mu. 411-419 [doi]
- Joint Hand-Object Pose Estimation with Differentiably-Learned Physical Contact Point AnalysisNan Zhuang, Yadong Mu. 420-428 [doi]
- HINFShot: A Challenge Dataset for Few-Shot Node Classification in Heterogeneous Information NetworkZifeng Zhuang, Xintao Xiang, Siteng Huang, Donglin Wang. 429-436 [doi]
- Learning to Select: A Fully Attentive Approach for Novel Object CaptioningMarco Cagrandi, Marcella Cornia, Matteo Stefanini, Lorenzo Baraldi, Rita Cucchiara. 437-441 [doi]
- Semi-supervised Many-to-many Music Timbre TransferYu-Chen Chang, Wen-Cheng Chen, Min-Chun Hu 0001. 442-446 [doi]
- Text-Enhanced Attribute-Based Attention for Generalized Zero-Shot Fine-Grained Image ClassificationYan-He Chen, Mei-Chen Yeh. 447-450 [doi]
- Spatio-Temporal Activity Detection and Recognition in Untrimmed Surveillance VideosKonstantinos Gkountakos, Despoina Touska, Konstantinos Ioannidis, Theodora Tsikrika, Stefanos Vrochidis, Ioannis Kompatsiaris. 451-455 [doi]
- Cross-Modal Self-Attention with Multi-Task Pre-Training for Medical Visual Question AnsweringHaifan Gong, Guanqi Chen, Sishuo Liu, Yizhou Yu, Guanbin Li. 456-460 [doi]
- Body Shape Calculator: Understanding the Type of Body Shapes from Anthropometric MeasurementsShintami Chusnul Hidayati, Yeni Anistyasari. 461-465 [doi]
- Unsupervised Video Summarization via Multi-source FeaturesHussain Kanafani, Junaid Ahmed Ghauri, Sherzod Hakimov, Ralph Ewerth. 466-470 [doi]
- Evaluating Contrastive Models for Instance-based Image RetrievalTarun Krishna, Kevin McGuinness, Noel E. O'Connor. 471-475 [doi]
- AWFA-LPD: Adaptive Weight Feature Aggregation for Multi-frame License Plate DetectionXiaocheng Lu, Yuan Yuan, Qi Wang. 476-480 [doi]
- NMS-Loss: Learning with Non-Maximum Suppression for Crowded Pedestrian DetectionZekun Luo, Zheng Fang, Sixiao Zheng, Yabiao Wang, Yanwei Fu. 481-485 [doi]
- Image Retrieval by Hierarchy-aware Deep Hashing Based on Multi-task LearningBowen Wang, LiangZhi Li, Yuta Nakashima, Takehiro Yamamoto, Hiroaki Ohshima, Yoshiyuki Shoji, Kenro Aihara, Noriko Kando. 486-490 [doi]
- Weakly Supervised Sketch Based Person SearchLan Yan, Wenbo Zheng, Fei-Yue Wang 0001, Chao Gou. 491-495 [doi]
- Personal Knowledge Base Construction from Multimodal DataAn-Zi Yen, Chia-Chung Chang, Hen-Hsen Huang, Hsin-Hsi Chen. 496-500 [doi]
- 2.5D Pose Guided Human Image GenerationKang Yuan, Sheng Li. 501-505 [doi]
- Collaborative Representation for Deep Meta Metric LearningMin Zhu, Weifeng Liu 0001, Kai Zhang, Ye Li, Peng Liu, Baodi Liu. 506-510 [doi]
- Ten Questions in Lifelog Mining and Information RecallAn-Zi Yen, Hen-Hsen Huang, Hsin-Hsi Chen. 511-518 [doi]
- Bag of Tricks for Building an Accurate and Slim Object Detector for Embedded ApplicationsYongkun Du, Zhineng Chen, Caiyan Jia, Xuanya Li, Yu-Gang Jiang. 519-525 [doi]
- Efficient-ROD: Efficient Radar Object Detection based on Densely Connected Residual NetworkChih-Chung Hsu, Chieh Lee, Lin Chen, Min-Kai Hung, Andy Yu-Lun Lin, Xian-Yu Wang. 526-532 [doi]
- DANet: Dimension Apart Network for Radar Object DetectionBo Ju, Wei Yang, Jinrang Jia, Xiaoqing Ye, Qu Chen, Xiao Tan 0001, Hao Sun, Yifeng Shi, Errui Ding. 533-539 [doi]
- Object Detection on Embedded Systems for Traffic in Asian CountriesBao-Hong Lai, Hsun-Ping Hsieh. 540-544 [doi]
- Squeeze-and-Excitation network-Based Radar Object Detection With Weighted Location FusionPengliang Sun, Xuetong Niu, Pengfei Sun, Kele Xu. 545-552 [doi]
- ROD2021 Challenge: A Summary for Radar Object Detection Challenge for Autonomous Driving ApplicationsYizhou Wang 0005, Jenq-Neng Hwang, Gaoang Wang, Hui Liu 0011, Kwang-Ju Kim, Hung-Min Hsu, Jiarui Cai, Haotian Zhang, Zhongyu Jiang, Renshu Gu. 553-559 [doi]
- Embedded YOLO: Faster and Lighter Object DetectionWen-Kai Wu, Chien-Yu Chen 0008, Jiann-Shu Lee. 560-565 [doi]
- Radar Object Detection Using Data Merging, Enhancement and FusionJun Yu, Xinlong Hao, Xinjian Gao, Qiang Sun, Yuyu Liu, Peng Chang, Zhong Zhang, Fang Gao, Feng Shuang. 566-572 [doi]
- Scene-aware Learning Network for Radar Object DetectionZangwei Zheng, Xiangyu Yue, Kurt Keutzer, Alberto L. Sangiovanni-Vincentelli. 573-579 [doi]
- GPT2MVS: Generative Pre-trained Transformer-2 for Multi-modal Video SummarizationJia-Hong Huang, Luka Murn, Marta Mrak, Marcel Worring. 580-589 [doi]
- Impact of Interaction Strategies on User Relevance FeedbackOmar Shahbaz Khan, Björn Þór Jónsson 0001, Jan Zahálka, Stevan Rudinac, Marcel Worring. 590-598 [doi]
- Automatic Baseball Pitch OverlayTing-Hsuan Chou, Wei-Ta Chu. 599-602 [doi]
- Video Action Retrieval Using Action Recognition ModelYuko Iinuma, Shin'ichi Satoh 0001. 603-606 [doi]
- MeTILDA: Platform for Melodic Transcription in Language Documentation and ApplicationMitchell Lee, Praveena Avula, Min Chen. 607-610 [doi]
- IR Questioner: QA-based Interactive Retrieval SystemRintaro Yanagi, Ren Togo, Takahiro Ogawa, Miki Haseyama. 611-614 [doi]
- Reproducibility Companion Paper: Knowledge Enhanced Neural Fashion Trend ForecastingYunshan Ma, Yujuan Ding, Xun Yang, Lizi Liao, Wai-Keung Wong, Tat-Seng Chua, Jinyoung Moon, Hong-Han Shuai. 615-618 [doi]
- A Beneficial Dual Transformation Approach for Deep Learning Networks Used in Steel Surface Defect DetectionFityanul Akhyar, Chih-Yang Lin, Gugan S. Kathiresan. 619-622 [doi]
- Discrete Tchebichef Transform for Versatile Video CodingKa Hou Chan, Sio Kei Im. 623-626 [doi]
- Fire Detection using Transformer NetworkMohammad Shahid, Kai-Lung Hua. 627-630 [doi]
- Visible-infrared Person Re-identification with Human Body Parts AssistanceHuangpeng Dai, Qing Xie 0002, Jiachen Li, Yanchun Ma, Lin Li, Yongjian Liu. 631-637 [doi]
- Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text RecognitionZilong Fu, Hongtao Xie, Guoqing Jin, Junbo Guo. 638-644 [doi]
- Contextualized Keyword Representations for Multi-modal Retinal Image CaptioningJia-Hong Huang, Ting-Wei Wu, Marcel Worring. 645-652 [doi]
- MSAV: An Unified Framework for Multi-view Subspace Analysis with View ConsistenceHuibing Wang, Guangqi Jiang, Jinjia Peng, XianPing Fu. 653-659 [doi]
- A Tensor Sparse Representation-Based CBMIR System for Computer-Aided Diagnosis of Focal Liver Lesions and its Pilot TrialJian Wang, Xian-Hua Han, Lanfen Lin, Hongjie Hu, Yen-Wei Chen 0001. 660-666 [doi]
- M-DFNet: Multi-phase Discriminative Feature Network for Retrieval of Focal Liver LesionsYingying Xu, Jing Liu 0041, Lanfen Lin, Hongjie Hu, Ruofeng Tong 0001, Jingsong Li, Yen-Wei Chen 0001. 667-673 [doi]
- M2GUDA: Multi-Metrics Graph-Based Unsupervised Domain Adaptation for Cross-Modal HashingChengyuan Zhang, Zhi Zhong, Lei Zhu, Shichao Zhang, Da Cao, Jianfeng Zhang. 674-681 [doi]
- Human Pose Estimation based on Attention Multi-resolution NetworkCongcong Zhang, Ning He, Qixiang Sun, Xiaojie Yin, Ke Lu. 682-687 [doi]
- ICDAR'21: Intelligent Cross-Data Analysis and RetrievalMinh-Son Dao, Michael Alexander Riegler, Duc-Tien Dang-Nguyen, Cathal Gurrin, Minh-Triet Tran, Thanh Binh Nguyen. 688-689 [doi]
- Introduction to the Fourth Annual Lifelog Search Challenge, LSC'21Cathal Gurrin, Björn Þór Jónsson 0001, Klaus Schöffmann, Duc-Tien Dang-Nguyen, Jakub Lokoc, Minh-Triet Tran, Wolfgang Hürst, Luca Rossetto, Graham Healy. 690-691 [doi]
- MMArt-ACM'21: International Joint Workshop on Multimedia Artworks Analysis and Attractiveness Computing in Multimedia 2021Min-Chun Hu 0001, Ichiro Ide, Kensuke Tobitani. 692-693 [doi]
- MMPT'21: International Joint Workshop on Multi-Modal Pre-Training for Multimedia UnderstandingBei Liu, Jianlong Fu, Shizhe Chen, Qin Jin, Alexander G. Hauptmann, Yong Rui. 694-695 [doi]
- CEA'21: The 13th Workshop on Multimedia for Cooking and Eating ActivitiesYoko Yamakata, Atsushi Hashimoto. 696-697 [doi]