计算机视觉方向最新文章[1016]
CV: 涵盖cs.CV领域的最新文章 Learning Generalisable Omni-Scale Representations for PersonRe-Identification链接:http://arxiv.org/abs/1910.06827v1备注:Extension of conference version: arXiv:1905.00953. Source code:https://github.com/KaiyangZhou/deep-person-reid作者:Kaiyang Zhou;Xiatian Zhu;Yongxin Yang;Andrea Cavallaro;Tao Xiang摘要:An effective person re-identification (re-ID) model should learn feature representations that are both discriminative, for distinguishing similar-looking people, and generalisable, for deployment across datasets without any adaptation.SegSort: Segmentation by Discriminative Sorting of Segments链接:http://arxiv.org/abs/1910.06962v1备注:In ICCV 2019. Webpage & Code:https://jyhjinghwang.github.io/projects/segsort.html作者:Jyh-Jing Hwang;Stella X. Yu;Jianbo Shi;Maxwell D. Collins;Tien-Ju Yang;Xiao Zhang;Liang-Chieh Chen摘要:Almost all existing deep learning approaches for semantic segmentation tackle this task as a pixel-wise classification problem. Yet humans understand a scene not in terms of pixels, but by decomposing it into perceptual groups and structures that are the basic building blocks of recognition.
Tiny Video Networks链接:http://arxiv.org/abs/1910.06961v1备注:作者:AJ Piergiovanni;Anelia Angelova;Michael S. Ryoo摘要:Video understanding is a challenging problem with great impact on the abilities of autonomous agents working in the real-world. Yet, solutions so far have been computationally intensive, with the fastest algorithms running for more than half a second per video snippet on powerful GPUs.
Human Action Recognition with Multi-Laplacian Graph ConvolutionalNetworks链接:http://arxiv.org/abs/1910.06934v1备注:作者:Ahmed Mazari;Hichem Sahbi摘要:Convolutional neural networks are nowadays witnessing a major success in different pattern recognition problems.
DeepGCNs: Making GCNs Go as Deep as CNNs链接:http://arxiv.org/abs/1910.06849v1备注:First two authors contributed equally. This work is a journalextension of our ICCV'19 paper arXiv:1904.03751作者:Guohao Li;Matthias Müller;Guocheng Qian;Itzel C. Delgadillo;Abdulellah Abualshour;Ali Thabet;Bernard Ghanem摘要:Convolutional Neural Networks (CNNs) have been very successful at solving a variety of computer vision tasks such as object classification and detection, semantic segmentation, activity understanding, to name just a few.
A Compact Neural Architecture for Visual Place Recognition链接:http://arxiv.org/abs/1910.06840v1备注:Submitted to RA-L with ICRA 2020 presentation option, 8 pages, 13figures作者:Marvin Chancán;Luis Hernandez-Nunez;Ajay Narendra;Andrew B. Barron;Michael Milford摘要:None
Learning to Predict Layout-to-image Conditional Convolutions forSemantic Image Synthesis链接:http://arxiv.org/abs/1910.06809v1备注:Code will be available soon at https://github.com/xh-liu/CC-FPSE作者:Xihui Liu;Guojun Yin;Jing Shao;Xiaogang Wang;Hongsheng Li摘要:Semantic image synthesis aims at generating photorealistic images from semantic layouts.
Cortical-inspired Wilson-Cowan-type equations for orientation-dependentcontrast perception modelling链接:http://arxiv.org/abs/1910.06808v1备注:This is the extended invited journal version of the SSVM 2019conference proceeding arXiv:1812.07425作者:Marcelo Bertalmío;Luca Calatroni;Valentina Franceschi;Benedetta Franceschiello;Dario Prandi摘要:We consider the evolution model proposed in to describe illusory contrast perception phenomena induced by surrounding orientations. Firstly, we highlight its analogies and differences with widely used Wilson-Cowan equations , mainly in terms of efficient representation properties.
Integrating Temporal and Spatial Attentions for VATEX Video CaptioningChallenge 2019链接:http://arxiv.org/abs/1910.06737v1备注:ICCV 2019 VATEX challenge作者:Shizhe Chen;Yida Zhao;Yuqing Song;Qin Jin;Qi Wu摘要:This notebook paper presents our model in the VATEX video captioning challenge. In order to capture multi-level aspects in the video, we propose to integrate both temporal and spatial attentions for video captioning.
Depth Completion from Sparse LiDAR Data with Depth-Normal Constraints链接:http://arxiv.org/abs/1910.06727v1备注:Accepted to ICCV 2019作者:Yan Xu;Xinge Zhu;Jianping Shi;Guofeng Zhang;Hujun Bao;Hongsheng Li摘要:Depth completion aims to recover dense depth maps from sparse depth measurements. It is of increasing importance for autonomous driving and draws increasing attention from the vision community.
Seeing and Hearing Egocentric Actions: How Much Can We Learn?链接:http://arxiv.org/abs/1910.06693v1备注:Accepted for the Fifth International Workshop on EgocentricPerception, Interaction and Computing (EPIC) at the International Conferenceon Computer Vision (ICCV) 2019作者:Alejandro Cartas;Jordi Luque;Petia Radeva;Carlos Segura;Mariella Dimiccoli摘要:Our interaction with the world is an inherently multimodal experience. However, the understanding of human-to-object interactions has historically been addressed focusing on a single modality.
Being the center of attention: A Person-Context CNN framework forPersonality Recognition链接:http://arxiv.org/abs/1910.06690v1备注:作者:Dario Dotti;Mirela Popa;Stylianos Asteriadis摘要:This paper proposes a novel study on personality recognition using video data from different scenarios. Our goal is to jointly model nonverbal behavioral cues with contextual information for a robust, multi-scenario, personality recognition system.
A Method to Generate Synthetically Warped Document Image链接:http://arxiv.org/abs/1910.06621v1备注:作者:Arpan Garai;Samit Biswas;Sekhar Mandal;Bidyut. B. Chaudhuri摘要:The digital camera captured document images may often be warped and distorted due to different camera angles or document surfaces. A robust technique is needed to solve this kind of distortion.
Background Segmentation for Vehicle Re-Identification链接:http://arxiv.org/abs/1910.06613v1备注:作者:Mingjie Wu;Yongfei Zhang;Tianyu Zhang;Wenqi Zhang摘要:Vehicle re-identification (Re-ID) is very important in intelligent transportation and video surveillance.Prior works focus on extracting discriminative features from visual appearance of vehicles or using visual-spatio-temporal information.
Stereo-based Multi-motion Visual Odometry for Mobile Robots链接:http://arxiv.org/abs/1910.06607v1备注:5 pages, 5 figures作者:Qing Zhao;Bin Luo;Yun Zhang摘要:With the development of computer vision, visual odometry is adopted by more and more mobile robots. However, we found that not only its own pose, but the poses of other moving objects are also crucial for the decision of the robot.
Trajectorylet-Net: a novel framework for pose prediction based ontrajectorylet descriptors链接:http://arxiv.org/abs/1910.06583v1备注:作者:Xiaoli Liu;Jianqin Yin;Jin Tang;Zhicheng Zhang摘要:Pose prediction is an increasingly interesting topic in computer vision and robotics. In this paper, we propose a new network, Trajectorylet-Net, to predict future poses.
IMMVP: An Efficient Daytime and Nighttime On-Road Object Detector链接:http://arxiv.org/abs/1910.06573v1备注:作者:Cheng-En Wu;Yi-Ming Chan;Chien-Hung Chen;Wen-Cheng Chen;Chu-Song Chen摘要:It is hard to detect on-road objects under various lighting conditions. To improve the quality of the classifier, three techniques are used. We define subclasses to separate daytime and nighttime samples. Then we skip similar samples in the training set to prevent overfitting.
Real-time monitoring of driver drowsiness on mobile platforms using 3Dneural networks链接:http://arxiv.org/abs/1910.06540v1备注:13 pages, 2 figures, 'Online First' version. For associated mp4files, see journal website作者:Jasper S. Wijnands;Jason Thompson;Kerry A. Nice;Gideon D. P. A. Aschwanden;Mark Stevenson摘要:Driver drowsiness increases crash risk, leading to substantial road trauma each year. Drowsiness detection methods have received considerable attention, but few studies have investigated the implementation of a detection approach on a mobile phone.
End-to-End Multi-View Fusion for 3D Object Detection in LiDAR PointClouds链接:http://arxiv.org/abs/1910.06528v1备注:CoRL2019作者:Yin Zhou;Pei Sun;Yu Zhang;Dragomir Anguelov;Jiyang Gao;Tom Ouyang;James Guo;Jiquan Ngiam;Vijay Vasudevan摘要:Recent work on 3D object detection advocates point cloud voxelization in birds-eye view, where objects preserve their physical dimensions and are naturally separable.
Target-Oriented Deformation of Visual-Semantic Embedding Space链接:http://arxiv.org/abs/1910.06514v1备注:8 pages作者:Takashi Matsubara摘要:Multimodal embedding is a crucial research topic for cross-modal understanding, data mining, and translation. Many studies have attempted to extract representations from given entities and align them in a shared embedding space.
Exploring Overall Contextual Information for Image Captioning inHuman-Like Cognitive Style链接:http://arxiv.org/abs/1910.06475v1备注:ICCV 2019作者:Hongwei Ge;Zehang Yan;Kai Zhang;Mingde Zhao;Liang Sun摘要:Image captioning is a research hotspot where encoder-decoder models combining convolutional neural network (CNN) and long short-term memory (LSTM) achieve promising results. Despite significant progress, these models generate sentences differently from human cognitive styles.
End-to-End Adversarial Shape Learning for Abdomen Organ DeepSegmentation链接:http://arxiv.org/abs/1910.06474v1备注:Accepted to International Workshop on Machine Learning in MedicalImaging (MLMI2019)作者:Jinzheng Cai;Yingda Xia;Dong Yang;Daguang Xu;Lin Yang;Holger Roth摘要:Automatic segmentation of abdomen organs using medical imaging has many potential applications in clinical workflows. Recently, the state-of-the-art performance for organ segmentation has been achieved by deep learning models, i.e., convolutional neural network (CNN).
Building Damage Detection in Satellite Imagery Using ConvolutionalNeural Networks链接:http://arxiv.org/abs/1910.06444v1备注:作者:Joseph Z. Xu;Wenhan Lu;Zebo Li;Pranav Khaitan;Valeriya Zaytseva摘要:In all types of disasters, from earthquakes to armed conflicts, aid workers need accurate and timely data such as damage to buildings and population displacement to mount an effective response.
Restoration of marker occluded hematoxylin and eosin stained whole slidehistology images using generative adversarial networks链接:http://arxiv.org/abs/1910.06428v1备注:作者:Bairavi Venkatesh;Tosha Shah;Antong Chen;Soheil Ghafurian摘要:It is common for pathologists to annotate specific regions of the tissue, such as tumor, directly on the glass slide with markers.
Tell-the-difference: Fine-grained Visual Descriptor via a DiscriminatingReferee链接:http://arxiv.org/abs/1910.06426v1备注:作者:Shuangjie Xu;Feng Xu;Yu Cheng;Pan Zhou摘要:In this paper, we investigate a novel problem of telling the difference between image pairs in natural language. Compared to previous approaches for single image captioning, it is challenging to fetch linguistic representation from two independent visual information.
FireNet: Real-time Segmentation of Fire Perimeter from Aerial Video链接:http://arxiv.org/abs/1910.06407v1备注:Published at NeurIPS 2019; Workshop on Artificial Intelligence forHumanitarian Assistance and Disaster Response(AI+HADR 2019)作者:Jigar Doshi;Dominic Garcia;Cliff Massey;Pablo Llueca;Nicolas Borensztein;Michael Baird;Matthew Cook;Devaki Raj摘要:In this paper, we share our approach to real-time segmentation of fire perimeter from aerial full-motion infrared video. We start by describing the problem from a humanitarian aid and disaster response perspective.
Building Information Modeling and Classification by Visual Learning At ACity Scale链接:http://arxiv.org/abs/1910.06391v1备注:33rd Conference on Neural Information Processing Systems (NeurIPS2019), Vancouver, Canada作者:Qian Yu;Chaofeng Wang;Barbaros Cetiner;Stella X. Yu;Frank Mckenna;Ertugrul Taciroglu;Kincho H. Law摘要:In this paper, we provide two case studies to demonstrate how artificial intelligence can empower civil engineering. In the first case, a machine learning-assisted framework, BRAILS, is proposed for city-scale building information modeling.
The Local Elasticity of Neural Networks链接:http://arxiv.org/abs/1910.06943v1备注:11 pages作者:Hangfeng He;Weijie J. Su摘要:This paper presents a phenomenon in neural networks that we refer to as \textit{local elasticity}.
Quantifying Classification Uncertainty using Regularized EvidentialNeural Networks链接:http://arxiv.org/abs/1910.06864v1备注:Presented at AAAI FSS-19: Artificial Intelligence in Government andPublic Sector, Arlington, Virginia, USA作者:Xujiang Zhao;Yuzhe Ou;Lance Kaplan;Feng Chen;Jin-Hee Cho摘要:Traditional deep neural nets (NNs) have shown the state-of-the-art performance in the task of classification in various applications.
Neural Approximation of an Auto-Regressive Process through ConfidenceGuided Sampling链接:http://arxiv.org/abs/1910.06705v1备注:作者:YoungJoon Yoo;Sanghyuk Chun;Sangdoo Yun;Jung-Woo Ha;Jaejun Yoo摘要:We propose a generic confidence-based approximation that can be plugged in and simplify the auto-regressive generation process with a proved convergence. We first assume that the priors of future samples can be generated in an independently and identically distributed (i.i.d.
SafeCritic: Collision-Aware Trajectory Prediction链接:http://arxiv.org/abs/1910.06673v1备注:To Appear as workshop paper for the British Machine Vision Conference(BMVC) 2019作者:Tessa van der Heiden;Naveen Shankar Nagaraja;Christian Weiss;Efstratios Gavves摘要:Navigating complex urban environments safely is a key to realize fully autonomous systems. Predicting future locations of vulnerable road users, such as pedestrians and cyclists, thus, has received a lot of attention in the recent years.
Topological Navigation Graph链接:http://arxiv.org/abs/1910.06658v1备注:作者:Povilas Daniusis;Shubham Juneja;Lukas Valatka;Linas Petkevicius摘要:In this article, we focus on the utilisation of reactive trajectory imitation controllers for goal-directed mobile robot navigation. We propose a topological navigation graph (TNG) - an imitation-learning-based framework for navigating through environments with intersecting trajectories.
Liver segmentation and metastases detection in MR images usingconvolutional neural networks链接:http://arxiv.org/abs/1910.06635v1备注:作者:Mariëlle J. A. Jansen;Hugo J. Kuijf;Maarten Niekel;Wouter B. Veldhuis;Frank J. Wessels;Max A. Viergever;Josien P. W. Pluim摘要:Primary tumors have a high likelihood of developing metastases in the liver and early detection of these metastases is crucial for patient outcome. We propose a method based on convolutional neural networks (CNN) to detect liver metastases.
Multi-Frame GAN: Image Enhancement for Stereo Visual Odometry in LowLight链接:http://arxiv.org/abs/1910.06632v1备注:Accepted by the 3rd Conference on Robot Learning, Osaka, Japan (CoRL2019). The first two authors contributed equally to this paper作者:Eunah Jung;Nan Yang;Daniel Cremers摘要:We propose the concept of a multi-frame GAN (MFGAN) and demonstrate its potential as an image sequence enhancement for stereo visual odometry in low light conditions.
Training CNNs faster with Dynamic Input and Kernel Downsampling链接:http://arxiv.org/abs/1910.06548v1备注:12 pages, 4 figures作者:Zissis Poulos;Ali Nouri;Andreas Moshovos摘要:We reduce training time in convolutional networks (CNNs) with a method that, for some of the mini-batches: a) scales down the resolution of input images via downsampling, and b) reduces the forward pass operations via pooling on the convolution filters.
State of Compact Architecture Search For Deep Neural Networks链接:http://arxiv.org/abs/1910.06466v1备注:6 pages作者:Mohammad Javad Shafiee;Andrew Hryniowski;Francis Li;Zhong Qiu Lin;Alexander Wong摘要:The design of compact deep neural networks is a crucial task to enable widespread adoption of deep neural networks in the real-world, particularly for edge and mobile scenarios.
Real-time Data Driven Precision Estimator for RAVEN-II Surgical RobotEnd Effector Position链接:http://arxiv.org/abs/1910.06425v1备注:6 pages, 10 figures, ICRA2020(under review)作者:Haonan Peng;Xingjian Yang;Yun-Hsuan Su;Blake Hannaford摘要:Surgical robots have been introduced to operating rooms over the past few decades due to their high sensitivity, small size, and remote controllability. The cable-driven nature of many surgical robots allows the systems to be dexterous and lightweight, with diameters as low as 5mm.
页:
[1]