DICTA 2019 - Keynote Speakers

Prof Mubarak Shah
Director Center for Research in Computer Vision, University of Central Florida, USA

Biography: Dr. Mubarak Shah, Trustee Chair Professor of Computer Science, is the founding director of the Center for Research in Computer Vision at UCF. His research interests include: video surveillance, visual tracking, human activity recognition, visual analysis of crowded scenes, video registration, UAV video analysis, etc. Dr. Shah is a fellow of IEEE, AAAS, IAPR and SPIE. In 2006, he was awarded a Pegasus Professor award, the highest award at UCF. He is ACM distinguished speaker. He was an IEEE Distinguished Visitor speaker for 1997-2000 and received IEEE Outstanding Engineering Educator Award in 1997. He received the Harris Corporation’s Engineering Achievement Award in 1999, the TOKTEN awards from UNDP in 1995, 1997, and 2000; Teaching Incentive Program award in 1995 and 2003, Research Incentive Award in 2003 and 2009, Millionaires’ Club awards in 2005 and 2006, University Distinguished Researcher award in 2007, honorable mention for the ICCV 2005 Where Am I? Challenge Problem, and was nominated for the best paper award in ACM Multimedia Conference in 2005. He is an editor of international book series on Video Computing; editor in chief of Machine Vision and Applications journal, and an associate editor of ACM Computing Surveys journal. He was an associate editor of the IEEE Transactions on PAMI, and a guest editor of the special issue of International Journal of Computer Vision on Video Computing.

Title: Video Capsule Networks: Human Action Detection, Video Object Segmentation and Video Segmentation Conditioned on Text

Abstract: Employing deep learning, tremendous progress has been made in a very short time in solving difficult computer vision problems and very impressive results have been obtained. This has been achieved mainly employing convolutional neural networks (CNN). Traditional CNN can be used to extract good set of features, but the issue is they do not explicitly model the entities present in the input data. A capsule is a group of neurons which can model different entities or parts of entities. A capsule network provides an effective way to model part-to-whole relationships between entities and allows to learn viewpoint invariant representations. Good results in classifying small images (MNIST) using capsules have been obtained. However, capsules have not been demonstrated on high dimensional data like large images or videos. In this talk, I will present Video Capsule Networks for solving three important problems: Human Action Detection, Video Object Segmentation and Video Segmentation Conditioned on Text.

Prof Ian Reid
University of Adelaide, Australia

Biography: Ian Reid is a Professor of Computer Science. He joined the School in September 2012. He is part of the Australian Centre for Visual Technologies a University research centre within the School. He was formerly a Professor of Engineering Science at the University of Oxford where, together with long-time colleague Prof. David Murray, he rans the Active Vision Group which is part of the wider Robotics Research Group His research interests span a wide range of topics in Computer Vision. In particular he is concerned with algorithms for visual control of active head/eye robotic platforms (for surveillance and navigation), visual geometry and camera self-calibration (applications of these to measurement, AR and VR, including sporting events), visual SLAM, human motion capture, activity analysis, and novel view synthesis. Examples of these and other projects can be found on the Active Vison Group's web pages.

Dr. Ivan Laptev
Senior researcher, WILLOW project team, INRIA Paris

Biography: Ivan Laptev is a senior researcher at INRIA Paris and a head of scientific board at VisionLabs. He received a PhD degree in Computer Science from the Royal Institute of Technology in 2004 and a Habilitation degree from École Normale Supérieure in 2013. Ivan's main research interests include visual recognition of human actions, objects and interactions, and more recently robotics. He has published over 70 papers at international conferences and journals of computer vision and machine learning. He serves as an associate editor of IJCV and TPAMI journals, he has served as a program chair for CVPR’18 and is a regular area chair for CVPR, ICCV and ECCV. He has co-organized several tutorials, workshops and challenges at major computer vision conferences. He has also co-organized a series of INRIA summer schools on computer vision and machine learning (2010-2013) and Machines Can See summits (2017-2019). He received an ERC Starting Grant in 2012 and was awarded a Helmholtz prize in 2017.

Title: Towards Embodied Action Understanding

Abstract: Computer vision has come a long way towards automatic labeling of objects, scenes and human actions in visual data. While this recent progress already powers applications such as visual search and autonomous driving, visual scene understanding remains an open challenge beyond specific applications. In this talk I will outline limitations of human-defined labels and will argue for the task-driven approach to scene understanding. Towards this goal I will describe our recent efforts on learning visual models from narrated instructional videos. I will present methods for automatic discovery of actions and object states associated with specific tasks such as changing a car tire or making coffee. Along these efforts, I will describe a state-of-the-art method for text-based video search using our recent dataset with automatically collected 100M narrated videos. Finally I will present our on-going work on visual scene understanding for real robots where we learn agents to discover sequences of actions to achieve particular tasks.

Prof Kyros Kutulakos
Professor, Department of Computer Science, University of Toronto

Biography: Kyros Kutulakos is a Professor of Computer Science at the University of Toronto. He received his PhD degree from the University of Wisconsin-Madison in 1994 and his BS degree from the University of Crete in 1988, both in Computer Science. Kyros has been a pioneer in the area of computational light transport, developing theoretical tools and computational cameras to analyze light propagation in real-world environments. He is the recipient of an Alfred P. Sloan Fellowship, an Ontario Premier's Research Excellence Award, a Marr Prize in 1999, a Marr Prize Honorable Mention in 2005, and five more paper prizes at CVPR 1994, ECCV 2006, CVPR 2014, CVPR 2017 and CVPR 2019. He was Program Co-Chair of CVPR 2003 and ICCV 2013, and also served as Program Co-Chair of the Second International Conference on Computational Photography in 2010.