Module 1: Perception
Reading tasks
Sensor and Sensor Fusion Technology in Autonomous Vehicles: A Review [ Link ]
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks [ Link ]
You Only Look Once: Unified, Real-Time Object Detection [ Link ]
DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving [ Link ]
Probabilistic 3D Multi-Modal, Multi-Object Tracking for Autonomous Driving [ Link ]
Simple online and realtime tracking with a deep association metric [ Link ]
Blog Post 4: DeepDriving
The paper "DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving" introduces a novel approach to autonomous driving by predicting key driving affordances rather than processing the entire scene or mapping images directly to commands. Traditional Mediated Perception methods require complex scene reconstruction, while Behavior Reflex methods lack interpretability. The proposed Direct Perception model extracts 13 key affordances such as lane distances, heading angles, and vehicle distances using a Convolutional Neural Network (CNN). This study highlights how Direct Perception improves efficiency, interpretability, and generalization to real-world driving scenarios by balancing perception-based and reflex-based approaches.
[Read more ...]
Blog Post 3: Yolo
This paper presents YOLO, a real-time object detection method that achieves better accuracy with object detection than comparable real-time object detection methods. The authors found that when YOLO is used in conjunction with Faster CNN, it achieves better accuracy than without its implementation.
[Read more ...]
Blog Post 2: Sensor Fusion
The paper "Sensor and Sensor Fusion Technology in Autonomous Vehicles: A Review" provides a comprehensive overview of the role of sensors in autonomous vehicles (AVs), emphasizing their importance in perception, localization, and decision-making. It examines key sensor technologies such as cameras, LiDAR, and radar, discussing their strengths, limitations, and performance under various environmental conditions. The paper highlights the necessity of sensor calibration as a prerequisite for accurate data fusion and object detection, reviewing available open-source calibration tools. Additionally, it categorizes sensor fusion approaches into high-level, mid-level, and low-level fusion, evaluating state-of-the-art algorithms that enhance object detection and overall driving safety. The review concludes by addressing challenges in sensor fusion, such as data synchronization and environmental adaptability, while proposing future research directions for improving autonomous vehicle technology.
[Read more ...]
Blog Post 1: Faster R-CNN
The paper introduces Faster R-CNN, a deep learning-based object detection framework that improves upon previous region-based detection models by integrating a Region Proposal Network (RPN). Unlike earlier methods that relied on computationally expensive region proposal algorithms, Faster R-CNN shares convolutional features between region proposal and object detection networks, making the process nearly cost-free. The RPN generates region proposals efficiently, which are then refined by the Fast R-CNN detector. The experimental results demonstrate that Faster R-CNN significantly improves detection accuracy while achieving real-time processing speeds, making it a powerful tool for object detection tasks.
[Read more ...]
Example: Machine Learning Applications
Reading tasks
Deep Residual Learning for Image Recognition [ Link ]
Attention Is All You Need [ Link ]
Blog Post Example: ResNet
As the number of layers of neural networks increases, the problems of overfitting, gradient vanishing, and gradient explosion often occur, so this article came into being. In this paper, the concept of deep residual networks (ResNets) is proposed. By introducing "shortcut connections," this study solves the problem of gradient vanishing in deep network training and has an important impact on the field of deep learning. The method of the paper explicitly redefines the network layers as learning residual functions relative to the inputs. By learning residuals, the network can be optimized more easily and can train deeper models more efficiently. Therefore, this method can help solve the performance degradation problem that may occur when the network layer increases. In addition, the article displays the experimental part. The model shows significant improvements in handling large-scale visual recognition tasks like ImageNet and CIFAR-10. The application of deep residual networks in major visual recognition competitions like ILSVRC and COCO 2015 further proves their power and wide applicability.
[Read more ...]
|