In 2017, my team at NVIDIA released a set of AI models and code on GitHub for creating autonomous drones and rovers that can can navigate complex, unmapped places without GPS. I presented this work at IROS 2017 conference. Our drone, nicknamed Redtail, can fly along forest trails autonomously, achieving record-breaking long-range flights of more than 1 kilometer (about six-tenths of a mile) in the lower forest canopy.
Our technology can turn any drone or a rover into one that’s autonomous, capable of navigating along roads, forest trails, tunnels, under bridges, and inside buildings by relying only on visual sensors. All that’s needed is a path the mobile robot can recognize visually. Here is our record-breaking 1 kilometer long-range flight (about six-tenths of a mile):
Our code consists of a Deep Neural Network (DNN) called TrailNet and a set of Robotic Operating System (ROS) nodes for camera IO, control and path planning.
TrailNet DNN estimates drone’s orientation and lateral offset with respect to the navigation path. The provided control system uses the estimated pose to fly a drone along the path. Our TrailNet is based on Resnet18 DNN architecture with our modifications for navigation tasks. Our control code is based on waypoint navigation and uses popular Pixhawk/PX4 stack for drone piloting (or ArduRover stack on the ground rovers).
We also provided an object detection DNN based on YOLO for detecting humans, vehicles, bikes, etc. Here is an example of human detection:
You can also find hardware specs & instructions on how to build your own autonomous drone based on TBS Discovery 450 frame. For example, you can build slick looking drones like these (the second one has a stereo camera):
All code & models are running on NVIDIA Jetson TX1/TX2 embedded computers. Our TrailNet navigation DNN runs at 60 fps on Jetson TX2. Here is Jetson TX2 compute module (credit card size) next to the drone we used for our trail experiments. Jetson was mounted on the drone’s belly. We used one forward looking monocular camera for navigation and one downward looking PX4FLOW optical flow unit for drone stabilization.
TrailNet’s training process is based on imitation learning – a human traverses a target environment like she/he would normally walk/fly/ride. The human holds a special 3-camera rig that records different viewpoints along the path. Then we use auto-labeling technique to extract training data and train TrailNet to compute how a human would go given a currently observed viewpoint to stay on a navigation path/trail. Here is me holding the 3-camera rig:
What’s interesting – our TrailNet DNN is capable of learning to recognize “good to fly over” environments vs “not passable” environments by learning it implicitly from training data (or in unsupervised way):
The DNN can be easily retrained for different environments and can be used for land-based or aerial robots. We already used the framework to train a drone to follow train tracks and ported the system to a robot on wheels to traverse office hallways.
It is also possible to augment TrailNet DNN for turn by turn instructions to make a full fledged navigation system. Here is one experiment where we biased our DNN to only make left turns where possible and ignore other hallways that would lead to right turns or straight moves (this model uses input from one monocular RGB camera):
For more information about this work, see our paper “Toward Low-Flying Autonomous MAV Trail Navigation using Deep Neural Networks for Environmental Awareness” that we presented at IROS 2017 conference.