The Regressor-based and End-to-End Landmark Detection sample demonstrates how to use the NVIDIA® proprietary deep neural network (DNN) MapNet to perform lane marking detection
and landmark detection on the road. It detects the lane you are in (ego-lane), and the left and right adjacent lanes when they are present. Landmarks include
vertical poles, intersection markings (e.g., crosswalks) and roadmarkings (e.g., arrows) on the road.
MapNet has been trained with RCB images and its performance is invariant to RGB encoded H.264 videos.
This sample can also stream a H.264 or RAW video and computes the multi-class likelihood map of lane markings on each frame. A user assigned
threshold value binarizes a likelihood map into clusters of lane markings, then image post-processing steps are employed to fit polylines onto the lane clusters
and assign them with lane position and appearance types. The sample can also be operated with cameras.
The image datasets used to train MapNet have been captured by a View Sekonix Camera Module (SF3324/5) with AR0231 RCCB sensor.
The camera is mounted high up at the rear view mirror position. Demo videos are captured at 2.3 MP and down-sampled to 960 x 604.
To achieve the best lane detection performance, adopt a similar camera setup and align the video center vertically with the horizon before recording new videos.
The Regressor-based and End-to-End Landmark Detection sample, sample_landmark_detection_by_regressor
accepts the following optional parameters.
If none are specified, it will perform detections on supplied pre-recorded video.
./sample_landmark_detection_by_regressor --input-type=[video|camera] --video=[path/to/video] --model-type=[regressor|e2e] --camera-type=[camera] --camera-group=[a|b|c|d] --camera-index=[0|1|2|3] --roi=[x,y,w,h]
Where:
--input-type=[video|camera] Defines if the input is from live camera or from a recorded video. Live camera is only supported on On NVIDIA DRIVE platform. Default value: video --video=[path/to/video] Is the absolute or relative path of a raw or h264 recording. Only applicable if '--input-type=video'. Default value: path/to/data/samples/laneDetection/video_lane.h264. --model-type=[regressor|e2e] Specifies which type of MapNet model to use for landmark detection. Default value: e2e. --camera-type=[camera] Is a supported AR0231 `RCCB` sensor. Only applicable if '--input-type=camera'. Default value: ar0231-rccb-bae-sf3324 --camera-group=[a|b|c|d] Is the group where the camera is connected to. Only applicable if '--input-type=camera'. Default value: a --camera-index=[0|1|2|3] Indicates the camera index on the given port. Default value: 0 --roi=[x,y,w,h] Defines a Region of Interest (ROI) where detections occur: X: x-coordinate. Y: y-coordinate. W: width. H: height. Default value: No ROI.
./sample_landmark_detection_by_regressor --video=<video file.h264>
or
./sample_landmark_detection_by_regressor --video=<video file.raw>
./sample_landmark_detection_by_regressor --input-type=camera --camera-type=<camera_type> --camera-group=<camera_group>
where <camera type>
is a supported RCCB
sensor. See List of cameras supported out of the box for the list of supported cameras for each platform.
./sample_landmark_detection_by_regressor --model-type e2e
./sample_landmark_detection_by_regressor --model-type regressor
MapNet creates a window and displays the final landmark polyline outputs overlaid on top of the video.
The polyline colors represent the detected landmark attribute type as follows:
Lane markings:
Vertical Poles:
Intersections:
Roadmark:
Numbers are displayed on top of landmark detections indicating a track ID for a specific detection.
This track ID is a number that corresponds to the same detection(such as a lane) across camera frames so the same lane detected in multiple frames will have the same track ID associated with it.
The letter E is appended in front of a track ID to indicate a landmark (poles, intersections, road markings) other than lanes to differentiate lane tracks from other landmark tracks.
For more details see Landmark Perception.