NVIDIA RAPIDS Overview
NVIDIA RAPIDS enables you to find better insights into your data more quickly, through accelerated visualization techniques. It is often useful to visualize your data before training a model so you understand your data better, particularly when you’re dealing with an unfamiliar dataset. During preprocessing, dimensionality reduction is critical both for visualizing and mapping multidimensional datasets to a lower dimensional manifold, as well as for feature ranking/extraction. It helps expose structure, identify important feature correlations which can be exploited during the model development stage, and reduce model complexity by removing spurious correlations. Furthermore, visualization is critical to comprehend and trust the results of an ML System, particularly for identifying top features driving the prediction and points of failure. RAPIDS and NVIDIA GPUs let us apply complex visualization techniques with speed.
NVIDIA RAPID’s cuML random forest module is used to accelerate XGBoost, which is a kind of gradient boosted decision tree (GBDT). GBDTs have become a popular model of applied ML because they make it easy to explain predictions that are made and are useful on a wide range of problems such as regression, classification, ranking, and prediction. Accuracy is important, but also having an understandable and clear way of comprehending why something may have been classified as fraudulent is valuable for customers.