Technical Brief - NVIDIA Docs

Implementations of a digital fingerprinting AI workflow for cybersecurity enable organizations to deeply analyze every account login across the network. AI performs massive data filtration and reduction for real-time threat detection by identifying user specific anti-patterns in behavior rather than traditional methods that utilize generalized organizational rules. With digital fingerprinting, a model is created for every individual user across an enterprise, and will flag when user and machine activity patterns shift. Critical behavior anomalies can be rapidly identified for security analysts, so that they can more quickly discover and react to threats.

To reduce the time to develop a cybersecurity solution addressing the previously described use case, NVIDIA has developed the Digital Fingerprinting AI Workflow.

The digital fingerprinting workflow features an example opinionated integrated solution which illustrates how to leverage NVIDIA Morpheus, a cybersecurity AI framework, to build models and pipelines to pull large volumes of user login data from Azure Cloud Services logs, filter and process them, and classify each with a z-score, all in real-time. The z-score, which is the number of standard deviations away from the mean, is used to flag anomalous events. Said another way, a z-score represents how different from the established activity pattern a given event is. Anomalous events are then flagged and presented to the user via example dashboards. A subset of the user login data is used to regularly retrain and fine-tune the models, ensuring that behavior being analyzed is always up-to-date and accurate.

This NVIDIA AI Workflow contains:

Training and inference pipelines for Digital Fingerprinting.
A reference for solution deployment in production, includes components like authentication, logging and monitoring the workflow.
Cloud Native deployable bundle packaged as helm charts
Guidance on performing training and customization of the AI solution to fit your specific use case

Note

These components and instructions used in the workflow are intended to be used as examples for integration, and may not be sufficiently production-ready on their own as stated. The workflow should be customized and integrated into one’s own infrastructure, using the workflow as reference.

Using the above assets, this NVIDIA AI Workflow provides a reference for you to get started and build your own AI solution with minimal preparation, and includes enterprise-ready implementation best practices which range from authentication, monitoring, reporting, and load balancing, helping you achieve the desired AI outcome more quickly while still allowing a path for you to deviate.

NVIDIA AI Workflows are designed as microservices, which means they can be deployed on Kubernetes alone or with other microservices to create a production-ready application for seamless scaling in your enterprise environment.

The following cloud-native Kubernetes services are used with this workflow:

NVIDIA Morpheus
MLflow
Kafka
Prometheus
Elasticsearch
Kibana
Grafana
S3 Compatible Object Storage

These components are packaged together into a deployable solution described in the diagram below:

More information about the components used can be found in the Digital Fingerprinting Workflow Guide and the NVIDIA Cloud Native Service Add-on Pack Deployment Guide.

These components are used to build and deploy training and inference pipelines, integrated together with the additional components as indicated in the below diagram:

Training Pipeline

Let’s first look more closely at the training pipeline. Prior to executing training there is a pre-process stage. The following graphic highlights the multiple stages within the pipeline:

Within this digital fingerprinting workflow, sample Python code is provided for preprocessing, which can be further customized by the developer, within the following stages:

Normalize
Extract Features
Model to MLflow

Inference Pipeline

The Morpheus digital fingerprinting inferencing pipeline has a pre-process stage and a post-process stage as well. Each of these stages has multiple stages in between, as illustrated in the graphic below.

Sample Python code is provided, which can be further customized by the developer, within the following stages:

Normalize
Extract Features
Get User Model
Get Generic Model
Add User Data

Additional Components

The following components are deployed and integrated as a part of the workflow solution:

MLflow

The MLflow open-source platform is a key element to both the training and inferencing pipelines. The MLOps platform enables organizations to easily manage their end-to-end machine learning lifecycle. MLflow uses a centralized model store and has its own set of APIs and user interface for manageability. In this workflow, MLflow’s tracking database is backed by a PostgreSQL database and the model repository is backed by S3-compatible MinIO object storage. The trained models, unique per-user, are delivered to the open-source MLflow platform via Python. A generic model is also trained for inferencing events with an unknown user. The per-user and generic models are loaded using MLflow’s API.

Apache Kafka

Apache Kafka is the open-source streaming platform that is used to bring real time data into the digital fingerprinting inferencing pipeline. The Morpheus digital fingerprinting reference workflow also includes a Kafka producer with a sample dataset and web server, as well as Python code and a custom debugger to allow the developer to debug anywhere within the pipeline. Options are available for developing and running the pipelines in the cloud or on-premises.

Monitoring

The digital fingerprinting AI workflow provides a Prometheus dashboard which includes metrics indicating pipeline throughput statistics; suspicious event breakdowns by region, browser, and other information; as well as active health and status of the pipeline. The metrics are viewed via the Grafana dashboard.

As the digital fingerprinting AI workflow performs the analysis, the pipeline’s security event data is sent to Elasticsearch, where security analysts can use Kibana to drill down into the results. Using Kibana, administrators can generate dashboards, reports, and export data as necessary. Kibana supports a rich search language to help security analysts locate suspicious events and correlate information from disparate systems.

Events can be viewed as a time series:

Or individual events can be investigated more thoroughly:

All of these third-party, open-source platform components can be easily swapped out to use the preferred components within an enterprise’s commercial platform. Steps and guidance on how to do so are provided in the Digital Fingerprinting Workflow Guide and the NVIDIA Cloud Native Service Add-on Pack Deployment Guide.