Scaling & Dynamic Routing#

Microservice Classification#

Tokkio microservices are classified as either stateful or stateless based on their operational characteristics:

Stateful Microservices - Perform continuous real-time processing on video/audio streams - Maintain state in memory to minimize latency and ensure consistency - Implemented as Kubernetes StatefulSet resources

Stateless Microservices - Handle requests that do not require state maintenance - Store state externally in caches or databases - Implemented as Kubernetes deployment resources.

Scaling Approaches#

Stateless Scaling#

Scaling stateless microservices involves:

  1. Increasing deployment replicas

  2. Maintaining replicas in the same message bus consumer group

  3. Distributing messages across available replicas

Stateful Scaling#

Scaling stateful microservices requires additional considerations:

  • Session state must be maintained within a specific replica

  • User session lifecycle is tied to the stateful microservice replicas.

  • Traffic routing must direct requests to the correct replica handling that session

Dynamic Routing Implementation#

To enable proper routing with multiple stateful replicas:

  1. Maintain mapping between stream IDs and StatefulSet replica indices

  2. Track which replica is processing each user session

  3. Route incoming requests to the appropriate replica based on session state

NVIDIA SDR Integration#

NVIDIA Stream Distribution and Routing (SDR) handles workload distribution and traffic routing, abstracting scaling complexity from developers. To scale a stateful microservice, developers need only integrate with SDR.

For detailed SDR implementation information, see Stream Distribution & Routing.