Claigrid
  • Introduction
  • Getting Started
    • Quickstart
  • Basics
    • AI Inference Termination (AIT)
    • Orchestration server (CIMS)
    • Worker nodes
    • Claigrid API
    • Pricing
  • Step by Step Setup
    • Connecting to Your Cloud Provider
    • Setting up a new project
    • AIT_API
Powered by GitBook
On this page
  1. Basics

Worker nodes

PreviousOrchestration server (CIMS)NextClaigrid API

Last updated 4 months ago

Worker Nodes are the backbone of Claigrid's compute infrastructure, acting as the bridge between Claigrid’s orchestration system and the underlying cloud resources. These nodes connect directly to the underlying compute instances, enabling containerized applications to run, scale, and interact seamlessly.

In summary, worker nodes are the operational hubs within Claigrid's infrastructure, handling user workloads and providing a scalable, reliable, and efficient compute environment.

Key Characteristics of Worker Nodes:

  • Compute Resource Integration: Worker nodes connect to the raw compute resources provided by the user’s cloud provider, facilitating direct access to the virtual machines and GPUs required for resource-intensive tasks like AI inference and cloud gaming.

  • Container Support: These nodes are designed to support containerized applications, providing isolated environments for each user session. This isolation ensures that each workload is securely separated, optimizing performance and preventing interference between applications.

  • Dynamic Scaling: Managed by Claigrid’s CIMS, worker nodes scale dynamically in groups, based on real-time demand. They allocate resources to accommodate new requests and spin down when workloads are complete, allowing efficient resource utilization and reducing costs.

  • High Availability and Reliability: Claigrid's worker nodes are deployed across multiple cloud regions, ensuring that resources are consistently available even in the event of localized outages. By balancing workloads across available nodes, Claigrid minimizes latency and optimizes response times for users worldwide.

  • Monitoring and Health Checks: Each worker node undergoes constant monitoring for health, connectivity, and performance. This proactive management enables quick troubleshooting and resource reallocation to ensure uninterrupted service.

Want to jump in, setting up your first cluster? Head to the section of this manual to learn more.

AI Inference Termination