< img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=3131724&fmt=gif" />

Kube AI Hub Compute Platform

Unified Heterogeneous Compute
3–10x Better GPU Utilization

Kube AI Hub is a heterogeneous compute management platform with Kubernetes as its kernel. Through GPU/CPU resource pooling and vGPU virtualization, it enables platform-level management of hardware clusters and supports domestic GPU/CPU/NPU hardware for a secure, self-controlled AI compute infrastructure.

console screenshot

Full-stack Compute Management, Simplified

Kube AI Hub provides end-to-end compute management from hardware resources to business applications. A unified console manages heterogeneous GPU/CPU clusters with built-in multi-tenancy, elastic scheduling, and fine-grained metering — helping enterprises rapidly build a self-controlled AI compute infrastructure.

  • Easy to Deploy

    Easy to Deploy

    Deploy on any existing Kubernetes cluster or bare metal, supports online and air-gapped installation, one-click scaling and upgrades.

  • Feature Complete

    Feature Complete

    Manage GPU nodes, job queues, compute scheduling, multi-tenancy, monitoring, metering, and log management in a single unified platform.

  • Modular & Pluggable

    Modular & Pluggable

    All modules are loosely coupled and optional. Flexibly integrate third-party schedulers, storage systems, and monitoring stacks.

Value for Every Team

The built-in multi-tenant design lets infrastructure teams, AI engineers, and operations staff collaborate on the same platform. Infra teams control hardware resources centrally, engineers focus on model development, and ops teams gain complete observability and automation.

Key Platform Features

Kube AI Hub covers the full compute management lifecycle from hardware onboarding to workload delivery. All features are modular and can be enabled on demand.

  • Heterogeneous GPU Cluster Mgmt

    Heterogeneous GPU Cluster Mgmt

    Unified onboarding of NVIDIA, Huawei Ascend, Cambricon, Tianshu, and other GPUs. Supports online node expansion and cross-cluster resource allocation.

  • vGPU Virtualization & Scheduling

    vGPU Virtualization & Scheduling

    Fine-grained GPU slicing and sharing across concurrent workloads. Significantly improves hardware utilization with sub-card granularity.

  • Multi-tenant Access Control

    Multi-tenant Access Control

    Three-tier permission system across platform, workspace, and project. Supports AD/LDAP integration for secure multi-team resource isolation.

  • Storage & Networking

    Storage & Networking

    Supports S3, NFS, Ceph, LocalPV and other storage backends. Built-in network policy management with Calico, Flannel, and other CNI plugins.

  • Heterogeneous Compute Management Heterogeneous Compute Management

    GPU/CPU heterogeneous compute pooling and virtualization improves utilization by 3–10x, supporting domestic GPU/CPU/NPU hardware for a secure local compute foundation.

    Read More →
  • Intelligent Job Scheduling Intelligent Job Scheduling

    Thousand-GPU distributed scheduling with built-in priority job queues and resource reservation policies for large-scale parallel AI training workloads.

    Read More →
  • Full-stack Observability Full-stack Observability

    Multi-dimensional GPU/CPU monitoring, alerting, and log management with multi-tenant isolation and support for multiple notification channels.

    Read More →
  • Metering and Billing Metering and Billing

    Compute usage monitoring and cost accounting by tenant, department, and project — helping enterprises manage IT costs with precision.

    Read More →
  • Multi-cluster Management Multi-cluster Management

    Unified management of multiple GPU/CPU clusters across data centers and hybrid cloud, with high availability and disaster recovery best practices.

    Read More →
  • Edge Node Support Edge Node Support

    Extend compute scheduling to edge nodes via KubeEdge, enabling cloud-edge collaborative AI inference job distribution and management.

    Read More →
  • App Marketplace App Marketplace

    Built-in Helm-based app marketplace and image registry (Harbor) for one-click deployment and lifecycle management of AI frameworks and tools.

    Read More →

Cloud-native Architecture with Decoupled Frontend and Backend

Frontend

Kube AI Hub Console

  • Kube AI Hub Console
  • Kube AI Hub Console
  • Kube AI Hub Console

Backend (REST API)

Kube AI Hub System

  • API Server
  • API Gateway
  • Controller Manager
  • GPU Scheduler
Kube AI Hub System