Skip to content
@llm-d-incubation

llm-d incubation

Incubating components of llm-d, a Kubernetes-native high-performance distributed LLM inference framework

Popular repositories Loading

  1. llm-d-infra llm-d-infra Public

    llm-d helm charts and deployment examples

    Shell 48 53

  2. workload-variant-autoscaler workload-variant-autoscaler Public

    Variant optimization autoscaler for distributed inference workloads

    Go 26 27

  3. llm-d-modelservice llm-d-modelservice Public

    helm charts for deploying models with llm-d

    Go Template 26 47

  4. llm-d-fast-model-actuation llm-d-fast-model-actuation Public

    Go 9 10

  5. batch-gateway batch-gateway Public

    The batch gateway is an llm-d implementation of the OpenAI batch inference API

    Go 4 3

  6. llm-d-ci llm-d-ci Public

    Shell 2 2

Repositories

Showing 9 of 9 repositories
  • llm-d-modelservice Public

    helm charts for deploying models with llm-d

    llm-d-incubation/llm-d-modelservice’s past year of commit activity
    Go Template 26 47 3 3 Updated Jan 31, 2026
  • batch-gateway Public

    The batch gateway is an llm-d implementation of the OpenAI batch inference API

    llm-d-incubation/batch-gateway’s past year of commit activity
    Go 4 Apache-2.0 3 0 3 Updated Jan 30, 2026
  • workload-variant-autoscaler Public

    Variant optimization autoscaler for distributed inference workloads

    llm-d-incubation/workload-variant-autoscaler’s past year of commit activity
    Go 26 Apache-2.0 27 86 (1 issue needs help) 18 Updated Jan 29, 2026
  • llm-d-incubation/llm-d-fast-model-actuation’s past year of commit activity
    Go 9 Apache-2.0 10 48 3 Updated Jan 28, 2026
  • hermes Public

    Hermes is a cluster configuration scanning and self-test generation tool for llm-d inference workloads

    llm-d-incubation/hermes’s past year of commit activity
    Rust 0 2 0 0 Updated Jan 26, 2026
  • llm-d-async Public
    llm-d-incubation/llm-d-async’s past year of commit activity
    Shell 0 Apache-2.0 0 1 0 Updated Jan 22, 2026
  • llm-d-infra Public

    llm-d helm charts and deployment examples

    llm-d-incubation/llm-d-infra’s past year of commit activity
    Shell 48 Apache-2.0 53 13 18 Updated Dec 13, 2025
  • llm-d-ci Public
    llm-d-incubation/llm-d-ci’s past year of commit activity
    Shell 2 2 0 0 Updated Aug 6, 2025
  • ig-wva Public

    Workload Variant Autoscaler is a service to compute the cost-optimal provisioning of heterogeneous accelerators for inference workloads with varying request latency objectives

    llm-d-incubation/ig-wva’s past year of commit activity
    Jupyter Notebook 1 1 0 1 Updated Jul 11, 2025

Top languages

Loading…

Most used topics

Loading…