Skip to content

Conversation

@csrwng
Copy link
Contributor

@csrwng csrwng commented Jan 21, 2026

This enhancement proposes isolating HyperShift's Cluster API (CAPI)
CRDs from those installed by the OpenShift platform on management
clusters. As OpenShift evolves toward using CAPI for standalone
cluster machine management, a conflict emerges: both the platform
and HyperShift need to install CAPI CRDs on the same management
cluster, potentially with incompatible versions.

The proposal introduces two major components:

  1. Private CAPI Types and API Proxy: HyperShift-specific CAPI CRDs
    using the cluster.hypershift.openshift.io group, with an API proxy
    sidecar to transparently translate between standard and private
    CAPI types.

  2. Automatic Migration: A migration controller that automatically
    converts existing hosted clusters from standard to private CAPI
    types without disrupting operations.

This enables independent version management for both HyperShift and
platform CAPI dependencies while maintaining transparent operation
for hosted cluster administrators and workloads.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 21, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 21, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 21, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign sjenning for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@bryan-cox
Copy link
Member

We should call out all the CAPI platforms we support: CAPA, CAPZ, CAPV, CAP-Agent, CAPG, etc.

@csrwng csrwng changed the title WIP: proposal for isolating CAPI types in hypershift OCPSTRAT-2789: Add HyperShift private CAPI types enhancement Jan 26, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 26, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 26, 2026

@csrwng: This pull request references OCPSTRAT-2789 which is a valid jira issue.

Details

In response to this:

This enhancement proposes isolating HyperShift's Cluster API (CAPI)
CRDs from those installed by the OpenShift platform on management
clusters. As OpenShift evolves toward using CAPI for standalone
cluster machine management, a conflict emerges: both the platform
and HyperShift need to install CAPI CRDs on the same management
cluster, potentially with incompatible versions.

The proposal introduces two major components:

  1. Private CAPI Types and API Proxy: HyperShift-specific CAPI CRDs
    using the cluster.hypershift.openshift.io group, with an API proxy
    sidecar to transparently translate between standard and private
    CAPI types.

  2. Automatic Migration: A migration controller that automatically
    converts existing hosted clusters from standard to private CAPI
    types without disrupting operations.

This enables independent version management for both HyperShift and
platform CAPI dependencies while maintaining transparent operation
for hosted cluster administrators and workloads.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@csrwng csrwng changed the title OCPSTRAT-2789: Add HyperShift private CAPI types enhancement CNTRLPLANE-2640: Add HyperShift private CAPI types enhancement Jan 26, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 26, 2026

@csrwng: This pull request references CNTRLPLANE-2640 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.22.0" version, but no target version was set.

Details

In response to this:

This enhancement proposes isolating HyperShift's Cluster API (CAPI)
CRDs from those installed by the OpenShift platform on management
clusters. As OpenShift evolves toward using CAPI for standalone
cluster machine management, a conflict emerges: both the platform
and HyperShift need to install CAPI CRDs on the same management
cluster, potentially with incompatible versions.

The proposal introduces two major components:

  1. Private CAPI Types and API Proxy: HyperShift-specific CAPI CRDs
    using the cluster.hypershift.openshift.io group, with an API proxy
    sidecar to transparently translate between standard and private
    CAPI types.

  2. Automatic Migration: A migration controller that automatically
    converts existing hosted clusters from standard to private CAPI
    types without disrupting operations.

This enables independent version management for both HyperShift and
platform CAPI dependencies while maintaining transparent operation
for hosted cluster administrators and workloads.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@csrwng csrwng marked this pull request as ready for review January 26, 2026 20:54
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 26, 2026
@openshift-ci openshift-ci bot requested review from enxebre and sjenning January 26, 2026 20:55
This enhancement proposes isolating HyperShift's Cluster API (CAPI)
CRDs from those installed by the OpenShift platform on management
clusters. As OpenShift evolves toward using CAPI for standalone
cluster machine management, a conflict emerges: both the platform
and HyperShift need to install CAPI CRDs on the same management
cluster, potentially with incompatible versions.

The proposal introduces two major components:

1. Private CAPI Types and API Proxy: HyperShift-specific CAPI CRDs
   using the cluster.hypershift.openshift.io group, with an API proxy
   sidecar to transparently translate between standard and private
   CAPI types.

2. Automatic Migration: A migration controller that automatically
   converts existing hosted clusters from standard to private CAPI
   types without disrupting operations.

This enables independent version management for both HyperShift and
platform CAPI dependencies while maintaining transparent operation
for hosted cluster administrators and workloads.
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 26, 2026

@csrwng: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.


d. **Workload Update and Resume**:
- Update CAPI-dependent workload deployments to include the API proxy sidecar
- For deployments managed by the Control Plane Operator, update the CPO deployment to signal it should add proxy sidecars
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this imply a backport to the minimum supported hc version, so autoscaler/machine-approver get their spec updated with the proxy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw that's clarified below

Historically, HyperShift management clusters did not use Cluster API (CAPI) for their own machine management, relying instead on the OpenShift Machine API. This allowed HyperShift to install and manage its own version of CAPI CRDs, effectively owning the CAPI types on the management cluster.

With OpenShift's evolution toward using CAPI for standalone cluster machine management, a critical conflict emerges: both the platform and HyperShift will need to install CAPI CRDs on the same management cluster. If these CRD versions are incompatible, neither the platform nor HyperShift can function correctly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth mentioning MCE which is the delivery mechanism for self hosted hcp has also a desire to handle their cluster.x-k8s.io CRDs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also standalone has a toggle to don't clobber this CRDs


This enhancement introduces new CRDs that mirror the standard CAPI CRDs but use the `cluster.hypershift.openshift.io` API group:

- `Cluster.cluster.hypershift.openshift.io`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also MHC and other CRDs the controllers require to work


**dual operator architecture** consists of two HyperShift operator instances running simultaneously: one supporting private CAPI types (new) and one using standard CAPI types (legacy).

1. The platform administrator upgrades their HyperShift operator to a version that supports the `--private-capi-types` flag and runs `hypershift install --private-capi-types`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having this as a flag would require update the delivery mechanisms for managed and selfhosted. Does it really need to be? when would you opt-out?

| `hypershift.openshift.io/private-capi-types: "true"` | New Operator | Successfully migrated clusters |
| `hypershift.openshift.io/scope: "legacy"` | Legacy Operator | Existing clusters awaiting migration |
| `hypershift.openshift.io/migration-in-progress: "true"` | Migration Controller | Clusters actively being migrated (neither operator reconciles) |
| `hypershift.openshift.io/migration-failed` | Legacy Operator | Previous migration failed; requires SRE remediation before retry |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this whole process is more sensitive for self hosted in which case this all would need to be documented and would impact user directly with additional burden.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also even though is not recommended, there's users who might be consuming Machine CRs directly specially in baremetal scenarios to .e.g. annotate next one for deletion on scale down. It would be good to collect some feedback.

* Isolate HyperShift's CAPI CRD dependencies from the platform's CAPI CRDs by using a distinct API group (`cluster.hypershift.openshift.io`).
* Enable HyperShift components to continue using standard CAPI client libraries without modification through a transparent API proxy.
* Automatically migrate existing HyperShift installations to use the private CAPI types without user intervention or hosted cluster downtime.
* Ensure zero user-facing impact - hosted cluster administrators and workloads should experience no behavioral changes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably call out there's a period where ability to operate would be degraded e.g. ability to scale dataplane while the controllers are scaled down


### Alternative 1: Coordinate CAPI Versions Between Platform and HyperShift

Instead of isolating CAPI types, ensure that the platform and HyperShift always use compatible CAPI versions through tight coordination.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coupling from standalone management clusters would be solved by the flag they provide to not clobber the CRDs. Leaving the only possible conflict with MCE. Each MCE version bundles a pinned version of hypershift. What would prevent MCE and HO from running with the same latest capi APIs release for each downstream cycle?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants