Skip to content

Conversation

@jaypoulz
Copy link
Contributor

This enhancement introduces Adaptable topology, a new cluster-topology mode that enables clusters to dynamically adjust their behavior based on node count. This allows SingleReplica clusters to scale to multi-node configurations without redeployment.

Key features:

  • Automatic behavior adjustment as control-plane and worker nodes scale
  • One-way transition from SingleReplica to Adaptable topology
  • Operator compatibility declarations via OLM annotations
  • CLI command for safe topology transitions with compatibility checks
  • Shared utilities in library-go to ease operator implementation

The proposal includes complete workflow descriptions, API extensions, test plans, and version skew strategy. Future stages will add AutomaticQuorumRecovery (AQR) to enable DualReplica-based resiliency for two-node configurations.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Dec 10, 2025
@openshift-ci-robot
Copy link

@jaypoulz: This pull request explicitly references no jira issue.

Details

In response to this:

This enhancement introduces Adaptable topology, a new cluster-topology mode that enables clusters to dynamically adjust their behavior based on node count. This allows SingleReplica clusters to scale to multi-node configurations without redeployment.

Key features:

  • Automatic behavior adjustment as control-plane and worker nodes scale
  • One-way transition from SingleReplica to Adaptable topology
  • Operator compatibility declarations via OLM annotations
  • CLI command for safe topology transitions with compatibility checks
  • Shared utilities in library-go to ease operator implementation

The proposal includes complete workflow descriptions, API extensions, test plans, and version skew strategy. Future stages will add AutomaticQuorumRecovery (AQR) to enable DualReplica-based resiliency for two-node configurations.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 10, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign moadz for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jaypoulz
Copy link
Contributor Author

jaypoulz commented Dec 11, 2025

/retitle OCPEDGE-2280: Add Adaptable Topology, reorganize topology enhancements

@openshift-ci openshift-ci bot changed the title NO-JIRA: Add Adaptable Topology enhancement proposal OCPEDGE-2280: Add Adaptable Topology, reorganize topology enhancements Dec 11, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 11, 2025

@jaypoulz: This pull request references OCPEDGE-2280 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target either version "4.21." or "openshift-4.21.", but it targets "openshift-4.22" instead.

Details

In response to this:

This enhancement introduces Adaptable topology, a new cluster-topology mode that enables clusters to dynamically adjust their behavior based on node count. This allows SingleReplica clusters to scale to multi-node configurations without redeployment.

Key features:

  • Automatic behavior adjustment as control-plane and worker nodes scale
  • One-way transition from SingleReplica to Adaptable topology
  • Operator compatibility declarations via OLM annotations
  • CLI command for safe topology transitions with compatibility checks
  • Shared utilities in library-go to ease operator implementation

The proposal includes complete workflow descriptions, API extensions, test plans, and version skew strategy. Future stages will add AutomaticQuorumRecovery (AQR) to enable DualReplica-based resiliency for two-node configurations.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jaypoulz
Copy link
Contributor Author

/retest

Introduce Adaptable topology, a cluster-topology mode that adjusts
control-plane and infrastructure behavior based on node count.
Clusters can install with Adaptable topology or transition from
SingleReplica as a Day 2 operation.

Key components:
- Infrastructure API adds Adaptable enum value
- Operator compatibility annotations for in-payload and OLM operators
- Library-go utilities for node count awareness
- cluster-etcd-operator enhanced scaling logic for 2↔3 nodes
- oc adm topology transition CLI command with compatibility checks
- Console marketplace filtering and compatibility display
- Initial Topology Audit covering 39 operators

The initial implementation supports SingleReplica-to-Adaptable
transitions. Future stages will add AutomaticQuorumRecovery (AQR)
for DualReplica behavior on two-node configurations.

Reviewers assigned across 18 teams including control plane, OLM,
API, installer, console, monitoring, and core operator teams.

This also reorganizes edge topology enhancements under a new
edge-topologies/ directory:
- edge-topologies/adaptable-topology.md (new)
- edge-topologies/single-node/ (moved)
- edge-topologies/two-node/ (moved and reorganized)
  - two-node-fencing.md (renamed from tnf.md)
  - two-node-arbiter.md (renamed from arbiter-clusters.md)
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 12, 2025

@jaypoulz: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/markdownlint 6e81004 link true /test markdownlint

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Adaptable topology is enabled
- Provide a subscription function for topology changes that operators invoke
to enable Adaptable topology behavior
- Subscribe to node count changes and provide notifications when

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a more natural way of triggering operators to make a change relevant to cluster changes might be to utilize the controller runtime events library for notifications (and as a result allowing operators to reconcile the changes)

We might be able to utilize the GenericEvent but I think ideally it would make sense to create a TopologyTransitionEvent type specific for these transitions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having events to log changes in the cluster is reasonable, but I think that operators responding to node count changes is more direct and more in line with the vision of Adaptable topology. We want to avoid give the impression that AdaptableTopology is transitioning between topologies. It's behavior is define by the number of CP nodes in the cluster, and it expects each operator to know what to do no matter how many CP nodes there are.

1. The cluster administrator adds a new control-plane node to the cluster
2. The node joins the cluster and receives the control-plane labels
3. Operators watching the Infrastructure config detect the node count change
4. When crossing the 2→3 control-plane node threshold:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also support >3 control plane nodes, e.g. 4 or 5 etcd members. I think this etcd scaling process needs to be done then, too.

Copy link
Contributor Author

@jaypoulz jaypoulz Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we allow 1, 2, 3, and 5. Etcd scaling beyond from 3-4 and 4-5 are technically equivalent to 1>2 because it's just a adding a node. This an implementation detail that does need to be decided upon, because this would either be the first time we explicitly allow non-transient 4 node control planes or we'd use the double-scaleup behavior of 2->3 to go from 3->5 etcd members.

4. When crossing the 2→3 control-plane node threshold:
- CEO initiates the etcd scaling process, starting learner instances on the two control-plane nodes not running etcd
- CEO coordinates an atomic transition to promote both learner instances together
- Other operators adjust their behavior to match HighlyAvailable control-plane topology

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would that happen concurrently with CEO, or after? As CEO / etcd scaling is high risk operation, I would feel better to first do that, and then somehow trigger the other operators.

Copy link
Contributor Author

@jaypoulz jaypoulz Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concurrently with CEO. We already do this with TNF every time we do a graceful or non-graceful disruption event, and it's up to the operators to define a dependency on etcd if it is present. The only operator with such a dependency is apiserver, which responds to CEO changes as well as CP node changes.

- CEO initiates the etcd scaling process, starting learner instances on the two control-plane nodes not running etcd
- CEO coordinates an atomic transition to promote both learner instances together
- Other operators adjust their behavior to match HighlyAvailable control-plane topology
5. When scaling down and crossing the 3→2 control-plane node threshold:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is triggering this scaling down? Is it a simple "oc delete node/cp-1"? How then ensure atomicity?
I think we need an "oc adm topology" command to do that.

We also need to cross check with the procedure we have in place to replace a failed control plan node. If that involves first removing the node, we probably do not want to trigger the etcd

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can delete nodes now, and this modifies behavior for the cluster.
For UPI clusters, you won't get a replacement node - regardless of the topology set.

While I agree that we will need additional atomicity coordination for scale-up and scale-down across threshold levels (i.e. 2->3 or 3->2) to ensure coordination at the etcd level, but my proposal is that CEO handles that, since it already has safe-scaling protection built in.

I don't know that a separate oc adm topology command should be for scale-up or down. It can provide insight into your effective topology, but I'd like to avoid introducing new commands that perform behavior that is already part of the product.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what you're getting at is that a user who is accustomed to our current topologies might say - but how do tell the cluster how I want it to behave with regard to topology? That paradigm is strictly incompatible with AdaptableTopology. You don't get to tell the cluster how to behave - you just control the number of CP nodes, and the cluster operators are responsible for reconciling that to appropriate behaviors.

This prevents invalid configurations where control-plane and infrastructure
topologies are out of sync.

#### Shared Utilities in library-go

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still feels like an awkward way of getting and watch the status. Why not have a new attribute "effectiveTopology" which can change and shows the current topology?
Also, for cluster admins, its not easy to see what the current effective topolog is, e.g. during the control plane scaling process. We need to have a status field that shows e.g. "etcd cluster extension in progress, adding learners,....."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an implementation detail. We can have library-go functions for effective topology, but that's more for monitoring / user-facing behavior. For operator authors, it's better to write these in reference to what behavior is appropriate for each node-count, since topologies are just an abstraction layer on top of that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you say that 2 node topology is not two node topology because etcd is in progress scaling up from 1 to 2 nodes?

I would say that it's two-node topology with cluster-etcd-operator in "In Progress" state.

AdaptableTopology enacts that paradigm. Number of control-plane nodes (+ a potential future extension for fencing) determines your effective topology. If your cluster-admin wants to see the state of the cluster adapting to a node addtion - they just look at the cluster operators the same way they've always done.


**Topology Mismatch**: The cluster configuration would claim HighlyAvailable
while still operating as SingleReplica until nodes are provisioned.
The declared topology differs from actual behavior.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adaptable will have this intermediate state, too. E.g. during etcd learner phase when transitioning from SNO to MNO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is that it's not a mis-match. It's expected behavior for Adatable topology. Whereas it's an invalid state for HighlyAvailable outside of the installation phase.

The declared topology differs from actual behavior.

**Complex Support Matrix**: Supporting arbitrary topology transitions creates
a complex matrix.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But does not Adaptable also allow arbitrary tranistions, by adding/removing nodes? This will create a complex matrix, too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is complex as you'd imagine.
There are exactly 6 scenarios, which could be combined into a single lane for verification:

  • 1->2
  • 2->3
  • 3->4
  • 4->3
  • 3->2
  • 2->1

Everything other transition or combination is technically equivalent to some combination of these atomic transitions.


The one-way transition model simplifies testing.
We only test transitions into Adaptable topology and can stagger support
for different source topologies.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we also need to test scaleout/scaledown with adaptable, which creates tons of possible paths we need to validate. Or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is somewhat true. This bullet refers to testing the topology transtion, not testing the functionality of the topology definition.

Adaptable topology will need scaleout / scale-in testing, but those could be dedicated lanes, and I think there is less there to test than you might think.
1 to 2 and 2 to 3 have nuances. but 3 to 4 and beyond behave the same for any number of nodes.
scaledown is the same in reverse.

@openshift-bot
Copy link

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 16, 2026
@openshift-bot
Copy link

Stale enhancement proposals rot after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 23, 2026
@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 23, 2026
@jaypoulz
Copy link
Contributor Author

/remove lifecycle-rotten

@jaypoulz
Copy link
Contributor Author

/remove-lifecycle rotten

@openshift-ci openshift-ci bot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants