Skip to content

Conversation

@dantengsky
Copy link
Member

@dantengsky dantengsky commented Jan 21, 2026

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

The previous implementation of PartitionsShuffleKind::Mod broke cache affinity even when cluster membership remained unchanged. It first sorted partitions by hash % num_executors, then redistributed them evenly by partition count. This meant that when table data changed (segments added or removed), the redistribution cut points would shift, causing the same segment to be assigned to different executors.

For example, with 7 segments (indices 0-6) and 2 executors, the sorted partitions would be split as indices 0,1,2 to executor0 and indices 3,4,5,6 to executor1. When a new segment is added (now 8 segments), the split becomes indices 0,1,2,3 to executor0 and indices 4,5,6,7 to executor1, moving the segment originally at index 3 from executor1 to executor0 and invalidating its cache.

This fix removes the redundant redistribution step and directly assigns partitions to executors based on hash % num_executors. Now the same partition always goes to the same executor as long as cluster membership is unchanged, regardless of other partitions being added or removed.

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

The previous implementation of PartitionsShuffleKind::Mod broke cache affinity
even when cluster membership remained unchanged. It first sorted partitions by
hash % num_executors, then redistributed them evenly by partition count. This
meant that when table data changed (segments added or removed), the
redistribution cut points would shift, causing the same segment to be assigned
to different executors.

For example, with 7 segments (indices 0-6) and 2 executors, the sorted
partitions would be split as indices 0,1,2 to executor0 and indices 3,4,5,6 to
executor1. When a new segment is added (now 8 segments), the split becomes
indices 0,1,2,3 to executor0 and indices 4,5,6,7 to executor1, moving the
segment originally at index 3 from executor1 to executor0 and invalidating
its cache.

This fix removes the redundant redistribution step and directly assigns
partitions to executors based on hash % num_executors. Now the same partition
always goes to the same executor as long as cluster membership is unchanged,
regardless of other partitions being added or removed.
@github-actions github-actions bot added the pr-bugfix this PR patches a bug in codebase label Jan 21, 2026
@dantengsky
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e0bf4c74fa

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix this PR patches a bug in codebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant