Skip to content

Conversation

@Sh-Zh-7
Copy link
Contributor

@Sh-Zh-7 Sh-Zh-7 commented Dec 25, 2025

Description

This PR introduce the following optimization rules:

  • PruneWindowColumns
  • RemoveRedundantWindow
  • PruneOrderByInWindowAggregation
  • GatherAndMergeWindows
  • ReplaceWindowWithRowNumber
  • PushdownLimitIntoWindow
  • PushdownFilterIntoWindow

And its corresponding nodes and operators.

@Sh-Zh-7 Sh-Zh-7 marked this pull request as draft December 25, 2025 19:21
@codecov
Copy link

codecov bot commented Dec 25, 2025

Codecov Report

❌ Patch coverage is 5.79614% with 1414 lines in your changes missing coverage. Please review.
✅ Project coverage is 39.12%. Comparing base (9abac5c) to head (e7a4b17).
⚠️ Report is 59 commits behind head on master.

Files with missing lines Patch % Lines
...tion/operator/GroupedTopNRowNumberAccumulator.java 0.00% 182 Missing ⚠️
...execution/operator/RowReferenceTsBlockManager.java 0.00% 153 Missing ⚠️
.../planner/iterative/rule/GatherAndMergeWindows.java 16.08% 120 Missing ⚠️
...n/operator/process/window/TopKRankingOperator.java 0.00% 90 Missing ⚠️
...l/aggregation/grouped/array/IntArrayFIFOQueue.java 0.00% 90 Missing ⚠️
...ngine/plan/relational/planner/node/ValuesNode.java 0.00% 89 Missing ⚠️
...gregation/grouped/array/LongBigArrayFIFOQueue.java 0.00% 81 Missing ⚠️
...eryengine/plan/planner/TableOperatorGenerator.java 0.00% 80 Missing ⚠️
...xecution/operator/GroupedTopNRowNumberBuilder.java 0.00% 78 Missing ⚠️
...ion/operator/process/window/RowNumberOperator.java 0.00% 76 Missing ⚠️
... and 19 more
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #16953      +/-   ##
============================================
+ Coverage     39.02%   39.12%   +0.10%     
- Complexity      207      212       +5     
============================================
  Files          5021     5093      +72     
  Lines        333377   341357    +7980     
  Branches      42431    43620    +1189     
============================================
+ Hits         130110   133568    +3458     
- Misses       203267   207789    +4522     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Sh-Zh-7 Sh-Zh-7 marked this pull request as ready for review January 5, 2026 04:56
@sonarqubecloud
Copy link

sonarqubecloud bot commented Jan 6, 2026

Quality Gate Failed Quality Gate failed

Failed conditions
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR integrates window function optimization rules into IoTDB, adding support for optimizing window functions through specialized plan nodes (TopKRankingNode, RowNumberNode, ValuesNode) and corresponding operators, along with optimization rules to transform and optimize window operations.

Changes:

  • Added new plan nodes: TopKRankingNode, RowNumberNode, and ValuesNode for specialized window operations
  • Implemented optimization rules: PruneWindowColumns, RemoveRedundantWindow, GatherAndMergeWindows, ReplaceWindowWithRowNumber, PushDownLimitIntoWindow, PushDownFilterIntoWindow
  • Added operators: TopKRankingOperator, RowNumberOperator, ValuesOperator with supporting data structures

Reviewed changes

Copilot reviewed 35 out of 35 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
HeapTraversal.java Utility for navigating binary heap structures
TopKRankingNode.java Plan node for top-k ranking operations
RowNumberNode.java Plan node for row numbering operations
ValuesNode.java Plan node for constant value operations
RemoveRedundantWindow.java Rule to remove empty window operations
ReplaceWindowWithRowNumber.java Rule to replace window with row number (incomplete)
PushDownLimitIntoWindow.java Rule to push limit into window operations
PushDownFilterIntoWindow.java Rule to push filter into window operations
GatherAndMergeWindows.java Rule to merge adjacent window operations
TopKRankingOperator.java Operator for executing top-k ranking
RowNumberOperator.java Operator for computing row numbers
ValuesOperator.java Operator for constant values
Supporting data structures NoChannelGroupByHash, FIFO queues, grouped TopN builders

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +160 to +162
public List<Symbol> getOutputSymbols() {
return Collections.singletonList(rankingSymbol);
}
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method getOutputSymbols returns only the ranking symbol, but it should return all output symbols including those from the child node. This inconsistency with other node implementations (like RowNumberNode which properly handles output symbols) will cause incorrect query planning.

Copilot uses AI. Check for mistakes.

@Override
public List<Symbol> getOutputSymbols() {
return Collections.singletonList(rowNumberSymbol);
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method getOutputSymbols returns only the row number symbol, but it should return all output symbols including those from the child node. This is inconsistent with how other operators handle output symbols and will cause query planning errors.

Suggested change
return Collections.singletonList(rowNumberSymbol);
return ImmutableList.<Symbol>builder()
.addAll(getChild().getOutputSymbols())
.add(rowNumberSymbol)
.build();

Copilot uses AI. Check for mistakes.
Map<Symbol, Symbol> mapping = new HashMap<>(rewrittenSource.getMappings());
SymbolMapper mapper = symbolMapper(mapping);

TopKRankingNode rewrittenTopNRanking = mapper.map(node, rewrittenSource.getRoot());
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name 'rewrittenTopNRanking' (line 641) is inconsistent with the node type TopKRankingNode. The name should be 'rewrittenTopKRanking' to match the actual class name and maintain naming consistency.

Copilot uses AI. Check for mistakes.
case 1036:
return ExceptNode.deserialize(buffer);
case 1037:
return TopKNode.deserialize(buffer);
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deserialization case for TABLE_TOPK_RANKING_NODE (1037) is calling TopKNode.deserialize(buffer) instead of TopKRankingNode.deserialize(buffer). This will cause runtime errors when deserializing TopKRankingNode instances.

Copilot uses AI. Check for mistakes.

@Override
public Result apply(WindowNode node, Captures captures, Context context) {
return null;
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The apply method returns null unconditionally. This rule will never perform any transformation, making it ineffective. The method should implement the actual transformation logic to replace the WindowNode with a RowNumberNode.

Suggested change
return null;
return Result.empty();

Copilot uses AI. Check for mistakes.
return result;
}

private void processRow(TsBlock tsBlock, int position, long rowNumber) {
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method processRow accepts three parameters (TsBlock tsBlock, int position, long rowNumber) but is being called with (tsBlock, partitionId, rowCount + 1) at line 121. The second argument should be 'position', not 'partitionId'. This will cause incorrect column access and likely runtime errors.

Copilot uses AI. Check for mistakes.

private void processRow(TsBlock tsBlock, int position, long rowNumber) {
// Check max rows per partition limit
if (maxRowsPerPartition.isPresent() && rowNumber >= maxRowsPerPartition.get()) {
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition checks if rowNumber >= maxRowsPerPartition, but it should check rowNumber > maxRowsPerPartition. With the current logic, when rowNumber equals maxRowsPerPartition (which is the maximum allowed), the row is incorrectly skipped. For example, if maxRowsPerPartition is 5, row 5 will be skipped even though rows 1-5 should be included.

Suggested change
if (maxRowsPerPartition.isPresent() && rowNumber >= maxRowsPerPartition.get()) {
if (maxRowsPerPartition.isPresent() && rowNumber > maxRowsPerPartition.get()) {

Copilot uses AI. Check for mistakes.
Comment on lines +92 to +106
boolean generateRanking,
Optional<Integer> hashChannel,
int expectedPositions,
Optional<Long> maxPartialMemory) {
this.operatorContext = operatorContext;
this.inputOperator = inputOperator;
this.rankingType = rankingType;
this.inputTypes = inputTypes;
this.partitionChannels = partitionChannels;
this.partitionTSDataTypes = partitionTSDataTypes;
this.sortChannels = sortChannels;
this.sortOrders = sortOrders;
this.maxRowCountPerPartition = maxRowCountPerPartition;
this.partial = !generateRanking;
this.generateRanking = generateRanking;
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TopKRankingOperator constructor parameter 'generateRanking' is used to set 'partial' with inverted logic (partial = !generateRanking at line 105), but then 'generateRanking' is also stored separately. This creates confusing dual state. Additionally, the constructor parameter name at line 92 is 'generateRanking' but the field at line 66 is named 'generateRanking' while the parameter is used to derive 'partial'. Consider using a single boolean field with clear semantics.

Copilot uses AI. Check for mistakes.
return new RowNumberNode(
getPlanNodeId(), partitionBy, orderSensitive, rowNumberSymbol, maxRowCountPerPartition);
}

Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method overrides PlanNode.accept; it is advisable to add an Override annotation.

Suggested change
@Override

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants