Skip to content

Conversation

@LLDay
Copy link

@LLDay LLDay commented Jan 26, 2026

Rationale for this change

This PR covers a part of #14342.

This PR introduces methods for replacing physical expressions in execution plan nodes. This is an intermediate step in the development of physical-level placeholders. Later, these methods will be used to resolve placeholders in physical plans.

What changes are included in this PR?

Expression Rewriting Interface: Added physical_expressions and with_physical_expressions to the ExecutionPlan trait to allow inspection and replacement of expressions within plan nodes.

Are these changes tested?

Are there any user-facing changes?

Yes, a user can rewrite physical expressions in some execution plans.

@github-actions github-actions bot added physical-expr Changes to the physical-expr crates optimizer Optimizer rules core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) execution Related to the execution crate proto Related to proto crate datasource Changes to the datasource crate physical-plan Changes to the physical-plan crate labels Jan 26, 2026
@LLDay LLDay marked this pull request as ready for review January 26, 2026 13:18
@LLDay
Copy link
Author

LLDay commented Jan 26, 2026

@askalt, could you review the PR as well?

@askalt
Copy link
Contributor

askalt commented Jan 26, 2026

cc @Omega359 (I remember you were also interested in placeholders within physical plans).

@Omega359
Copy link
Contributor

cc @Omega359 (I remember you were also interested in placeholders within physical plans).

Yes, I would like to be able to reuse plans - I'll try and find time in the next day or so to review this though I'm not particularly familiar with this area of code.

@LLDay LLDay force-pushed the feat/physical-placeholders branch from 7d86e73 to eb09e71 Compare January 27, 2026 07:05
@LLDay LLDay force-pushed the feat/physical-placeholders branch from eb09e71 to 3046ed0 Compare January 27, 2026 14:03
@github-actions github-actions bot removed optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt) execution Related to the execution crate proto Related to proto crate labels Jan 27, 2026
@LLDay LLDay changed the title Physical-level placeholders Rewrite physical expressions in execution plans Jan 27, 2026
@LLDay LLDay force-pushed the feat/physical-placeholders branch from 3046ed0 to 4420b7f Compare January 27, 2026 14:45
@LLDay
Copy link
Author

LLDay commented Jan 27, 2026

We decided to leave only one commit in this PR regarding the rewriting of physical expressions. The second commit with ResolvePlaceholdersExec will be in a separate PR. You can find what the physical expressions resolution looked like here, but this code will be changed.

Comment on lines 521 to 525
Ok(Some(Arc::new(Self {
projector,
input: Arc::clone(&self.input),
metrics: ExecutionPlanMetricsSet::new(),
cache: self.cache.clone(),
})))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use a bit shorter form that highlights what were really changed (we could apply it for most plans):

        Ok(Some(Arc::new(Self {
            projector,
            metrics: ExecutionPlanMetricsSet::new(),
            ..self.clone()
        })))

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I applied these changes where possible (some execution plans do not implement clone).

filter,
join_type: self.join_type,
join_schema: Arc::clone(&self.join_schema),
left_fut: Arc::clone(&self.left_fut),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have two different semantics for runtime state: on the one hand metrics are recreated, when left_fut is cloned. Let's explicitly re-set a runtime state in with_physical_expressions: recreate left_fut and add a note to the trait method comment, that it is ok to not preserve the state.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed the behavior of the with_new_children method. It keeps dynamic filters and left_fut, but recreates metrics. I added a note to the with_physical_expressions method.

self.name()
);

let lex_ordering = LexOrdering::new(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a method for LexOrdering to avoid code duplication in different sort nodes?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the try_with_new_expressions method. I'm still wondering if this method should be part of the LexOrdering, because it is of little use.

fn physical_expressions<'a>(
&'a self,
) -> Option<Box<dyn Iterator<Item = Arc<dyn PhysicalExpr>> + 'a>> {
let left_sort_iter = self
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we also need to return self.on (to ensure that query like select * from t1 join t2 on t1.a + $1 = t2.a works).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, on stores Column expressions. I've tested current implementation with the old ResolvePlaceholdersExec and it worked.

/// Returns a new [`DataSource`] with its physical expressions replaced by the provided
/// `exprs`.
///
/// # Constraints
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also add notes about properties (for example, DataSource::EquivalenceProperties) and state here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@askalt
Copy link
Contributor

askalt commented Jan 28, 2026

By some reasons, I cannot resolve threads in this PR -- if you can, let's resolve all threads from the previous review iteration.
image

@LLDay LLDay force-pushed the feat/physical-placeholders branch from 4420b7f to e4dd613 Compare January 28, 2026 20:41
Introduces `physical_expressions` and `with_physical_expressions`
methods to the `ExecutionPlan` trait. These methods provide an interface
to inspect and replace physical expressions within execution plan nodes.

The commit also implements these methods for various ExecutionPlans and
introduces a new physical expression invariant check to ensure
consistency between expression retrieval and replacement.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate datasource Changes to the datasource crate physical-expr Changes to the physical-expr crates physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants