Factor out Substrait consumers into separate files #15794
+4,321
−3,452
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
The
consumer.rs
file grew a bit too big (~3400 LOC). Good thing is that it's easily splittable into separate files, each one responsible for converting one Substrait node into one DataFusion Logical plan node. With this change, people can just go to the file that they care about greatly reducing the amount of information that they need to deal with.What changes are included in this PR?
A refactor of Substrait
consumer.rs
file into multiple files following these rules:from_*
prefixed function responsible for converting one Substrait node into one DataFusion Logical plan node is now factored out into its own file named after the original Substrait node (e.g.cast.rs
,literal.rs
,aggregate_rel.rs
)utils.rs
filepub(super)
for functions that now need to be shared across different files.There's one subtle public API change that might be nice to keep:
That one should probably have been public from the beginning, and not exposing it now gets a bit messy as it's used outside of the
consume
module in theproducer.rs
tests, and hiding it to the outside would require introducing some #[cfg(test)] to the module definitions. I can hide though if people one a perfect no-api-change refactor.Are these changes tested?
yes, by current pipelines.
Are there any user-facing changes?
The
from_substrait_type
function is now exposed