Skip to content

[SPARK-51830] Exception handling for partition datatype conversion call #50610

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Madhukar525722
Copy link

What changes were proposed in this pull request?

Exception handling, similar to other call for castPartValueToDesiredType

Why are the changes needed?

Lets say a partition is of int datatype and a partition in legacy table was created as
partition_name=partition_value

Then, when we perform following operations,

val df = spark.sql("select * from db.table");
val modifiedDf = df.withColumn("partition_name", lit(2))
modifiedDf.show(false)
modifiedDf.write.mode(SaveMode.Append).insertInto("db.table")

it throws error, even if spark.sql.sources.validatePartitionColumns=false.

java.lang.NumberFormatException: For input string: "partition_value"
  at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:67)
  at java.base/java.lang.Integer.parseInt(Integer.java:668)
  at java.base/java.lang.Integer.parseInt(Integer.java:786)
  at org.apache.spark.sql.execution.datasources.PartitioningUtils$.castPartValueToDesiredType(PartitioningUtils.scala:535)
  at org.apache.spark.sql.execution.datasources.PartitioningUtils$.removeLeadingZerosFromNumberTypePartition(PartitioningUtils.scala:362)
  at org.apache.spark.sql.execution.datasources.PartitioningUtils$.$anonfun$getPathFragment$1(PartitioningUtils.scala:355)

Does this PR introduce any user-facing change?

No

How was this patch tested?

Re-ran the above failing commands

Was this patch authored or co-authored using generative AI tooling?

@github-actions github-actions bot added the SQL label Apr 16, 2025
@HyukjinKwon
Copy link
Member

Could we have a JIRA and unittest? See also https://spark.apache.org/contributing.html

@Madhukar525722 Madhukar525722 changed the title Error handling for partition datatype conversion call [SPARK-51830] Exception handling for partition datatype conversion call Apr 17, 2025
@Madhukar525722 Madhukar525722 force-pushed the master branch 3 times, most recently from 65fb110 to 7babcec Compare April 17, 2025 06:51
@Madhukar525722
Copy link
Author

Madhukar525722 commented Apr 17, 2025

Hi @HyukjinKwon , addressed your suggestions. Please review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants