Skip to content

[fix](json-functions)fix json-replace/insert/set/array behavior with complex type #50308

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

amorynan
Copy link
Contributor

What problem does this PR solve?

here are some functions behavior is not same with mysql as JSON_REPLACE | JSON_INSERT | JSON_SET | JSON_ARRAY
which json and complex types are not supported before this PR
Follow mysql : https://dev.mysql.com/doc/refman/8.0/en/json-modification-functions.html#function_json-set
before we will meet result:

mysql> insert into t01 values (1, '{"a":1, "b":2}');
Query OK, 1 row affected (0.25 sec)
{'label':'label_1056456f8d8c46e5_9be38bb344bdd72c', 'status':'VISIBLE', 'txnId':'8019'}

mysql> select b, json_set(b, "$.a", b) from t01;
+---------------+-----------------------+
| b             | json_set(b, "$.a", b) |
+---------------+-----------------------+
| {"a":1,"b":2} | {"a":1,"b":2}         |
+---------------+-----------------------+

now

mysql>  select b, json_set(b, "$.a", b) from t01;
+---------------+---------------------------+
| b             | json_set(b, "$.a", b)     |
+---------------+---------------------------+
| {"a":1,"b":2} | {"a":{"a":1,"b":2},"b":2} |
+---------------+---------------------------+
1 row in set (0.12 sec)

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@amorynan
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33763 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2161e35bd69d53f5b0f5d924df3c9b68dec1a09a, data reload: false

------ Round 1 ----------------------------------
q1	26172	5055	5004	5004
q2	2071	283	182	182
q3	10384	1242	689	689
q4	10218	1003	563	563
q5	7557	2396	2380	2380
q6	182	162	136	136
q7	918	753	614	614
q8	9321	1300	1078	1078
q9	6828	5069	5048	5048
q10	6795	2289	1877	1877
q11	474	281	275	275
q12	352	358	230	230
q13	17774	3668	3081	3081
q14	224	236	226	226
q15	541	485	481	481
q16	435	447	390	390
q17	574	858	361	361
q18	7810	7077	7015	7015
q19	1197	940	552	552
q20	328	330	213	213
q21	4251	3333	2460	2460
q22	1007	1015	908	908
Total cold run time: 115413 ms
Total hot run time: 33763 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5081	5055	5063	5055
q2	241	330	226	226
q3	2126	2670	2294	2294
q4	1449	1827	1392	1392
q5	4559	4445	4466	4445
q6	211	162	125	125
q7	1940	1939	1701	1701
q8	2578	2519	2485	2485
q9	7167	7160	7038	7038
q10	3023	3194	2747	2747
q11	559	530	498	498
q12	683	788	615	615
q13	3585	3928	3280	3280
q14	274	295	265	265
q15	524	478	458	458
q16	457	494	468	468
q17	1169	1583	1406	1406
q18	7727	7545	7344	7344
q19	806	861	974	861
q20	1988	1970	1910	1910
q21	5163	4732	4553	4553
q22	1080	999	988	988
Total cold run time: 52390 ms
Total hot run time: 50154 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 184885 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2161e35bd69d53f5b0f5d924df3c9b68dec1a09a, data reload: false

query1	1004	473	483	473
query2	6555	1884	1833	1833
query3	6742	221	213	213
query4	26140	23757	23358	23358
query5	4338	607	471	471
query6	293	207	222	207
query7	4625	508	282	282
query8	305	248	253	248
query9	8582	2564	2560	2560
query10	461	321	261	261
query11	15114	15000	14767	14767
query12	153	108	99	99
query13	1636	504	393	393
query14	9178	6071	6245	6071
query15	201	188	162	162
query16	7255	641	465	465
query17	1196	730	588	588
query18	1989	401	304	304
query19	194	194	159	159
query20	123	124	111	111
query21	213	127	103	103
query22	4101	4085	3901	3901
query23	33965	32759	32943	32759
query24	8397	2375	2378	2375
query25	558	495	412	412
query26	1236	272	154	154
query27	2749	486	337	337
query28	4344	2106	2087	2087
query29	792	534	429	429
query30	278	211	187	187
query31	925	871	788	788
query32	71	64	62	62
query33	554	361	315	315
query34	780	847	503	503
query35	797	815	745	745
query36	979	981	892	892
query37	116	100	77	77
query38	4136	4139	4111	4111
query39	1510	1393	1376	1376
query40	206	115	104	104
query41	63	62	60	60
query42	131	105	107	105
query43	492	500	490	490
query44	1299	790	786	786
query45	184	173	165	165
query46	828	1023	611	611
query47	1744	1794	1712	1712
query48	373	408	295	295
query49	808	518	416	416
query50	633	667	415	415
query51	4151	4072	4139	4072
query52	103	125	100	100
query53	228	251	193	193
query54	587	568	526	526
query55	82	77	82	77
query56	306	290	284	284
query57	1118	1128	1074	1074
query58	264	243	248	243
query59	2763	2722	2636	2636
query60	330	309	289	289
query61	129	125	126	125
query62	799	721	660	660
query63	233	185	187	185
query64	4379	1004	677	677
query65	4356	4214	4217	4214
query66	1136	405	307	307
query67	15707	15401	15160	15160
query68	8584	880	513	513
query69	467	305	270	270
query70	1182	1069	1069	1069
query71	463	323	290	290
query72	5449	4765	4769	4765
query73	723	609	341	341
query74	8866	9255	8653	8653
query75	4075	3164	2706	2706
query76	3732	1182	736	736
query77	775	386	290	290
query78	10020	10196	9227	9227
query79	1927	811	566	566
query80	575	512	484	484
query81	478	264	215	215
query82	432	123	99	99
query83	258	246	221	221
query84	237	106	80	80
query85	788	359	310	310
query86	338	311	298	298
query87	4347	4362	4236	4236
query88	3382	2194	2186	2186
query89	383	314	287	287
query90	1977	210	206	206
query91	146	153	117	117
query92	73	62	56	56
query93	1200	927	582	582
query94	666	408	309	309
query95	382	291	283	283
query96	493	547	270	270
query97	3152	3243	3103	3103
query98	229	218	206	206
query99	1536	1412	1292	1292
Total cold run time: 273352 ms
Total hot run time: 184885 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.97 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2161e35bd69d53f5b0f5d924df3c9b68dec1a09a, data reload: false

query1	0.03	0.04	0.03
query2	0.12	0.10	0.12
query3	0.24	0.20	0.19
query4	1.59	0.19	0.18
query5	0.60	0.59	0.61
query6	1.19	0.72	0.72
query7	0.03	0.02	0.02
query8	0.04	0.04	0.04
query9	0.57	0.55	0.51
query10	0.55	0.57	0.56
query11	0.16	0.10	0.11
query12	0.14	0.11	0.12
query13	0.61	0.59	0.59
query14	1.16	1.17	1.17
query15	0.87	0.84	0.83
query16	0.39	0.38	0.39
query17	1.00	1.04	1.02
query18	0.21	0.19	0.19
query19	1.85	1.76	1.80
query20	0.01	0.01	0.01
query21	15.41	0.91	0.56
query22	0.75	1.26	0.79
query23	14.76	1.36	0.61
query24	7.40	1.99	1.19
query25	0.48	0.22	0.06
query26	0.68	0.16	0.15
query27	0.05	0.05	0.05
query28	9.50	0.79	0.44
query29	12.52	3.97	3.31
query30	0.26	0.10	0.07
query31	2.80	0.58	0.38
query32	3.22	0.53	0.47
query33	3.05	3.08	3.09
query34	15.81	5.03	4.45
query35	4.49	4.49	4.47
query36	0.67	0.49	0.48
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.03
query40	0.18	0.14	0.13
query41	0.08	0.03	0.02
query42	0.03	0.02	0.02
query43	0.04	0.02	0.02
Total cold run time: 103.71 s
Total hot run time: 29.97 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 0.00% (0/2) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.55% (14538/27148)
Line Coverage 42.38% (126010/297353)
Region Coverage 41.19% (64404/156355)
Branch Coverage 35.77% (32373/90512)

List<DataType> arguments = new ArrayList<>();
arguments.add(VarcharType.SYSTEM_DEFAULT); // json_str
for (int i = 1; i < arity(); i++) {
if ((i & 1) == 0 && (getArgumentType(i).isComplexType() || getArgumentType(i).isJsonType())) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does (i & 1) == 0 mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(i & 1) == 0 determines the even position, which exactly corresponds to the position of the value parameter. pass the origin value type to BE and should not cast to varchar

@hello-stephen
Copy link
Contributor

BE Regression P0 && UT Coverage Report

Increment line coverage 0.00% (0/2) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage
Line Coverage
Region Coverage
Branch Coverage

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 23, 2025
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Member

@eldenmoon eldenmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@amorynan
Copy link
Contributor Author

run feut

@yiguolei yiguolei merged commit 85b7bee into apache:master Apr 24, 2025
30 of 34 checks passed
github-actions bot pushed a commit that referenced this pull request Apr 24, 2025
…complex type (#50308)

### What problem does this PR solve?
here are some functions behavior is not same with mysql as JSON_REPLACE
| JSON_INSERT | JSON_SET | JSON_ARRAY
which json and complex types are not supported before this PR
Follow mysql :
https://dev.mysql.com/doc/refman/8.0/en/json-modification-functions.html#function_json-set
before  we will meet result:
```
mysql> insert into t01 values (1, '{"a":1, "b":2}');
Query OK, 1 row affected (0.25 sec)
{'label':'label_1056456f8d8c46e5_9be38bb344bdd72c', 'status':'VISIBLE', 'txnId':'8019'}

mysql> select b, json_set(b, "$.a", b) from t01;
+---------------+-----------------------+
| b             | json_set(b, "$.a", b) |
+---------------+-----------------------+
| {"a":1,"b":2} | {"a":1,"b":2}         |
+---------------+-----------------------+
```
now 
```
mysql>  select b, json_set(b, "$.a", b) from t01;
+---------------+---------------------------+
| b             | json_set(b, "$.a", b)     |
+---------------+---------------------------+
| {"a":1,"b":2} | {"a":{"a":1,"b":2},"b":2} |
+---------------+---------------------------+
1 row in set (0.12 sec)
```
yiguolei pushed a commit that referenced this pull request Apr 24, 2025
…complex type #50308 (#50309)

### What problem does this PR solve?
backport: #50308

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants