You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Licensed to the Apache Software Foundation (ASF) under one
3
-
or more contributor license agreements. See the NOTICE file
4
-
distributed with this work for additional information
5
-
regarding copyright ownership. The ASF licenses this file
6
-
to you under the Apache License, Version 2.0 (the
7
-
"License"); you may not use this file except in compliance
8
-
with the License. You may obtain a copy of the License at
9
-
10
-
http://www.apache.org/licenses/LICENSE-2.0
11
-
12
-
Unless required by applicable law or agreed to in writing,
13
-
software distributed under the License is distributed on an
14
-
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
15
-
KIND, either express or implied. See the License for the
16
-
specific language governing permissions and limitations
17
-
under the License.
2
+
Licensed to the Apache Software Foundation (ASF) under one
3
+
or more contributor license agreements. See the NOTICE file
4
+
distributed with this work for additional information
5
+
regarding copyright ownership. The ASF licenses this file
6
+
to you under the Apache License, Version 2.0 (the
7
+
"License"); you may not use this file except in compliance
8
+
with the License. You may obtain a copy of the License at
9
+
10
+
http://www.apache.org/licenses/LICENSE-2.0
11
+
12
+
Unless required by applicable law or agreed to in writing,
13
+
software distributed under the License is distributed on an
14
+
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
15
+
KIND, either express or implied. See the License for the
16
+
specific language governing permissions and limitations
17
+
under the License.
18
18
-->
19
19
20
20
# Compatibility Guide
@@ -34,13 +34,126 @@ There is an [epic](https://github.com/apache/datafusion-comet/issues/313) where
34
34
35
35
## Cast
36
36
37
-
Comet currently delegates to Apache DataFusion for most cast operations, and this means that the behavior is not
38
-
guaranteed to be consistent with Spark.
37
+
Cast operations in Comet fall into three levels of support:
39
38
40
-
There is an [epic](https://github.com/apache/datafusion-comet/issues/286) where we are tracking the work to implement Spark-compatible cast expressions.
39
+
-**Compatible**: The results match Apache Spark
40
+
-**Incompatible**: The results may match Apache Spark for some inputs, but there are known issues where some inputs
41
+
will result in incorrect results or exceptions. The query stage will fall back to Spark by default. Setting
42
+
`spark.comet.cast.allowIncompatible=true` will allow all incompatible casts to run natively in Comet, but this is not
43
+
recommended for production use.
44
+
-**Unsupported**: Comet does not provide a native version of this cast expression and the query stage will fall back to
45
+
Spark.
41
46
42
-
### Cast from String to Timestamp
47
+
The following table shows the current cast operations supported by Comet. Any cast that does not appear in this
48
+
table (such as those involving complex types and timestamp_ntz, for example) are not supported by Comet.
43
49
44
-
Casting from String to Timestamp is disabled by default due to incompatibilities with Spark, including timezone
45
-
issues, and can be enabled by setting `spark.comet.castStringToTimestamp=true`. See the
46
-
[tracking issue](https://github.com/apache/datafusion-comet/issues/328) for more information.
Copy file name to clipboardExpand all lines: docs/source/user-guide/configs.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,7 +25,7 @@ Comet provides the following configuration settings.
25
25
|--------|-------------|---------------|
26
26
| spark.comet.ansi.enabled | Comet does not respect ANSI mode in most cases and by default will not accelerate queries when ansi mode is enabled. Enable this setting to test Comet's experimental support for ANSI mode. This should not be used in production. | false |
27
27
| spark.comet.batchSize | The columnar batch size, i.e., the maximum number of rows that a batch can contain. | 8192 |
28
-
| spark.comet.cast.stringToTimestamp| Comet is not currently fully compatible with Spark when casting from String to Timestamp. | false |
28
+
| spark.comet.cast.allowIncompatible| Comet is not currently fully compatible with Spark for all cast operations. Set this config to true to allow them anyway. See compatibility guide for more information. | false |
29
29
| spark.comet.columnar.shuffle.async.enabled | Whether to enable asynchronous shuffle for Arrow-based shuffle. By default, this config is false. | false |
30
30
| spark.comet.columnar.shuffle.async.max.thread.num | Maximum number of threads on an executor used for Comet async columnar shuffle. By default, this config is 100. This is the upper bound of total number of shuffle threads per executor. In other words, if the number of cores * the number of shuffle threads per task `spark.comet.columnar.shuffle.async.thread.num` is larger than this config. Comet will use this config as the number of shuffle threads per executor instead. | 100 |
31
31
| spark.comet.columnar.shuffle.async.thread.num | Number of threads used for Comet async columnar shuffle per shuffle task. By default, this config is 3. Note that more threads means more memory requirement to buffer shuffle data before flushing to disk. Also, more threads may not always improve performance, and should be set based on the number of cores available. | 3 |
0 commit comments