Skip to content

Generate well-indented SQL from LogicalPlan #11308

@edmondop

Description

@edmondop

Is your feature request related to a problem or challenge?

DataFusion provides the capability of "unparsing" a logical plan into SQL via the unparser module in the sql crate (see https://github.com/apache/datafusion/blob/main/datafusion/sql/src/unparser/plan.rs). Examples also showcase the Dialect features to provide customizable escaping.

As a part of the work on SpiceAI and datafusion-federation, we have some rewrites on the tpch_q13 and we expect the final rewritten SQL to be the following:

SELECT c_orders.c_count,
       Count(1) AS custdist
FROM   (SELECT c_custkey                       AS c_custkey,
               "count(tpch.orders.o_orderkey)" AS c_count
        FROM   (SELECT TPCH.customer.c_custkey,
                       Count(TPCH.orders.o_orderkey) AS
                       "COUNT(tpch.orders.o_orderkey)"
                FROM   TPCH.customer
                       LEFT JOIN TPCH.orders
                              ON ( ( TPCH.customer.c_custkey =
                                     TPCH.orders.o_custkey )
                                   AND TPCH.orders.o_comment NOT LIKE
                                       '%special%requests%' )
                GROUP  BY TPCH.customer.c_custkey)) AS c_orders
GROUP  BY c_orders.c_count
ORDER  BY custdist DESC NULLS FIRST,
          c_orders.c_count DESC NULLS FIRST 

however the plan_to_sql generates a one-line sql which is much harder to read

SELECT c_orders.c_count, COUNT(1) AS custdist FROM (SELECT c_custkey AS c_custkey, "COUNT(tpch.orders.o_orderkey)" AS c_count FROM (SELECT tpch.customer.c_custkey, COUNT(tpch.orders.o_orderkey) AS "COUNT(tpch.orders.o_orderkey)" FROM tpch.customer LEFT JOIN tpch.orders ON ((tpch.customer.c_custkey = tpch.orders.o_custkey) AND tpch.orders.o_comment NOT LIKE '%special%requests%') GROUP BY tpch.customer.c_custkey)) AS c_orders GROUP BY c_orders.c_count ORDER BY custdist DESC NULLS FIRST, c_orders.c_count DESC NULLS FIRST"

Describe the solution you'd like

I would like to be able to provide an extra parameter to the Unparser, such as pretty_print or indent, which needs to be respected in the plan_to_sql

Describe alternatives you've considered

Use https://github.com/dprint/dprint on the SQL, but unfortunately SQL is not supported

Additional context

I used this https://www.dpriver.com/pp/sqlformat.htm to generate the formatted SQL from the SQL generated from dataufusion

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions