Skip to content

Support transactions (and pandas.to_sql / read_sql_table) #72

@C0DK

Description

@C0DK

It's a great project, and truly helps me in a variety of ways. However, I am having a few issues when utilizing it with Pandas.

When reading a table I would like to do:

    def read(
        self, catalog_name: str, schema_name: str, table_name: str
    ) -> DataFrame:
        with self._get_connection(catalog_name) as connection:
            iterator = pd.read_sql_table(
                schema=schema_name,
                table_name=table_name,
                con=connection,
            )

This should return a pandas dataframe.

However, I am required to do the following lacking features from your otherwise great connection.

def read(
        self, catalog_name: str, schema_name: str, table_name: str
    ) -> pd.DataFrame:
        with self._get_connection(catalog_name) as connection:
            query = f"SELECT * FROM {catalog_name}.{schema_name}.{table_name}"
            self.logger.debug("Running query '%s'", query)
            df = pd.read_sql(query, connection)
            self.logger.debug(df)
            return df

Similarly I cannot use the to_sql on a dataframe where i'd like to do something like the following:

    def write(
        self,
        dataframe: pd.DataFrame,
        catalog_name: str,
        schema_name: str,
        table_name: str,
    ) -> pd.DataFrame:
        with self._get_connection(catalog_name) as connection:
            dataframe.to_sql(name=table_name, con=connection, schema=schema_name)

Both fail

databricks.sql.exc.NotSupportedError: Transactions are not supported on Databricks

Versions:

python 3.10.6
databricks-sql-connector==2.2.1
pandas==1.5.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions