Skip to content

Add support for RETURNING in SQLAlchemy dialect #363

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 29, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,12 @@ Changes for crate
Unreleased
==========

- Added support for the ``RETURNING`` clause to the SQLAlchemy dialect. This
requires CrateDB 4.2 or greater. In case you use any server side generated
columns in your primary key constraint with earlier CrateDB versions, you can
turn this feature off by passing ``implicit_returning=False`` in the
``create_engine()`` call.

- Added support for ``geo_point`` and ``geo_json`` types to the SQLAlchemy
dialect.

Expand Down
44 changes: 41 additions & 3 deletions docs/sqlalchemy.rst
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,45 @@ In this example, we:
The SQLAlchemy documentation has more information about `working with
tables`_.

``_id`` as primary key
......................

As with version 4.2 CrateDB supports the ``RETURNING`` clause, which makes it
possible to use the ``_id`` column as fetched value for the ``PRIMARY KEY``
constraint, since the SQLAlchemy ORM always **requires** a primary key.

A table schema like this

.. code-block:: sql

CREATE TABLE "doc"."logs" (
"ts" TIMESTAMP WITH TIME ZONE,
"level" TEXT,
"message" TEXT
)

would translate into the following declarative model::

>>> from sqlalchemy.schema import FetchedValue

>>> class Log(Base):
...
... __tablename__ = 'logs'
... __mapper_args__ = {
... 'exclude_properties': ['id']
... }
...
... id = sa.Column("_id", sa.String, server_default=FetchedValue(), primary_key=True)
... ts = sa.Column(sa.DateTime, server_default=sa.func.current_timestamp())
... level = sa.Column(sa.String)
... message = sa.Column(sa.String)

>>> log = Log(level="info", message="Hello World")
>>> session.add(log)
>>> session.commit()
>>> log.id
...

.. _using-extension-types:

Extension types
Expand Down Expand Up @@ -336,11 +375,9 @@ You can then set the values of the ``Geopoint`` and ``Geoshape`` columns::
>>> session.add(tokyo)
>>> session.commit()


Querying
========


When the ``commit`` method is called, two ``INSERT`` statements are sent to
CrateDB. However, the newly inserted rows aren't immediately available for
querying because the table index is only updated periodically (one second, by
Expand Down Expand Up @@ -544,7 +581,7 @@ The score is made available via the ``_score`` column, which is a virtual
column, meaning that it doesn't exist on the source table, and in most cases,
should not be included in your :ref:`table definition <table-definition>`.

You can select ``_score`` as part of a query, like this:
You can select ``_score`` as part of a query, like this::

>>> session.query(Character.name, '_score') \
... .filter(match(Character.quote_ft, 'space')) \
Expand All @@ -557,6 +594,7 @@ table definition But notice that we select the associated score by passing in
the virtual column name as a string (``_score``) instead of using a defined
column on the ``Character`` class.


.. _SQLAlchemy: http://www.sqlalchemy.org/
.. _Object-Relational Mapping: https://en.wikipedia.org/wiki/Object-relational_mapping
.. _dialect: http://docs.sqlalchemy.org/en/latest/dialects/
Expand Down
7 changes: 7 additions & 0 deletions src/crate/client/sqlalchemy/compiler.py
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,13 @@ def visit_any(self, element, **kw):
self.process(element.right, **kw)
)

def returning_clause(self, stmt, returning_cols):
columns = [
self._label_select_column(None, c, True, False, {})
for c in sa.sql.expression._select_iterables(returning_cols)
]
return "RETURNING " + ", ".join(columns)

def visit_insert(self, insert_stmt, asfrom=False, **kw):
"""
used to compile <sql.expression.Insert> expressions.
Expand Down
1 change: 1 addition & 0 deletions src/crate/client/sqlalchemy/dialect.py
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,7 @@ class CrateDialect(default.DefaultDialect):
type_compiler = CrateTypeCompiler
supports_native_boolean = True
colspecs = colspecs
implicit_returning = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this still work with older CrateDB versions which don't have RETURNING ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as long as you don't add a column with server_default=FetchedValue() which requires RETURNING`, e.g:

class LogOld(Base):
    __tablename__ = "logs_old"
    __table_args__ = {"schema": "doc"}

    ts = Column(DateTime, primary_key=True)
    level = Column(String, primary_key=True)
    message = Column(String, primary_key=True)

With the schema:

create table logs_old (ts timestamp with time zone, level text, message text)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If implicit_returning is False, it would never add the RETURNING clause to the INSERT statement.

I guess what you mean is that we should not change behaviour for old versions. ( it will be a different exception).

Copy link
Member

@mfussenegger mfussenegger Jul 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to make sure that it doesn't add a RETURNING to other statements that previously worked

Could you add a changes entry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to make sure that it doesn't add a RETURNING to other statements that previously worked

I will test a bit more.

Could you add a changes entry?

Sure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found an issue when running one of our test suites, where an INSERT statement would add a RETURNING clause. However cannot reproduce it when the model is used outside of the test suite.

Copy link
Contributor Author

@chaudum chaudum Jul 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so as soon as one column of the PK has a server side generated value, the SQLAlchemy ORM will append a RETURNING (in order to get the full PK for the model instance).

Example that works at the moment, but would fail with this change:

from sqlalchemy import create_engine, Column, String, DateTime, text
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker


engine = create_engine(
    "crate://",
    connect_args={
        "username": "crate",
        "servers": "localhost:4200",
    },
)

Session = sessionmaker(bind=engine)
Base = declarative_base(bind=Session)


class Event(Base):
    __tablename__ = "event"
    __table_args__ = {"schema": "doc"}

    ts = Column("ts", DateTime, primary_key=True, server_default=text("current_timestamp"))
    event = Column("event", String, primary_key=True)


if __name__ == "__main__":
    session = Session()
    o = Event(event="status")
    session.add(o)
    session.commit()
    print(o)
CREATE TABLE doc.event (
  ts TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
  event TEXT
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this is a show stopper, because having the primary key in the model but not in the database is an awful workaround for not being able to use the FetchedValue() for _id. What do you think?

Copy link
Member

@mfussenegger mfussenegger Jul 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can users override the implicit_returning value? If so, I think a (breaking?) changes entry that explains the impact of the change and how to work around it if necessary should be good enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you can override it when creating the engine: create_engine('crate://',..., implicit_returning=False).


def __init__(self, *args, **kwargs):
super(CrateDialect, self).__init__(*args, **kwargs)
Expand Down