Skip to content

bpo-42064: Pass module state to trace, progress, and authorizer callbacks #27940

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Sep 7, 2021

Conversation

erlend-aasland
Copy link
Contributor

@erlend-aasland erlend-aasland commented Aug 25, 2021

  • add print-or-clear traceback helper
  • add helpers to clear and visit saved contexts
  • modify callbacks to use the new callback_context struct

https://bugs.python.org/issue42064

@erlend-aasland
Copy link
Contributor Author

erlend-aasland commented Aug 25, 2021

On hold until #27934 is merged. There should be no conflicts with #27931.

This PR will need a rebase after #27934 is merged.

@erlend-aasland erlend-aasland marked this pull request as ready for review August 31, 2021 12:41
@erlend-aasland erlend-aasland requested a review from encukou August 31, 2021 12:41
Copy link
Member

@encukou encukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick:
SET_CALLBACK_CONTEXT and CLEAR_CALLBACK_CONTEXT can both be regular functions rather than macros, making them shorter and more readable.
That leaves VISIT_CALLBACK_CONTEXT, which I think is too robust for the 3 uses. But that's very much a personal opinion.

Comment on lines 182 to 184
SET_CALLBACK_CONTEXT(self->trace_ctx, NULL);
SET_CALLBACK_CONTEXT(self->progress_ctx, NULL);
SET_CALLBACK_CONTEXT(self->authorizer_ctx, NULL);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These three statements are named free_callback_contexts() below. But here they are problematic.

  • If this is a brand new connection, then PyType_GenericAlloc has set all these to NULL and they don't need to be NULLed again.
  • If someone is calling __init__ for the second time, this might free the Python callable, but SQLite will still have the pointer. Also the old db will leak, and who knows what'll happen to old cursors… Whoa, I don't see how reinitialization could work; should it rather raise an exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had reinitialisation in mind, but as you say, it's hard to imagine how to do that properly anyway. Maybe implicit connection close at the start of __init__ will do the trick. I'll have to think that through. Perhaps we should fix the reinit case before proceeding with this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe implicit connection close at the start of __init__ will do the trick.

I would be surprised. Cursors seem to have hard (and reasonable!) assumptions about their connection's db.
I also don't see the use case – when would a new connection object not be enough?

Copy link
Contributor Author

@erlend-aasland erlend-aasland Aug 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, true. Let's raise an exception instead.

I did some tests, and reinit actually seems to work pretty well using an implicit close in __init__, even with dangling cursors that fetch and execute after reinit. There's no refleaks on our side, and SQLite seems to be able to clean up as well. The only improvement would be to clear the callbacks from SQLite before the implicit close.

I also don't see the use case

Me neither.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no refleaks on our side, and SQLite seems to be able to clean up as well.

I'm surprised. Wouldn't this result in a sqlite3_open_v2 without a corresponding close?

Copy link
Contributor Author

@erlend-aasland erlend-aasland Sep 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I was a little bit vague. I did the tests with an added implicit close in __init__.

Here's the diff I used:

diff --git a/Modules/_sqlite/connection.c b/Modules/_sqlite/connection.c
index 63fd916662..961afe8de0 100644
--- a/Modules/_sqlite/connection.c
+++ b/Modules/_sqlite/connection.c
@@ -56,6 +56,7 @@ static const char * const begin_statements[] = {
 static int pysqlite_connection_set_isolation_level(pysqlite_Connection* self, PyObject* isolation_level, void *Py_UNUSED(ignored));
 static void _pysqlite_drop_unused_cursor_references(pysqlite_Connection* self);
 static void free_callback_context(callback_context *ctx);
+static void connection_close(pysqlite_Connection *self);
 
 static PyObject *
 new_statement_cache(pysqlite_Connection *self, int maxsize)
@@ -114,6 +115,18 @@ pysqlite_connection_init_impl(pysqlite_Connection *self,
         return -1;
     }
 
+    // Gracefully handle reinitialization
+    if (self->initialized) {
+        (void)sqlite3_set_authorizer(self->db, NULL, NULL);
+        sqlite3_progress_handler(self->db, 0, NULL, NULL);
+#ifdef HAVE_TRACE_V2
+        sqlite3_trace_v2(self->db, SQLITE_TRACE_STMT, NULL, NULL);
+#else
+        sqlite3_trace(self->db, NULL, NULL);
+#endif
+        connection_close(self);
+    }
+
     pysqlite_state *state = pysqlite_get_state_by_type(Py_TYPE(self));
     self->state = state;
 

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Well, this is more complicated than I initially thought; I guess it should be a separate bpo issue.

I keep finding issues:

  • db is not set to NULL if init fails.
  • This segfaults for me:
import sqlite3

conn = sqlite3.connect(':memory:')
conn.execute('CREATE TABLE foo (bar)')

try:
    conn.__init__('/bad-file/')
except sqlite3.OperationalError:
    pass

conn.execute('INSERT INTO foo (bar) VALUES (1), (2), (3), (4)')

... and even if that's fixed, I'm concerned that there will be other edge cases we can't reasonably think about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there's some issues in connection init. I'd gladly clean it up and harden it before continuing with this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a blocker for this PR; just please open a bpo so the issues won't be lost in a closed PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened issue 45126 for this.

@erlend-aasland
Copy link
Contributor Author

SET_CALLBACK_CONTEXT and CLEAR_CALLBACK_CONTEXT can both be regular functions rather than macros, making them shorter and more readable.
That leaves VISIT_CALLBACK_CONTEXT, which I think is too robust for the 3 uses. But that's very much a personal opinion.

  • VISIT_CALLBACK_CONTEXT cannot be implemented as a regular function since it uses Py_VISIT; I prefer to keep the macro
  • I'll change CLEAR_CALLBACK_CONTEXT to a regular function; I have no strong opinion about this :)
  • SET_CALLBACK_CONTEXT will IMO be more complex as a function, since it would take a pointer to a pointer as an argument; if you have a strong opinion about this, I'll of course change it

@erlend-aasland
Copy link
Contributor Author

NB, this needs to be rebased onto main after #28088 is merged, assuming of course that it'll be easier to land that PR.

@encukou
Copy link
Member

encukou commented Sep 7, 2021

will IMO be more complex as a function, since it would take a pointer to a pointer as an argument

Personally I'd prefer a type-safe function with a more complex argument, but as I said, this is nitpicking. Feel free to make them all macros for consistency, if you find it easier to read :)

I plan to review #28088 first.

@erlend-aasland
Copy link
Contributor Author

will IMO be more complex as a function, since it would take a pointer to a pointer as an argument

Personally I'd prefer a type-safe function with a more complex argument, but as I said, this is nitpicking. Feel free to make them all macros for consistency, if you find it easier to read :)

I'll keep VISIT_CALLBACK_CONTEXT function a macro in any case, but I'll think thrice about SET_CALLBACK_CONTEXT before deciding :)

I plan to review #28088 first.

Great, thanks!

BTW, I'm creating a bpo for cleaning up connection __init__; I've got a WIP branch lying around already, so you can expect a PR anytime soon :)

@erlend-aasland
Copy link
Contributor Author

All right, no more changes comin' up, unless you've got further remarks :)

Copy link
Member

@encukou encukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

@erlend-aasland
Copy link
Contributor Author

Looks good, thanks!

Likewise!

@encukou encukou merged commit 979336d into python:main Sep 7, 2021
@erlend-aasland erlend-aasland deleted the sqlite-callback-state/part2 branch September 7, 2021 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants