Skip to content

Don't return empty routing key when partition key is unbound #1620

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 3, 2024

Conversation

akhaku
Copy link
Contributor

@akhaku akhaku commented Nov 21, 2022

DefaultBoundStatement#getRoutingKey has logic to infer the routing key when no one has explicitly called setRoutingKey or otherwise set the routing key on the statement. It however doesn't check for cases where nothing has been bound yet on the statement.
This causes more problems if the user decides to get a BoundStatementBuilder from the PreparedStatement, set some fields on it, and then copy it by constructing new BoundStatementBuilder objects with the BoundStatement as a parameter, since the empty ByteBuffer gets copied to all bound statements, resulting in all requests being targeted to the same Cassandra node in a token-aware load balancing policy.

assertThat(copy.getRoutingKey().hasRemaining()).isTrue();
}

// copied from RequestLogFormatterTest, we should move somewhere to share b/w tests
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hacked the test together by copying this code from RequestLogFormatterTest, I'm looking to you folks for advice on where this code should actually live

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This functionality is probably something that's more easily validated in an integration test. PreparedStatementIT already does some validation of the ability to compute routing information when the partition key is bound; a test for this functionality should slot in next to it pretty nicely.

I was doing some testing with something like the following:

  @Test
  public void should_return_null_routing_information_when_single_partition_key_is_unbound() {
    should_return_null_routing_information_when_single_partition_key_is_unbound(
            "SELECT a FROM prepared_statement_test WHERE a = ?");
    should_return_null_routing_information_when_single_partition_key_is_unbound(
            "INSERT INTO prepared_statement_test (a) VALUES (?)");
    should_return_null_routing_information_when_single_partition_key_is_unbound(
            "UPDATE prepared_statement_test SET b = 1 WHERE a = ?");
    should_return_null_routing_information_when_single_partition_key_is_unbound(
            "DELETE FROM prepared_statement_test WHERE a = ?");
  }

  private void should_return_null_routing_information_when_single_partition_key_is_unbound(String queryString) {

    CqlSession session = sessionRule.session();
    BoundStatement boundStatement = session.prepare(queryString).bind();
    assertThat(boundStatement.getRoutingKey()).isNull();
  }

This test fails with the existing 4.x code but passes with the changes referenced above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, much cleaner than my hacked-together test!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And for simplicity I'm going to go with exactly what you have here and not worry about the extra copying tests

@@ -358,7 +358,7 @@ public ByteBuffer getRoutingKey() {
if (indices.isEmpty()) {
return null;
} else if (indices.size() == 1) {
return getBytesUnsafe(indices.get(0));
return isSet(0) ? getBytesUnsafe(indices.get(0)) : null;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably worth noting that the multi-column partition key index code branch has already fixed this bug at line 367 below

Copy link
Contributor

@tolbertam tolbertam Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this exactly ^. On initial look, I came to the same conclusion and felt that you could even just delete this branch and fall into the else and the behavior matches this change exactly, but I think this branch sort of exists as an optimization (avoids creating an array), so 👍 to accept your change as it is.

@akhaku
Copy link
Contributor Author

akhaku commented Nov 21, 2022

More related arguments we could make since we can re-bind the partition key:

  1. we should never copy over getRoutingKey while copying the template at StatementBuilder:71
  2. OR we should copy over the raw instance variable, not the computed value
  3. OR we should only generate the routing key when evaluating the statement in the load balancing policy, if Request#getRoutingKey returns null at BasicLoadBalancingPolicy:290

Between all of those, the simplest solution is probably 1. The second would be nice too but there's no easy way to get the instance variable (save casting to a DefaultBoundStatement and perhaps exposing another getter), and I can't think of a non-brittle way to do the third one.

Worth nothing that as far as I can tell, DefaultBoundStatement#getRoutingKeyspace doesn't have this problem because the field that's used to calculate its value (the prepared statement) can't change.

In any case, probably a different PR for that follow-up change?

@absurdfarce absurdfarce self-requested a review December 2, 2022 21:13
@@ -358,7 +358,7 @@ public ByteBuffer getRoutingKey() {
if (indices.isEmpty()) {
return null;
} else if (indices.size() == 1) {
return getBytesUnsafe(indices.get(0));
return isSet(0) ? getBytesUnsafe(indices.get(0)) : null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you can use "isSet(0)" here. The first index could be somewhere else depending on the query... pretty sure this needs to be:

      } else if (indices.size() == 1) {
        int index = indices.get(0);
        return isSet(index) ? getBytesUnsafe(index) : null;
      } else {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah right, sorry I missed that, thanks

@absurdfarce
Copy link
Contributor

Thanks for this @akhaku , and my apologies for taking so long to get back to you... holidays and all that! This change looks pretty good; I had a few comments on your current implementation but I don't think any of them are major. I haven't yet given a great deal of thought to your points about handling the cases around the builder... I want to think about that a bit more.

@akhaku
Copy link
Contributor Author

akhaku commented Dec 3, 2022

No worries, thanks for looking at this! The other comments are much less urgent, I think this is a somewhat serious bug (if you happen to hit it) and so good to get this fixed before worrying about my other comments about cleanup.

@akhaku
Copy link
Contributor Author

akhaku commented Dec 16, 2022

Gentle nudge @absurdfarce :)

@absurdfarce
Copy link
Contributor

Hey @akhaku, apologies for the delay in getting back to you!

I'm inclined to agree with your analysis above; I think we're probably okay to merge this change as it stands and address the questions copying values in another PR. With that in mind I think this PR is ready to go as it stands so I'll try to get it merged later today and create a follow-up ticket for the other questions.

Thanks for all your work on this!

@absurdfarce
Copy link
Contributor

Hey @akhaku, regarding the follow-up ticket... I think I need to understand the case in question a bit better.

You mentioned above that we can "re-bind the partition key" for the DefaultPreparedStatement referenced by a given DefaultBoundStatement. For my own understanding... is the concern here that the List returned by getPartitionKeyIndices() is mutable? Or are you referring to another operation that I'm just not thinking of at the moment?

I'm also a bit confused by your (correct) point that DefaultBoundStatement.getRoutingKeyspace() uses a field (preparedStatement) in DefaultBoundStatement which cannot change. What confuses me is that getRoutingKey() references the same field if a fixed routing key wasn't supplied when the instance was constructed. So the only difference there is a difference in return value (ColumnDefinitions via getVariableDefinitions() in the routing keyspace case, a List of Integers via getPartitionKeyIndices in the routing key case).... which returns me to my earlier question about mutability.

All of that said, the second option on your list ("we should copy over the raw instance variable, not the computed value") seems most desirable (at least given my current understanding). If the user has explicitly specified a routing key preserve it, otherwise allow it to be computed as it normally would.

DefaultBoundStatement#getRoutingKey has logic to infer the
routing key when no one has explicitly called setRoutingKey
or otherwise set the routing key on the statement.
It however doesn't check for cases where nothing has been
bound yet on the statement.
This causes more problems if the user decides to get a
BoundStatementBuilder from the PreparedStatement, set some
fields on it, and then copy it by constructing new
BoundStatementBuilder objects with the BoundStatement as a
parameter, since the empty ByteBuffer gets copied to all
bound statements, resulting in all requests being targeted
to the same Cassandra node in a token-aware load balancing
policy.
@akhaku
Copy link
Contributor Author

akhaku commented Apr 24, 2023

Just a rebase.
It's been a while since I looked at this, so based on my fuzzy recollection of what I was talking about:

Regarding rebinding the partition key: you can use SettableByName or SettableById to reset the partition key after it's already set. If we had the old routing key cached, then after changing the bound parameter, the cached routing key is no longer accurate, we need to recompute the routing key based on the new bound partition key.

That also sort of answers your question about DefaultBoundStatement#getRoutingKeyspace - while we can change the underlying information that's used to calculate the routing key (the bound parameter for the partition key), we cannot change the parameter that is used to figure out the routing keyspace.

@tolbertam tolbertam self-requested a review August 27, 2024 00:08
Copy link
Contributor

@tolbertam tolbertam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 looks good to me; The behavior appears to make this consistent with > 1 partition key. If any partition key value is unset, this returns null, so seems correct to also return null in this case.

@@ -358,7 +358,7 @@ public ByteBuffer getRoutingKey() {
if (indices.isEmpty()) {
return null;
} else if (indices.size() == 1) {
return getBytesUnsafe(indices.get(0));
return isSet(0) ? getBytesUnsafe(indices.get(0)) : null;
Copy link
Contributor

@tolbertam tolbertam Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this exactly ^. On initial look, I came to the same conclusion and felt that you could even just delete this branch and fall into the else and the behavior matches this change exactly, but I think this branch sort of exists as an optimization (avoids creating an array), so 👍 to accept your change as it is.

@absurdfarce
Copy link
Contributor

This has been sitting for far too long and the core change here is a good one. We can and should continue any discussion around the handling of getRoutingKey() in DefaultBoundStatement but I want to make sure the core change of this PR gets into 4.19.0.

@absurdfarce absurdfarce merged commit 103bb96 into apache:4.x Sep 3, 2024
absurdfarce pushed a commit that referenced this pull request Sep 3, 2024
DefaultBoundStatement#getRoutingKey has logic to infer the
routing key when no one has explicitly called setRoutingKey
or otherwise set the routing key on the statement.
It however doesn't check for cases where nothing has been
bound yet on the statement.
This causes more problems if the user decides to get a
BoundStatementBuilder from the PreparedStatement, set some
fields on it, and then copy it by constructing new
BoundStatementBuilder objects with the BoundStatement as a
parameter, since the empty ByteBuffer gets copied to all
bound statements, resulting in all requests being targeted
to the same Cassandra node in a token-aware load balancing
policy.

patch by Ammar Khaku; reviewed by Andy Tolbert, and Bret McGuire
reference: #1620
@akhaku akhaku deleted the routingKey branch September 17, 2024 04:48
dkropachev added a commit to scylladb/java-driver that referenced this pull request Mar 19, 2025
* CASSANDRA-19635: Run integration tests with C* 5.x

patch by Lukasz Antoniak; reviewed by Andy Tolbert, and Bret McGuire for CASSANDRA-19635

* CASSANDRA-19635: Configure Jenkins to run integration tests with C* 5.x

patch by Lukasz Antoniak; reviewed by Bret McGuire for CASSANDRA-19635

* update badge URL to org.apache.cassandra/java-driver-core

* Limit calls to Conversions.resolveExecutionProfile

Those repeated calls account for a non-negligible portion of my application
CPU (0.6%) and can definitly be a final field so that it gets resolved only
once per CqlRequestHandler.

patch by Benoit Tellier; reviewed by Andy Tolbert, and Bret McGuire
reference: apache#1623

* autolink JIRA tickets in commit messages

patch by Stefan Miklosovic; reviewed by Michael Semb Wever for CASSANDRA-19854

* Don't return empty routing key when partition key is unbound

DefaultBoundStatement#getRoutingKey has logic to infer the
routing key when no one has explicitly called setRoutingKey
or otherwise set the routing key on the statement.
It however doesn't check for cases where nothing has been
bound yet on the statement.
This causes more problems if the user decides to get a
BoundStatementBuilder from the PreparedStatement, set some
fields on it, and then copy it by constructing new
BoundStatementBuilder objects with the BoundStatement as a
parameter, since the empty ByteBuffer gets copied to all
bound statements, resulting in all requests being targeted
to the same Cassandra node in a token-aware load balancing
policy.

patch by Ammar Khaku; reviewed by Andy Tolbert, and Bret McGuire
reference: apache#1620

* JAVA-3167: CompletableFutures.allSuccessful() may return never completed future

patch by Lukasz Antoniak; reviewed by Andy Tolbert, and Bret McGuire for JAVA-3167

* ninja-fix Various test fixes

* Run integration tests with DSE 6.9.0

patch by Lukasz Antoniak; reviewed by Bret McGuire
reference: apache#1955

* JAVA-3117: Call CcmCustomRule#after if CcmCustomRule#before fails to allow subsequent tests to run

patch by Henry Hughes; reviewed by Alexandre Dutra and Andy Tolbert for JAVA-3117

* JAVA-3149: Support request cancellation in request throttler
patch by Lukasz Antoniak; reviewed by Andy Tolbert and Chris Lohfink for JAVA-3149

* Fix C* 3.0 tests failing on Jenkins
patch by Lukasz Antoniak; reviewed by Bret McGuire
reference: apache#1939

* Reduce lock held duration in ConcurrencyLimitingRequestThrottler

It might take some (small) time for callback handling when the
throttler request proceeds to submission.

Before this change, the throttler proceed request will happen while
holding the lock, preventing other tasks from proceeding when there is
spare capacity and even preventing tasks from enqueuing until the
callback completes.

By tracking the expected outcome, we can perform the callback outside
of the lock. This means that request registration and submission can
proceed even when a long callback is being processed.

patch by Jason Koch; Reviewed by Andy Tolbert and Chris Lohfink for CASSANDRA-19922

* Annotate BatchStatement, Statement, SimpleStatement methods with CheckReturnValue

Since the driver's default implementation is for
BatchStatement and SimpleStatement methods to be immutable,
we should annotate those methods with @CheckReturnValue.
Statement#setNowInSeconds implementations are immutable so
annotate that too.

patch by Ammar Khaku; reviewed by Andy Tolbert and Bret McGuire
reference: apache#1607

* Remove "beta" support for Java17 from docs

patch by Bret McGuire; reviewed by Andy Tolbert and Alexandre Dutra
reference: apache#1962

* Fix uncaught exception during graceful channel shutdown

after exceeding max orphan ids

patch by Christian Aistleitner; reviewed by Andy Tolbert, and Bret McGuire for apache#1938

* Build a public CI for Apache Cassandra Java Driver

 patch by Siyao (Jane) He; reviewed by Mick Semb Wever for CASSANDRA-19832

* CASSANDRA-19932: Allow to define extensions while creating table
patch by Lukasz Antoniak; reviewed by Bret McGuire and Chris Lohfink

* Fix DefaultSslEngineFactory missing null check on close

patch by Abe Ratnofsky; reviewed by Andy Tolbert and Chris Lohfink for CASSANDRA-20001

* Query builder support for NOT CQL syntax

patch by Bret McGuire; reviewed by Bret McGuire and Andy Tolbert for CASSANDRA-19930

* Fix CustomCcmRule to drop `CURRENT` flag no matter what

If super.after() throws an Exception `CURRENT` flag is never dropped
which leads next tests to fail with IllegalStateException("Attempting to use a Ccm rule while another is in use.  This is disallowed")

Patch by Dmitry Kropachev; reviewed by Andy Tolbert and Bret McGuire for JAVA-3117

* JAVA-3051: Memory leak

patch by Jane He; reviewed by Alexandre Dutra and Bret McGuire for JAVA-3051

* Automate latest Cassandra versions when running CI

 patch by Siyao (Jane) He; reviewed by Mick Semb Wever for CASSJAVA-25

* Refactor integration tests to support multiple C* distributions. Test with DataStax HCD 1.0.0

patch by Lukasz Antoniak; reviewed by Bret McGuire
reference: apache#1958

* Fix TableMetadata.describe() when containing a vector column

patch by Stefan Miklosovic; reviewed by Bret McGuire for CASSJAVA-2

* Move Apache Cassandra 5.x off of beta1 and remove some older Apache Cassandra versions.

patch by Bret McGuire; reviewed by Bret McGuire for CASSJAVA-54

* Update link to Jira to be CASSJAVA

Updating the link to Jira.

Previously we had a component in the CASSANDRA Jira project but now we have a project for each driver - in the case of Java, it's CASSJAVA.

Added CASSJAVA to .asf.yaml

patch by Jeremy Hanna; reviewed by Bret McGuire for CASSJAVA-61

* Move DataStax shaded Guava module into Java driver

patch by Lukasz Antoniak; reviewed by Alexandre Dutra and Bret McGuire for CASSJAVA-52

* JAVA-3057 Allow decoding a UDT that has more fields than expected

patch by Ammar Khaku; reviewed by Andy Tolbert and Bret McGuire
reference: apache#1635

* CASSJAVA-55 Remove setting "Host" header for metadata requests.

With some sysprops enabled this will actually be respected which completely borks Astra routing.

patch by Bret McGuire; reviewed by Alexandre Dutra and Bret McGuire for CASSJAVA-55

* JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder
patch by Jane He; reviewed by Mick Semb Wever and Bret McGuire for JAVA-3118
reference: apache#1931

* Upgrade Guava to 33.3.1-jre

patch by Lukasz Antoniak; reviewed by Alexandre Dutra and Bret McGuire for CASSJAVA-53

* Do not always cleanup Guava shaded module before packaging

* Revert "Do not always cleanup Guava shaded module before packaging"

This reverts commit 5be52ec.

* Conditionally compile shaded Guava module

* JAVA-3143: Extend driver vector support to arbitrary subtypes and fix handling of variable length types (OSS C* 5.0)

patch by Jane He; reviewed by Bret McGuire and João Reis
reference: apache#1952

* JAVA-3168 Copy node info for contact points on initial node refresh only from first match by endpoint

patch by Alex Sasnouskikh; reviewed by Andy Tolbert and Alexandre Dura for JAVA-3168

* JAVA-3055: Prevent PreparedStatement cache to be polluted if a request is cancelled.

There was a critical issue when the external code cancels a request, indeed the cached CompletableFuture will then always throw a CancellationException.
This may happens, for example, when used by reactive like Mono.zip or Mono.firstWithValue.

patch by Luc Boutier; reviewed by Alexandre Dutra and Bret McGuire
reference: apache#1757

* Expose a decorator for CqlPrepareAsyncProcessor cache rather than the ability to specify an
arbitrary cache from scratch.

Also bringing tests from apache#2003 forward
with a few minor changes due to this implementation

patch by Bret McGuire; reviewed by Bret McGuire and Andy Tolbert
reference: apache#2008

* ninja-fix Using shaded Guava classes for import in order to make OSGi class paths happy.

Major hat tip to Dmitry Konstantinov for the find here!

* Changelog updates for 4.19.0

* [maven-release-plugin] prepare release 4.19.0

* [maven-release-plugin] prepare for next development iteration

* Specify maven-clean-plugin version

Sets the version to 3.4.1 in parent pom.
Having it unspecified causes the following warning:
```
[WARNING] Some problems were encountered while building the effective model for com.scylladb:java-driver-guava-shaded:jar:4.19.0.0-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-clean-plugin is missing. @ line 97, column 15
[WARNING]
[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
[WARNING]
[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[WARNING]
```

* Install guava-shaded before running core's compile in CI

Without this module available "Build" and "Unit tests" job fail with
"package <x> does not exist" or "cannot find symbol" pointing to
`[...].shaded.guava.[...]` packages.

* Remove exception catch in `prepared_stmt_metadata_update_loopholes_test`

Since incorporating "JAVA-3057 Allow decoding a UDT that has more fields than
expected" the asuumptions of removed check are no longer valid.

* Switch shaded guava's groupId in osgi-tests

Switches from `org.apache.cassandra` to `com.scylladb`
in `BundleOptions#commonBundles()`.

* Typo: increase line's loglevel in CcmBridge

---------

Co-authored-by: Lukasz Antoniak <[email protected]>
Co-authored-by: Brad Schoening <[email protected]>
Co-authored-by: Benoit Tellier <[email protected]>
Co-authored-by: Stefan Miklosovic <[email protected]>
Co-authored-by: Ammar Khaku <[email protected]>
Co-authored-by: absurdfarce <[email protected]>
Co-authored-by: Henry Hughes <[email protected]>
Co-authored-by: Jason Koch <[email protected]>
Co-authored-by: Christian Aistleitner <[email protected]>
Co-authored-by: janehe <[email protected]>
Co-authored-by: Abe Ratnofsky <[email protected]>
Co-authored-by: absurdfarce <[email protected]>
Co-authored-by: Dmitry Kropachev <[email protected]>
Co-authored-by: janehe <[email protected]>
Co-authored-by: Jeremy Hanna <[email protected]>
Co-authored-by: SiyaoIsHiding <[email protected]>
Co-authored-by: Alex Sasnouskikh <[email protected]>
Co-authored-by: Luc Boutier <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants