Skip to content

Conversation

@kitalkuyo-gita
Copy link
Contributor

What changes were proposed in this pull request?

Related to issue-368

How was this PR tested?

  • Tests have Added for the changes
  • Production environment verified

Copy link
Contributor

@Leomrlin Leomrlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SAME predicate is primarily used to assert whether entities in a path are same. Is the EXISTS syntax implemented in this pr?

Here is a syntax definition for reference.
<same predicate> ::= SAME <left paren> <element variable reference> <comma> <element variable reference> [ { <comma> <element variable reference> }... ] <right paren>

* SQL node representing a same predicate pattern in GQL.
* This node represents a pattern where two path patterns share a common predicate condition.
*
* <p>Example: MATCH (a:person) -> (b) | (a:person) -> (c) WHERE SAME(a.age > 25)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SAME predicate is primarily used to assert whether entities in a path are same. Is the EXISTS syntax implemented here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may misunderstand that this feature is used to check whether entities are the same, but in fact, this feature is used to share conditions between multiple path patterns

*
* @return true if distinct, false if union all
*/
public boolean isDistinct() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The EXISTS sql node doesn't need distinct and union, the copying here is redundant.


// Unparse union operator
if (isDistinct) {
writer.print(" | ");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here make the corresponding changes.

* MatchSamePredicate(left, right, condition, distinct) ->
* MatchFilter(MatchUnion(left, right, distinct), condition)
*/
public class SamePredicateOptimizationRule extends RelOptRule {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this optimization rule mean?

Copy link
Contributor Author

@kitalkuyo-gita kitalkuyo-gita Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimization Rule Detailed Explanation

1. Core Function of the Optimization Rule

This optimization rule is used to convert the SAME predicate pattern into a more efficient execution plan.

Conversion Process:

MatchSamePredicate(left, right, condition, distinct)
↓ (Optimized)
MatchFilter(MatchUnion(left, right, distinct), condition)

2. **Why is this optimization necessary? **

Issues before optimization:

  • MatchSamePredicate is a special operator, requiring special execution logic.
  • The executor needs to understand the semantics of the SAME predicate, resulting in high implementation complexity.
  • Difficulty leveraging existing optimization rules (such as predicate pushdown and index optimization).

Benefits after optimization:

  • Converted to a standard Union + Filter combination, simplifying execution logic.
  • Can leverage existing optimization rules and indexes.
  • More standardized execution plans, facilitating further optimization.

3. Specific Conversion Example

Original SQL:

MATCH (a:person) -> (b) | (a:person) -> (c) WHERE SAME(a.age > 25)
RETURN a.id, b.id, c.id

Execution Plan before optimization:

MATCHSamePredicate(
left: (a:person) -> (c) (b),
right: (a:person) -> (c),
condition: a.age > 25,
distinct: true
)

Optimized Execution Plan:

MatchFilter(
condition: a.age > 25,
input: MatchUnion(
inputs: [(a:person) -> (b), (a:person) -> (c)],
all: false // distinct = true
)
)

4. Practical Application Examples

Scenario 1: Basic Condition Sharing

-- Find people older than 25 who are both friends and colleagues
MATCH (a:person) -> (b:friend) | (a:person) -> (c:colleague) WHERE SAME(a.age > 25)

Before Optimization: Requires special handling of the SAME predicate
After Optimization:

  1. First, union the two paths: (a:person) -> (b:friend)(a:person) -> (c:colleague)
  2. Then filter: a.age > 25

Scenario 2: Sharing Complex Conditions

-- Finding People Meeting Complex Conditions
MATCH (a:person) -> (b) | (a:person) -> (c) WHERE SAME(a.age > 25 AND a.name = 'marko')

After:

  1. Union: (a:person) -> (b)(a:person) -> (c)
  2. Filter: a.age > 25 AND a.name = 'marko'

Scenario 3: DISTINCT Semantics

-- Using DISTINCT Semantics
MATCH (a:person) -> (b) | (a:person) -> (c) WHERE SAME(a.age > 25 AND a.name = 'marko') (b) | (a:person) -> (c) WHERE SAME(a.age > 25) DISTINCT

After optimization:

  1. Union with DISTINCT: (a:person) -> (b)(a:person) -> (c) (de-duplication)
  2. Filter: a.age > 25

5. Specific Benefits of the Optimization

Performance Optimization:

  • Predicate Pushdown: Conditions can be pushed down before the union, reducing the amount of data.
  • Index Utilization: Existing indexes can be leveraged to accelerate condition filtering.
  • Parallel Execution: Union operations can execute two paths in parallel.

Code Simplification:

  • Executor Simplification: No special SAME predicate execution logic is required.
  • Maintainability Improvement: Standard union and filter operators are used.
  • Test Simplification: Existing union and filter test cases can be reused.

6. Optimization Rule Triggering Conditions

// This rule only applies to nodes of the MatchSamePredicate type.
super(operand(MatchSamePredicate.class, any()));

Triggering Time:

  • Query optimization phase
  • When the execution plan contains a MatchSamePredicate node
  • Automatically applied, no manual intervention required

7. Cooperation with Other Optimization Rules

MatchFilterMergeRule:

// Can further merge multiple filters.
MatchFilter(MatchFilter(input, condition1), condition2)
↓
MatchFilter(input, condition1 AND condition2)

Predicate Pushdown Rule:

  • Can push filter conditions further down to the data source.
  • Reduces the amount of intermediate results.

8. Actual Execution Performance Comparison

Before optimization:

Execution plan: MatchSamePredicate
- Requires special execution logic
- Difficult to leverage indexes
- Complex execution path

After optimization:

Execution plan: MatchFilter -> MatchUnion
- Uses standard operators
- Can leverage index optimization
- Clear execution path
- Supports further optimization

This optimization rule embodies the core concept of the Query Optimizer: transforming complex semantics into simple, optimizable combinations of standard operators.

@kitalkuyo-gita
Copy link
Contributor Author

The SAME predicate is primarily used to assert whether entities in a path are same. Is the EXISTS syntax implemented in this pr?

Here is a syntax definition for reference. <same predicate> ::= SAME <left paren> <element variable reference> <comma> <element variable reference> [ { <comma> <element variable reference> }... ] <right paren>

  • EXISTS Syntax - Fully Implemented Feature

Usage: Check if a match exists for a certain path

Grammar Format:
EXISTS PathPattern

Actual SQL Example:

-- 检查b节点是否有出边连接到c节点
MATCH (a:person WHERE id = 1)-[e]->(b)
WHERE EXISTS (b) -> (c)
RETURN a, e, b

-- 检查b节点是否有入边连接
MATCH (a:person WHERE id = 1)-[e]->(b)  
WHERE EXISTS (b) <- (c:person where id != 1)
RETURN a, e, b

-- 否定存在性检查
MATCH (a:person WHERE id = 1)-[e]->(b)
WHERE NOT EXISTS (b) -> (c)
RETURN a, e, b

Test Case Proof : The project already has a complete EXISTS test case

  • gql_subquery_005.sql
  • gql_subquery_006.sql
  • gql_subquery_007.sql
  • gql_subquery_009.sql
  1. SAME predicate - New features implemented in this PR

Purpose : Share the same conditions among multiple path patterns

Grammar Format:
MATCH path1 | path2 WHERE SAME(condition)

Actual SQL Example:

-- 基本用法:两个路径都要求a.age > 25
MATCH (a:person) -> (b) | (a:person) -> (c) WHERE SAME(a.age > 25)
RETURN a.id as a_id, a.age as a_age, b.id as b_id, c.id as c_id

-- 支持DISTINCT语义
MATCH (a:person) -> (b) | (a:person) -> (c) WHERE SAME(a.age > 25) DISTINCT
RETURN a.id as a_id, a.age as a_age, b.id as b_id, c.id as c_id

-- 支持多个路径模式
MATCH (a:person) -> (b) | (a:person) -> (c) | (a:person) -> (d) WHERE SAME(a.age > 25)
RETURN a.id as a_id, a.age as a_age, b.id as b_id, c.id as c_id, d.id as d_id

-- 复杂条件:涉及多个变量的条件
MATCH (a:person) -> (b) | (a:person) -> (c) WHERE SAME(a.age > 25 AND b.id != c.id)
RETURN a.id as a_id, a.age as a_age, b.id as b_id, c.id as c_id

@kitalkuyo-gita
Copy link
Contributor Author

kitalkuyo-gita commented Sep 5, 2025

The SAME predicate is primarily used to assert whether entities in a path are same. Is the EXISTS syntax implemented in this pr?

Here is a syntax definition for reference. <same predicate> ::= SAME <left paren> <element variable reference> <comma> <element variable reference> [ { <comma> <element variable reference> }... ] <right paren>

I personally think what you said makes some sense. But this is rarely needed in practice because the entities in the path are usually different You can see the following two examples

Expected SAME (Entity Comparison) by yours:

--Check if a, b, and c are the same entity
MATCH (a:person) -> (b) | (a:person) -> (c) WHERE SAME(a, b, c)
--This is rarely needed in practice because the entities in the path are usually different

Actual implementation of SAME (Conditional Sharing):

--Find people over 25 years old who are connected to both b and c at the same time
MATCH (a:person) -> (b) | (a:person) -> (c) WHERE SAME(a.age > 25)
--This is very useful in practice, ensuring that multiple paths meet the same conditions

Exits (path existence):

--Find people with friends
MATCH (a:person) WHERE EXISTS (a) -> (b:person)
--Find people without friends
MATCH (a:person) WHERE NOT EXISTS (a) -> (b:person)

@kitalkuyo-gita kitalkuyo-gita changed the title feat: support same predicate feat: support shard predicate Nov 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants