SQL Functions for Fields #87

MRichards99 · 2021-07-30T13:11:41Z

As per #86, I've made some changes to allow SQL functions to be added to fields. I came across an issue where my first commit broke queries with conditions using related fields, but I've since fixed that. I've ran these changes on DataGateway API's tests (which have use cases of related fields, 'ordinary' queries, included entities etc.) and these all pass.

You might feel these changes require a refactor or could be put in a more appropriate place within Python ICAT and unit tests and documentation from Python ICAT's side. Hopefully this is a good start for this to happen so you don't need to implement our request completely from scratch. I've created a PR so it's easier for yourself (and STFC colleagues) to see the status of this.

For awareness, I'm on leave next week (week commencing 2nd August) so I won't be able to respond/reply to comments.

- An example query for this use case is: `SELECT o FROM Investigation o WHERE UPPER(o.title) like '%toMAto%'`

- This is done by checking if a function is present before removing it

RKrahl · 2021-08-05T09:50:01Z

Sorry, @MRichards99, I didn't found the time yet to look into that and, to be honest, I can't make promises when I will get around to do it.

MRichards99 · 2021-08-09T10:25:52Z

Not a problem, I've been on annual leave last week so I couldn't have made any progress anyway. Let me know when you do find some time :)

RKrahl

I have a few issues with your implementation:

First of all, I'd like to see one or more tests that demonstrate that what you want to achieve actually works. tests/test_06_query.py would be the appropriate place to add that.

Your implementation modifies the checks that are in place for attributes to essentially allow an SQL function to be used everywhere where an attribute or a related object appears, even if this doesn't make sense for the particular case. Properly implemented are the SQL functions only in the case of conditions, in other cases, it results in invalid queries:

# This is what you intended:
>>> query = Query(client, "Investigation", conditions={ "UPPER(title)": "like '%PolICE%'" })
>>> str(query)
"SELECT o FROM Investigation o WHERE UPPER(o.title) like '%PolICE%'"
# That was probably not intended:
>>> query = Query(client, "Investigation", includes=["UPPER(type)"])
>>> str(query)
'SELECT o FROM Investigation o INCLUDE o.UPPER(type)'
>>> query = Query(client, "Investigation", order=["UPPER(name)"])
>>> str(query)
'SELECT o FROM Investigation o ORDER BY o.UPPER(name)'
>>> query = Query(client, "Investigation", attributes=["UPPER(name)"])
>>> str(query)
'SELECT o.UPPER(name) FROM Investigation o'
>>> query = Query(client, "Investigation", join_specs={"UPPER(datasets)": "LEFT JOIN"}, order=["datasets.name"])
/net/home/jsi/bin/icatsh:1: QueryOneToManyOrderWarning: ordering on a one to many relation datasets may surprisingly affect the search result.
  #! /usr/bin/python3 -i
>>> str(query)
'SELECT o FROM Investigation o JOIN o.datasets AS s1 ORDER BY s1.name'

The current python-icat release throws an error like ValueError: Unknown attribute name 'UPPER(name)' in all of these cases. I'd suggest, it's better to raise a clear error rather than to silently accept it in cases where it is not supported.

I'd suggest therefore that the implementation should deal with that in higher level methods only where it makes sense rather than in the low level helpers. That would require a decision in which cases SQL functions should be allowed first:

in includes: this would certainly not make sense.
in join_specs: certainly not.
in attributes: maybe, but that would interfere with aggregate in a strange manner. Not sure if it is needed here anyway.
in order: maybe, could make sense.
in conditions: this is where you need it.

It would then require a review of the internal data structures to properly represent this.

Finally, we also would need documentation.

RKrahl · 2021-08-16T09:48:09Z

icat/query.py

        pattr = ""
+        if attrname.endswith(')'):
+            attrname = (attrname.split("("))[1].split(")")[0]
        for attr in attrname.split('.'):


I don't think this is the proper place to deal with that. _attrpath() is a low level helper with a well defined scope: to check an attribute name, iterating over the components of the dotted path to related objects as needed. It should not deal with higher level issues such as whether that attribute is used inside an SQL function. It is called from many places, not in all of them an SQL function would make sense.

Also, I'm not sure about the way how the attribute name and SQL function is parsed here.

RKrahl · 2021-08-16T14:29:42Z

icat/query.py

            if i < 0:
                continue
+            if obj.endswith(')'):
+                obj = (obj.split("("))[1].split(")")[0]


Similar comment as above: _makesubst() should rather not need to care about SQL functions.

RKrahl · 2021-08-16T15:00:16Z

@MRichards99, one additional comment: the review comment above doesn't mean that you have to do all that alone. I may also have a look on it. However, I will be on annual leave starting on Wednesday and will not be able to do that before September.

Another question: you made the PR against the develop branch which would mean to target it for version 1.0. If you need that more urgently, we could also backport the implementation once its done and make a 0.20.0 release for it. On the other hand, the only thing that holds back 1.0 at the moment is icat.server: I wanted to include support for upcoming schema changes in 1.0, ref. #73. Since the discussion on the schema currently makes progress, there is a chance that python-icat 1.0 can also be released soon.

MRichards99 · 2021-08-20T11:15:40Z

I'll have a look into this further over the next couple of weeks. Having read your comments (which I agree with, the current stage of the changes were more of a 'rough demonstration' than code that's merge-able), I'll try and add this functionality in a higher-level, perhaps in its own function that can be called in addConditions. I will also think about DataGateway use cases and see if it makes sense elsewhere.

I branched off the develop branch because I think it's the default branch. I'm happy for it to be in 0.20.0 rather than 1.0, which branch do I need to use for that?

MRichards99 · 2021-10-04T08:16:52Z

@RKrahl which branch (master or develop) would you like to me branch off to work against? I have no reason for this to only be present in version 1.0, this feature being in 0.20.0 would suffice.

RKrahl · 2021-10-04T13:49:49Z

@MRichards99, you don't need to change anything in the branches. My comment above was meant to be a question on how urgent that change is.

The develop branch already targets version 1.0. That may even be fine for you, because the only thing I'm still waitng before that release is icat.server 5.0. And since we now made good progress there with the schema decisions, that release may come soon anyway.

If you can't wait for 1.0, but need the SQL functions stuff more urgent than that, we can still backport it to master and make a quick intermediate release 0.20.

agbeltran · 2021-10-04T14:22:49Z

@RKrahl @MRichards99 Ideally we would like a quick release in 0.20 if possible, so that we can incorporate this fix in datagatway soon

MRichards99 · 2021-10-05T10:47:40Z

@RKrahl @MRichards99 Ideally we would like a quick release in 0.20 if possible, so that we can incorporate this fix in datagatway soon

In this case, I might make a different branch that branches off master just to make the process of releasing 0.20.0 a bit easier (rather than merging this branch into master as I noticed there were a few differences). It might help me to start from scratch too

MRichards99 · 2021-10-07T17:00:40Z

I'm converting this to a draft PR (and unlinking #86) as I don't intend for this to be merged but it might be useful for all of us to view back at this PR to look at the discussion.

RKrahl · 2021-10-12T15:13:24Z

Btw., I suggest we should keep the link with #86 for the sake of documentation but close this one after the merge of #89.

MRichards99 added 2 commits July 26, 2021 16:14

Allow SQL functions to not cause errors from Python ICAT

cbc9e32

- An example query for this use case is: `SELECT o FROM Investigation o WHERE UPPER(o.title) like '%toMAto%'`

Fix queries without SQL functions on fields

4699ee2

- This is done by checking if a function is present before removing it

MRichards99 requested a review from RKrahl July 30, 2021 13:11

MRichards99 mentioned this pull request Jul 30, 2021

Instrument search is case-sensitive ral-facilities/datagateway#731

Closed

1 task

RKrahl linked an issue Aug 16, 2021 that may be closed by this pull request

Allow SQL Functions to be used on Fields #86

Closed

RKrahl requested changes Aug 16, 2021

View reviewed changes

RKrahl added the enhancement New feature or request label Aug 16, 2021

MRichards99 removed a link to an issue Oct 7, 2021

Allow SQL Functions to be used on Fields #86

Closed

MRichards99 marked this pull request as draft October 7, 2021 17:00

RKrahl mentioned this pull request Oct 12, 2021

Support for JPQL Functions on Query Conditions #89

Merged

RKrahl linked an issue Oct 12, 2021 that may be closed by this pull request

Allow SQL Functions to be used on Fields #86

Closed

RKrahl closed this Oct 29, 2021

RKrahl deleted the sql-functions-for-fields branch October 29, 2021 19:36

SQL Functions for Fields #87

SQL Functions for Fields #87

Uh oh!

Conversation

MRichards99 commented Jul 30, 2021

Uh oh!

RKrahl commented Aug 5, 2021

Uh oh!

MRichards99 commented Aug 9, 2021

Uh oh!

RKrahl left a comment

Choose a reason for hiding this comment

Uh oh!

RKrahl Aug 16, 2021

Choose a reason for hiding this comment

Uh oh!

RKrahl Aug 16, 2021

Choose a reason for hiding this comment

Uh oh!

RKrahl commented Aug 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MRichards99 commented Aug 20, 2021

Uh oh!

MRichards99 commented Oct 4, 2021

Uh oh!

RKrahl commented Oct 4, 2021

Uh oh!

agbeltran commented Oct 4, 2021

Uh oh!

MRichards99 commented Oct 5, 2021

Uh oh!

MRichards99 commented Oct 7, 2021

Uh oh!

RKrahl commented Oct 12, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

RKrahl commented Aug 16, 2021 •

edited

Loading