Skip to content

Feature request: improved hook support #623

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bart-degreed opened this issue Nov 15, 2019 · 12 comments
Closed

Feature request: improved hook support #623

bart-degreed opened this issue Nov 15, 2019 · 12 comments

Comments

@bart-degreed
Copy link
Contributor

Description

The past weeks I have been trying to implement several features of our existing API using JsonApiDotNetCore. Starting out with custom services and repositories, I am now trying to make more use of hooks. One thing that I'm stuck at is the hook support for reading lists.

Consider the next method in DefaultResourceService:

public virtual async Task<IEnumerable<TResource>> GetAsync()
{
    _hookExecutor?.BeforeRead<TResource>(ResourcePipeline.Get);

    var entityQuery = _repository.Get();
    entityQuery = ApplyFilter(entityQuery);
    entityQuery = ApplySort(entityQuery);
    entityQuery = ApplyInclude(entityQuery);
    entityQuery = ApplySelect(entityQuery);

    if (!IsNull(_hookExecutor, entityQuery))
    {
        var entities = await _repository.ToListAsync(entityQuery);
        _hookExecutor.AfterRead(entities, ResourcePipeline.Get);
        entityQuery = _hookExecutor.OnReturn(entities, ResourcePipeline.Get).AsQueryable();
    }

    if (_options.IncludeTotalRecordCount)
        _pageManager.TotalRecords = await _repository.CountAsync(entityQuery);

    // pagination should be done last since it will execute the query
    var pagedEntities = await ApplyPageQueryAsync(entityQuery);
    return pagedEntities;
}

The first thing to notice is that whenever you start to use a hook (OnReturn in my case), paging no longer works server-side. This is unacceptable for large tables. I think that if the hook subscriber really needs the query executed, it should do so itself and live with the consequences.

Something else I found is that hooks cannot be generic. It's unclear to me if that is a bug or by design. To illustrate, the following works:

services.AddScoped<ResourceDefinition<Video>, VideoResourceHookContainer>();

public class VideoResourceHookContainer : TypeEmbeddedIdResourceHookContainer<Video> { }

public class TypeEmbeddedIdResourceHookContainer<TResource> 
  : ResourceDefinition<TResource> 
  where TResource : class, IIdentifiable<long> 
{
  // ...
}

but the following does not:

services.AddScoped<ResourceDefinition<Video>, TypeEmbeddedIdResourceHookContainer<Video>>();

public class TypeEmbeddedIdResourceHookContainer<TResource> 
  : ResourceDefinition<TResource> 
  where TResource : class, IIdentifiable<long> 
{
  // ...
}

Finally, the documentation on the interfaces in IResourceHookExecutor.cs is wrong. For example, IReadHookExecutor contains: "Wrapper interface for all Before execution methods." Also the file contains the misspelling "appropiate" multiple times.

Environment

Latest commit in develop branch.

@maurei
Copy link
Member

maurei commented Nov 18, 2019

Hi @bart-degreed, thanks again for your input

Your first request is an interesting point of discussion. Your concern about server-side paging being skipped completely is valid, but I don't think allowing the query to be executed in the hook itself is the way to address this issue.

The main concern here is that IHookExecutor not only fires the hooks associated to the main resource of the request (eg TResource=Article for GET /articles), but also any "nested hook" that is involved when dealing with a data set. Note that in general this data set can be nested arbitrarily deep (through requests like articles?include=a.b.c ... x.y.z or as a result of custom repository/service implementations). See this docs item or this test for simple examples of that.

To achieve this, IHookExecutor discovers all involved resources within a dataset by traversing through it and firing any associated hooks along the way. Such traversal will always require an evaluated query and this wouldn't be possible if the query would be executed optionally inside one of the involved hooks. Even if the main resource of the request (Article in the above example) does not have an AfterRead hook implemented, any nested resource that might be involved might have it implemented, so the query still needs to be evaluated to traverse and check for this and fire those hooks.

An important use-case for such thorough traversal is security: when using hooks to implement Access Control, it ensures that the authorization logic related to every resource will be executed.

Nevertheless, server-side paging probably shouldn't be skipped totally. But I don't think there is a trivial answer as to how to approach this best: we could apply the paging before firing the hooks here. But then having authorization as described above will lead to strange behaviour: eg the request page size is P but authorization logic hides two members and the resulting set is P — 2: how to deal with that?

I'd like to know more about your particular use-case of the hooks here to figure out what is the best way to approach it: whether to work around the problem (eg paging before executing the queries?) or possibly change the behaviour of ResourceHooks for your use-case.

[edit]
Thoughts on the side

  • I think semantically it makes sense to feed the AfterRead and OnReturn hooks with an evaluated dataset rather than a IQueryable: this is kinda suggested by the names of the hooks
  • Sounds like you could possibly need a OnQueryExecute hook, which would be a hook on the data-level of the framework rather than service-level. But to be sure I would need to know more about your use-case.
  • Possibly implementing a query filter could solve your problem?

@maurei
Copy link
Member

maurei commented Nov 18, 2019

As to your last two points,

Something else I found is that hooks cannot be generic.

and

Finally, the documentation on the interfaces in IResourceHookExecutor.cs is wrong

Thanks for pointing those out. We're aware about the docs: this is still work in progress (#567). We aim to fully update the docs before releasing v4. The issue with that generic implementation of ResourceDefinition<T> looks like a bug: I'll check that out in detail and report back shortly.

@maurei
Copy link
Member

maurei commented Nov 18, 2019

I looked into the generic resource definition and realised this is by design: the auto-discovery searches for either

  • bound implementations of ResourceDefinition<>,
  • or bound implementations of unbound subclasses of ResourceDefinition<>

That is why public class VideoResourceHookContainer : TypeEmbeddedIdResourceHookContainer<Video> { } is recognised, but not public class TypeEmbeddedIdResourceHookContainer<TResource> { } . If you want to use this class as the default resource definition implementation for your resources, you can do so by doing registering the unbound type as such

services.AddScoped( typeof(ResourceDefinition<>), typeof(TypeEmbeddedIdResourceHookContainer<>))

We should include this more clearly in the docs, added it to the docs issue.

On a side note, if we choose to include that type of registration in the auto-discovery by default, it wouldn't be possible register a resource definition for just a subset of resources, or to use two of such generic resource definitions because one would override the other. So I think it'd be better to not auto-discover unbound resources definitions like that.

@bart-degreed
Copy link
Contributor Author

I looked into the generic resource definition and realised this is by design

I'd like to be able to register different bindings per resource type, like:

User => CustomHookContainer<User>
Group => CustomHookContainer<Group>
Video => TypeEmbeddedIdResourceHookContainer<Video>
Book => TypeEmbeddedIdResourceHookContainer<Book>

Maybe the auto-discovery should search for IResourceHookContainer<TResource>, similar to reposities and services, instead of ResourceDefinition?

Your first request is an interesting point of discussion.

In my use case, I wanted to update resource IDs based on other property values during GET. We encode IDs to prevent caller guessing values. That works (via OnReturn), unless sparse fieldsets are used. The properties I needed were not always available anymore. For that to work, the invocation needs to move to before Select(). But because I already need a custom service anyway, I decided to put the logic there instead. The reason I need a custom service is that we use Elasticsearch for searching, which returns a list of IDs for the current page. We then enrich the data from SQL Server. So the ordering of get/include/select/sort/page is totally different there.

For hooks, I think the following would help:

  • Hook invocation/registration should be more granular. For example, subscribing to BeforeUpdate causes the paging issue (which should be unrelated).
  • OnReturn should execute after paging. Subscriber may change entity properties there. And even add/remove entries, but that results in fewer/more items on the current page. I don't think this hook is the right place to apply custom list filters (should do that on IQueryable<> instead).
  • A hook to dynamically include nested entities is something I need. Now it requires to inherit from DefaultResourceRepository. For example, I expose [Attr]ProviderCode which calls this.Provider.Code. It throws nullref because Provider was not loaded by EF.
  • A hook to add extra filter criteria (on IQueryable or Expression<Func<...>>) is something I need for tenant filtering. Now it requires to inherit from DefaultResourceRepository. I was unaware that QueryFilter could be used for that. I would like to write a execute-always hook that uses an is-pattern to dynamically check for properties to filter on.

@maurei
Copy link
Member

maurei commented Nov 21, 2019

I'd like to be able to register different bindings per resource type, like:

This is still possible by just registering resource definitions explicitly:

services.AddScoped<ResourceDefinition<User>, CustomHookContainer<User>>();
services.AddScoped<ResourceDefinition<Group>, CustomHookContainer<Group>>();
services.AddScoped<ResourceDefinition<Video>, TypeEmbeddedIdResourceHookContainer<Video>>();
services.AddScoped<ResourceDefinition<Book>, TypeEmbeddedIdResourceHookContainer<Book>>();

This type of setup is just not auto-discoverable currently, because when multiple unbound types like CustomHookContainer<> and TypeEmbeddedIdResourceHookContainer<> are declared, it is impossible for the discovery to know that User and Group should be used with the former, and Video and Book should be used with the latter. (Whereas for bound types like class Video : CustomHookContainer<Video> { } this is certain)

[edit]

Will get back at the other comments shortly

@bart-degreed
Copy link
Contributor Author

This is still possible by just registering resource definitions explicitly

I tried that (see first post). It does not work.

@maurei
Copy link
Member

maurei commented Nov 25, 2019

That is weird, this should work. I'm using something similar in my projects. Are you calling these explicit registrations after your invocation of AddJsonApi( .. )? Because if you do it before, I think auto-discovery will override your custom registration

(on a side note: autodiscovery should probably check if there already exists a service registration before adding/override an existing one)

If that isn't the problem, please provide me with a repro setup so I can look into it.

@maurei
Copy link
Member

maurei commented Nov 25, 2019

As to your other comments regarding IQueryable hooks I agree that having such hooks would be greatly beneficial. I would have to think and write out specs of such feature before being able to have a good discussion on the conceptual level.

Some things that pop up in my mind are:

  • Usage of such hooks should be agnostic to the internals of the repository layer. With this we might run into problems with order of execution (eg calling a .Where( ... ) filter on a property after a .Select( ... ) which excluded that property from the projection).
  • Such hooks should respect separation of concerns. If a query involves two resources (a main resource and am included resource), the contents of the hook of the main resource should be completely independent of any included resource, and vice-versa. However both hooks should be involved in the execution cycle.
    • possibly a naive first implementation for IQueryable hooks could ignore nested hooks, if it turns out that this covers a large subset of the use-cases

I will make an effort to draft a proposal soon, but until then any related thoughts are welcome

@mjartz
Copy link

mjartz commented Nov 27, 2019

Hi All,
Yes i totally agree with Bart above. I'm working on a POC using JADNC 3.1.0 and have just started migrating it to the latest 4.0.0 alpha. Basic gets were working fine with paging out of the box as I had set the options.EnableResourceHooks to false. Without doing anything other than changing that option to true, all of a sudden a number of gets on large tables were trying to retrieve every record because of the queryable being enumerated to a list in the GetAsync method Bart highlighted above. I can see this potentially causing issues for a lot of people, especially as it seems hooks will need to be enabled and used in a lot of scenarios. As it stands right now, I cannot use this package with hooks enabled, without implementing a custom resource service overriding this method and using this for all of my entities.
FYI, I totally understand the rationale for doing it this way in order to satisfy paging functionality after a hook may have further filtered or sorted the data, but I think this responsibility must be taken on by the developer themselves within the hook, though I acknowledge your concerns around this as well. I also like the idea of having hooks more granular so that the above would only occur if hooks were enabled, AND the OnReturn hook was actually implemented.
Regard,

@bart-degreed
Copy link
Contributor Author

That is weird, this should work.

I tried to create a small repro solution based on the latest bits and was unable to reproduce the issue I ran into. I no longer have the original sources, because it turned out that hooks are completely useless to me in their current state. If I run into this again in the future I'll create a separate issue.

@maurei
Copy link
Member

maurei commented Feb 13, 2020

I've been having some thoughts on how to improve the performance of hooks. Elaborates on the last bit of the comment of @mjartz

  • I think it makes sense semantically that After hooks are fired after query execution (leaving paging out of the equation for simplicity although I do acknowledge better integration is required). So if eg. a user only ever has AfterRead hooks implemented in the application, we can be pretty sure there is no need to ever prematurely execute any query or side load any extra data in search of a nested Before hook to execute.
  • We could expand on that further by taking into account for every resource which nested hooks can be reached. Eg if there are two traversable but disconnected relationshipchains A.B.C and X.Y.Z, then if the only Before hooks implemented are for resources in subset {X, Y, Z}, one can be certain that for any request involving only resources of subset {A, B, C} premature query execution is not required.
    • Even if the relationship chains are connected, eg A.B.C.X.Y.Z, for create/patch/delete requests we could make better use of the fact that the nesting level can be only be one level deep, i.e. if the request involves A.B, (eg POST Article and set an author too) and A nor B have any Before hooks implemented, no need to execute the query prematurely.
    • We could also choose to stick more closely to the relationship chain in general (i.e. only those resources as requested from the front-end) and ignore any additional manual Include() queries to determine which hooks to execute, but we should be careful with that as it comes with a security risk
  • Alternatively we could come up with an API to give a developer full control when and when not to search for nested hooks to execute, but I feel such API might be pretty complicated to work with

@bart-degreed
Copy link
Contributor Author

Closing this in favor of #934, which proposes a simpler, non-recursive design.
Combined with recently added IResourceDefinition callbacks such as OnApplyFilter, OnApplyIncludes, OnApplySparseFieldSet etc. I suspect it covers the major use cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants