Description
When we receive calls burst we found that the calls "wait each other" (i.e. the first call of the burst waits the last one).
This result in degradation of performances in both time of execution and memory consumption in the server because we have to keep many calls in fly.
This is particularly evident in big graphql queries where the users request many fields and we have several depth in the queries where each level has many async fields.
Just to be TLDR, looking the implementations of graphql and asyncio we understood that this is due to the following:
- graphql breadth first way to schedule and resolve the fields
- asyncio internal FIFO queue of tasks to be executed
As an example lets have queries like that, where data may be like beer vendors and we want for each beer vendor many fields that describes that vendor, a1...a100, b1...b100, ...:
query {
data {
a1 {
b1 {
c1
...
c100
}
...
b100 {
c1
...
c100
}
}
...
a100 { ... }
}
}
If we have n of this calls coming in burst when we arrive to the depth of the c fields we have many many task scheduled in the asyncio queue.
If we check the of order of execution we have that the first query, on each level, "waits" the other queries, because all the queries schedules a lot of tasks.
In the proof of concept, that you may find at the end of the post, you can verify the order of execution of the resolvers.
It could be very nice to have some sort of priority in the order to let the first query not wait the scheduling and resolve of all the queries before ending.
I understand that this is something between graphql and asyncio but i think it could affect the use of graphql in environments receiving many calls.
Fixes, helps and hints in how to improve this would be very appreciated.
import asyncio
from graphene import ObjectType, Schema, String, Field
FIELD_NUMBER = 2
CONCURRENT_QUERIES = 10
def make_resolver(i, j=None):
async def resolver(self, info):
print(f"START query {info.context['query_number']} | a{i} | b{j}")
await asyncio.sleep(0.001)
print(f"END query {info.context['query_number']} | a{i} | b{j}")
return i
return resolver
def create_fields():
fields = {}
for i in range(FIELD_NUMBER):
inner_fields = {}
for j in range(FIELD_NUMBER):
inner_fields[f"b{j}"] = String()
inner_fields[f"resolve_b{j}"] = make_resolver(i, j)
MyType = type(
f"MyType",
(ObjectType,),
inner_fields,
)
fields[f"a{i}"] = Field(MyType)
fields[f"resolve_a{i}"] = make_resolver(i)
return fields
async def make_query(schema, query_number):
inner_query_values = [f"b{i}" for i in range(FIELD_NUMBER)]
query_values = [
"a%s {%s}" % (i, " ".join(inner_query_values)) for i in range(FIELD_NUMBER)
]
query_string = "{ %s }" % (" ".join(query_values),)
await schema.execute_async(
query_string, context_value=dict(query_number=query_number)
)
async def main():
Query = type("Query", (ObjectType,), create_fields())
schema = Schema(query=Query)
await asyncio.gather(*[make_query(schema, i) for i in range(CONCURRENT_QUERIES)])
asyncio.run(main())