-
Notifications
You must be signed in to change notification settings - Fork 63
LW-11609 Web socket ChainHistoryProvider #1489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
517a6a7 to
8b0e7ac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preliminary review with some questions
packages/core/src/WebSocket.ts
Outdated
| } | ||
|
|
||
| interface EmitHealthOptions { | ||
| notRecoverable?: boolean; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick/suggestion: in general non-negated boolean variables result in more readable boolean logic
| notRecoverable?: boolean; | |
| recoverable?: boolean; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes... in this case I used the negative form as it also has the meaning "this error needs to end up to a server restart".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure there are exceptions to the rule 🙏
On a different topic:
has the meaning "this error needs to end up to a server restart".
In that case, server should probably die, and all ws connections get disconnected? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes... in case of not recoverable error there is no alternative...
The server tries to, and often it does... due to the nature of a not recoverable error, sometimes it could be something is not correctly closed and self shutdown is not successful; anyway the health check is set to inform the orchestrator it has to perform a forced shutdown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gytis-ivaskevicius what is the preferred behavior for unrecoverable errors: die or report unhealthy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this in deep: we are working on health checks since some weeks ago.
The ws-server implements the health check logic as specified by SRE (two distinct endpoints one for readiness one for the health status).
We didn't specifically discuss the self shutdown option, anyway if the required action is a restart, making the server able to do it by itself (rather than waiting for the orchestrator to do that) sounds as an improvement...
For sure, any insights from @gytis-ivaskevicius will be appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if a restart may resolve the error - either liveliness probe should fail or it should exit with non 0 exit code. Service shutting down in this case is a better option since it gives a faster feedback loop
if the error can not be resolved by restarting - it would be better if liveliness probe would fail to reduce restarts and possibly keep other parts of application working
8b0e7ac to
84ef892
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good @iccicci. Minor CR (change request)
packages/cardano-services/src/ChainHistory/DbSyncChainHistory/mappers.ts
Show resolved
Hide resolved
packages/cardano-services/src/ChainHistory/DbSyncChainHistory/queries.ts
Outdated
Show resolved
Hide resolved
8d0f6ac to
63798e8
Compare
Context
Lace performs a huge amount of HTTP calls.
Proposed Solution
Implemented event based
ChainHistoryProvider.transactionsByAddressesandUtxoProvider.The load tests demonstrate this back end implementation is able to sustain even an unexpected workload: 100 users simultaneously performing wallet restoration.
Follow some load test results.
1 user performing wallet restoration:
2 users simultaneously performing wallet restoration:
3 users simultaneously performing wallet restoration:
4 users simultaneously performing wallet restoration:
5 users simultaneously performing wallet restoration:
10 users simultaneously performing wallet restoration:
20 users simultaneously performing wallet restoration:
50 users simultaneously performing wallet restoration:
Here is not relevant the time, but moreover that even under an insane workload the server is able to complete its job without timeout errors for the customers.
100 users simultaneously performing wallet restoration:
100 users with lots of transactions simultaneously performing wallet restoration:
Important Changes Introduced
The
DbSyncChainHistoryProviderand theDbSyncUtxoProviderperform almost the same actions about utxos but with some slight difference.They were aligned to produce exactly the same data with two purposes:
CardanoWsClientuses the same source for both the pseudo-providers implementations; this simplified a lot the comparison tests.