Skip to content

Conversation

@conorschofield
Copy link
Contributor

While testing the new ChannelFinder version for LBNL, ChannelFinder would crash when getting an error from the Archiver. This would cause ChannelFinder to not continue to try to populate multiple archivers if the first one returned any errors.

Example timing out or exceeding max buffer size.

Apr 03 17:02:55 [server] java[244057]: 2025-04-03 17:02:55.667  WARN 244057 --- [ taskExecutor-2] o.p.c.p.ChannelProcessorService          : ChannelProcessor org.phoebus.channelfinder.processors.aa.AAChannelProcessor$$EnhancerBySpringCGLIB$$8bcefb2e throws exception
Apr 03 17:02:55 [server] java[244057]: reactor.core.Exceptions$ReactiveException: java.util.concurrent.TimeoutException: Did not observe any item or terminal signal within 15000ms in 'flatMap' (and no fallback has been configured)
Apr 03 17:02:55 [server] java[244057]:         at reactor.core.Exceptions.propagate(Exceptions.java:396) ~[reactor-core-3.4.34.jar!/:3.4.34]
Apr 03 17:02:55 [server] java[244057]:         at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:98) ~[reactor-core-3.4.34.jar!/:3.4.34]
Apr 03 17:02:55 [server] java[244057]:         at reactor.core.publisher.Mono.block(Mono.java:1742) ~[reactor-core-3.4.34.jar!/:3.4.34]
Apr 03 17:02:55 [server] java[244057]:         at org.phoebus.channelfinder.processors.aa.ArchiverClient.getStatusesFromPvListBody(ArchiverClient.java:108) ~[classes!/:na]
Apr 03 17:02:55 [server] java[244057]:         at org.phoebus.channelfinder.processors.aa.ArchiverClient.getStatuses(ArchiverClient.java:73) ~[classes!/:na]
Apr 03 17:02:55 [server] java[244057]:         at org.phoebus.channelfinder.processors.aa.AAChannelProcessor.getArchiveActions(AAChannelProcessor.java:202) ~[classes!/:na]...

This adds error handling and returns an empty response so ChannelFinder can continue its process.
New error handling:

Apr 04 14:25:19 [server] java[248015]: 2025-04-04 14:25:19.191  INFO 248015 --- [ taskExecutor-2] o.p.c.processors.aa.AAChannelProcessor   : Get archiver status in archiver ArchiverInfo[alias=arch-ml, url=archiver-url, version=d3dd9cc-2022-08-08-als_SNAPSHOT_14-May-2024T13-57-58, policies=[VerySlowControlled, MediumContr>
Apr 04 14:25:34 [server] java[248015]: 2025-04-04 14:25:34.192  WARN 248015 --- [     parallel-1] o.p.c.processors.aa.ArchiverClient       : There was an error getting status from pv list body response with URI: archiver-url/mgmt/bpl/getPVStatus. Error: Did not observe any item or terminal signal within 15000ms in 'flatM>
Apr 04 14:25:34 [server] java[248015]: 2025-04-04 14:25:34.193  WARN 248015 --- [ taskExecutor-2] o.p.c.processors.aa.ArchiverClient       : Error when trying to get status from pv list query: argument "content" is null
Apr 04 14:25:34 [server] java[248015]: 2025-04-04 14:25:34.193  INFO 248015 --- [ taskExecutor-2] o.p.c.processors.aa.ArchiverClient       : Configure PVs {ARCHIVE=[], PAUSE=[], RESUME=[], NONE=[]} in archiver-url
Apr 04 14:25:34 [server] java[248015]: 2025-04-04 14:25:34.193  INFO 248015 --- [ taskExecutor-2] o.p.c.processors.aa.AAChannelProcessor   : Get archiver status in archiver ArchiverInfo[alias=arch03, url=archiver-url, version=d3dd9cc-2022-08-08-als_SNAPSHOT_22-July-2024T11-05-05, policies=[TMA]]
Apr 04 14:25:49 [server] java[248015]: 2025-04-04 14:25:49.194  WARN 248015 --- [     parallel-2] o.p.c.processors.aa.ArchiverClient       : There was an error getting status from pv list body response with URI: archiver-url/mgmt/bpl/getPVStatus. Error: Did not observe any item or terminal signal within 15000ms in 'flatMa>
Apr 04 14:25:49 [server] java[248015]: 2025-04-04 14:25:49.194  WARN 248015 --- [ taskExecutor-2] o.p.c.processors.aa.ArchiverClient       : Error when trying to get status from pv list query: argument "content" is null
Apr 04 14:25:49 [server] java[248015]: 2025-04-04 14:25:49.194  INFO 248015 --- [ taskExecutor-2] o.p.c.processors.aa.ArchiverClient       : Configure PVs {ARCHIVE=[], PAUSE=[], RESUME=[], NONE=[]} in archiver-url

    Apr 04 14:25:49 [server] java[248015]: 2025-04-04 14:25:49.194  INFO 248015 --- [ taskExecutor-2] o.p.c.processors.aa.AAChannelProcessor   : Configured 0 channels.

@jacomago
Copy link
Contributor

jacomago commented Apr 7, 2025

To fix the sonarcloud lints you could write a simple error logging function for requests to the archiver, that you then call in the onErrorResume method

@jacomago
Copy link
Contributor

jacomago commented Apr 7, 2025

You are also now repeating all the uriStrings in two places.

@sonarqubecloud
Copy link

sonarqubecloud bot commented Apr 7, 2025

@conorschofield
Copy link
Contributor Author

I've created a showError method

@jacomago jacomago merged commit 053418f into ChannelFinder:master Apr 8, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants