-
Notifications
You must be signed in to change notification settings - Fork 7.6k
1.x: Upgrading AsyncOnSubscribe from Experimental to Beta #3817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
see #3780 for context. |
The goals for the However sever or trivial this impact, I do not see this as a problem with The @akarnokd you are the committer who has been most vocally against this upgrade. Can you please post your concerns with the API? |
You want to rely on the request pattern which now has to be changed almost everywhere and become part of the public API to satisfy this operator. Any third party lifted operator and the whole thing may be broken again. You need tight control of the request amount which only happens reliably with a same-thread consumer (so no observeOn, etc.). In addition, the operator can't ensure the specified amount is honored by the source it generates (i.e., last page may shorter than the page size); overproduction can shift the whole output-paging and underproduction triggers a new round to generate the missing amount. The only alternative for handling bad sources is to signal error in either case. (Btw., @kurzweil 's example doesn't use the You quoted RPC and paging as the main target for AsyncOnSubscribe. RPC requires bi-di streaming and paging requires streaming+backpressure over a network boundary, both beyond RxJava. |
None of your points has anything to do with the API of
The purpose of the So I'm still at a loss for why you argue against an API that is clearly not accounted for in RxJava. Can you give me any alternative to solving the consumer driven RPC with minimal buffering that maintains ordering? |
Yes, swing around the request parameter into a
|
Okay, thanks. This looks interesting but could you please explain your code example a little bit? What is |
To start the next round of values with different parameters for service(). |
I'm sorry but I don't understand it. Can you show some usage? Is this supposed to be implemented in a I'm still not seeing any necessary change to the API. There is still a need to have direct control over providing data for the precise amount requested by the consumer. This seems like implementation to buffer/collect requests to be fulfilled at sometime in the future. If so this could be used as a stop gap solution with |
You try to reinterpret the request() call which only works if the producer and the consumer are next to each other without any intermediate operators. It's the same problem when people try to interpret the request amount as the number of bytes to return when the sequence type is byte[]. It won't work properly. By using the PublishSubject above, you have the option to control the parameters to the service public class ServicePaging {
static Observable<Integer> service(int start, int size) {
return Observable.range(start, size);
}
static Pair<Observer<Integer>, Observable<Integer>> repeatingService() {
Subject<Integer, Integer> ps = PublishSubject.<Integer>create().toSerialized();
return Pair.of(ps, ps.concatMap(v -> service(v, 20)));
}
public static void main(String[] args) {
Pair<Observer<Integer>, Observable<Integer>> pair = repeatingService();
pair.second.subscribe(new Subscriber<Integer>() {
int totalReceived;
int received;
int index;
@Override
public void onNext(Integer t) {
System.out.println(index + " - " + t);
if (++received == 20) {
index += 20;
received = 0;
pair.first.onNext(index);
}
if (++totalReceived == 120) {
pair.first.onCompleted();
}
}
@Override
public void onError(Throwable e) {
e.printStackTrace();
}
@Override
public void onCompleted() {
System.out.println("Done");
}
});
pair.first.onNext(0);
}
} |
Your code example shows a way that you can protect a producer from the I still don't see what is the hold up. Why not use request values for what they are intended? |
I just want to clarify. I don't think its absolutely necessary to fix all cases of this request behavior right now. I'm not talking about promoting into public (i.e. non- |
Backpressure amounts are established between subsequent producer and consumer pairs in the operator chain, depending on the efficiency of moving data across a potential (i.e., flatMap) or actual (i.e., observeOn) async boundary. It is not there to propagate a specific request amount from the end consumer up to the main source; that requires a different protocol (i.e., bi-di streams). But let's assume for a second it were so. The consequence would be that unless the end-consumer's request amount is constant, operators such as observeOn and flatMap can't work with a fixed capacity queue anymore. (Even if they go unbounded, a request with
In the current setup, this only works if the source and consumer are right next to each other since they reinterpret the request amount, making it a synchronous operator.
Generally, you should be able to tell the service how many elements you'd like to receive before getting an Observable back; it is a constant 20 in my example.
In my example, I didn't add In addition, the prefetch you want is contradictory to your request N from service where comes from the end consumer. If the requesting is left as is now in RxJava, you get prefetch and on-demand delivery. Bottom line is that sources such as |
I don't think I have ever said that this is how back pressure works. Many times the request amounts are forwarded (i.e. trivial back pressure for operators like scan, etc). Your definition sounds precisely like what I am trying to say. However if your consumer chain has a buffer in place (i.e. a take, observeOn, flatMap, etc) then the consuming operator will now have a constraint on it's request amount. So in general the consumer observable chain will have some finite but dynamic request amount.
I am not sure what any of this has to do with our discussion since I am not talking about constant request amounts.
The
In your example there was no ordering of multiple concurrent fulfilling requests. For example, the consuming subscriber requests a batch of 5000 (which has a buffer size of 5000). While the provider is busy producing the 5000 the buffer in the consumer drains 2000 entries and can then request 2000 more (total 7000). However the original 5000 have not finished draining from the producer. Concurrent batches and order preserving. There is nothing in your example that can handle this.
What do you mean by a synchronous operator and why is this significant? Operators are synchronous in that there cannot be interleaving onNexts so I am not sure if you understand what the goal is. The goal of AsyncOnSubscribe is for a consumer to request arbitrary amounts of data and for the producer to provide exactly that amount. It does not matter how big or small these amounts are nor should it matter when those requests are made. This is useful when integrating a system that cannot provide an Rx Observable interface. For instance, a message passing system like activeMQ or SQS which receives data and must buffer or risk violating back pressure rules. In a naive implementation one might wrap the message stream via a listener and observe the datasource directly. This will clearly not satisfy back pressure when there is a slow consumer. The next iteration one could use The purpose of this is to integrate producers which have no concept of back pressure (i.e. sampling from mouse events or paginating over data services) and request ONLY the requisite amount at a time. This seems like exactly the purpose behind back pressure.
Can you tell me how the AsyncOnSubscribe changes the request amounts? They should be exactly the amounts requested from the consumer. Also this is not a "repurposing" of back pressure. The purpose of back pressure was to minimize the unbounded buffers between a producer and consumer. The AsyncOnSubscribe does exactly this while preserving ordering of data. |
I simply can't understand what you were trying to accomplish. All I see is mixed concepts and interpretations of request, backpressure, RPC-like behavior, bi-directional value passing. If you want to Rx-ify a pull-like data source, Observable<GroupedObservable<Request, T>> service(Observable<Request> parameters); where |
I have no other way to try to tell you that we need a way to convert @benjchristensen this functionality was your vision. Perhaps you can communicate it best. |
@akarnokd I was just talking with @kurzweil about this and it got me thinking about you're last comment. Could we invert the
It would be a lot like |
If you mean I can't think of a version based on the retryWhen/repeatWhen signature. It seems you want the end-subscriber to be in charge of the amount to be generated in one batch, thus it has to know about the front to signal the generator. There is an alternative signature to my method: Subject<Long, T> service(Func1<Long, Observable<T>> generator, JoinMode mode); But this may also require a special boundary indicator T for the Subscriber to know when to call onNext on the subject. |
Eclipse doesn't seem to mind that I don't have an instance of Observable. Maybe because it is an inner interface?
Not literally. The retryWhen operator takes a
Not really no. The values emitted to the
I guess but it seem strange to have a new type to describe something that can be done by applying an existing operator. It also limits the user to only things that we've thought of. Like in a bazar world where someone wants to use The place where my idea gets awkward is how does the returned observable terminate. The requests public static <T> Observable<T> create(final Transformer<Long, T> requestsToValues) {
return create((subscriber) -> {
PublishSubject<Long> requestsSubject = PublishSubject.create();
t.setProducer(requestsSubject::onNext);
requestsToValues.call(requestsSubject).subscribe(subscriber);
}
});
} Here is a proof of concept of what it may look like to use it. public class Main {
public static void main(String[] args) {
Observable.createAsync(requests ->
getCount().flatMap(max ->
requests
.scan(State::new, State::next)
.takeWhile(state -> state.start < max)
.concatMap(Main::getPage)));
}
private static class State {
long start;
long amount;
public State next(long n) {
State next = new State();
next.start = this.start + amount;
next.amount = n;
return next;
}
}
private static Observable<? extends Long> getCount() {
return null;
}
private static Observable<? extends String> getPage(State request) {
return null;
}
} anyway if you think this API has any merit we should probably move this off to another issue or PR. |
@benjchristensen Can you comment? |
I find this API valuable, and it allows batch size and request(n) size to differ, which is exactly what is needed and wanted. Here is example code showing this behavior: package scratch;
import rx.Observable;
import rx.Observer;
import rx.observables.AsyncOnSubscribe;
import rx.schedulers.Schedulers;
public class TestAsyncOnSubscribe {
public static void main(String[] args) {
Observable<String> obs = Observable.create(new AsyncOnSubscribe<Integer, String>() {
final int batchSize = 50;
@Override
protected Integer generateState() {
return 1;
}
@Override
protected Integer
next(Integer state, long requested, Observer<Observable<? extends String>> observer) {
System.out.println("new async batch starting at: " + state + " requested: " + requested);
observer.onNext(Observable
.range(state, batchSize)
.doOnSubscribe(() -> System.out.println("Subscribing to batch starting at: " + state))
.map(i -> "value_" + i)
.subscribeOn(Schedulers.computation()));
return state + batchSize;
}
});
//obs.take(15).toBlocking().forEach(v -> System.out.println(v));
obs.observeOn(Schedulers.computation()).take(250).toBlocking().forEach(
v -> System.out.println(v));
}
} Here is the output:
I'm using 250 even though it makes the output longer so that it is beyond the 128 that Note how this correctly does the following:
One possible change would be to not invoke And if LONG.MAX_VALUE is requested? It correctly behaves by executing batches of 50 repeatedly after each other:
This API is useful and very powerful when composing |
What if page responded has different size than requested? Like request(50) is called, and you decided to load data in a chunks of 20 items, but your RPC call responds with 10. This is one of the reasons I created https://github.com/Rabajaba/rx-utils PortionObservable. (may be it will be helpful here) |
Why does it return 10 when 20 is requested? If it's because it's the end, that's fine, as it should then cause termination. If the server can't actually respond deterministically, then it seems that using offset/limit is the wrong approach. Your PortionObservable solution looks interesting. I browsed the code briefly, but not enough to grok the tradeoffs. Would be interesting for you and @stealthcode to compare solutions and see if AsyncOnSubscribe should change in any way before promoting it. |
@benjchristensen Thanks for the feedback. @stealthcode One feature I didn't yet implemented: parallel load of pages, while results would be still emitted in a correct order. Use case: you have external slow API, which responds data not faster than 10 seconds, no matter to your request parameters. Doing parallel pages loading will significantly improve performance. Let me create another implementation of PortionObservable using AsyncOnSubscribe and see whether my unit tests would still complete. |
Ben's use case, which disregards the request amount in the generator callback, can be written with Observable.create(
SyncOnSubscribe.<...>createSingleState(
() -> 1,
(s, o) -> o.onNext(Observable.range(1, 50))
)
)
.flatMap(o -> o, 1)
.subscribe(...) Flatmap will request 1 and request another only after the previously generated inner Observable completes. If you want to pre-generate these sources, use |
If I remember correctly, API promotion happened after voting before the minor version has changed (0.x -> 1.0, 1.0 -> 1.1). If you wish, we can vote now @ReactiveX/rxjava-committers or just before 1.2. If voted for now, after a rebase, this can be merged and won't cause you problems with the ongoing cleanup work. If this can wait till 1.2, you can close the PR. |
Closing as not mergeable. If you have further input on the issue, don't hesitate to reopen this issue or post a new one. |
No description provided.