Skip to content

Conversation

@eregon
Copy link
Member

@eregon eregon commented Mar 23, 2023

  • These singleton classes are expensive to create and also cause a lot of uncached method lookups.
  • Both Post.create and Post.where(id: i).first create a new singleton class per call (through #clone of a dataset object which has a singleton class).
  • With this commit, no singleton classes are created during insertion and in run_benchmark. And therefore also no uncached method lookup.

We discussed this in details on the CRuby Slack. @jeremyevans suggested using Post[i] or Post.first(id: i) instead of Post.where(id: i).first.
I think what's important is to benchmark something representative of what users would use.
@jeremyevans Do you know if Post[i] or Post.first(id: i) is frequently used by Sequel users? What's recommended in the documentation?

Numbers on 3.2.1 YJIT:

  • Post.where(id: i).first (original): 130ms
  • Post.first(id: i): 79ms
  • Post[i]: 68ms

For DB[:posts].insert() vs Post.create(), I think it would be more representative to use Post.create(), but OTOH this is only part of the setup, not of the measured time, and it seems good to avoid too much side effects during the setup (Post.create() does cause one uncached method lookup (for a literal_integer call site) for Post[i]).
I think this is something to improve in Sequel, Post.create() should be able to cache the dataset or avoid creating a new one, similar to what DB[:posts].insert() does (jeremyevans/sequel#2007). So once that's fixed in Sequel we could use Post.create() again.

Related: #205 #159

* These singleton classes are expensive to create and also cause a lot
  of uncached method lookups.
* Both `Post.create` and `Post.where(id: i).first`
  create a new singleton class per call (through #clone of a dataset
  object which has a singleton class).
* With this commit, no singleton classes are created during insertion
  and in `run_benchmark`. And therefore also no uncached method lookup.
@eregon
Copy link
Member Author

eregon commented Mar 23, 2023

The ActiveRecord benchmark does post = Post.where(id: i).first, is that representative code for ActiveRecord @tenderlove (#5)? Another possibility seems post = Post.find(i).

@eregon
Copy link
Member Author

eregon commented Mar 23, 2023

For the ActiveRecord benchmark, Post.find(i) seems ~2x faster, 46ms vs 110ms on 3.2.1 YJIT, 84ms vs 172ms without YJIT. => #208

@maximecb
Copy link
Contributor

It seems to make sense to have more idiomatic/efficient use of sequel in the benchmark.

@jeremyevans any input?

@jeremyevans
Copy link
Contributor

Post[i] is definitely more idiomatic and efficient for Sequel. However, if you are using models, Post.create is more idiomatic than DB[:posts].insert, though it is less efficient (create handles validations, callbacks, etc.). I should be able to commit a change to Sequel today to avoid the use of singleton classes for datasets.

@jeremyevans
Copy link
Contributor

FWIW, all extensions and plugins other than timestamps should be removed from the benchmark. The only one that has an effect is prepared_statements, and that's no longer recommended for general use.

@maximecb
Copy link
Contributor

Thank you Jeremy.

* And it will likely be fixed in Sequel to avoid creating a singleton class per call.
@hmistry
Copy link
Contributor

hmistry commented Mar 23, 2023

@jeremyevans @eregon My thinking for adding this benchmark is not find the most performant way to do a query in Sequel but to see if and how much YJIT can optimize code with the plugin strategy because it is a chain of method overrides using modules.

Sequel has multiple ways to do a query and maybe the strategy should be to see how the fastest and slowest query methods perform in YJIT optimizations. If I were developing JIT, I would want different code strategies and see how JIT compiles and optimizes it.

@eregon
Copy link
Member Author

eregon commented Mar 23, 2023

@hmistry I understand. I think what matters though is the benchmark is representative and benchmarks something meaningful, if it's benchmarking something nobody/very few use, then it's a not so useful benchmark, because it might lead to optimizing things that are irrelevant for actual performance in practice (and optimizing things take time).

In this case, many singleton classes and uncached method lookup are pretty much pointless to benchmark on YJIT, this stuff is handled by the interpreter, YJIT or even any JIT can do little about it. Basically we found a performance issue in Sequel and that's getting fixed (jeremyevans/sequel#2007).

Regarding the plugin strategy, AFAIK that is still being benchmarked with this PR (at least the various plugins are in Post's ancestors).

@eregon eregon deleted the improve-sequel-benchmark branch May 2, 2023 11:20
eregon added a commit to eregon/ruby-bench that referenced this pull request May 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants