Skip to content

Add a method memoizer to dartdoc #1571

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Dec 14, 2017
Merged

Add a method memoizer to dartdoc #1571

merged 12 commits into from
Dec 14, 2017

Conversation

jcollins-g
Copy link
Contributor

@jcollins-g jcollins-g commented Dec 12, 2017

First version of a memoizer to eliminate code of the form:

class GinormousClass extends GinormousInterface {
  Thing _foo;
  Thing foo {
    if (_foo == null) {
      _foo = doExpensiveThing();
    }
    return _foo;
  }
}

and replace it with

class GinormousClass extends GinormousInterface {
  Memoizer _memoizer = new Memoizer();

  Thing foo => _memoizer.memoize(doExpensiveThing);
}

The cache gets declared once, and from that point on you don't have to worry about it. The class can also handle a variety of other "cache the result of this" tasks that dartdoc currently implements separately, over and over again.

Alternatively, it could conceivably be used as a mixin:

class GinormousClass extends GinormousInterface with Memoizer {
  Thing foo => memoize(doExpensiveThing);
}

This has some disadvantages, including polluting the namespace. Probably the former is what I will go with to update dartdoc.

I am building this differently from package memoize because Dartdoc needs a per-object cache, and in the future dartdoc will need cache invalidation and possibly serialization of cached entities. Some elements of this work might eventually make it to that package, though.

@googlebot googlebot added the cla: yes Google CLA check succeeded. label Dec 12, 2017
HashableList(Iterable<dynamic> iterable) : super(iterable);

@override
bool operator ==(other) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written, this is not symmetric, since HashableList will compare equal to an ordinary list, but not vice versa. Do you need this behavior?

If not, I'd suggest adding an if test, like so:

@override
bool operator ==(other) {
  if (other is HashableList) {
    if (this.length == other.length) {
      ...
    }
  }
  return false;
}

As a side benefit, the operations on other will no longer require dynamic dispatches, so they will be faster on some implementations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, assymetric behavior was not needed and if I ever ran into it I'd need this. Done.


// Never write this if statement again.
R _getOrUpdateCache<R>(R Function() f, _MemoKey key) {
if (!_memoizationTable.containsKey(key)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be more performant:

R _getOrUpdateCache<R>(R Function() f, _MemoKey key) {
  return _memoizationTable.putIfAbsent(key, f);
}

...because putIfAbsent only needs to hash the key once.

(Note that in ordinary circumstances this benefit is offset by the expense of creating the closure to pass to putIfAbsent, but in this particular case we're paying that expense anyhow).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that, but I don't think this works because the code would have to read:

R _getOrUpdateCache<R>(R Function() f, _MemoKey key) {
  return _memoizationTable.putIfAbsent(key, f());
}

I think that means that the function f will be called before the decision in putIfAbsent, which sadly makes this whole exercise less than useful. I suppose I could create my own Map wrapper that takes a closure and doesn't execute it until the ifAbsent check is made, but that seems like a negative marginal return on investment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually if you look at the definition of putIfAbsent you'll see that it already has the behavior we want:

  /**                                                                                                                                                                                                                                                                                     
   * Look up the value of [key], or add a new value if it isn't there.                                                                                                                                                                                                                    
   *                                                                                                                                                                                                                                                                                      
   * Returns the value associated to [key], if there is one.                                                                                                                                                                                                                              
   * Otherwise calls [ifAbsent] to get a new value, associates [key] to                                                                                                                                                                                                                   
   * that value, and then returns the new value.                                                                                                                                                                                                                                          
   *                                                                                                                                                                                                                                                                                      
   *     Map<String, int> scores = {'Bob': 36};                                                                                                                                                                                                                                           
   *     for (var key in ['Bob', 'Rohan', 'Sophena']) {                                                                                                                                                                                                                                   
   *       scores.putIfAbsent(key, () => key.length);                                                                                                                                                                                                                                     
   *     }                                                                                                                                                                                                                                                                                
   *     scores['Bob'];      // 36                                                                                                                                                                                                                                                        
   *     scores['Rohan'];    //  5                                                                                                                                                                                                                                                        
   *     scores['Sophena'];  //  7                                                                                                                                                                                                                                                        
   *                                                                                                                                                                                                                                                                                      
   * Calling [ifAbsent] must not add or remove keys from the map.                                                                                                                                                                                                                         
   */
  V putIfAbsent(K key, V ifAbsent());

So you should be able to do this directly:

R _getOrUpdateCache<R>(R Function() f, _MemoKey key) {
  return _memoizationTable.putIfAbsent(key, f); // Note lack of `()` after `f`
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

facepalm I don't know why I thought putAbsent worked the other way. Maybe because of Python's dict.setdefault. Fixed.

}

/// A type alias for [Tuple2]`<Function, HashableList>`.
class _MemoKey extends Tuple2<Function, HashableList> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this class? It looks like every case of new _MemoKey(f, new HashableList([param1, param2, ...]) could be replaced with new HashableList([f, param1, param2, ...]), and we would save a memory allocation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. This is a refactoring leftover from before I added support for method parameters. Punting Tuple makes it more likely this can make it in to the memoize package someday. Done.

}

/// Calls and caches the return value of [f]() if not in the cache, then
/// returns the cached value of [f]().
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to add a comment explaining that f should be a tear-off of a method, not an inline closure, in order to guarantee that on subsequent calls the value of f will compare equal. For example, clients should do this:

_doExpensiveThing() { ... }
memoizedExpensiveThing() => memoized(_doExpensiveThing);

rather than this:

memoizedExpensiveThing() => memoized(() { ... });

And similar comments in memoized1, memoized2, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave an outline of this idea in the class docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Google CLA check succeeded.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants