-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Object.hashCode is slow? #24206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Regarding the current design: Since most objects will not have their hash code computed, dedicating a word in all of them was deemed wasteful. The default implementation of Object.hashCode uses an RNG rather than object addresses to allow for the GC relocating objects, although currently, we only relocate objects in the young generation. |
For what it's worth, this was a noticeable performance issue in dartfmt, that I had to optimize by having my own hash code base class. |
To give you an idea of the overhead Daniel mentioned in his comment I ran dart2js (the default version) compiling itself and collected the allocation information. From that I selected the top-ten object allocations. Based on the average object size I can give you a percentage overhead of adding a field to each of these objects. None of the objects in this list would benefit from an extra hash code field, but since they are all instances you suggest they have one. The size is the cumulative allocated size, to get an idea on how much more memory we would have to allocate and touch. LinkEntry: +25% on top of 3.0GB Given these numbers of overhead it should be pretty clear that adding an extra field for a hash in all objects adds significant memory overhead, reduces cache efficiency, adds runtime cost for allocation, adds extra GC pressure and more. For the comparatively few objects that need a hash value, you can either cache the value in an extra field as you suggest or rely on the address based lookup of the hash value. Since usually the objects you need to put into identity maps are under your control, I suggest that you add the field into those. Then you get the best of both worlds, low overhead and fast hash access. Regarding the comment about the use of a random number: This gives you a much better spread of your hash values than if you use addresses of objects as they tend to have alignment issues. P.S. As Bob suggests, you should cap the counter value so as not to grow into arbitrary large integers or use a random number with sufficient bits. |
I see, yeah a field is not going to work. Could it be that the VM's hash table has a scalability issue, though? It seems that Object.hashCode is pretty fast until a critical point is reached at which point is becomes REALLY slow. Please look at this: I'm fine with using the manual fix myself. Right now I'm just concerned about the unpredictable performance. It seems like one of those lurking problems that most people have to learn the hard way (e.g. because their web server suddenly becomes unresponsive). |
Thank you for your concern. I just reran your example benchmark, with fixes which factor out process startup, allocation and compilation artifacts: class Foo {
}
int test(int count) {
var list = new List(count);
var sum = 0;
var c1000 = count / 1000.0;
var foo;
var sw = new Stopwatch();
sw.start();
for (var i = 0; i < count; i++) {
list[i] = new Foo();
}
sw.stop();
var lap0 = sw.elapsedMicroseconds / c1000;
sw..reset()..start();
for (var i = 0; i < count; i++) {
sum ^= foo.hashCode; // Ensure to not ever overflow into Mint or BigInt.
}
sw.stop();
var lap1 = sw.elapsedMicroseconds / c1000;
print("$count, ${lap0.toStringAsFixed(3)}, ${lap1.toStringAsFixed(3)}");
return sum;
}
warmup() {
print("** Warmump **");
test(1000);
test(10000);
}
main(List<String> args) {
var arg1 = (args.length > 1) ? int.parse(args[0]) : 10000;
var arg2 = (args.length > 2) ? int.parse(args[1]) : 300000;
var arg3 = (args.length > 3) ? int.parse(args[2]) : 10000;
warmup();
print("** Test **");
for (var c = arg1; c <= arg2; c += arg3) {
test(c);
}
} Once these measurement errors have been factored out we can see that the time in microseconds taken per 1000 calls to foo.hashCode is remarkably flat, while the time taken to allocate 1000 Foo objects obviously fluctuates depending on whether we hit a GC in one of the allocation loops:
I hope this also addresses your concern about unpredictable performance. |
"foo" is never assigned so it's just computing null.hashCode all the time. After changing it to list[i].hashCode the numbers in the second column get a lot bigger and also fluctuate. Outside of the GC noise, there does not seem to be an upward-going trend, though, but the spikes themselves are likely due to the hash table since they go away when hashCode is overridden. I noticed that 32-bit dart runs the benchmark a lot more smoothly. Don't know what to make of that, maybe it's just faster.
|
Thanks! Another reason not to be writing code at midnight. :-( I'll investigate where the occasional spike comes from. |
We store identity hash codes on 64-bit platforms in the object header (by using free bits in the header). On 32-bit platforms we don't have the space in the header word and therefore call out to runtime ( Possible solutions for making identity hash codes faster on 32-bit platforms include:
If it's really important to solve it, I would suggest doing 2. from above. |
The canonical const objects table currently uses the object header and this has been a consistent problem for us
I think we should re evaluate the memory overhead of adding a hash value per object on 32 bit architectures and see if it makes it easier for us. We already have hash as a field in RawString, RawClosure, RawTypeArguments, RawType, RawTypeParameter, RawBoundedType etc. |
@asgerf, @mkustermann What is the update on this issue? I learned this the hard way, looking everywhere in my code for some kind of memory leak that I couldn't find. Lost a whole day to run into the conclusion that it was my design "improvement" (reducing the number of sets by joining values in one class) what was striking the performance so badly. Even though on average it was better to have an inefficient hashCode rather than increasing the memory allocation of objects, you have to considered expectations and variability. A 10x variability is unacceptable. Ensuring consistency has a cost, it has always been like that. The numbers shown by @iposva-google are not a reason strong enough. If this extra memory allocation could be an issue, I suggest creating an alternative Object, perhaps PrimitiveObject, and Object extends PrimitiveObject with a fast hashCode implementation. Finally, why does it has to be so slow? Why is not the hashCode simply the memory allocation pointer, which must be unique for an Object? |
@icatalud Unless you were benchmarking performance of ARM32 or X86, I suspect that you were most likely bitten by a regression that was introduced somewhere in June (shipped with 2.5) and was recently fixed on master branch by 60328ca. When I run your benchmark on X64 Dart VM built at HEAD I get the following numbers:
which seem fine to me. On 64-bit architectures (ARM64 and X64) we in fact store identity hash code in the object header (because there is some free space there).
Objects can move around in memory (Dart has a moving garbage collector which moves objects when promoting from young to old generation or when compacting the heap) so pointer can change during the execution. There is a way to make this work (if an object had it's identity hash code used you can enlarge it by one word when relocating) but it is rather complex and give that 32-bit architectures might be going out of focus it is not high priority for us to implement it. |
@mraleph Switching to master and upgrading solved the issue as you pointed it out. I'm running Dart on a 64 bit machine, so I'm assuming Dart is executing assuming this 64 bit architecture. When running on stable however I get the following results:
I can assure you (from my own experience) that this could be producing serious confusion in other developers that are clueless about big performance drops they might be having. This is currently happening on stable. My understanding about hashCode and equality is the following:
I do not think it is conscious with Dart users to take this issues so lightly. Consistency is an important aspect of every programming language. If you cannot include a specialized solution, this should at least be clearly informed. I already explained the cost that this had in my case. You cannot leave 32 bit behind with that heavy cost without a warning. It would imply some programs running 10 times faster in 64 bit machines. If so, 32 bit machines should simply not be supported or clearly point out the drawbacks with warnings, or discouraging the utilization of Dart in 32 bit machines. This is important for developers to know when making a choice of a programming language. Imagine you are someone that chooses Dart as the language to code a solution for 32 bits machines, when by choosing a different language, you could have 10x improvement. Even python or JS perform better on this case, as an example in python this runs in 100 us, faster than Dart. It can be easily solved, but why would you know about this?
Even if the pointer moves around, the hashCode could be stored the first time with the current pointer value and I would expect collisions to be sparse. Otherwise the hashCode could be created from a global id, like |
No, this is not a correct understanding. On X64 we freed some space (32-bits to be precise) in the object header so that we could store hash code in an object without increasing object size. This happened several years ago. However in Dart 2.5 we had a regression - we accidentally disabled fast way to access this hash code storage - which is what you observed on your benchmarks. On 32-bit we don't have a luxury of having a free space in an object to put hash code into - so we opted for now to leave this choice in the developer's hands. Developer can always override As you can see we are keeping this bug open in a hope that one day we will also improve
To store it you need to have some space in an object - which is what we want to avoid (on 32-bit platforms). |
This is still an issue on stable 64 bit, which is what most people probably use. Shouldn't it get hot fixed or something?
The problem is not that there is an easy workaround, the problem is that this is unknown, there is no warning about it. Put yourself in the feet of a developer that decided to port his Python machine learning code (or anything else you can think of that requires lots of hash operations) into Dart language. After a lot of work, to their disappointment the Dart code is slower than the Python code. They have no clue why. Sure there is an easy fix, but this is unknown and intuitive, what would drive them to think that it is the Hash data-structures they are using what is making their program under perform?
The cost of variability is not being leveraged correctly. Making things work more efficiently on most cases might be slightly better on the average, but the cost for the people that loses is too high. If you suffered this yourself, you would think differently, the problem is that you already know about it, so you it's natural to take this into account. Like I said, increasing the memory utilization by 20% (I suspect that in most programs it would be much less than that) is something acceptable, but dropping the performance by an order of magnitude is not acceptable. The default use of objects in hash data-structures should be efficient, because their utilization is common in any programming language. As precious as it might be, consistency is even more precious than memory. There is always a sacrifice of average outcomes for the sake of consistency, this rule applies to all human systems (from computer programs to the free market and the law). Taking care about consistency is having respect for humans and not treating them like numbers, it means understanding that it’s not fair that some people randomly get the bad stuff from the box surprises, because we wouldn’t want to be those ones.
It's completely acceptable to inform that Dart is designed to be more efficient in 64 bit machines, and for now pay the cost of taking extra memory in 32 machines by adopting simple solution of having an extra field in 32 bit Objects. It is even possible to still give the option of creating lighter objects, by having two classes Object and PrimitiveObject, but the default Object that most people is using should be the one that is consistent. If someone was optimizing for memory, they could switch to the lighter PrimitiveObject. I believe that people for whom the extra 20% memory is relevant is a rarer case than accessing the Object hashCode. Even some basic data structures like List or Maps could extend this PrimitiveObject. Update: I deleted comments that didn't add content to the issue. The possible solution I posted was already specified here, and the question before was off topic as pointed out below. |
Then you need to update references from other objects to this object. You know all of these references anyway when doing precise garbage collection. This is classical garbage collection stuff (see wikipedia for example or a more specialized literature if you have access to it). In any case this is off topic to this bug. |
I think most Dart programs are not at all performance bound by Object.hashCode performance so are unlikely to be significantly affected by the regression. It will be fixed in the next stable release, and in the next dev release. We generally don't put out stable hotfix releases unless there are significant issues like security problems, crashes, etc.
Making every object larger affects the performance for all Dart users. Increasing object size reduces the number of objects which fit in cache, which leads to more cache misses, which in turn harms performance. Making objects smaller at the expense of a slower Object.hashCode only affects the smaller number of users who use that method. |
That comment completely dismisses my previous comment about taking into account everyone and not just averages (or "most of"), specially in what regards high variability. You are right about the 99% of the cases not being affected, because even if a software works 10 times slower, it still does what it has to do fast enough (slow languages perform well in most uses cases), but you have no idea how much time this is costing to a 0.1% of the cases. I actually stumbled upon this problem one month ago, when just by chance I was unsatisfied with the performance of a library I have been working on. I changed the way I was registering objects by giving them an int id and manipulating everything internally with this id. The benchmarks improved the performance by 4x, I thought wow my new design is brilliant, I'm taking advantage of int Sets that have some special implementation and are super fast. I could believe this was the case thinking maybe it had to do with using contiguous numbers and the You not are measuring the cost of inconsistency (variability) fairly, you cannot be happy because most people (do you have numbers to support that claim anyways?) is fine by leaving a small percentage in the trash. I told you my story, but there must be other developers that suffered this.
Almost any developer would like to know that their programs are running according to the complexity bounds related to data structures they are using in a consistent manner every time and they don't have to wonder if they did something that is leaving their program in the 1% case where it is working slower than Python. This applies to 32 bit Dart developers too. Once again, consistency is key. If 32 bit programs have to run 5% to 10% slower on average but can you can ensure that my program is always going to perform according to what is expected and I don't have to worry about details, then I accept it.
If talking about quantity of users or average use case, that statement lacks the numbers to support the amount of impact this has. The 20% to 50% memory increase is on the extreme cases of benchmarks using minimal native data structures. I would argue that most users create classes and the classes have plenty of fields inside, but I do not have numbers to support that claim. I would also argue that many users (if not most) use the Map (json) or Set data structures and they put objects inside. In those cases the performance gain might be outweighing the performance cost caused by the slight memory usage increase. In conclusion:
According to what happened to me and might be happening to others, it is inconsiderate to them not finding this a significant issue. |
Having a second thought today about this issue, I realized that no one has explained with clear rationally what is going on. Theoretically the hashtable solution that is implemented shouldn't be as bad as it is according to the benchmarks. Whenever the hashCode is accessed is probably to use it within a hash datastructure, therefore getting the hashCode from a hashTable is like duplicating the hash data structures effort. With that logic the additional time it takes shouldn't go beyond 2x or 3x (3x because of the extra overhead). So why does it reach 10x? I thought the benchmark was not representative because it was adding one value, but I attempted the same with the set filled with random values and the result is similar. I thought that maybe the hashCode getter is being used more than one time, but it is not the case. In summary, there is something odd with the implementation, performance drop shouldn't be as bad. Aside from that comment I think I found a way to convince you @munificent and @mraleph that the current implementation is wrong (at least until it is justified to be beneficial). Imagine it was the other way around, the hashCode was stored within the object. If someone came and told you: Certainly not, there are quite some reasons that go against that proposition. Starting from the expectation that developers have about the language that using a getter is almost like directly retrieving a value, to the actual impact this could have, since retrieving the value from a hash table implies at least duplicating the cost of hash data structures retrieval operations, which makes it wonder if it is actually a good idea to have a hashCode getter at all. Perhaps it would be less misleading to remove the getter if it is hardly ever used, and force the users to create the getter if they need it in hashable Objects, or extend a HashableObject class instead of Object. The case with the hashCode implementation is that it was implemented without thoroughly justifying it, it is exactly the opposite of what I explained above. Currently you are in the position where all the checks were skipped and things are upside down. Everyone expects and thinks that hashCode is just a value stored in the object, but reality is that it is working differently. Somehow this has been accepted as normal and no one has a clue about how much negative impact this could be having. There is only one reason that goes in favor of that implementation, a possible performance gain, and that is only a maybe, because no one knows that for sure. But there is a list of reasons that go against it (I listed them in the previous comment).
About that comment I thought another way to put it. Imagine exactly the same problem happened after a C++ update. Some programs would run 10 times slower. Can you imagine how significant that would be, or the impact it would have on google servers? It could be a disaster and in fact it would be better that the latest update caused the programs to crash, because then the people would switch back immediately to the previous version of the compiler. But if the programs are unexpectedly running 10 times slower, unless the users had a performance benchmark, they would be naively thinking that their software is in a better spot with the new stable version of the compiler. In the case of Dart, because it is not so widely used, these bugs go more unnoticed, but you should have the same carefulness that you would be having when updating the C++ compiler. Remember that when one person posts about an issue it usually means it happened to 10 other people, because only a small percentage of people post issues. About the performance story I told you, here is the link to the code I "optimized" by luck when using an int id instead of objects. One of the general problems I see whenever I post something is that I read a lot of reasons that are based on arbitrary statements. Arbitrary statements are dangerous, because they make us feel that we are right about choices we make, when there is an open possibility that the statement itself is completely wrong. Decisions should be taken based on rational objective facts (within the possibilities to acquire them). |
Yes, because that is what happened, at least as I understand it. Many of the original Dart team members came from V8. My understanding is that V8 does pay the memory and performance cost for storing the hash code in every object. They didn't feel that was a good optimization, but couldn't remove it because it would regress some benchmarks, even if it made most JS code faster on average. When they created Dart, they deliberately avoided repeating what they considered to be a mistake.
That expectation doesn't exist in Dart. Iterable.length can take infinite time: Iterable<String> get forever sync* { while (true) yield "forever"; }
main() {
print(forever.length);
}
That was also part of the initial design of Dart. Originally, Object had no Basically, you can have three things:
But you only get to pick two. Dart picks 1 and 3. When 2 is a problem, it's easy to implement |
It surprises me that it happened like that. Even then, you shouldn't accept it as ok because that's how they did it, you should think whether or not that's rationally a good way to make decisions and whether you think things should be done like that.
Decisions shouldn't be based on feelings, should be based on facts. There is a "pressure" for Dart to be more efficient than nodejs, proving that is worth changing, therefore feelings are likely to be aligned with the idea of stretch reasoning in favor of conceding reliability in favor of performance (it's not necessarily like that, but these kind of things hide behind feelings). I think you shouldn't compare to anything but to Dart itself and just make a solid reliable language in all aspects and once you have that, you can optimize the vm as long as the behavior stays consistent. I believe the granular performance is hardly ever taken into account, as prove of that claim there are tons of python servers which probably run more that 5 times slower than nodejs. What that means is that most people are not choosing a language based on performance, that's only one more of the factors that are taken into consideration.
You are right that the expectation doesn't exist in Dart as it's specified in the effective dart design section (I just read it). It should be mentioned in the tour. In the design section they do mention that most languages expects getters to make minimal fast operations. If new developers come, they are probably going to have that expectation, as I expected it intuitively at least in what regards native getters.
But it would need to be clearly informed, is that what the casual Dart developer is going to do? The general problem I'm seeing, is that you are innovating in a language's behavior, which is fair as long as it is clearly pointed out. But it's not, I had no clue about that, it is not specified in the Object hashCode docs either (I discovered it through benchmarking), and it is not consistent, depending on the platform it has a variability of x10. Sometimes it is good idea to override, sometimes it's not.
Exactly because the uses cases are unpredictable is why I think it's better to point for consistency. If it's truly so important to save the space, perhaps a better solution was to automatically wrap the Object with another proxy Object that has a hashCode whenever the getter is used. I'm not aware about the internals though, I will ask on stackoverflow about that.
That's a good segmentation of the trade offs, however when deciding which options to take, it's not just about picking randomly. Every one of those points should be profoundly evaluated and think on the implications of them. About number 2, what are the implications for the developers? Will that make them more free? Does people coming from JS going to take that into account? What will happen if people forgets to override the hashcode, what will be the implications to their program? Is it intuitive? And so forth. When making decisions a about a product, they shouldn't be based on feelings, they should be based on rational thoughts and facts. You should put yourself in the feet of the consumers, what is what they want and how do they behave. I was very happy with some things in Dart, but these kind of things make me feel completely insecure about it, I don't believe the language is reliable anymore, because under the hood there are unexpected design choices that can have huge impact in my program (this is not the first one, but it's the first where I find that my python program was faster). What else am I doing wrong that I haven't found out yet? |
I think this thread is not going anywhere, so I respectfully request to take this discussion somewhere else. @icatalud we have heard your arguments, there is no need to repeat them. no amount of polemics on this issue is going to change the outcome or its prioritisation. We recognise that If you have an example of an application where (1) and (2) are both violated - then we would be happy to take a look - and that (unlike the discussion we are having here) would in fact change how we look at this issue. We would also be happy to take a patch addressing this issue on 32-bit platforms. If you feel like contributing we can explain what needs to happen. |
@mraleph Thank you for the update, I didn't know what was the status of this, no one said anything about leaving it open or addressing it. That's why I kept explaining my point and also because I found the given answers unsatisfying. Keep in mind that this is open for 4 years, I thought you would just leave this open. I was trying to make the point that this is important, even if (1) and (2) were rare, because if they are happening in servers for example, it's most likely that people are not noticing. Usually performance is not so important, as python servers are good enough for most applications. I happened to measure this because in the library I write it is important to have good performance, but I believe it's more common that what you believe. Remember this problem is currently also happening in stable 64 bit. Obviously this is important for me now and that's why I complain. I hope someone else raises the issue too. Dart doesn't have the developer mass of popular languages, making issues more unlikely to be discovered or pointed out.
I don't commit beforehand, but please explain what needs to happen to contribute. |
One possible way to fix this is on 32-bit is to use 2 bits in the object header to encode whether object has its
GC needs to be modified to each object it moves to check if it is in |
That's probably a more optimal solution than storing the hashCode in all objects.
Why are both "default" and "hashCode-used" states necessary, couldn't they be the same default state? If it is default, the hashCode is computed from address, otherwise it has been relocated. Would that be enough? Could you point out the key code pieces, so that I can have a head start? |
Because you want to distinguish objects that have their hash code used for something and those that did not - you need this information during relocation to decide if you need to add a hash code field into object.
VM source is in |
@asgerf are you a Dart developer? If you are not would you have any interest in collaborating with this issue? I made important progress and basically what is missing is figuring out why using a default hash_ in the GetHash() method causes a failure. The hash_ field is available because I'm testing on 64bit architecture, but the logic is the same, so it should work on 32bit. This is the PR in case you wished to participate. |
@icatalud Sorry, I'm not using Dart these days and I'm afraid I wouldn't be of much use here anyway |
@asgerf wrote:
I cannot think how would you not be of much use here when it was you who detected the bug originally and all of your statements became true, I arrived here 4 years later exactly under the circumstances you mentioned. Overall what is the insight you would give about this issue, what would be your preferred choice? Consider that Dart developers claim that there is a significant increase in memory usage when adding hash_ field in 32 bit architectures. I would say that a 20% impact is for specific native data structures benchmarks, but it should be far less in real applications. I'm implementing a rather complex solution, but it seems that the only way to currently provide consistency to 32 bit platforms. It is my opinion is that it should the other way around, Dart 32 should be by default consistent and the work I'm doing should be an optimization. Consider this like a Dart VM c++ challenge, not a Dart language one. Your help would be much appreciated by dozens (hundreds?) of people regardless of you using Dart or not and your work will be documented (you become part of the language historical contributors). It's also an interesting learning experience to see how the internals of the VM work. At least I have learned a lot, thanks to it I have now designed my own ideal VM version (in my head not in reality). In case you wished to invest some time on this, it would be truly helpful. |
@kodandersson, @iposva-google, @lrhn , @mkustermann, @a-siva, @munificent, I ping you because I embarked with Vyacheslav Egorov (@mraleph) into solving this issue for 32bit (the only current problem) using the third option described in this comment. The solution has been virtually ready for almost a month, but @mraleph has been away lately and the cadence for addressing updates has been very scarce during the last month. He pointed out that he is currently away in his last comment. I would really appreciate if any of you could help finish this or tell me straightaway if you are not interested in this implementation and then I can drop this issue for good (and relieve the responsibility). If the solution is considered an improvement to the VM and you are interested: I understand that replying takes time and committing to the issue even more. But if any of you could not interact with me but simply trigger the build from Gerrit (currently it builds only in my machine), that would be truly helpful. The way it would work is that whenever I update the code I'll ping you with "@your_name update please" at most once a day and that's it. Now if you are willing to comment, that's even better. According to the comments there is only a change in the snapshots missing and if someone can give a script or command to test that, that's very helpful. There is currently a build that has been waiting to be triggered in Gerrit for almost two weeks. If you are ok with triggering the builds but do not wish to complicate with any kind of explanations, simply comment "I can trigger the builds" and I'll start pinging you only with the message described. This is the PR. Thank you. |
Overriding Object.hashCode with a simple integer counter can provide shocking performance improvements. I.e.
It seems the VM uses a random-number generator to create fresh hash codes, and then stores them in a weak hash table to associate it with the object. Profiling the dart2js compiler seemed to indicate that
Object_getHash
(which performs the lookup, not the computation) is a bottleneck.Maybe the hash code could be stored as a field? There is something ironic about storing hash codes in a hash table.
I don't have a minimal reproduction case, but you can reproduce by running revision 90039f6 of the dart2js compiler with the
--cps-ir
flag on the ImagingGaussianBlurOnce benchmark:Then apply https://codereview.chromium.org/1311673009 to see the difference.
The text was updated successfully, but these errors were encountered: