-
Notifications
You must be signed in to change notification settings - Fork 1.6k
BigTable: Cell.from_pb() performance improvement #4745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BigTable: Cell.from_pb() performance improvement #4745
Conversation
theacodes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
A couple of questions:
|
|
@tseaver the code that a client uses to create a cell calls the following method on row.py: SetCell is a Mutation type and all changes to BigTable go through Mutations. At the Mutation level the timestamp is expressed in units of microseconds. So the Cell class is really a read only class - there is no ORM magic going on for the persistence. ;-) I hope this answers your first question above. As far as memoization, I thought of that, but suspect that the timestamp as a datetime type will be read once. |
|
Can this be merged now? |
|
LGTM |
|
@zakons Thanks for the patch! |
|
@tseaver Looks like something is wrong in the system tests. Looking into it. But feel free to find the mistake(s) if you see it first. |
|
@tseaver looks like I found the cause and it's not a big deal. I'll check again to make sure but I'll submit a PR for it. |
Change to have Cell store the microseconds from the Cell protobuf and to use a property annotation to get the timestamp as a datetime, when requested. This moves the performance penalty to only the code which needs to access this timestamp, which may actually be a small amount of code. There is better than a 5% performance improvement for reading rows with 10 cells. See Issue #4714.