Skip to content

Avatars need thumbnail versions #12350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mrsdizzie opened this issue Jul 28, 2020 · 17 comments
Open

Avatars need thumbnail versions #12350

mrsdizzie opened this issue Jul 28, 2020 · 17 comments
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented type/enhancement An improvement of existing functionality

Comments

@mrsdizzie
Copy link
Member

When showing user avatars on pr/issue/etc... we limit it to 40x40 but still use a full 240kb png image. Github has a 3kb jpg version for this.

This is noticeable on slower connections like gitea.com and pages that might load many avatars like an issue with lots of comments. We should have a thumbnail version of user avatars to avoid this

@silverwind
Copy link
Member

silverwind commented Jul 28, 2020

Yes, this would be a nice performane win. Github renders 40px with a 88px source image. I think we should aim for a multiple of our biggest rendered avatar size, 2x or 3x possibly.

Question is whether we want to keep the thumbnails in memory (and recompute them every time on startup) or store them on the filesystem for faster startup after initial generation.

@silverwind
Copy link
Member

silverwind commented Jul 28, 2020

Actually, a dynamic size would be even better e.g. /avatars/id/80 to dynamically render 80px. Github also does this and it seems the most flexible. Will probably need some form of cache to be fast.

@lunny lunny added the type/enhancement An improvement of existing functionality label Jul 28, 2020
@mrsdizzie
Copy link
Member Author

I think they should be kept on disk because the goal is to be able to serve them from a different server at somepoint (#11387).

I'm also not sure of our current dynamic avatar lookup system of https://gitea.com/user/avatar/6543/-1 because in a test of a very slow site (gitea.com):

https://www.webpagetest.org/result/200729_K1_d89b116c0c49e418eea1af147116c08b/

That HTTP request can take anywhere from 200ms to almost 2 seconds in some tests just to return the real location of the image to then download separately. Seems random when that request is faster/slower but that it is consistently slow and even at its fastest it takes more time than it should to just download a smaller version of the image directly.

I know the initial point was to not slow down overall page rendering waiting for avatars but use of gitea.com shows avatar rendering is kind of painful as-is using the current method (it looks like using a dialup connection for me). I think having them be direct links to much smaller images (3kb) could be better there.

@silverwind
Copy link
Member

silverwind commented Jul 29, 2020

gitea.com is hosted in Asia so that distance will affect latency significantly for non-asian users. TLS 1.3 would improve latency by eliminating one RTT per TLS session.

I see that avatars are served using HTTP 1.1 which means browsers have to open a separate connection for each avatar and are limited to generally 6 parallel connections per domain name. This can be improved by upgrading to HTTP2 where it should not really matter much if the avatars are on a separate domain or not.

@mrsdizzie
Copy link
Member Author

These are already HTTP/2 -- anything from gitea.com is HTTP/2

https://www.webpagetest.org/result/200729_K1_d89b116c0c49e418eea1af147116c08b/2/details/#step1_request3

https://www.webpagetest.org/result/200729_K1_d89b116c0c49e418eea1af147116c08b/2/details/#step1_request22

Screen Shot 2020-07-29 at 11 11 05 AM

Theres one thing from goreportcard.com that is http/1

It doesn't matter too much for this example since there is only one user avatar anyway where there are never more than a few open connections to gitea.com at once because most things are hosted elsewhere and the connections to gitea.com often need to wait on each other anyway (can't download the avatar until the request for its location is finished).

I think the point is that many gitea.com users are not connecting from Asia and experience this latency and it is helpful to remove or re-evaluate it wherever possible. TLS 1.3 wouldn't fix that a request for the location of an avatar can take almost 2 seconds sometimes. Theres also other situations where connections will be slow as well.

In general I think its just worth reflecting on the current design of how avatars are loaded given the experience of using them on a slower site like gitea.com. The situation before wasn't good either where page rendering could be held up a while, but what we have currently doesn't seem perfect.

You can see in the webpagetest example above that on each test run the repo avatar which is loaded directly from a known URL downloads and is displayed in a fraction of the time it even takes to lookup the location of the user avatar URL.

So my original thought was that if the thumbnail images were small enough (3kb) that it wouldn't need the extra lookup workaround of the /-1 URL to avoid the original performance issues which caused that to be implemented (there are probably other reasons too I'm not familiar with).

@silverwind
Copy link
Member

silverwind commented Jul 29, 2020

Not sure why but for me, everything from gitea.com is HTTP 1.1 in multiple browsers:

image

You can also clearly observe the 6-connection limit in that screenshot which slows down the avatar downloads.

+1 to removing that -1, it's also ugly in the devtools to look at and you can see in my waterfall that those extra requests block the actual avatar downloads because of the connection limit.

I think we want dynamic-size avatars based on URL parameter and a memory- and/or disk-based cache of those avatars, invalidated only when users upload new ones, e.g. avatar-$userid-$size.png.

@silverwind
Copy link
Member

Also when looking at those avatar requests, I see Etag but no accompanying Cache-Control which I'm pretty sure means browers can never cache the avatars.

https://developers.google.com/speed/docs/insights/LeverageBrowserCaching

When the server returns a response it must provide the Cache-Control and ETag headers:

  • Cache-Control defines how, and for how long the individual response can be cached by the browser and other intermediate caches. To learn more, see caching with Cache-Control.
  • ETag provides a revalidation token that is automatically sent by the browser to check if the resource has changed since the last time it was requested. To learn more, see validating cached responses with ETags.

Etag alone will not let the browser cache it, it needs a Cache-Control policy as well.

@mrsdizzie
Copy link
Member Author

weird! I don't know anything about the setup/configuration of gitea.com or why it might be different for some people.

But I think disk based cache of them would be good so they could be stored on a separate host with other 'static' assets (which is what @lunny wants for gitea.com).

Ideally many of them could get cached locally anyway but that savings is largely lost currently because it still takes a lot of time for the /-1 request just to respond that it is an image you might have in local memory.

You can see in tests above it doesn't need to request the actual avatar image on a 2nd view but still spends time on the /-1 request

@silverwind
Copy link
Member

silverwind commented Jul 29, 2020

stored on a separate host

Such multi-domain workarounds are a relic of the past since HTTP2 which can multiplex a theortical infinite amount of connection through a single TCP session. You should not expect much if any gain from moving assets to another host.

Ideally many of them could get cached locally

They can not cache currently because we don't send Cache-Control. Every avatar is downloaded fully on every page load. I imagine just adding that header will pretty much resolve the issue on its own but the 302's will of course introduce unnecessary roundtrips.

@mrsdizzie
Copy link
Member Author

Maybe we're talking about different things -- I'm talking about hosting them on a CDN for the purpose of faster downloads from a geographically closer source -- not about using a 2nd hostname to get around the number of connections open at once for increased speed. gitea.com (for me) is still slow now with http/2. The fastest parts are downloaded from a CDN. It was pretty much unusable before the CDN. The avatars on this github page are hosted on a separate cdn as well.

The images now have an expires header of about 6 hours in the future so some browsers like Chrome (and the test above) store it in a disk cache and don't always re-download for each page view but they need proper caching headers like the other files (many of which are set by the CDN they are hosted on).

Not sure why the repo avatar gets cache-control headers but the user one doesn't though : (

@silverwind
Copy link
Member

silverwind commented Jul 29, 2020

Not sure how CDN integration works but I could see timely updates as an issue. When a user changes an avatar, they will expect it to show immediately so you'd need a mechism to pretty much realtime invalidate the CDN content which would probably add a lot of complexity.

I think we're generally overthinking this issue. Just add some cache headers and maybe implement thumbnails served directly or optionally via CDN, but I doubt it will help much.

@lunny
Copy link
Member

lunny commented Jul 31, 2020

Now almost all assets has been served by CDN except avatars because gitea didn't support that.
@silverwind I think it's possible to change avatars to CDN by sending PR to gitea.
A CDN could cache the images via url path or file extension, the /-1 is not friendly to do this.

About http1.1 and http2, we have some haproxy host as reverse proxy which may not support http2. We will enable it ASAP.

@silverwind
Copy link
Member

silverwind commented Jul 31, 2020

change avatars to CDN

Possible to move them to STATIC_URL_PREFIX but don't expect much if any gain from that. I think the critical part in avatar caching is browser-based cache which is currently not working because of missing Cache-Control header. Secondly is the redirect removal and the -1 URLs.

@lunny
Copy link
Member

lunny commented Aug 1, 2020

@silverwind I have changed haproxy configuration, could you help to confirm all requests are http2 now?

@stale
Copy link

stale bot commented Oct 4, 2020

This issue has been automatically marked as stale because it has not had recent activity. I am here to help clear issues left open even if solved or waiting for more insight. This issue will be closed if no further activity occurs during the next 2 weeks. If the issue is still valid just add a comment to keep it alive. Thank you for your contributions.

@stale stale bot added the issue/stale label Oct 4, 2020
@lunny lunny added issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented and removed issue/stale labels Oct 4, 2020
@silverwind
Copy link
Member

#13569 should take care of the caching.

I'm not sure of the purpose of the 302 redirect that happens on every avatar fetch. Can someone explain maybe?

@silverwind
Copy link
Member

silverwind commented Nov 29, 2020

And #13649 makes avatars actually cachable by removing the redirects. This should pretty much solve this performance issue for subsequent page loads, but actual image resizing would still be nice to have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented type/enhancement An improvement of existing functionality
Projects
None yet
Development

No branches or pull requests

3 participants