|
| 1 | +# Caching layer for Magento services |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +Cache is one of the most important parts in a modern web application architecture. |
| 6 | +Magento caching layer should effectively improve: |
| 7 | + * Performance |
| 8 | + * Availability |
| 9 | + * Scalability |
| 10 | + |
| 11 | +Caching should be implemented on several levels: |
| 12 | + * Caching of http-responses returned by [BFF](https://github.com/magento/architecture/blob/master/design-documents/service-isolation.md#backends-for-frontends) (exposed web-server) |
| 13 | + * Application data caching (results of DB queries, merged configurations etc.) |
| 14 | + |
| 15 | +### Caching results of GET-requests to BFF (exposed web-server) |
| 16 | + |
| 17 | +[BFF](https://github.com/magento/architecture/blob/master/design-documents/service-isolation.md#backends-for-frontends) http-responses contains static assets, **public** and **private** dynamic content. |
| 18 | +Static assets should be cached using CDN. Reverse proxy should cache public content. |
| 19 | +It could be Varnish (acts as reverse proxy) or Fastly (combines reverse proxy and CDN functions). |
| 20 | +Varnish and Fastly are already integrated with Magento. Reverse proxy uses lazy caching approach. |
| 21 | +Beside performance this type of cache has next benefits: |
| 22 | + * Protection against outages - can optionally serve stale content when there is a problem with origin server |
| 23 | + * Scalability - number of caching nodes can be increased |
| 24 | + * Flexibility – Varnish Configuration Language (VCL) builds customized solutions, rules and modules |
| 25 | + |
| 26 | +  |
| 27 | + |
| 28 | +A private or visitor-specific content should be stored on client-side (browser). Every time a HTTP POST request is made |
| 29 | +the cache in the web browser will be flushed and an AJAX call will be done to fetch an updated copy of the private content. |
| 30 | +Also private content cache on client-side should expire according to TTL. |
| 31 | + |
| 32 | +  |
| 33 | + |
| 34 | + In general, the same caching approach works in current monolith Magento architecture. |
| 35 | + For service oriented architecture next issues should be solved: |
| 36 | + * The http-response header should include a set of tags for effective cache purging on reverse-proxy. |
| 37 | + The set should be created while request is going via the chain of services. Appropriate design should be prepared. |
| 38 | + * Caching on reverse-proxy should be aligned with request authorization approach. The presence of an authorization token should be taken into account. |
| 39 | + |
| 40 | +### Caching results of GET-requests in service-to-service communication (private web-server) |
| 41 | + |
| 42 | +Services can produce a lot of GET requests to each other in the private network. |
| 43 | +First of all service application should be as fast as possible to avoid using reverse proxy as cache layer. |
| 44 | +But if application is not able to hold the load a reverse proxy can be used to reduce the overall load on service origin. |
| 45 | +Varnish is offered as a proven solution for such cases. Should be aligned with request authorization approach. |
| 46 | + |
| 47 | +  |
| 48 | + |
| 49 | +### Application data caching |
| 50 | + |
| 51 | +Service application needs to cache such data as DB query results, merged configurations for efficiently and rapidly respond to requests. |
| 52 | + |
| 53 | +The three possible types of caches are the following: |
| 54 | + * Local caches |
| 55 | + * Remote caches dedicated to each service |
| 56 | + * Remote caches shared across the services |
| 57 | + |
| 58 | +#### Local caches |
| 59 | + - Pros: fast, no network traffic associated with retrieving data |
| 60 | + - Cons: |
| 61 | + - shares sources with application |
| 62 | + - can lead to data inconsistency when multiple instances of application servers are created |
| 63 | + - the information stored within an individual cache node cannot be shared with other application servers |
| 64 | + - not follow the 12-factor principles (any data that needs to persist must be stored in a stateful backing service) |
| 65 | + |
| 66 | +  |
| 67 | + |
| 68 | +#### Remote caches dedicated to each service |
| 69 | + - Pros: |
| 70 | + - scalable, adheres to 12-factor principles |
| 71 | + - externalize application state |
| 72 | + - possibility dynamically change the number of service instances |
| 73 | + - contains only dedicated service data |
| 74 | + - multiple instances of the same service can use the same cache instance. |
| 75 | + - Cons: |
| 76 | + - a slight delay in network calls |
| 77 | + - difficulties with centralized cache invalidation |
| 78 | + |
| 79 | +  |
| 80 | + |
| 81 | +#### Remote caches shared across the services |
| 82 | + - Pros: |
| 83 | + - centralized cache invalidation is easy |
| 84 | + - scalable |
| 85 | + - adheres to 12-factor principles |
| 86 | + - Cons: |
| 87 | + - cache nodes contain mixed data from different services |
| 88 | + |
| 89 | +  |
| 90 | + |
| 91 | +Local caches are isolated on the individual nodes and cannot be leveraged within a cluster of application servers. |
| 92 | +Remote distributed caches on the other hand, provide low latency and higher levels of availability, when employed with read replicas and can provide a shared environment for all application servers to utilize the cached data. |
| 93 | +Remote cache dedicated to each service allows to store only data of particular service. |
| 94 | +But there is no possibility (needs implementation) to flush all cache instances in case of upgrade or other huge changes. |
| 95 | +Remote caches shared across the services are easy to flush centralized, but mix data from different services. |
| 96 | + |
| 97 | + |
| 98 | +#### Cache invalidation |
| 99 | + - Service should store in cache only own data (DB, configuration, etc.) and be fully responsible for cache invalidation |
| 100 | + - Service mustn't store in cache data retrieved from other Magento service |
| 101 | + - In case of "Remote caches dedicated to each service" model service should provide entry point for full cache invalidation |
| 102 | + - In case of "Remote caches shared across the services" model full cache invalidation can be performed by any service |
| 103 | + |
| 104 | +#### Technologies |
| 105 | +Today most key/value stores such as Memcached and Redis can store terabytes worth of data. Redis also provides high availability and persistence features. |
| 106 | +Redis and Memcached offer high performance. Memcached is designed for simplicity while Redis offers a rich set of features that make it effective for a wide range of use cases. |
| 107 | +Unlike Memcached, Redis offers snapshots, replication, transactions, advanced data structures. So I propose to use Redis for application data caching. |
| 108 | + |
| 109 | +### Resolutions |
| 110 | + - Model "Remote cache dedicated to each service" is most suitable for application data cache. |
| 111 | + - Don't use reverse proxy for caching requests in service-to-service communication until it's needed |
| 112 | + - Cache invalidation will be event driven, should be described in separate proposal. |
| 113 | + - Application data cache versioning should be described in a separate document. |
| 114 | + - Create HLD for caching API REST requests (GET). |
| 115 | + - HTTP cache will not actually handle caching of GraphQL requests/responses since they're all done as a POST. GraphQL caching should be described in a separate document. |
| 116 | + |
0 commit comments