Skip to content

Commit 23d24f1

Browse files
author
Igor Melnikov
authored
Merge pull request #52 from magento-mpi/caching-layer
Caching layer for Magento services
2 parents c4cebc5 + ac4442c commit 23d24f1

11 files changed

+116
-0
lines changed
Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
# Caching layer for Magento services
2+
3+
## Overview
4+
5+
Cache is one of the most important parts in a modern web application architecture.
6+
Magento caching layer should effectively improve:
7+
* Performance
8+
* Availability
9+
* Scalability
10+
11+
Caching should be implemented on several levels:
12+
* Caching of http-responses returned by [BFF](https://github.com/magento/architecture/blob/master/design-documents/service-isolation.md#backends-for-frontends) (exposed web-server)
13+
* Application data caching (results of DB queries, merged configurations etc.)
14+
15+
### Caching results of GET-requests to BFF (exposed web-server)
16+
17+
[BFF](https://github.com/magento/architecture/blob/master/design-documents/service-isolation.md#backends-for-frontends) http-responses contains static assets, **public** and **private** dynamic content.
18+
Static assets should be cached using CDN. Reverse proxy should cache public content.
19+
It could be Varnish (acts as reverse proxy) or Fastly (combines reverse proxy and CDN functions).
20+
Varnish and Fastly are already integrated with Magento. Reverse proxy uses lazy caching approach.
21+
Beside performance this type of cache has next benefits:
22+
* Protection against outages - can optionally serve stale content when there is a problem with origin server
23+
* Scalability - number of caching nodes can be increased
24+
* Flexibility – Varnish Configuration Language (VCL) builds customized solutions, rules and modules
25+
26+
![Public content caching](img/public-cache.jpg)
27+
28+
A private or visitor-specific content should be stored on client-side (browser). Every time a HTTP POST request is made
29+
the cache in the web browser will be flushed and an AJAX call will be done to fetch an updated copy of the private content.
30+
Also private content cache on client-side should expire according to TTL.
31+
32+
![Private content caching](img/private-cache.jpg)
33+
34+
In general, the same caching approach works in current monolith Magento architecture.
35+
For service oriented architecture next issues should be solved:
36+
* The http-response header should include a set of tags for effective cache purging on reverse-proxy.
37+
The set should be created while request is going via the chain of services. Appropriate design should be prepared.
38+
* Caching on reverse-proxy should be aligned with request authorization approach. The presence of an authorization token should be taken into account.
39+
40+
### Caching results of GET-requests in service-to-service communication (private web-server)
41+
42+
Services can produce a lot of GET requests to each other in the private network.
43+
First of all service application should be as fast as possible to avoid using reverse proxy as cache layer.
44+
But if application is not able to hold the load a reverse proxy can be used to reduce the overall load on service origin.
45+
Varnish is offered as a proven solution for such cases. Should be aligned with request authorization approach.
46+
47+
![Private content caching](img/reverse-proxy-service1.png)
48+
49+
### Application data caching
50+
51+
Service application needs to cache such data as DB query results, merged configurations for efficiently and rapidly respond to requests.
52+
53+
The three possible types of caches are the following:
54+
* Local caches
55+
* Remote caches dedicated to each service
56+
* Remote caches shared across the services
57+
58+
#### Local caches
59+
- Pros: fast, no network traffic associated with retrieving data
60+
- Cons:
61+
- shares sources with application
62+
- can lead to data inconsistency when multiple instances of application servers are created
63+
- the information stored within an individual cache node cannot be shared with other application servers
64+
- not follow the 12-factor principles (any data that needs to persist must be stored in a stateful backing service)
65+
66+
![Local cache](img/local-service-cache.png)
67+
68+
#### Remote caches dedicated to each service
69+
- Pros:
70+
- scalable, adheres to 12-factor principles
71+
- externalize application state
72+
- possibility dynamically change the number of service instances
73+
- contains only dedicated service data
74+
- multiple instances of the same service can use the same cache instance.
75+
- Cons:
76+
- a slight delay in network calls
77+
- difficulties with centralized cache invalidation
78+
79+
![Remote cache dedicated to each service](img/remote-dedicated-cache1.png)
80+
81+
#### Remote caches shared across the services
82+
- Pros:
83+
- centralized cache invalidation is easy
84+
- scalable
85+
- adheres to 12-factor principles
86+
- Cons:
87+
- cache nodes contain mixed data from different services
88+
89+
![Remote cache shared across the services](img/remote-cache-cluster1.png)
90+
91+
Local caches are isolated on the individual nodes and cannot be leveraged within a cluster of application servers.
92+
Remote distributed caches on the other hand, provide low latency and higher levels of availability, when employed with read replicas and can provide a shared environment for all application servers to utilize the cached data.
93+
Remote cache dedicated to each service allows to store only data of particular service.
94+
But there is no possibility (needs implementation) to flush all cache instances in case of upgrade or other huge changes.
95+
Remote caches shared across the services are easy to flush centralized, but mix data from different services.
96+
97+
98+
#### Cache invalidation
99+
- Service should store in cache only own data (DB, configuration, etc.) and be fully responsible for cache invalidation
100+
- Service mustn't store in cache data retrieved from other Magento service
101+
- In case of "Remote caches dedicated to each service" model service should provide entry point for full cache invalidation
102+
- In case of "Remote caches shared across the services" model full cache invalidation can be performed by any service
103+
104+
#### Technologies
105+
Today most key/value stores such as Memcached and Redis can store terabytes worth of data. Redis also provides high availability and persistence features.
106+
Redis and Memcached offer high performance. Memcached is designed for simplicity while Redis offers a rich set of features that make it effective for a wide range of use cases.
107+
Unlike Memcached, Redis offers snapshots, replication, transactions, advanced data structures. So I propose to use Redis for application data caching.
108+
109+
### Resolutions
110+
- Model "Remote cache dedicated to each service" is most suitable for application data cache.
111+
- Don't use reverse proxy for caching requests in service-to-service communication until it's needed
112+
- Cache invalidation will be event driven, should be described in separate proposal.
113+
- Application data cache versioning should be described in a separate document.
114+
- Create HLD for caching API REST requests (GET).
115+
- HTTP cache will not actually handle caching of GraphQL requests/responses since they're all done as a POST. GraphQL caching should be described in a separate document.
116+
Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading

0 commit comments

Comments
 (0)