Scaling Memcache at Facebook
Enhancements to Memcached:
Lease Mechanism: To mitigate the "thundering herd" problem, a lease mechanism is used. When a cache miss occurs, a lease token is assigned to one server, preventing multiple servers from simultaneously querying the backend database.
Adaptive Slab Allocator: This helps in better memory management by reducing fragmentation and optimizing memory allocation for different sizes of dataβ.
Handling Consistency and Stale Data:
McSqueal and Mcrouter: These tools are used to ensure data consistency across different regions by broadcasting invalidations whenever the data is updated in the databaseβ.
Replication and Invalidation: SQL statements modifying data include the required cache keys, ensuring that updates are propagated to all relevant cachesβ.
Scaling Strategies:
Regional Clusters: Splitting servers into regional clusters reduces latency and improves reliability by keeping data closer to users and mitigating the effects of large-scale outagesβ.
Different Pools for Different Data Types: Assigning different pools to different types of data based on access patterns and memory requirements prevents negative interactions and optimizes performanceβ.
Load Balancing and Failure Handling:
Gutter Servers: These are used as fallback when primary memcached servers fail, preventing cascading failures that can overload the systemββ.
Distributed Read Requests: Read requests are distributed across multiple replicas to balance the load and improve performanceβ.
Performance Optimization:
Eviction Policies: Utilizing Least Recently Used (LRU) eviction policies within each slab class ensures efficient memory use by discarding the least used items firstβ.
High Availability: The design incorporates mechanisms to handle server failures gracefully, ensuring high availability and reliability of the caching systemββ.
Ref: https://www.usenix.org/conference/nsdi13/technical-sessions/presentation/nishtala
Last updated