🌐CAP Theorem
Last updated
Last updated
In very simple words,
Consistency: Every read receives the most recent write or an error.
Availability: All reads contain data, but it might not be the most recent.
Partition tolerance: The system continues to operate despite network failures.
In practice, distributed systems must choose which two out of the three properties they will guarantee:
CP (Consistency and Partition Tolerance): The system remains consistent and tolerates network partitions, but might not be available during a partition.
Example: HBase, MongoDB in certain configurations, Banking Systems
AP (Availability and Partition Tolerance): The system remains available and tolerates network partitions, but might return outdated data during a partition.
Example: Cassandra, Couchbase, Social media platforms
CA (Consistency and Availability): The system remains consistent and available as long as there are no network partitions.
Example: Relational databases in a single-node setup or systems within a single data center without network partitions.
MongoDB can be configured to satisfy all the above three different implications (CP, AP, CA)
Consistency and Partition Tolerance (CP) Configuration:
Replica Sets: MongoDB uses replica sets to ensure data is replicated across multiple nodes. By default, MongoDB is designed to provide strong consistency.
Write Concerns: You can configure write concerns to ensure that writes are acknowledged by a majority of the nodes, ensuring strong consistency even in the presence of network partitions.
Read Preferences: Setting read preferences to "primary" ensures that all reads are directed to the primary node, which holds the most recent data.
Trade-off: This configuration might compromise availability because if the primary node is unavailable or there's a network partition, writes cannot be acknowledged until a new primary is elected.
Availability and Partition Tolerance (AP) Configuration:
Replica Sets: MongoDB's replica sets can also be configured to prioritize availability.
Write Concerns: Using lower write concerns (e.g., w: 1
) ensures that writes are acknowledged as soon as they are written to the primary node, which increases availability but might compromise consistency during network partitions.
Read Preferences: Setting read preferences to "secondary" or "nearest" allows reads from secondary nodes, which might return slightly stale data but ensures high availability.
Trade-off: This configuration might compromise consistency because reads might not always reflect the most recent writes, especially during network partitions.
Consistency and Availability (CA) Configuration (Less Typical for Distributed Systems):
This configuration is less typical for MongoDB in a distributed setup because achieving CA requires no network partitions, which is unrealistic in most distributed systems.
In a single-node MongoDB setup or within a single data center with highly reliable networking, MongoDB can effectively provide CA guarantees.
Trade-off: This setup sacrifices partition tolerance, meaning it won't handle network partitions gracefully.
References: