- Asynchronous event-driven design: Avoid as much as possible any synchronous interaction with the data or business logic tier. Instead, use an event-driven approach and workflow
- Partitioning/Shards: You need to design your data model so that it will fit the partitioning model
- Parallel execution: Parallel execution should be used to get the most out of the available resources. A good place to use parallel execution is for processing users requests. In this case multiple instances of each service can take the requests from the messaging system and execute them in parallel. Another place for parallel processing is using MapReduce for performing aggregated requests on partitioned data
- Replication (read-mostly): In read-mostly scenarios (LinkedIN seems to fall into this category well), database replication can help load-balance the read load by splitting the read requests among the replicated database nodes
- Consistency without distributed transactions: That was one of the hot topics of the conference, which also sparked some discussion during one of the panels I participated in. An argument was made that to reach scalability you had to sacrifice consistency and handle consistency in your applications using things such as optimistic locking and asynchronous error-handling. It also assumes that you will need to handle idempotency in your code. My argument was that while this pattern addresses scalability, it creates complexity and is therefore error-prone. During another panel, Dan Pritchett argued that there are ways to avoid this level of complexity and still achieve the same goal, as I outlined in this blog post.
- Move the database to the background - There was violent agreement that the database bottleneck can only be solved if database interactions happen in the background.
Thursday, 15 November 2007
Scalability tips
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment