Distributed *
After the success of the revolutions of Distributed Data and Distributed Computing, it is time to think about Distributed Architecture.I believe that you can find many similarities between the way that data processing is evolving throughout the years and how system architecture is evolving and can continue to evolve.
60 years of Database Evolution |
I would argue that the same concept is true for the architecture of most large systems; the days of "We are a Java shop, therefore every solution is based on OSGi and J2EE" are gone as well.
You may argue that OSGi is exactly built to provide the kind of modularity that we need. But I think that it is better to look on problems (after we proved that it is time to re-architect them for scale), not as nails since we have a hammer, or OSGi bundles in this case. Each scalability issue should be solved with the best tool for the problem.
The main incentive for the evolution of Data Processing, Large Scale Computing and new Architecture designs is the changing environment, mainly the explosion of Data, Users, Services and Markets. The existing technologies were unable to scale fast enough. It is impossible to build a computer big enough to compute fast enough planet scale problems. It is impossible to build a single relational database big enough to store and retrieve fast enough Big Data. In the same line, it is impossible to architect a single technology architecture to scale flexibly enough for your evolving business.
Distributed Scale
In a similar way that NoSQL databases require different thinking from the DBA and MapReduce requires different thinking from developers, software architects should think differently on Scalability issues.
Distributed Architecture for Scale |
The goal of the distributed architecture is to allow linear scalability of the system in any of the axis of growth.
The diagram above was adapted from Christian Timmerer's post to illustrate the required modularity.
- Users - If you have double the users, you can have simply double the boxes.
- Services - If you want to add additional services, you can simply add these boxes independently from the other services.
- Markets - When you enter a new market, geographically or other dimension, you would like to do it independently from the existing ones.
The diagram above was adapted from Christian Timmerer's post to illustrate the required modularity.
Once your decided that it is time to scale, start breaking down your system to smaller independent services, each with its own API and its own data store ("Share Nothing" concept). The data store can be based on RDBMS, but more likely you can find more suitable data solution, now that you only need to solve the data issues of a more focused service. Maybe a simple Hash Table in memory can do everything that is needed, or any of the many ready-made data solutions that were developed for similar services.
When the service is narrower and more focused, it is easier to identify what is the right scalability pattern that is needed, and to choose the right set of tools to solve it.
This is certainly not a new idea as it is discuss greatly as SOA (Service Oriented Architecture), but this it is usually used for external services, and less for internal break down of a single system. It seems that the overhead of SOA compare to a simple function call or updating of the same RDBMS, is too high for many cases. But this is the easiest option if you wish to scale differently different aspects of your system (functions or data wise).
This is certainly not a new idea as it is discuss greatly as SOA (Service Oriented Architecture), but this it is usually used for external services, and less for internal break down of a single system. It seems that the overhead of SOA compare to a simple function call or updating of the same RDBMS, is too high for many cases. But this is the easiest option if you wish to scale differently different aspects of your system (functions or data wise).