By CAP theorem, we can have any two of C, A and P.
But why do we even need Partition tolerance, or let's say multiple network partition. If we have only one partition we can make a CA system.
I assume the system can still be scaled well in the same network partition.
One reason of network partitioning could be geographical distribution of nodes.
Can anyone suggest any other reason?
In a nutshell, it's all about terminology being loosely interpreted.
If we have only one partition we can make a CA system.
What you call a network partition is not the same as Partition (connection loss between nodes) in terms of CAP theoreme.
If you were to guarantee C and A at the same time, you'd have to eliminate the possibility of Partitioning (connection loss) between nodes, as CA is only possible if nodes can communicate.
Now, if you eliminate Partitioning you obviously no longer can guarantee Partition tolerance as there's nothing to guarantee.
I assume the system can still be scaled well in the same network partition.
A simple network partition comprises at least two nodes and a switch in between. If the switch goes down, the partition (connection loss) occurres. So you can't have a CA system since nodes can't sync data among them if there's no connection. Thus you are left with having to guarantee Partition tolerance and that makes C and A mutually exclusive.
That is you may only choose to guarantee A and forfeit C - allow nodes to keep serving requests (stay available) and thus allow their data to differ (inconsistent); or guarantee C and forfeit A - keep the data the same among all nodes (consistent), but in order to achieve that, nodes should not allow updates (be unavailable).