I understand the problem in linear probing that because of subsequent indexing there will be cluster of element. But I don't understand this statement The bigger the cluster gets, more it reduces the performance.
How it reduces performance in hashing ?
For the first insertion into an empty hash table, we are guaranteed not to encounter any collisions. Suppose for the sake of argument that we are very unlucky - our second insertion hashes to the same slot as our first, and we have to perform a (very small) linear search to find the next free slot. The probability of this collision was 1/n for a table of n slots. Now we've got two next to each other in an otherwise empty table. What are the odds of our next insertion colliding with this cluster? Not 1/n as with the second insertion, but now 2/n - the chances have increased. The odds of something hashing to a k-slot cluster are k/n, and when they do, they have to linear search all the way down the cluster to the end, not only wasting time but also increasing the length of the cluster! The problem is that the pattern is self-reinforcing, and as your table gets full, your insertion time can approach O(n).