From the 11th Chapter(Performance and Scalability) and the section named Context Switching of the JCIP book:
When a new thread is switched in, the data it needs is unlikely to be in the local processor cache, so a context-switch causes a flurry of cache misses, and thus threads run a little more slowly when they are first scheduled.
Can someone explain in an easy to understand way the concept of cache miss and its probable opposite (cache hit)?
A cache miss, generally, is when something is looked up in the cache and is not found – the cache did not contain the item being looked up. The cache hit is when you look something up in a cache and it was storing the item and is able to satisfy the query.
Why context-switching would cause a lot of cache miss?
In terms of memory, each processor has a memory cache – a high speed copy of small portions of main memory. When a new thread is context switched into a processor, the local cache memory usually doesn't correspond to the data needed for the thread (except for some shared data structures and code if it's another thread of the same process). This means that all (or most) memory lookups made by that new thread result in cache misses because the data that it needs is not stored in the local memory cache. The hardware has to then make a number of requests to main memory to fill up the local memory cache which causes the thread to initially run slower.
Some early CPUs used virtually-addressed L1 caches which had to be invalidated on context-switches to maintain correctness, but these days caches are physically addressed (they cache a region of physical memory, regardless of what virtual address maps it). They can stay valid across context switches, so it's a question of whether the new thread will touch any of the same memory that the previous thread was touching. Threads will have their own stack memory, and often work on different data.
A security-paranoid system might perhaps invalidate caches on context-switch to try guard against Spectre attacks with a cache-timing side-channel between user-space processes, but that's not normally done.