Okay my question is probably dumb. But I cant find any answers that correct me.
I learned that in DDR4 -lets say the stick has 8 chips- each chip parallelly contributes 8 bit to the 64 bit bus width.
My question is why? what if I want to get 64 bits that are stored in a single chip. Because with this existing way (that I have explained very poorly and possibly wrong) it will take 8 cycles of data transfer.
why
Because it means you get to handle more bits on each cycle. If each chips can process X number of bits each cycle in your hypothetical 8 chips stick, the controller now can process 8X number of bits on each cycle.
what if I want to get 64 bits that are stored in a single chip
You likely won't. Interleaving means every operation is spread evenly among all chips, and since it's supposed to be transparent (ie, the OS and apps generally won't care if a controller have 2 or 4 or 8 chips), you'd go out of your way to write & read something that end up on a single chip, in the process you'd be writing and reading the rest of the chips anyway. A normal operation of storing or reading something will end up using all chips, very quickly, without you having to care about the details.