I have read that it is common practice to store mipmaps in the same file as the texture and to load the base image and the mipmaps at the same time. On the other hand reading from storage space (HDD or SSD) is drastically slower than using memory closer to the CPU; graphics APIs such as OpenGL and Vulkan allow you to generate mipmaps using dedicated graphics cards.
So I was wondering if it would actually be faster to use the graphics cards to generate the mipmaps at run time rather than waste cycles waiting for virtual memory reads. Also, would the answer change if textures had to be loaded and unloaded dynamically during run time, allowing for other operations to continue as they were loaded from storage space?
Perhaps this is not quite the answer you're after but, it is common and, IMO recommended from a rendering performance point of view, that textures be compressed (using whatever is supported on the GPU) in which case you typically generate the MIP map chain and compress 'offline'.
Given that you can achieve compression results of, say, 2~8bpp (i.e. between 1/16th to 1/4 of the original size of a 32bpp source), the compression will easily counteract the 1/3rd overhead the smaller MIP map levels add to a texture.
Further, in practice, you might be limited on the GPU to just using a 2x2 box filter for downsampling, and that may be sub-optimal from a quality perspective.
I'm assuming you're not considering content that might be generated 'on the fly' such as environment maps for reflections etc, in which case texture compression is not really an option.