Are Lisp lists always implemented as linked lists under the hood?
Is this a problem as far as processor caching goes? If so, are there solutions that use more contiguous structures which help caching?
Lisp implementations often can store some values directly in cons cells: fixnums, characters, ... For everything else a pointer will be stored in the car
or cdr
.
Nowadays almost all implementations which are using cons cells don't use optimizations like cdr-coding
.
Memory locality usually is improved by using a copying / compacting / generational garbage collector.
copying -> when a space is full, the GC copies the list and will allocate the new cells next to each other in a new storage area
compacting -> some scheme to get rid of memory gaps or similar
generational -> longer living objects will be promoted to different storage areas. Thus a list which has survived some GCs will get copyied to another generation and the cells will be allocated next to each other.
Sometimes above GC stretegies are combined in fancy ways.
Also note that in many Lisp programs many of those cons cells may be short-lived:
(mapcar #'1+
(mapcar #'isqrt '(10 20 30 40 50)) ; <- result is 'garbage'
)
The list of integer square roots is immediately garbage. The function will just walk through the fresh cons cells and allocate new fresh cons cells and there won't be much cache non-locality.
Allocation of cons cells can be reduced by using destructive operations. Above can be written as:
CL-USER 24 > (let ((s (mapcar #'isqrt '(10 20 30 40 50))))
(map-into s #'1+ s))
(4 5 6 7 8)
This will get rid of one allocated list and further improves locality.