In the context of skip nets, I learned that a skip list is a good alternative to a balanced tree, and easier to implement.
However, I see a problem in implementing the insert action of a skip list, especially for the case where the inserted value becomes the new head of the skip list. If we set the inserted value as the new head, we have to add it to each level, because the skip list has 2 properties (If I understand correctly):
So, If we add the values 5, 4, 3, 2, 1 in that decreasing order into an empty skip list, we will get a really "ugly" skip list, where each newly inserted node must be present in at least the same linked lists as its successor node. The resulting skip list doesn't seem to give any efficiency gain for searches compared to a linear search.
What am I missing here?
I thought of using a doubly-linked list in each level to have more flexibility when searching. Is a singly-linked list a required property of skip lists?
It is common practice to maintain one or two sentinel nodes: the head and tail nodes. These nodes do not contain data that belongs to your collection. If for instance your data is of type float
, these sentinel nodes could have -Infinity
and +Infinity
as values respectively.
This then also solves your problem: whatever happens to be the least data value that you store in the list, it will not become the head node (as there is a sentinel node for that), nor does it need to be linked into all linked lists.
One other remark on what you wrote:
The down value of each node points to itself in the lower level.
This is an implementation detail, but it is not uncommon to design your nodes like this (using C syntax here):
struct SkipNode {
int value;
SkipNode* next[MAX_LEVELS];
};
The next
array will have a pointer for each level in which this node is part of the linked list. In fact, this array acts like a stack -- at index 0 we have the bottom layer, and this stack may grow to add the node's inclusion in upper layers. A "node" is thus a value and a stack to represent where this value is included in a linked list.
Here is a little runnable implementation in JavaScript:
class SkipNode {
constructor(value) {
this.value = value;
this.next = []; // Stack of node references: one per level
}
}
class SkipList {
constructor() {
// Two sentinel nodes
this.head = new SkipNode(-Infinity);
this.tail = new SkipNode(+Infinity);
this.head.next.push(this.tail); // We start with one level: [-inf] -> [+inf]
}
find(value) {
let node = this.head; // Sentinel node (-inf)
const path = []; // to collect one node from each level
for (let level = node.next.length - 1; level >= 0; level--) { // The head node has all levels
// Find interval in current level:
let next = node.next[level];
while (next.value <= value) {
node = next;
next = node.next[level];
}
path.push(node); // Keep track of left-side node at this level
if (node.value === value) break;
}
return path.reverse(); // Bottom to top
}
has(value) {
return this.find(value)[0].value === value;
}
insert(value) {
const path = this.find(value);
if (path[0].value === value) return; // Value is already in the collection
const newNode = new SkipNode(value);
for (let level = 0; level < path.length; level++) {
newNode.next.push(path[level].next[level]); // Grow the stack
path[level].next[level] = newNode;
if (Math.random() < 0.5) return; // Random choice to grow the stack more or not
}
// Add new level
this.head.next.push(newNode);
newNode.next.push(this.tail);
}
print() { // 90° rotated view on the sizes of the stacks: traverse bottom layer
let node = this.head.next[0];
while (node !== this.tail) {
console.log(node.value + " " + "*".repeat(node.next.length));
node = node.next[0];
}
}
}
// Demo: insert the values 20, 19, 18, ..., 1, 0 into a skip list
const lst = new SkipList();
for (let value = 20; value >= 0; value--) lst.insert(value);
lst.print();
You also wrote:
is the single-linked list a required property of the Skip List
It is not an absolute requirement, but as you can see in the above implementation, it is not needed to implement it as a doubly linked list. That would just add more work to perform an insert operation. By ensuring you always know what the closest preceding node is at each level for a given value, you have enough information to perform the insertion in singly linked lists.