I am reading the basics of splay trees. The amortized cost of an operation is O(log n) over n operations. Some rough basic idea is that when you access a node, you splay it i.e. you take it to root so next time this is quickly accessed and also if the node was deep, it enhances balance-ness of tree.
I don't understand how the tree can perform amortized O(log n) for this sample input:
Say a tree of n nodes is already built. My next n operations are n reads. I access a deep node say at depth n. This takes O(n). True that after this access, the tree will become balanced. But say every time I access the most current deep node. This will never be less than O(log n). then how we can ever compensate for the first costly O(n) operation and bring the amortized cost of each read as O(log n)?
Thanks.
Assuming your analysis is correct and the operations are O(log(n))
per access and O(n)
the first time...
If you always access the bottommost element (using some kind of worst-case oracle), a sequence of a
accesses will take O(a*log(n) + n)
. And thus the amortized cost per operation is O((a*log(n) + n)/a)
=O(log(n) + n/a)
or just O(log(n))
as the number of accesses grows large.
This is the definition of asymptotic average-case performance/time/space, also called "amortized performance/time/space". You are accidentally thinking that a single O(n)
step means all steps are at least O(n)
; one such step is only a constant amount of work in the long run; the O(...)
is hiding what's really going on, which is taking the limit of [total amount of work]
/[queries]
=[average ("amortized") work per query]
.
This will never be less than O(log n).
It has to be in order to get O(log n)
average performance. To get intuition, the following website may be good: http://users.informatik.uni-halle.de/~jopsi/dinf504/chap4.shtml specifically the image http://users.informatik.uni-halle.de/~jopsi/dinf504/splay_example.gif -- it seems that while performing the O(n)
operations, you move the path you searched scrunching it towards the top of the tree. You probably only have a finite number of such O(n)
operations to perform until the entire tree is balanced.
Here's another way to think about it:
Consider an unbalanced binary search tree. You can spend O(n)
time balancing it. Assuming you don't add elements to it*, it takes O(log(n))
amortized time per query to fetch an element. The balancing setup cost is included in the amortized cost because it is effectively a constant which, as demonstrated in the equations in the answer, disappears (is dwarfed) by the infinite amount of work you are doing. (*if you do add elements to it, you need a self-balancing binary search tree, one of which is a splay tree)