I have found several pieces of code where refcount_inc
is called after the mod_timer
. E.g. this one:
static int ip_frag_reinit(struct ipq *qp)
{
unsigned int sum_truesize = 0;
if (!mod_timer(&qp->q.timer, jiffies + qp->q.fqdir->timeout)) {
refcount_inc(&qp->q.refcnt);
return -ETIMEDOUT;
}
}
Is it safe to call functions in that order?
The logic of the timer is like this
void timer_callback()
{
...
refcount_dec(obj);
}
The timer may have already been executed on another CPU when mod_timer
returns.
Even if the object is already referenced by the caller of ip_frag_reinit
function, if the timer callback executes first, the object at the caller's side will face a UAF (Use After Free) issue.
At the same time there are many places in the kernel where same functions are called in the opposite, seemingly correct order: refcount_inc
before mod_timer
.
The timer function is probably the ip_expire, which looks like that:
static void ip_expire(struct timer_list *t)
{
struct ipq *qp;
int err;
qp = container_of(...);
rcu_read_lock();
/* Paired with WRITE_ONCE() in fqdir_pre_exit(). */
if (READ_ONCE(qp->q.fqdir->dead))
goto out_rcu_unlock;
spin_lock(&qp->q.lock);
...
spin_unlock(&qp->q.lock);
icmp_send(head, ICMP_TIME_EXCEEDED, ICMP_EXC_FRAGTIME, 0);
goto out_rcu_unlock;
out:
spin_unlock(&qp->q.lock);
out_rcu_unlock:
rcu_read_unlock();
kfree_skb_reason(head, SKB_DROP_REASON_FRAG_REASM_TIMEOUT);
ipq_put(qp);
}
The last line calls ipq_put
, which is effectively refcount_dec_and_test
.
However, before that line the function ip_expire
acquires (and releases) the spinlock.
The function ip_frag_reinit
is called with the same spinlock being taken.
That is, before this function calls refcount_inc
, the timer function ip_expire
is unable to execute all its instructions from the beginning to the ipq_put()
: this execution will stop at line
spin_lock(&qp->q.lock);
(Note, that if
branch with refcount_inc
call is executed only when mod_timer
returns 0, which means that given timer is not active, so its function is not started yet.)
So, there is no risk for the timer function to drop reference count for the object to 0, before ip_frag_reinit
function get its own reference with refcount_inc
.
Well, in the timer function ip_expire
there is branch
if (READ_ONCE(qp->q.fqdir->dead))
which bypass spinlock acquiring. But I guess that condition is not fulfilled for the object processed with ip_frag_reinit
.