Let's say we have two threads: th1 and th2.
Let's imagine this line of events:
Th1 locks the mutex and does some work in it's critical region.
Th2 calls lock on the mutex but is blocked.
Th1 finishes its work on the critical region and unlocks the mutex.
My question now is: Even if th1 still had some time left to work, will the call to unlock the mutex necessarily cause an immediate context switch to a thread that is blocked on the lock(in this case to th2) if there's any?
No, or at the very most implementation defined.
Typically, uncontested mutex operations never need to enter the kernel at all, they are just atomic operations on a memory location. When a conflict is detected, such as one thread wanting to lock a mutex that another thread owns, the wanting thread has to enter the kernel to wait for it; and the kernel needs to adjust the mutex so that the owning thread signals the kernel when it is done with it.
The distinction is that enter may block indefinitely until the release of a mutex; while signal merely indicates that such an event has occurred.
When the kernel is informed that a contested mutex has become available, it must, at the very least, enable the waiting thread to run. When it runs, it may discover the mutex is still not available, and re-enter its waiting mode.
Whether the waiting thread runs before the releasing thread may be based upon deterministic things like priority or scheduling class. In the face of multiprocessors, both threads may be simultaneously unleashed on separate CPUs, thus the next acquisition of the mutex may be entirely non-deterministic.
On the other hand, some systems like google's fair scheduling mutexes (entirely done in user mode) ensure that the starvation implied by the above paragraph cannot happen.
So implementation defined; and the definition provided by your implementation says a lot about your implementation.