[SOLVED] Is a context switch needed for the the short-term scheduler to run?

Is a context switch needed for the the short-term scheduler to run?

My understanding is that the short-term scheduler is a module in the kernel (a process in itself i guess?). Frequently this is being run to check and decide if it should preemptive the running process (may be because of SJF and a shorter job as arrived).

If that is correct, my intuition suggests that for the short-term scheduler to run a context switch has to happen:

Save state of running process
Load the new process (short-term scheduler)
Let it decide which process to run (lets say next_process)
next_process is being allocated the CPU and thus its PCB is loaded.

However I don't think this is correct, judging from what my teacher has taught us.

How and why am I wrong?
How can the short-term scheduler process run without a context switch to happen for it?

Solution

Let's start by assuming a task has a state that is one of:

"currently running". If there are 8 CPUs then a maximum of 8 tasks can be currently running on a CPU at the same time.
"ready to run". If there are 20 tasks and 8 CPUs, then there may be 12 tasks that are ready to run on a CPU.
"blocked". This is waiting for IO (disk, network, keyboard, ...), waiting to acquire a mutex, waiting for time to pass (e.g. sleep()), etc. Note that this includes things the task isn't aware of (e.g. fetching data from swap space because the task tried to access data that isn't actually in memory).

Sometimes a task will do something (call a kernel function like read(), sleep(), pthread_mutex_lock(), etc; or access data that isn't in memory) that causes the task to switch from the "currently running" state to the "blocked" state. When this happens some other part of the kernel (e.g. the virtual file system layer, virtual memory management, ...) will tell the scheduler that the currently running task has blocked (and needs to be put into the "blocked" state); and the scheduler will have to find something else for the CPU to do, which will be either finding another task for the CPU to run (and switching the other task from "ready to run" to "currently running") or putting the CPU into a power saving state (because there's no tasks for the CPU to run).

Sometimes something that a task was waiting for occurs (e.g. the user presses a key, a mutex is released, data arrives from swap space, etc). When this happens some other part of the kernel (e.g. the virtual file system layer, virtual memory management, ...) will tell the scheduler that the task needs to leave the "blocked" state. When this happens the scheduler has to decide if the task will go from "blocked" to "ready to run" (and tasks that were using CPUs will continue using CPUs), or if the task will go from "blocked" to "currently running" (which will either cause a currently running task to be preempted and go from "currently running" to "ready to run", or will cause a previously idle CPU to be taken out of a power saving state). Note that in a well designed OS this decision will depend on things like task priorities (e.g. if a high priority tasks unblocks it preempt a low priority task, but if a low priority task unblocks then it doesn't preempt a high priority task).

On modern systems these 2 things (tasks entering and leaving the "blocked" state) are responsible for most task switches.

Other things that can cause task switches are:

a task terminates itself or crashes. This is mostly the same as a task blocking (some other part of the kernel informs the scheduler and the scheduler has to find something else for the CPU to do).
a new task is created. This is mostly the same as a task unblocking (some other part of the kernel informs the scheduler and the scheduler decides if the new task will preempt a currently running task or cause a CPU to be taken out of a power saving state).
the scheduler is frequently switching between 2 or more tasks to create the illusion that they're all running at the same time (time multiplexing). On a well designed modern system this only ever happens when there are more tasks at the same priority than there are available CPUs and those tasks block often enough; which is extremely rare. In some cases (e.g. "earliest deadline first" scheduling algorithm in a real-time system) this might be impossible.

My understanding is that the short-term scheduler is a module in the kernel (a process in itself i guess?)

The scheduler is typically implemented as set of functions that other parts of the kernel call - e.g. maybe a block_current_task(reason) function (where scheduler might have to decide which other task to switch to), and an unblock_task(taskID) function (where if the scheduler decides the unblocked task should preempt a currently running task it already knows which task it wants to switch to). These functions may call an even lower level function to do an actual context switch (e.g. a switch_to_task(taskID)), where that lower level function may:

do time accounting (work out how much time has passed since last time, and use that to update statistics so that people can know things like how much CPU time each task has consumed, how much time a CPU has been idle, etc).
if there was a previously running task (if the CPU wasn't previously idle), change the previously running task's state from "currently running" to something else ("ready to run" or "blocked").
if there was a previously running task, save the previously running task's "CPU state" (register contents, etc) somewhere (e.g. in a some kind of structure).
change the state of the next task to "currently running" (regardless of what the next task's state was previously).
load the next task's "CPU state" (register contents, etc) from somewhere.

How can the short-term scheduler process run without a context switch to happen for it?

The scheduler is just a group of functions in the kernel (and not a process).