Linux introduced a concept of scheduling domains to make the scheduler aware of the processor topology. The topology-aware scheduler is more flexible than the earlier 0(1) scheduler and fulfills all the requirements discussed earlier. The scheduling domain refers to a group of CPUs whose load can be balanced against each other.

The scheduling domains are hierarchical, and load balancing is done starting from the base domain since CPUs at the bottom of the hierarchy are closely related. (For example, two logical CPUs of a HT-enabled processor that can share cache.) Load balancing is performed at lower domains more frequently than higher levels. To make
this concept comprehensible, consider an SMT NUMA machine with four HT-enabled CPUs. The four CPUs are divided into two NUMA nodes, each having two CPUs. The pictorial view of the scheduling domain hierarchy for this system is described in.

AB mentioned earlier, the scheduling group is a set of CPUs that can be balanced against each other. For example, at the HT Level domain, CPU 0:0 and CPU 0:1 can be balanced against each other. Note that CPU 0:0 and CPU 0:1 are two logical CPUs of a single HT-enabled processor, CPU O. As mentioned earlier, Linux treats a HT-enabled CPU as two logical CPUs.

Name:  Scheduling domains.jpg
Views: 92
Size:  24.7 KB

At the next higher level, i.e., at the Physical Level domain, CPU 0 and CPU 1 can be balanced against each other. Similarly, CPU 2 and CPU 3 can be balanced against each other. At the next higher level, i.e., at the. NUMA Level domain, all CPUs (CPU 0, CPU 1, CPU 2_and CPU 3) can be balanced against each other.

Since process scheduling at the lowest level is less costly (CPUs can share cache), scheduling is performed more frequently and even for small imbalances. At the next higher level, scheduling is slightly costly (CPUs can share memory but not cache); the scheduling is performed at a larger interval and for higher imbalances. At an even higher level, scheduling is very costly (because CPUs can not share memory at the same speed and accessing other CPUs' memory is slow), and is therefore performed at large intervals and for high imbalances. It's time now to look at the Linux kernel source to know how the kernel actually performs scheduling.