coperating-systembootloaderxv6ata

xv6 boot loader: Reading sectors off disk using CHS


I've been trying to wrap my head around the C part of the xv6 boot loader (the question is below the code)

void
bootmain(void)
{
  struct elfhdr *elf;
  struct proghdr *ph, *eph;
  void (*entry)(void);
  uchar* pa;

  elf = (struct elfhdr*)0x10000;  // scratch space

  // Read 1st page off disk
  readseg((uchar*)elf, 4096, 0);

  // Is this an ELF executable?
  if(elf->magic != ELF_MAGIC)
    return;  // let bootasm.S handle error

  // Load each program segment (ignores ph flags).
  ph = (struct proghdr*)((uchar*)elf + elf->phoff);
  eph = ph + elf->phnum;
  for(; ph < eph; ph++){
    pa = (uchar*)ph->paddr;
    readseg(pa, ph->filesz, ph->off);
    if(ph->memsz > ph->filesz)
      stosb(pa + ph->filesz, 0, ph->memsz - ph->filesz);
  }

  // Call the entry point from the ELF header.
  // Does not return!
  entry = (void(*)(void))(elf->entry);
  entry();
}

void
waitdisk(void)
{
  // Wait for disk ready.
  while((inb(0x1F7) & 0xC0) != 0x40)
    ;
}

// Read a single sector at offset into dst.
void
readsect(void *dst, uint offset)
{
  // Issue command.
  waitdisk();
  outb(0x1F2, 1);   // count = 1
  outb(0x1F3, offset);
  outb(0x1F4, offset >> 8);
  outb(0x1F5, offset >> 16);
  outb(0x1F6, (offset >> 24) | 0xE0);
  outb(0x1F7, 0x20);  // cmd 0x20 - read sectors

  // Read data.
  waitdisk();
  insl(0x1F0, dst, SECTSIZE/4);
}

// Read 'count' bytes at 'offset' from kernel into physical address 'pa'.
// Might copy more than asked.
void
readseg(uchar* pa, uint count, uint offset)
{
  uchar* epa;

  epa = pa + count;

  // Round down to sector boundary.
  pa -= offset % SECTSIZE;

  // Translate from bytes to sectors; kernel starts at sector 1.
  offset = (offset / SECTSIZE) + 1;

  // If this is too slow, we could read lots of sectors at a time.
  // We'd write more to memory than asked, but it doesn't matter --
  // we load in increasing order.
  for(; pa < epa; pa += SECTSIZE, offset++)
    readsect(pa, offset);
}

So it's using the CHS addressing scheme, that is, it outputs the sector index, cylinder number and head number onto the disk ports and issues a read command (see readsect(dst, offset)).

The offset parameter taken by that function is supposed to contain the offset from the start of the disk in sectors. E.g. if you pass 0x01000203 as offset (16777731 in decimal) it will split it into 0x03 as the sector index, 0x0002 as the cylinder number, and 0x01 as the head number. The problem is that the sector index can't go from 0x00 to 0xFF, it can only go from 0x01 to 0x3F (1 to 63 in decimal), so this addressing scheme is not contiguous. For example, an offset 0x100002EE would be invalid as there's no sector index 0xEE.

I don't really understand how the kernel is still successfully loaded into memory. In the readseg() function it's clear that you can pass any memory offset to it, it will convert it to sector offset and pass to readsect(), potentially passing an invalid sector index. It would be fine if the kernel size never exceeded 63 * 512 = 32256 bytes, thus never reaching invalid sector indices, but it is actually around 170k.

What is going on?


Solution

  • What is going on?

    The hard disk is not using CHS addressing, but is actually using (28-bit) LBA addressing. In that case, IDE/ATA controller's registers that would've been for "sector, cylinder low, cylinder high, head" in CHS mode become "LBA bits 0 to 7, LBA bits 8 to 15, LBA bits 16 to 23, LBA bits 24 to 27" instead.

    Note that this code is still extremely bad (e.g. doesn't check if the read command returned an error, probably makes insane assumptions about the existence and configuration of lots of multiple pieces of hardware, etc); and a real boot loader can not make these assumptions and mostly needs to use firmware to load data from disk (because it's impractical for a boot loader to support all RAID controllers, AHCI/SATA, SCSI controllers, USB controllers and devices, ...). However, Xv6 is a "teaching OS" (which mostly means that it teaches you a mixture of bad ideas and nonsense for the sake of cramming the illusion of knowledge into a ~3 month time span).