25 Dec 2016

PCI interrupt routing III - IOAPIC

文章欢迎转载,但转载时请保留本段文字,并置于文章的顶部
Author: derek0883@gmail.com
本文原文地址:http://www.gorecursion.com/virtualization/2016/12/25/pci-int3.html

In my previous post PCI interrupt routing II I studied how PCI interrupt route works. But the story is not end yet. as the connection is between PCI device and PCI bridge. It is only half way to CPU.

APIC

APIC stands for Advanced Programmable Interrupt Controller. It is a interrupt controller.

If you start QEMU with following command.

qemu-2.8.0-rc0 $ ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm \
        -kernel images/bzImage -hda images/xenial-server-cloudimg-amd64-disk1.img -nographic \
        -append "console=ttyS0 root=/dev/sda1 cloud-init=disabled"

You will see QEMU use APIC for interrupt conrtol.

$ cat /proc/interrupts 
           CPU0       
  0:  48      1632   IO-APIC   2-edge      timer  handle_edge_irq
  1:  49         9   IO-APIC   1-edge      i8042  handle_edge_irq
  4:  52       430   IO-APIC   4-edge      serial  handle_edge_irq
  8:  56         1   IO-APIC   8-edge      rtc0  handle_edge_irq
  9:  57         0   IO-APIC   9-fasteoi   acpi  handle_fasteoi_irq
 11:  59        47   IO-APIC  11-fasteoi   eth0  handle_fasteoi_irq
 12:  60       125   IO-APIC  12-edge      i8042  handle_edge_irq
 14:  62         0   IO-APIC  14-edge      ata_piix  handle_edge_irq
 15:  63       126   IO-APIC  15-edge      ata_piix  handle_edge_irq

At the system level, APIC consists of two parts (Following Figure)—one residing in the I/O subsystem (called the IOAPIC) and the other in the CPU (called the Local APIC). The local APIC and the IOAPIC communicate over a dedicated APIC bus. The IOAPIC bus interface consists of two bi-directional data signals (APICD[1:0]) and a clock input (APICCLK). Check IOAPIC datasheet here.

APIC

Each IOAPIC has 24 input interrupt pin, PIIX3 connect to IOAPIC, IRQ[1,2:7,8#,9:12,14,15] connect to IOAPIC interrupt input INTIN[1,2:7,8#,9:12,14,15]. There are 24 I/O Redirection Table entry registers. Each register is a dedicated entry for each interrupt input signal. Each entry is 64bit, the definition of each entry is:

struct IO_APIC_route_entry {
    __u32   vector      :  8,
        delivery_mode   :  3,   /* 000: FIXED
                     * 001: lowest prio
                     * 111: ExtINT
                     */
        dest_mode   :  1,   /* 0: physical, 1: logical */
        delivery_status :  1,
        polarity    :  1,
        irr     :  1,
        trigger     :  1,   /* 0: edge, 1: level */
        mask        :  1,   /* 0: enabled, 1: disabled */
        __reserved_2    : 15;

    __u32   __reserved_3    : 24,
        dest        :  8;
} __attribute__ ((packed));

8 bit interrupt vector is index of IDT. PIIX3 IRQ11 connect to IOAPIC INTIN11, In PCI interrupt routing I, I already figured out the interrupt vector is 59. So I/O redirection table entry 11, vector field will be set to 59. When I/O APIC send interrupt request to CPU, CPU will read interrupt vector number through APIC bus. For the definition of other field Check IOAPIC datasheet here.

APIC tracing

Now boot kernel with APIC debug enable.

./x86_64-softmmu/qemu-system-x86_64 -kernel images/bzImage \
    -initrd images/cirros-x86_64-initramfs \
    -append "console=ttyS0 apic=debug loglevel=8" -monitor stdio

From demsg command, you will find the APIC redirection table:

IO APIC #0......
.... IRQ redirection table:
IOAPIC 0:
 pin00, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)
 pin01, enabled , edge , high, V(31), IRR(0), S(0), logical , D(01), M(1)
 pin02, enabled , edge , high, V(30), IRR(0), S(0), logical , D(01), M(1)
 pin03, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)
 pin04, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)
 pin05, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)
 pin06, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)
 pin07, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)
 pin08, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)
 pin09, enabled , level, high, V(39), IRR(0), S(0), logical , D(01), M(1)
 pin0a, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)
 pin0b, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)
...
IRQ to pin mappings:
...
IRQ11 -> 0:11
IRQ12 -> 0:12
IRQ13 -> 0:13
IRQ14 -> 0:14
IRQ15 -> 0:15

Pin0b is disabled, as during boot up, eth0 is down, it will be enabled after interface was bring up. You can verify this in QEMU console.

(qemu) info ioapic
ioapic id=0x00 sel=0x26 (redir[11])
pin 11 0x010000000000893b dest=1 vec=59  active-hi level        lowest logical

Now turn down the interface. you will find pin11 change to masked(meaning disabled).

$ sudo ifconfig eth0 down
(qemu) info ioapic
ioapic id=0x00 sel=0x26 (redir[11])
pin 11 0x0000000000010000 dest=0 vec=0   active-hi edge  masked fixed  physical