[driver-discuss] Am I understanding this correctly? -- potential e1000g bug
Kerry Shu
Kerry.Shu at Sun.COM
Thu Sep 17 09:45:04 PDT 2009
Jason King wrote:
> On Thu, Sep 17, 2009 at 10:16 AM, Garrett D'Amore <gdamore at sun.com> wrote:
>> Look closely at the stack. You'll notice that a PIL9 interrupt
>> *interrupted* e1000g while it was servicing an interrupt. I don't think
>> e1000g is at fault here. Something else is doing it.
>
> This is probably my lack of knowledge about how solaris handles
> interrupts, but with doing a little digging:
>
>> 0xffffff0007c49c60::findstack -v
> stack pointer for thread ffffff0007c49c60: ffffff0007c49b30
> ffffff0007c49bb0 rm_isr+0xaa()
> ffffff0007c49c00 av_dispatch_autovect+0x7c(10)
> ffffff0007c49c40 dispatch_hardint+0x33(10, 6)
> ffffff0007c4f450 switch_sp_and_call+0x13()
> ffffff0007c4f4a0 do_interrupt+0x9e(ffffff0007c4f4b0, b)
> ffffff0007c4f4b0 _interrupt+0xba()
>
> I'm assuming this portion of the stack dump is what you're talking
> about... looking at the function signature for dispatch_hardint -- the
> new vector is 10, and the old ipl is 6.
>
>> ::interrupts -d
> IRQ Vect IPL Bus Trg Type CPU Share APIC/INT# Driver Name(s)
> 3 0xb1 12 ISA Edg Fixed 0 1 0x0/0x3 asy#1
> 4 0xb0 12 ISA Edg Fixed 0 1 0x0/0x4 asy#0
> 6 0x41 5 ISA Edg Fixed 0 1 0x0/0x6 fdc#0
> 7 0x42 5 ISA Edg Fixed 1 1 0x0/0x7 ecpp#0
> 9 0x81 9 PCI Lvl Fixed 1 1 0x0/0x9 acpi_wrapper_isr
> 15 0x43 5 ISA Edg Fixed 0 1 0x0/0xf ata#1
> 16 0x83 9 PCI Lvl Fixed 1 4 0x0/0x10 hci1394#0, uhci#3, uhci#0,
> nvidia#0
> 17 0x87 8 PCI Lvl Fixed 0 1 0x0/0x11 audio810#0
> 18 0x86 9 PCI Lvl Fixed 1 1 0x0/0x12 pci-ide#1
> 19 0x85 9 PCI Lvl Fixed 0 1 0x0/0x13 uhci#1
> 23 0x84 9 PCI Lvl Fixed 1 1 0x0/0x17 ehci#0
> 26 0x40 5 PCI Lvl Fixed 1 1 0x1/0x2 aac#0
> 48 0x60 6 PCI Lvl Fixed 1 1 0x2/0x0 e1000g#0
> 72 0x82 7 PCI Edg MSI 0 1 - pcie_pci#0
> 73 0x30 4 PCI Edg MSI 0 1 - pcie_pci#2
> 74 0x44 5 PCI Edg MSI 0 1 - adpu320#0
> 160 0xa0 0 Edg IPI all 0 - poke_cpu
> 192 0xc0 13 Edg IPI all 1 - xc_serv
> 208 0xd0 14 Edg IPI all 1 - kcpc_hw_overflow_intr
> 209 0xd1 14 Edg IPI all 1 - cbe_fire
> 210 0xd3 14 Edg IPI all 1 - cbe_fire
> 240 0xe0 15 Edg IPI all 1 - xc_serv
> 241 0xe1 15 Edg IPI all 1 - apic_error_intr
>
> That makes sense -- e1000g#0 is IPL 6, however shouldn't there then be
> an entry somewhere in there with a VECT value of 0x0a and an IPL of 9?
> Or do i still have more learning to do?
>
What you are looking for is 0x10, not 0x0a. Looks to me, here you have
IRQ# 16 interrupt (might be either hci1394#0, uhci#3, uhci#0, or
nvidia#0) preempting e1000g#0 interrupt. I guess such situation happened
frequently since you felt system freeze. So are you running something
that let both e1000g0 and other 4 driver instances at IRQ# 16 busy? For
example, are you putting heavy load on both network and graphics?
Regards,
Kerry
More information about the driver-discuss
mailing list