From boris.ostrovsky at amd.com Tue Jun 3 08:15:55 2008 From: boris.ostrovsky at amd.com (Boris Ostrovsky) Date: Tue, 3 Jun 2008 11:15:55 -0400 Subject: [amd-platform-discuss] Questions regarding IBS attach function In-Reply-To: <20080529175145.GA8817@escobedo.amd.com> References: <20080529175145.GA8817@escobedo.amd.com> Message-ID: <4845602B.7090403@amd.com> (Question for FMA folks below) Hans Rosenfeld wrote: > Hi, > > while trying to understand the IBS provider in detail, I stumbled upon > this code in the attach routine: > > uint64_t val; > val = rdmsr(0xc001001f); > val = val | 0x400000000000; > wrmsr(0xc001001f, val); > drv_usecwait(10); > pci_putl(0, 0x18, 3, 0x1cc, 0x100); > drv_usecwait(10); > > This code enables PCI extended config space accesses through I/O > registers 0xcf8/0xcfc and then sets up the IBS control register on the > northbridge of node 0 to use LVT offset 0. The pci_putl() function is > private to the IBS module. > > According to the BKDG, this setup of the IBS control register should > already have been done by the BIOS, but if the driver assumes that it > probably is not, it should do it on _all_ nodes, not just node 0. To fix > this, the code would need to iterate over all nodes. Is there some public > interface to get the node count, or would I have to read it either from > the Node ID register, as in i86pc/os/mp_startup.c, or from > lgrp_lpat_node_cnt from sys/memnode.h? Yes, I also believe this should be done on all nodes. Have you tried collecting data on non-zero nodes? I think it wouldn't work. I don't know how you can get node count, but you can loop over all CPUs/cores and read MSRC001_103A (which is a read-only copy of IBS control register) and skip if IBS is enabled (i.e. LVTOffsetVal is 1). > > Another problem I see here is the use of LVT offset 0, which is usually > used by the machine check architecture. I couldn't find an interface to > coordinate usage of the 4 possible LVT offsets. Maybe a nexus device > claiming function 3 of the northbridges, as suggested by a comment in > i86pc/io/mc/mcamd_drv.c, could be used for this? As far as I know, Linux hardcodes offset 0 for MCA and 1 for IBS. There is no manager for LVT slots. You could presumably scan the 4 entries and search for an unused one. It's kind of dangerous though since this will assume that (1) this is atomic (i.e. noone is doing this in parallel) and (2) vector field is non-zero. Perhaps FMA folks can answer this --- does Solaris use offset 0 for MCA? > > Also I don't think it is good if every other driver has his own functions > to access the extended config space. The existing functions behind > pci_putl_func etc. could probably be easily extended to allow ECS > accesses. If I would make a patch to extend them, where should I send it > for review and inclusion into OpenSolaris? The idea is that people should not use IO space accesses to extended config space and use MMIO instead. I don't know whether Solaris maps ECS though (sometimes BIOSes do that, but I don't think one can rely on this). I don't believe this was working when Gaurav wrote this code. You can try using pci_config_setup/pci_config_get/pci_config_teardown and see if this works. -boris From boris.ostrovsky at amd.com Tue Jun 3 08:26:35 2008 From: boris.ostrovsky at amd.com (Boris Ostrovsky) Date: Tue, 3 Jun 2008 11:26:35 -0400 Subject: [amd-platform-discuss] Questions regarding IBS attach function In-Reply-To: <4845602B.7090403@amd.com> References: <20080529175145.GA8817@escobedo.amd.com> <4845602B.7090403@amd.com> Message-ID: <484562AB.50003@amd.com> Boris Ostrovsky wrote: > (Question for FMA folks below) > > Hans Rosenfeld wrote: >> Hi, >> >> while trying to understand the IBS provider in detail, I stumbled upon >> this code in the attach routine: >> >> uint64_t val; >> val = rdmsr(0xc001001f); >> val = val | 0x400000000000; >> wrmsr(0xc001001f, val); >> drv_usecwait(10); >> pci_putl(0, 0x18, 3, 0x1cc, 0x100); >> drv_usecwait(10); >> >> This code enables PCI extended config space accesses through I/O >> registers 0xcf8/0xcfc and then sets up the IBS control register on the >> northbridge of node 0 to use LVT offset 0. The pci_putl() function is >> private to the IBS module. >> >> According to the BKDG, this setup of the IBS control register should >> already have been done by the BIOS, but if the driver assumes that it >> probably is not, it should do it on _all_ nodes, not just node 0. To fix >> this, the code would need to iterate over all nodes. Is there some public >> interface to get the node count, or would I have to read it either from >> the Node ID register, as in i86pc/os/mp_startup.c, or from >> lgrp_lpat_node_cnt from sys/memnode.h? > > Yes, I also believe this should be done on all nodes. Have you tried collecting > data on non-zero nodes? I think it wouldn't work. > > I don't know how you can get node count, but you can loop over all CPUs/cores > and read MSRC001_103A (which is a read-only copy of IBS control register) and > skip if IBS is enabled (i.e. LVTOffsetVal is 1). > Actually, you would be better off going over PCI space (F3x1CC) since you won't need to switch CPUs, just deviceIDs. -boris From hans.rosenfeld at amd.com Tue Jun 3 08:44:37 2008 From: hans.rosenfeld at amd.com (Hans Rosenfeld) Date: Tue, 3 Jun 2008 17:44:37 +0200 Subject: [amd-platform-discuss] Questions regarding IBS attach function In-Reply-To: <4845602B.7090403@amd.com> References: <20080529175145.GA8817@escobedo.amd.com> <4845602B.7090403@amd.com> Message-ID: <20080603154437.GA24708@escobedo.amd.com> On Tue, Jun 03, 2008 at 11:15:55AM -0400, Boris Ostrovsky wrote: > Yes, I also believe this should be done on all nodes. Have you tried > collecting data on non-zero nodes? I think it wouldn't work. The BKDG says that the BIOS is supposed to set up the control register, so it should still work as long as the BIOS uses LVT offset 0. > I don't know how you can get node count, but you can loop over all > CPUs/cores and read MSRC001_103A (which is a read-only copy of IBS control > register) and skip if IBS is enabled (i.e. LVTOffsetVal is 1). I would need to get the node number of the CPU the code is running on, which seems to be even harder to get at than just the node count. > The idea is that people should not use IO space accesses to extended config > space and use MMIO instead. I don't know whether Solaris maps ECS though > (sometimes BIOSes do that, but I don't think one can rely on this). I don't > believe this was working when Gaurav wrote this code. I tried pci_putl_func, which uses either IO space or MMIO to get at the registers. Extending this interface to support ECS should not be too hard, but I would have to know where to send the patches :) > You can try using pci_config_setup/pci_config_get/pci_config_teardown and > see if this works. This is what I did first, but I had no success. How does pci_config_setup find out which function of which device I want to use? Hans -- %SYSTEM-F-ANARCHISM, The operating system has been overthrown From boris.ostrovsky at amd.com Tue Jun 3 10:56:57 2008 From: boris.ostrovsky at amd.com (Boris Ostrovsky) Date: Tue, 3 Jun 2008 13:56:57 -0400 Subject: [amd-platform-discuss] Questions regarding IBS attach function In-Reply-To: <20080603154437.GA24708@escobedo.amd.com> References: <20080529175145.GA8817@escobedo.amd.com> <4845602B.7090403@amd.com> <20080603154437.GA24708@escobedo.amd.com> Message-ID: <484585E9.6050102@amd.com> Hans Rosenfeld wrote: > On Tue, Jun 03, 2008 at 11:15:55AM -0400, Boris Ostrovsky wrote: >> Yes, I also believe this should be done on all nodes. Have you tried >> collecting data on non-zero nodes? I think it wouldn't work. > > The BKDG says that the BIOS is supposed to set up the control register, > so it should still work as long as the BIOS uses LVT offset 0. That's if BIOS does what it's supposed to. Different IBVs may decide not to do this. But in any case, I think we should either do this on all nodes or not do this at all. I doubt that a particular BIOS would initialize one node and not the other. > >> I don't know how you can get node count, but you can loop over all >> CPUs/cores and read MSRC001_103A (which is a read-only copy of IBS control >> register) and skip if IBS is enabled (i.e. LVTOffsetVal is 1). > > I would need to get the node number of the CPU the code is running on, > which seems to be even harder to get at than just the node count. You can get this from F0x60, but there may be a better way. Perhaps someone else can comment on this. > >> The idea is that people should not use IO space accesses to extended config >> space and use MMIO instead. I don't know whether Solaris maps ECS though >> (sometimes BIOSes do that, but I don't think one can rely on this). I don't >> believe this was working when Gaurav wrote this code. > > I tried pci_putl_func, which uses either IO space or MMIO to get at the > registers. Extending this interface to support ECS should not be too > hard, but I would have to know where to send the patches :) I suspect it's more than just patches. This would add new Solaris interfaces, which may require going through a whole bunch of Solaris processes (ARC and such). > >> You can try using pci_config_setup/pci_config_get/pci_config_teardown and >> see if this works. > > This is what I did first, but I had no success. How does pci_config_setup > find out which function of which device I want to use? I believe dip is per function. -boris From hans.rosenfeld at amd.com Wed Jun 4 01:43:41 2008 From: hans.rosenfeld at amd.com (Hans Rosenfeld) Date: Wed, 4 Jun 2008 10:43:41 +0200 Subject: [amd-platform-discuss] Questions regarding IBS attach function In-Reply-To: <484585E9.6050102@amd.com> References: <20080529175145.GA8817@escobedo.amd.com> <4845602B.7090403@amd.com> <20080603154437.GA24708@escobedo.amd.com> <484585E9.6050102@amd.com> Message-ID: <20080604084341.GA28969@escobedo.amd.com> On Tue, Jun 03, 2008 at 01:56:57PM -0400, Boris Ostrovsky wrote: > But in any case, I think we should either do this on all nodes or not do > this at all. I doubt that a particular BIOS would initialize one node and > not the other. We have to do this on all nodes, even if the BIOS sets it up correctly. In that case we would have to get at the LVT offset we're supposed to use. > >This is what I did first, but I had no success. How does pci_config_setup > >find out which function of which device I want to use? > > I believe dip is per function. I think I have to associate the dip with the pci device somehow. I tried putting it in /etc/driver_aliases, but it didn't work. Hans -- %SYSTEM-F-ANARCHISM, The operating system has been overthrown From boris.ostrovsky at amd.com Wed Jun 4 07:54:21 2008 From: boris.ostrovsky at amd.com (Boris Ostrovsky) Date: Wed, 4 Jun 2008 10:54:21 -0400 Subject: [amd-platform-discuss] Questions regarding IBS attach function In-Reply-To: <20080604084341.GA28969@escobedo.amd.com> References: <20080529175145.GA8817@escobedo.amd.com> <4845602B.7090403@amd.com> <20080603154437.GA24708@escobedo.amd.com> <484585E9.6050102@amd.com> <20080604084341.GA28969@escobedo.amd.com> Message-ID: <4846AC9D.5080706@amd.com> Hans Rosenfeld wrote: > >>> This is what I did first, but I had no success. How does pci_config_setup >>> find out which function of which device I want to use? >> I believe dip is per function. > > I think I have to associate the dip with the pci device somehow. I tried > putting it in /etc/driver_aliases, but it didn't work. You mean how to get bus/device/function from dip? This should work (no error checking): ddi_acc_handle_t acc_handle; ddi_acc_hdl_t *hp; pci_acc_cfblk_t *cfp; int devid; pci_config_setup(dip, &acc_handle); hp = impl_acc_hdl_get(acc_handle); cfp = (pci_acc_cfblk_t *)&hp->ah_bus_private; devid = (cfp->c_busnum << 8) | (cfp->c_devnum << 3) | cfp->c_funcnum; -boris From Eric.Saxe at Sun.COM Wed Jun 4 16:47:26 2008 From: Eric.Saxe at Sun.COM (Eric Saxe) Date: Wed, 04 Jun 2008 16:47:26 -0700 Subject: [amd-platform-discuss] Questions regarding IBS attach function In-Reply-To: <4846AC9D.5080706@amd.com> References: <20080529175145.GA8817@escobedo.amd.com> <4845602B.7090403@amd.com> <20080603154437.GA24708@escobedo.amd.com> <484585E9.6050102@amd.com> <20080604084341.GA28969@escobedo.amd.com> <4846AC9D.5080706@amd.com> Message-ID: <4847298E.4090701@sun.com> Hi Boris, Hans, Sorry, I've been distracted at an off site the last few days...although I might be a bit of a fish out of water with respect to some of your questions... If you still run into issues, you might consider copying Ed Pilatowicz, or fma-discuss.... Thanks, -Eric Boris Ostrovsky wrote: > Hans Rosenfeld wrote: > >>>> This is what I did first, but I had no success. How does pci_config_setup >>>> find out which function of which device I want to use? >>>> >>> I believe dip is per function. >>> >> I think I have to associate the dip with the pci device somehow. I tried >> putting it in /etc/driver_aliases, but it didn't work. >> > > > You mean how to get bus/device/function from dip? > > This should work (no error checking): > > ddi_acc_handle_t acc_handle; > ddi_acc_hdl_t *hp; > pci_acc_cfblk_t *cfp; > int devid; > > pci_config_setup(dip, &acc_handle); > hp = impl_acc_hdl_get(acc_handle); > cfp = (pci_acc_cfblk_t *)&hp->ah_bus_private; > devid = (cfp->c_busnum << 8) | (cfp->c_devnum << 3) | cfp->c_funcnum; > > -boris > > _______________________________________________ > amd-platform-discuss mailing list > amd-platform-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/amd-platform-discuss >