summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2013-06-21powerpc: Optimize hugepage invalidateAneesh Kumar K.V4-9/+201
Hugepage invalidate involves invalidating multiple hpte entries. Optimize the operation using H_BULK_REMOVE on lpar platforms. On native, reduce the number of tlb flush. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc/THP: Enable THP on PPC64Aneesh Kumar K.V2-2/+30
We enable only if the we support 16MB page size. Reviewed-by: David Gibson <dwg@au1.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc: split hugepage when using subpage protectionAneesh Kumar K.V1-0/+48
We find all the overlapping vma and mark them such that we don't allocate hugepage in that range. Also we split existing huge page so that the normal page hash can be invalidated and new page faulted in with new protection bits. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc: disable assert_pte_locked for collapse_huge_pageAneesh Kumar K.V1-0/+8
With THP we set pmd to none, before we do pte_clear. Hence we can't walk page table to get the pte lock ptr and verify whether it is locked. THP do take pte lock before calling pte_clear. So we don't change the locking rules here. It is that we can't use page table walking to check whether pte locks are held with THP. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc: Prevent gcc to re-read the pagetablesAneesh Kumar K.V2-5/+5
GCC is very likely to read the pagetables just once and cache them in the local stack or in a register, but it is can also decide to re-read the pagetables. The problem is that the pagetable in those places can change from under gcc. With THP/hugetlbfs the pmd (and pgd for hugetlbfs giga pages) can change under gup_fast. The pages won't be freed untill we finish gup fast because we have irq disabled and we free these pages via rcu callback. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc: Make linux pagetable walk safe with THP enabledAneesh Kumar K.V4-38/+68
We need to have irqs disabled to handle all the possible parallel update for linux page table without holding locks. Events that we are intersted in while walking page tables are 1) Page fault 2) umap 3) THP split 4) THP collapse A) local_irq_disabled: ------------------------ 1) page fault: A none to valid transition via page fault is not an issue because we would either see a none or valid. If it is none, we would error out the page table walk. We may need to use on stack values when checking for type of page table elements, because if we do if (!is_hugepd()) { if (!pmd_none() { if (pmd_bad() { We could take that bad condition because the pmd got converted to a hugepd after the !is_hugepd check via a hugetlb fault. The right way would be to check for pmd_none higher up or use on stack value. 2) A valid to none conversion via unmap: We can safely walk the upper level table, because we don't remove the the page table entries until rcu grace period. So even if we followed a wrong pointer we still have the pointer valid till the grace period. A PTE pointer returned need to be atomically checked for _PAGE_PRESENT and _PAGE_BUSY. A valid pointer returned could becoming none later. To prevent pte_clear we take _PAGE_BUSY. 3) THP split: A valid transparent hugepage is converted to nomal page. Before we split we do pmd_splitting_flush, which sets the hugepage PTE to _PAGE_SPLITTING So when walking page table we need to check for pmd_trans_splitting and handle that. The pte returned should also need to be checked for _PAGE_SPLITTING before setting _PAGE_BUSY similar to _PAGE_PRESENT. We save the value of PTE on stack and check for the flag in the local pte value. If we don't have the value set we can safely operate on the local pte value and we atomicaly set _PAGE_BUSY. 4) THP collapse: A normal page gets converted to hugepage. In the collapse path, we mark the pmd none early (pmdp_clear_flush). With irq disabled, if we are aleady walking page table we would see the pmd_none and won't continue. If we see a valid PMD, we should still check for _PAGE_PRESENT before setting _PAGE_BUSY, to make sure we didn't collapse the PTE to a Huge PTE. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc/THP: Add code to handle HPTE faults for hugepagesAneesh Kumar K.V4-4/+203
The deposted PTE page in the second half of the PMD table is used to track the state on hash PTEs. After updating the HPTE, we mark the coresponding slot in the deposted PTE page valid. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc: Update gup_pmd_range to handle transparent hugepagesAneesh Kumar K.V1-2/+8
Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc/kvm: Handle transparent hugepage in KVMAneesh Kumar K.V3-34/+44
We can find pte that are splitting while walking page tables. Return None pte in that case. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc: Replace find_linux_pte with find_linux_pte_or_hugepteAneesh Kumar K.V7-33/+36
Replace find_linux_pte with find_linux_pte_or_hugepte and explicitly document why we don't need to handle transparent hugepages at callsites. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc: Update find_linux_pte_or_hugepte to handle transparent hugepagesAneesh Kumar K.V1-6/+26
Reviewed-by: David Gibson <dwg@au1.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc: move find_linux_pte_or_hugepte and gup_hugepte to common codeAneesh Kumar K.V5-138/+138
We will use this in the later patch for handling THP pages Reviewed-by: David Gibson <dwg@au1.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc/THP: Implement transparent hugepages for ppc64Aneesh Kumar K.V6-2/+625
We now have pmd entries covering 16MB range and the PMD table double its original size. We use the second half of the PMD table to deposit the pgtable (PTE page). The depoisted PTE page is further used to track the HPTE information. The information include [ secondary group | 3 bit hidx | valid ]. We use one byte per each HPTE entry. With 16MB hugepage and 64K HPTE we need 256 entries and with 4K HPTE we need 4096 entries. Both will fit in a 4K PTE page. On hugepage invalidate we need to walk the PTE page and invalidate all valid HPTEs. This patch implements necessary arch specific functions for THP support and also hugepage invalidate logic. These PMD related functions are intentionally kept similar to their PTE counter-part. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc/THP: Double the PMD table size for THPAneesh Kumar K.V4-8/+16
THP code does PTE page allocation along with large page request and deposit them for later use. This is to ensure that we won't have any failures when we split hugepages to regular pages. On powerpc we want to use the deposited PTE page for storing hash pte slot and secondary bit information for the HPTEs. We use the second half of the pmd table to save the deposted PTE page. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc/mm: handle hugepage size correctly when invalidating hpte entriesAneesh Kumar K.V9-107/+95
If a hash bucket gets full, we "evict" a more/less random entry from it. When we do that we don't invalidate the TLB (hpte_remove) because we assume the old translation is still technically "valid". This implies that when we are invalidating or updating pte, even if HPTE entry is not valid we should do a tlb invalidate. With hugepages, we need to pass the correct actual page size value for tlb invalidation. This change update the patch 0608d692463598c1d6e826d9dd7283381b4f246c "powerpc/mm: Always invalidate tlb on hpte invalidate and update" to handle transparent hugepages correctly. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc/eeh: Debugfs for error injectionGavin Shan1-0/+31
The patch creates debugfs entries (powerpc/PCIxxxx/err_injct) for injecting EEH errors for testing purpose. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc/powernv: Debugfs directory for PHBGavin Shan2-0/+27
The patch creates one debugfs directory ("powerpc/PCIxxxx") for each PHB so that we can hook EEH error injection debugfs entry there in proceeding patch. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powerpc/eeh: Register OPAL notifier for PCI errorGavin Shan1-1/+40
The patch registers OPAL event notifier and process the PCI errors from firmware. If we have pending PCI errors, special EEH event (without binding PE) will be sent to EEH core for processing. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powernv/opal: Disable OPAL notifier upon poweroffGavin Shan1-0/+4
While we're restarting or powering off the system, we needn't the OPAL notifier any more. So just to disable that. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21powernv/opal: Notifier for OPAL eventsGavin Shan2-1/+73
This patch implements a notifier to receive a notification on OPAL event mask changes. The notifier is only called as a result of an OPAL interrupt, which will happen upon reception of FSP messages or PCI errors. Any event mask change detected as a result of opal_poll_events() will not result in a notifier call. [benh: changelog] Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Allow to check fenced PHB proactivelyGavin Shan1-0/+60
It's meaningless to handle frozen PE if we already had fenced PHB. The patch intends to check the PHB state before checking PE. If the PHB has been put into fenced state, we need take care of that firstly. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Enable EEH check for config accessGavin Shan1-1/+39
The patch enables EEH check and let EEH core to process the EEH errors for PowerNV platform while accessing config space. Originally, the implementation already had mechanism to check EEH errors and tried to recover from them. However, we never let EEH core to handle the EEH errors. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Initialization for PowerNVGavin Shan2-5/+17
The patch initializes EEH for PowerNV platform. Because the OPAL APIs requires HUB ID, we need trace that through struct pnv_phb. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: PowerNV EEH backendsGavin Shan2-1/+420
The patch adds EEH backends for PowerNV platform. It's notable that part of those EEH backends call to the I/O chip dependent backends. [Removed pointless change to eeh_pseries.c -- BenH] Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: I/O chip next errorGavin Shan2-2/+333
The patch implements the backend for EEH core to retrieve next EEH error to handle. For the informational errors, we won't bother the EEH core. Otherwise, the EEH should take appropriate actions depending on the return value: 0 - No further errors detected 1 - Frozen PE 2 - Fenced PHB 3 - Dead PHB 4 - Dead IOC Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: I/O chip PE log and bridge setupGavin Shan1-2/+55
The patch adds backends to retrieve error log and configure p2p bridges for the indicated PE. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: I/O chip PE resetGavin Shan1-1/+233
The patch adds the I/O chip backend to do PE reset. For now, we focus on PCI bus dependent PE. If PHB PE has been put into error state, the PHB will take complete reset. Besides, the root bridge will take fundamental or hot reset accordingly if the indicated PE locates at the toppest of PCI hierarchy tree. Otherwise, the upstream p2p bridge will take hot reset. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: I/O chip EEH state retrievalGavin Shan1-1/+98
The patch adds I/O chip backend to retrieve the state for the indicated PE. While the PE state is temperarily unavailable, the upper layer (powernv platform) should return default delay (1 second). Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: I/O chip EEH enable optionGavin Shan1-1/+64
The patch adds the backend to enable or disable EEH functionality for the specified PE. The backend is also used to enable MMIO or DMA path for the problematic PE. It's notable that all PEs on PowerNV platform support EEH functionality by default, and we disallow to disable EEH for the specific PE. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: I/O chip post initializationGavin Shan1-1/+20
The post initialization (struct eeh_ops::post_init) is called after the EEH probe is done. On the other hand, the EEH core post initialization is designed to call platform and then I/O chip backend on PowerNV platform. The patch adds the backend for I/O chip to notify the platform that the specific PHB is ready to supply EEH service. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: EEH backend for P7IOCGavin Shan3-0/+69
For EEH on PowerNV platform, the overall architecture is different from that on pSeries platform. In order to support multiple I/O chips in future, we split EEH to 3 layers for PowerNV platform: EEH core, platform layer, I/O layer. It would give EEH implementation on PowerNV platform much more flexibility in future. The patch adds the EEH backend for P7IOC. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Sync OPAL API with firmwareGavin Shan3-22/+119
The patch synchronizes OPAL APIs between kernel and firmware. Also, we starts to replace opal_pci_get_phb_diag_data() with the similar opal_pci_get_phb_diag_data2() and the former OPAL API would return OPAL_UNSUPPORTED from now on. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: EEH core to handle special eventGavin Shan2-18/+112
On PowerNV platform, the EEH event caused by interrupt won't have binding PE. The patch enables EEH core to handle the special event. To avoid the current logic we have, The eeh_handle_event() is renamed to eeh_handle_normal_event(), and the eeh_handle_special_event() is introduced. The function eeh_handle_event() dispatches to above two functions according to the input parameter. Besides, new backend "next_error" added to eeh_ops and it's expected to have following return values: 4 - Dead IOC 3 - Dead PHB 2 - Fenced PHB 1 - Frozen PE 0 - No error found Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Export confirm_error_lockGavin Shan2-6/+15
An EEH event is created and queued to the event queue for each ingress EEH error. When there're mutiple EEH errors, we need serialize the process to keep consistent PE state (flags). The spinlock "confirm_error_lock" was introduced for the purpose. We'll inject EEH event upon error reporting interrupts on PowerNV platform. So we export the spinlock for that to use for consistent PE state. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Allow to purge EEH eventsGavin Shan2-0/+38
On PowerNV platform, we might run into the situation where subsequent events are duplicated events of former one, which is being processed. For the case, we need the function implemented by the patch to purge EEH events accordingly. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Trace time on first error for PEGavin Shan3-0/+35
We're not expecting that one specific PE got frozen for over 5 times in last hour. Otherwise, the PE will be removed from the system upon newly coming EEH errors. The patch introduces time stamp to trace the first error on specific PE in last hour and function to update that accordingly. Besides, the time stamp is recovered during PE hotplug path as we did for frozen count. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Single kthread to handle eventsGavin Shan3-47/+55
We possiblly have multiple kthreads running for multiple EEH errors (events) and use one spinlock to make the process of handling those EEH events serialized. That's unnecessary and the patch creates only one kthread, which is started during EEH core initialization time in eeh_init(). A new semaphore introduced to count the number of existing EEH events in the queue and the kthread waiting on the semaphore. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Delay EEH probe during hotplugGavin Shan1-1/+15
While doing EEH recovery, the PCI devices of the problematic PE should be removed and then added to the system again. During the so-called hotplug event, the PCI devices of the problematic PE will be probed through early/late phase. We would delay EEH probe on late point for PowerNV platform since the PCI device isn't available in early phase. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Refactor eeh_reset_pe_once()Gavin Shan1-1/+2
We shouldn't check that the returned PE status is exactly equal to (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE) but instead only check that they are both set. [benh: changelog] Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: EEH post initialization operationGavin Shan2-0/+12
The patch adds new EEH operation post_init. It's used to notify the platform that EEH core has completed the EEH probe. By that, PowerNV platform starts to use the services supplied by EEH functionality. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Make eeh_init() publicGavin Shan2-3/+27
For EEH on PowerNV platform, we will do EEH probe based on the real PCI devices. The PCI devices are available after PCI probe. So we have to call eeh_init() explicitly on PowerNV platform after PCI probe. The patch also does EEH probe for PowerNV platform in eeh_init(). Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Trace PCI bus from PEGavin Shan2-0/+18
There're several types of PEs can be supported for now: PHB, Bus and Device dependent PE. For PCI bus dependent PE, tracing the corresponding PCI bus from PE (struct eeh_pe) would make the code more efficient. The patch also enables the retrieval of PCI bus based on the PCI bus dependent PE. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Make eeh_pe_get() publicGavin Shan2-2/+3
While processing EEH event interrupt from P7IOC, we need function to retrieve the PE according to the indicated EEH device. The patch makes function eeh_pe_get() public so that other source files can call it for that purpose. Also, the patch fixes referring to wrong BDF (Bus/Device/Function) address while searching PE in function __eeh_pe_get(). Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Make eeh_phb_pe_get() publicGavin Shan2-1/+2
One of the possible cases indicated by P7IOC interrupt is fenced PHB. For that case, we need fetch the PE corresponding to the PHB and disable the PHB and all subordinate PCI buses/devices, recover from the fenced state and eventually enable the whole PHB. We need one function to fetch the PHB PE outside eeh_pe.c and the patch is going to make eeh_phb_pe_get() public for that purpose. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Move common part to kernel directoryGavin Shan13-97/+120
The patch moves the common part of EEH core into arch/powerpc/kernel directory so that we needn't PPC_PSERIES while compiling POWERNV platform: * Move the EEH common part into arch/powerpc/kernel * Move the functions for PCI hotplug from pSeries platform to arch/powerpc/kernel/pci-hotplug.c * Move CONFIG_EEH from arch/powerpc/platforms/pseries/Kconfig to arch/powerpc/platforms/Kconfig * Adjust makefile accordingly Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/eeh: Cleanup for EEH coreGavin Shan2-18/+18
Cleanup on EEH core to remove unnecessary whitespaces. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/tm: Fix return of active 64bit signalsMichael Neuling1-3/+5
Currently we only restore signals which are transactionally suspended but it's possible that the transaction can be restored even when it's active. Most likely this will result in a transactional rollback by the hardware as the transaction will have been doomed by an earlier treclaim. The current code is a legacy of earlier kernel implementations which did software rollback of active transactions in the kernel. That code has now gone but we didn't correctly fix up this part of the signals code which still makes assumptions based on having software rollback. This changes the signal return code to always restore both contexts on 64 bit signal return. It also ensures that the MSR TM bits are properly restored from the signal context which they are not currently. Signed-off-by: Michael Neuling <mikey@neuling.org> cc: stable@vger.kernel.org (v3.9+) Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/tm: Fix return of 32bit rt signals to active transactionsMichael Neuling1-1/+1
Currently we only restore signals which are transactionally suspended but it's possible that the transaction can be restored even when it's active. Most likely this will result in a transactional rollback by the hardware as the transaction will have been doomed by an earlier treclaim. The current code is a legacy of earlier kernel implementations which did software rollback of active transactions in the kernel. That code has now gone but we didn't correctly fix up this part of the signals code which still makes assumptions based on having software rollback. This changes the signal return code to always restore both contexts on 32 bit rt signal return. Signed-off-by: Michael Neuling <mikey@neuling.org> cc: stable@vger.kernel.org (v3.9+) Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/tm: Fix restoration of MSR on 32bit signal returnMichael Neuling1-3/+6
Currently we clear out the MSR TM bits on signal return assuming that the signal should never return to an active transaction. This is bogus as the user may do this. It's most likely the transaction will be doomed due to a treclaim but that's a problem for the HW not the kernel. The current code is a legacy of earlier kernel implementations which did software rollback of active transactions in the kernel. That code has now gone but we didn't correctly fix up this part of the signals code which still makes the assumption that it must be returning to a suspended transaction. This pulls out both MSR TM bits from the user supplied context rather than just setting TM suspend. We pull out only the bits needed to ensure the user can't do anything dangerous to the MSR. Signed-off-by: Michael Neuling <mikey@neuling.org> cc: stable@vger.kernel.org (v3.9+) Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20powerpc/tm: Fix 32 bit non-rt signalsMichael Neuling1-5/+25
Currently sys_sigreturn() is TM unaware. Therefore, if we take a 32 bit signal without SIGINFO (non RT) inside a transaction, on signal return we don't restore the signal frame correctly. This checks if the signal frame being restoring is an active transaction, and if so, it copies the additional state to ptregs so it can be restored. Signed-off-by: Michael Neuling <mikey@neuling.org> cc: stable@vger.kernel.org (v3.9+) Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>