summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2015-05-11net: Add a struct net parameter to sock_create_kernEric W. Biederman1-1/+1
This is long overdue, and is part of cleaning up how we allocate kernel sockets that don't reference count struct net. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-11tun: Utilize the normal socket network namespace refcounting.Eric W. Biederman1-1/+0
There is no need for tun to do the weird network namespace refcounting. The existing network namespace refcounting in tfile has almost exactly the same lifetime. So rewrite the code to use the struct sock network namespace refcounting and remove the unnecessary hand rolled network namespace refcounting and the unncesary tfile->net. This change allows the tun code to directly call sock_put bypassing sock_release and making SOCK_EXTERNALLY_ALLOCATED unnecessary. Remove the now unncessary tun_release so that if anything tries to use the sock_release code path the kernel will oops, and let us know about the bug. The macvtap code already uses it's internal socket this way. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-10netlink: allow to listen "all" netnsNicolas Dichtel1-0/+2
More accurately, listen all netns that have a nsid assigned into the netns where the netlink socket is opened. For this purpose, a netlink socket option is added: NETLINK_LISTEN_ALL_NSID. When this option is set on a netlink socket, this socket will receive netlink notifications from all netns that have a nsid assigned into the netns where the socket has been opened. The nsid is sent to userland via an anscillary data. With this patch, a daemon needs only one socket to listen many netns. This is useful when the number of netns is high. Because 0 is a valid value for a nsid, the field nsid_is_set indicates if the field nsid is valid or not. skb->cb is initialized to 0 on skb allocation, thus we are sure that we will never send a nsid 0 by error to the userland. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-10seccomp, filter: add and use bpf_prog_create_from_user from seccompDaniel Borkmann1-7/+5
Seccomp has always been a special candidate when it comes to preparation of its filters in seccomp_prepare_filter(). Due to the extra checks and filter rewrite it partially duplicates code and has BPF internals exposed. This patch adds a generic API inside the BPF code code that seccomp can use and thus keep it's filter preparation code minimal and better maintainable. The other side-effect is that now classic JITs can add seccomp support as well by only providing a BPF_LDX | BPF_W | BPF_ABS translation. Tested with seccomp and BPF test suites. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Nicolas Schichan <nschichan@freebox.fr> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-10seccomp: simplify seccomp_prepare_filter and reuse bpf_prepare_filterNicolas Schichan1-4/+0
Remove the calls to bpf_check_classic(), bpf_convert_filter() and bpf_migrate_runtime() and let bpf_prepare_filter() take care of that instead. seccomp_check_filter() is passed to bpf_prepare_filter() so that it gets called from there, after bpf_check_classic(). We can now remove exposure of two internal classic BPF functions previously used by seccomp. The export of bpf_check_classic() symbol, previously known as sk_chk_filter(), was there since pre git times, and no in-tree module was using it, therefore remove it. Joint work with Daniel Borkmann. Signed-off-by: Nicolas Schichan <nschichan@freebox.fr> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-10net: filter: add a callback to allow classic post-verifier transformationsNicolas Schichan1-0/+6
This is in preparation for use by the seccomp code, the rationale is not to duplicate additional code within the seccomp layer, but instead, have it abstracted and hidden within the classic BPF API. As an interim step, this now also makes bpf_prepare_filter() visible (not as exported symbol though), so that seccomp can reuse that code path instead of reimplementing it. Joint work with Daniel Borkmann. Signed-off-by: Nicolas Schichan <nschichan@freebox.fr> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-06vlan: Use eth_proto_is_802_3Alexander Duyck1-1/+1
Replace "ntohs(proto) >= ETH_P_802_3_MIN" w/ eth_proto_is_802_3(proto). Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-06etherdev: Fix sparse error, make test usable by other functionsAlexander Duyck1-0/+18
This change does two things. First it fixes a sparse error for the fact that the __be16 degrades to an integer. Since that is actually what I am kind of doing I am simply working around that by forcing both sides of the comparison to u16. Also I realized on some compilers I was generating another instruction for big endian systems such as PowerPC since it was masking the value before doing the comparison. So to resolve that I have simply pulled the mask out and wrapped it in an #ifndef __BIG_ENDIAN. Lastly I pulled this all out into its own function. I notices there are similar checks in a number of other places so this function can be reused there to help reduce overhead in these paths as well. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-05tcp: provide SYN headers for passive connectionsEric Dumazet1-0/+8
This patch allows a server application to get the TCP SYN headers for its passive connections. This is useful if the server is doing fingerprinting of clients based on SYN packet contents. Two socket options are added: TCP_SAVE_SYN and TCP_SAVED_SYN. The first is used on a socket to enable saving the SYN headers for child connections. This can be set before or after the listen() call. The latter is used to retrieve the SYN headers for passive connections, if the parent listener has enabled TCP_SAVE_SYN. TCP_SAVED_SYN is read once, it frees the saved SYN headers. The data returned in TCP_SAVED_SYN are network (IPv4/IPv6) and TCP headers. Original patch was written by Tom Herbert, I changed it to not hold a full skb (and associated dst and conntracking reference). We have used such patch for about 3 years at Google. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Tested-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-04net: Export IGMP/MLD message validation codeLinus Lüssing2-0/+4
With this patch, the IGMP and MLD message validation functions are moved from the bridge code to IPv4/IPv6 multicast files. Some small refactoring was done to enhance readibility and to iron out some differences in behaviour between the IGMP and MLD parsing code (e.g. the skb-cloning of MLD messages is now only done if necessary, just like the IGMP part always did). Finally, these IGMP and MLD message validation functions are exported so that not only the bridge can use it but batman-adv later, too. Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-04net: Add skb_get_hash_perturbTom Herbert1-0/+2
This calls flow_disect and __skb_get_hash to procure a hash for a packet. Input includes a key to initialize jhash. This function does not set skb->hash. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-04etherdev: Process is_multicast_ether_addr at same size as other operationsAlexander Duyck1-1/+23
This change makes it so that we process the address in is_multicast_ether_addr at the same size as the other calls. This allows us to avoid duplicate reads when used with other calls such as is_zero_ether_addr or eth_addr_copy. In addition I have added a 64 bit version of the function so in eth_type_trans we can process the destination address as a 64 bit value throughout. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller7-13/+26
Merge net into net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds3-5/+19
Pull networking fixes from David Miller: 1) Receive packet length needs to be adjust by 2 on RX to accomodate the two padding bytes in altera_tse driver. From Vlastimil Setka. 2) If rx frame is dropped due to out of memory in macb driver, we leave the receive ring descriptors in an undefined state. From Punnaiah Choudary Kalluri 3) Some netlink subsystems erroneously signal NLM_F_MULTI. That is only for dumps. Fix from Nicolas Dichtel. 4) Fix mis-use of raw rt->rt_pmtu value in ipv4, one must always go via the ipv4_mtu() helper. From Herbert Xu. 5) Fix null deref in bridge netfilter, and miscalculated lengths in jump/goto nf_tables verdicts. From Florian Westphal. 6) Unhash ping sockets properly. 7) Software implementation of BPF divide did 64/32 rather than 64/64 bit divide. The JITs got it right. Fix from Alexei Starovoitov. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (30 commits) ipv4: Missing sk_nulls_node_init() in ping_unhash(). net: fec: Fix RGMII-ID mode net/mlx4_en: Schedule napi when RX buffers allocation fails netxen_nic: use spin_[un]lock_bh around tx_clean_lock net/mlx4_core: Fix unaligned accesses mlx4_en: Use correct loop cursor in error path. cxgb4: Fix MC1 memory offset calculation bnx2x: Delay during kdump load net: Fix Kernel Panic in bonding driver debugfs file: rlb_hash_table net: dsa: Fix scope of eeprom-length property net: macb: Fix race condition in driver when Rx frame is dropped hv_netvsc: Fix a bug in netvsc_start_xmit() altera_tse: Correct rx packet length mlx4: Fix tx ring affinity_mask creation tipc: fix problem with parallel link synchronization mechanism tipc: remove wrong use of NLM_F_MULTI bridge/nl: remove wrong use of NLM_F_MULTI bridge/mdb: remove wrong use of NLM_F_MULTI net: sched: act_connmark: don't zap skb->nfct trivial: net: systemport: bcmsysport.h: fix 0x0x prefix ...
2015-04-30Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-8/+0
Pull kvm changes from Paolo Bonzini: "Remove from guest code the handling of task migration during a pvclock read; instead use the correct protocol in KVM. This removes the need for task migration notifiers in core scheduler code" [ The scheduler people really hated the migration notifiers, so this was kind of required - Linus ] * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: x86: pvclock: Really remove the sched notifier for cross-cpu migrations kvm: x86: fix kvmclock update protocol
2015-04-30Merge tag 'tty-4.1-rc2' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial fixes from Greg KH: "Here are some small tty/serial driver fixes for 4.1-rc2. They include some minor fixes that resolve reported issues, and a new device quirk. All have been in linux-next succesfully" * tag 'tty-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: 8250_pci: Add support for 16 port Exar boards serial: samsung: fix serial console break tty/serial: at91: maxburst was missing for dma transfers serial: of-serial: Remove device_type = "serial" registration serial: xilinx: Use platform_get_irq to get irq description structure serial: core: Fix kernel-doc build warnings tty: Re-add external interface for tty_set_termios()
2015-04-30Merge tag 'usb-4.1-rc2' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are a number of small USB fixes for 4.2-rc2. They revert one problem patch, fix some minor things, and add some new quirks for "broken" devices. All have been in linux-next successfully" * tag 'usb-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: cdc-acm: prevent infinite loop when parsing CDC headers. Revert "usb: host: ehci-msm: Use devm_ioremap_resource instead of devm_ioremap" usb: chipidea: otg: remove mutex unlock and lock while stop and start role uas: Set max_sectors_240 quirk for ASM1053 devices uas: Add US_FL_MAX_SECTORS_240 flag uas: Allow uas_use_uas_driver to return usb-storage flags
2015-04-30tcp: add tcpi_bytes_received to tcp_infoEric Dumazet1-0/+4
This patch tracks total number of payload bytes received on a TCP socket. This is the sum of all changes done to tp->rcv_nxt RFC4898 named this : tcpEStatsAppHCThruOctetsReceived This is a 64bit field, and can be fetched both from TCP_INFO getsockopt() if one has a handle on a TCP socket, or from inet_diag netlink facility (iproute2/ss patch will follow) Note that tp->bytes_received was placed near tp->rcv_nxt for best data locality and minimal performance impact. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Matt Mathis <mattmathis@google.com> Cc: Eric Salo <salo@google.com> Cc: Martin Lau <kafai@fb.com> Cc: Chris Rapier <rapier@psc.edu> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-30tcp: add tcpi_bytes_acked to tcp_infoEric Dumazet1-0/+4
This patch tracks total number of bytes acked for a TCP socket. This is the sum of all changes done to tp->snd_una, and allows for precise tracking of delivered data. RFC4898 named this : tcpEStatsAppHCThruOctetsAcked This is a 64bit field, and can be fetched both from TCP_INFO getsockopt() if one has a handle on a TCP socket, or from inet_diag netlink facility (iproute2/ss patch will follow) Note that tp->bytes_acked was placed near tp->snd_una for best data locality and minimal performance impact. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Cc: Matt Mathis <mattmathis@google.com> Cc: Eric Salo <salo@google.com> Cc: Martin Lau <kafai@fb.com> Cc: Chris Rapier <rapier@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-29bridge/nl: remove wrong use of NLM_F_MULTINicolas Dichtel2-3/+5
NLM_F_MULTI must be used only when a NLMSG_DONE message is sent. In fact, it is sent only at the end of a dump. Libraries like libnl will wait forever for NLMSG_DONE. Fixes: e5a55a898720 ("net: create generic bridge ops") Fixes: 815cccbf10b2 ("ixgbe: add setlink, getlink support to ixgbe and ixgbevf") CC: John Fastabend <john.r.fastabend@intel.com> CC: Sathya Perla <sathya.perla@emulex.com> CC: Subbu Seetharaman <subbu.seetharaman@emulex.com> CC: Ajit Khaparde <ajit.khaparde@emulex.com> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: intel-wired-lan@lists.osuosl.org CC: Jiri Pirko <jiri@resnulli.us> CC: Scott Feldman <sfeldma@gmail.com> CC: Stephen Hemminger <stephen@networkplumber.org> CC: bridge@lists.linux-foundation.org Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-28Merge branch 'for-linus' of ↵Linus Torvalds1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "One additional new feature for 4.1, a new PRNG based on SHA-512 for the zcrypt driver. Two memory management related changes, the page table reallocation for KVM is removed, and with file ptes gone the encoding of page table entries is improved. And three bug fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/zcrypt: Introduce new SHA-512 based Pseudo Random Generator. s390/mm: change swap pte encoding and pgtable cleanup s390/mm: correct transfer of dirty & young bits in __pmd_to_pte s390/bpf: add dependency to z196 features s390/3215: free memory in error path s390/kvm: remove delayed reallocation of page tables for KVM kexec: allocate the kexec control page with KEXEC_CONTROL_MEMORY_GFP
2015-04-28tty: Re-add external interface for tty_set_termios()Frederic Danis1-0/+1
This is needed by Bluetooth hci_uart module to be able to change speed of Bluetooth controller and local UART. Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Cc: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-28uas: Add US_FL_MAX_SECTORS_240 flagHans de Goede1-0/+2
The usb-storage driver sets max_sectors = 240 in its scsi-host template, for uas we do not want to do that for all devices, but testing has shown that some devices need it. This commit adds a US_FL_MAX_SECTORS_240 flag for such devices, and implements support for it in uas.c, while at it it also adds support for US_FL_MAX_SECTORS_64 to uas.c. Cc: stable@vger.kernel.org # 3.16 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller1-2/+14
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Fix a crash in nf_tables when dictionaries are used from the ruleset, due to memory corruption, from Florian Westphal. 2) Fix another crash in nf_queue when used with br_netfilter. Also from Florian. Both fixes are related to new stuff that got in 4.0-rc. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds3-4/+10
Pull networking fixes from David Miller: 1) mlx4 doesn't check fully for supported valid RSS hash function, fix from Amir Vadai 2) Off by one in ibmveth_change_mtu(), from David Gibson 3) Prevent altera chip from reporting false error interrupts in some circumstances, from Chee Nouk Phoon 4) Get rid of that stupid endless loop trying to allocate a FIN packet in TCP, and in the process kill deadlocks. From Eric Dumazet 5) Fix get_rps_cpus() crash due to wrong invalid-cpu value, also from Eric Dumazet 6) Fix two bugs in async rhashtable resizing, from Thomas Graf 7) Fix topology server listener socket namespace bug in TIPC, from Ying Xue 8) Add some missing HAS_DMA kconfig dependencies, from Geert Uytterhoeven 9) bgmac driver intends to force re-polling but does so by returning the wrong value from it's ->poll() handler. Fix from Rafał Miłecki 10) When the creater of an rhashtable configures a max size for it, don't bark in the logs and drop insertions when that is exceeded. Fix from Johannes Berg 11) Recover from out of order packets in ppp mppe properly, from Sylvain Rochet * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits) bnx2x: really disable TPA if 'disable_tpa' option is set net:treewide: Fix typo in drivers/net net/mlx4_en: Prevent setting invalid RSS hash function mdio-mux-gpio: use new gpiod_get_array and gpiod_put_array functions netfilter; Add some missing default cases to switch statements in nft_reject. ppp: mppe: discard late packet in stateless mode ppp: mppe: sanity error path rework net/bonding: Make DRV macros private net: rfs: fix crash in get_rps_cpus() altera tse: add support for fixed-links. pxa168: fix double deallocation of managed resources net: fix crash in build_skb() net: eth: altera: Resolve false errors from MSGDMA to TSE ehea: Fix memory hook reference counting crashes net/tg3: Release IRQs on permanent error net: mdio-gpio: support access that may sleep inet: fix possible panic in reqsk_queue_unlink() rhashtable: don't attempt to grow when at max_size bgmac: fix requests for extra polling calls from NAPI tcp: avoid looping in tcp_send_fin() ...
2015-04-27x86: pvclock: Really remove the sched notifier for cross-cpu migrationsPaolo Bonzini1-8/+0
This reverts commits 0a4e6be9ca17c54817cf814b4b5aa60478c6df27 and 80f7fdb1c7f0f9266421f823964fd1962681f6ce. The task migration notifier was originally introduced in order to support the pvclock vsyscall with non-synchronized TSC, but KVM only supports it with synchronized TSC. Hence, on KVM the race condition is only needed due to a bad implementation on the host side, and even then it's so rare that it's mostly theoretical. As far as KVM is concerned it's possible to fix the host, avoiding the additional complexity in the vDSO and the (re)introduction of the task migration notifier. Xen, on the other hand, hasn't yet implemented vsyscall support at all, so we do not care about its plans for non-synchronized TSC. Reported-by: Peter Zijlstra <peterz@infradead.org> Suggested-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-04-27Merge git://git.infradead.org/intel-iommuLinus Torvalds1-3/+15
Pull intel iommu updates from David Woodhouse: "This lays a little of the groundwork for upcoming Shared Virtual Memory support — fixing some bogus #defines for capability bits and adding the new ones, and starting to use the new wider page tables where we can, in anticipation of actually filling in the new fields therein. It also allows graphics devices to be assigned to VM guests again. This got broken in 3.17 by disallowing assignment of RMRR-afflicted devices. Like USB, we do understand why there's an RMRR for graphics devices — and unlike USB, it's actually sane. So we can make an exception for graphics devices, just as we do USB controllers. Finally, tone down the warning about the X2APIC_OPT_OUT bit, due to persistent requests. X2APIC_OPT_OUT was added to the spec as a nasty hack to allow broken BIOSes to forbid us from using X2APIC when they do stupid and invasive things and would break if we did. Someone noticed that since Windows doesn't have full IOMMU support for DMA protection, setting the X2APIC_OPT_OUT bit made Windows avoid initialising the IOMMU on the graphics unit altogether. This means that it would be available for use in "driver mode", where the IOMMU registers are made available through a BAR of the graphics device and the graphics driver can do SVM all for itself. So they started setting the X2APIC_OPT_OUT bit on *all* platforms with SVM capabilities. And even the platforms which *might*, if the planets had been aligned correctly, possibly have had SVM capability but which in practice actually don't" * git://git.infradead.org/intel-iommu: iommu/vt-d: support extended root and context entries iommu/vt-d: Add new extended capabilities from v2.3 VT-d specification iommu/vt-d: Allow RMRR on graphics devices too iommu/vt-d: Print x2apic opt out info instead of printing a warning iommu/vt-d: kill bogus ecap_niotlb_iunits()
2015-04-27Merge tag 'nfs-for-4.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds5-85/+14
Pull NFS client updates from Trond Myklebust: "Another set of mainly bugfixes and a couple of cleanups. No new functionality in this round. Highlights include: Stable patches: - Fix a regression in /proc/self/mountstats - Fix the pNFS flexfiles O_DIRECT support - Fix high load average due to callback thread sleeping Bugfixes: - Various patches to fix the pNFS layoutcommit support - Do not cache pNFS deviceids unless server notifications are enabled - Fix a SUNRPC transport reconnection regression - make debugfs file creation failure non-fatal in SUNRPC - Another fix for circular directory warnings on NFSv4 "junctioned" mountpoints - Fix locking around NFSv4.2 fallocate() support - Truncating NFSv4 file opens should also sync O_DIRECT writes - Prevent infinite loop in rpcrdma_ep_create() Features: - Various improvements to the RDMA transport code's handling of memory registration - Various code cleanups" * tag 'nfs-for-4.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (55 commits) fs/nfs: fix new compiler warning about boolean in switch nfs: Remove unneeded casts in nfs NFS: Don't attempt to decode missing directory entries Revert "nfs: replace nfs_add_stats with nfs_inc_stats when add one" NFS: Rename idmap.c to nfs4idmap.c NFS: Move nfs_idmap.h into fs/nfs/ NFS: Remove CONFIG_NFS_V4 checks from nfs_idmap.h NFS: Add a stub for GETDEVICELIST nfs: remove WARN_ON_ONCE from nfs_direct_good_bytes nfs: fix DIO good bytes calculation nfs: Fetch MOUNTED_ON_FILEID when updating an inode sunrpc: make debugfs file creation failure non-fatal nfs: fix high load average due to callback thread sleeping NFS: Reduce time spent holding the i_mutex during fallocate() NFS: Don't zap caches on fallocate() xprtrdma: Make rpcrdma_{un}map_one() into inline functions xprtrdma: Handle non-SEND completions via a callout xprtrdma: Add "open" memreg op xprtrdma: Add "destroy MRs" memreg op xprtrdma: Add "reset MRs" memreg op ...
2015-04-27Merge branch 'for-linus' of ↵Linus Torvalds1-2/+29
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull fourth vfs update from Al Viro: "d_inode() annotations from David Howells (sat in for-next since before the beginning of merge window) + four assorted fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: RCU pathwalk breakage when running into a symlink overmounting something fix I_DIO_WAKEUP definition direct-io: only inc/dec inode->i_dio_count for file systems fs/9p: fix readdir() VFS: assorted d_backing_inode() annotations VFS: fs/inode.c helpers: d_inode() annotations VFS: fs/cachefiles: d_backing_inode() annotations VFS: fs library helpers: d_inode() annotations VFS: assorted weird filesystems: d_inode() annotations VFS: normal filesystems (and lustre): d_inode() annotations VFS: security/: d_inode() annotations VFS: security/: d_backing_inode() annotations VFS: net/: d_inode() annotations VFS: net/unix: d_backing_inode() annotations VFS: kernel/: d_inode() annotations VFS: audit: d_backing_inode() annotations VFS: Fix up some ->d_inode accesses in the chelsio driver VFS: Cachefiles should perform fs modifications on the top layer only VFS: AF_UNIX sockets should call mknod on the top layer only
2015-04-26Merge tag 'chrome-for-linus' of ↵Linus Torvalds1-5/+18
git://git.kernel.org/pub/scm/linux/kernel/git/olof/chrome-platform Pull chrome platform updates from Olof Johansson: "Here's a set of updates to the Chrome OS platform drivers for this merge window. Main new things this cycle is: - Driver changes to expose the lightbar to users. With this, you can make your own blinkenlights on Chromebook Pixels. - Changes in the way that the atmel_mxt trackpads are probed. The laptop driver is trying to be smart and not instantiate the devices that don't answer to probe. For the trackpad that can come up in two modes (bootloader or regular), this gets complicated since the driver already knows how to handle the two modes including the actual addresses used. So now the laptop driver needs to know more too, instantiating the regular address even if the bootloader one is the probe that passed. - mfd driver improvements by Javier Martines Canillas, and a few bugfixes from him, kbuild and myself" * tag 'chrome-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/olof/chrome-platform: platform/chrome: chromeos_laptop - instantiate Atmel at primary address platform/chrome: cros_ec_lpc - Depend on X86 || COMPILE_TEST platform/chrome: cros_ec_lpc - Include linux/io.h header file platform/chrome: fix platform_no_drv_owner.cocci warnings platform/chrome: cros_ec_lightbar - fix duplicate const warning platform/chrome: cros_ec_dev - fix Unknown escape '%' warning platform/chrome: Expose Chrome OS Lightbar to users platform/chrome: Create sysfs attributes for the ChromeOS EC mfd: cros_ec: Instantiate ChromeOS EC character device platform/chrome: Add Chrome OS EC userspace device interface platform/chrome: Add cros_ec_lpc driver for x86 devices mfd: cros_ec: Add char dev and virtual dev pointers mfd: cros_ec: Use fixed size arrays to transfer data with the EC
2015-04-25net: fix crash in build_skb()Eric Dumazet1-0/+1
When I added pfmemalloc support in build_skb(), I forgot netlink was using build_skb() with a vmalloc() area. In this patch I introduce __build_skb() for netlink use, and build_skb() is a wrapper handling both skb->head_frag and skb->pfmemalloc This means netlink no longer has to hack skb->head_frag [ 1567.700067] kernel BUG at arch/x86/mm/physaddr.c:26! [ 1567.700067] invalid opcode: 0000 [#1] PREEMPT SMP KASAN [ 1567.700067] Dumping ftrace buffer: [ 1567.700067] (ftrace buffer empty) [ 1567.700067] Modules linked in: [ 1567.700067] CPU: 9 PID: 16186 Comm: trinity-c182 Not tainted 4.0.0-next-20150424-sasha-00037-g4796e21 #2167 [ 1567.700067] task: ffff880127efb000 ti: ffff880246770000 task.ti: ffff880246770000 [ 1567.700067] RIP: __phys_addr (arch/x86/mm/physaddr.c:26 (discriminator 3)) [ 1567.700067] RSP: 0018:ffff8802467779d8 EFLAGS: 00010202 [ 1567.700067] RAX: 000041000ed8e000 RBX: ffffc9008ed8e000 RCX: 000000000000002c [ 1567.700067] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffffb3fd6049 [ 1567.700067] RBP: ffff8802467779f8 R08: 0000000000000019 R09: ffff8801d0168000 [ 1567.700067] R10: ffff8801d01680c7 R11: ffffed003a02d019 R12: ffffc9000ed8e000 [ 1567.700067] R13: 0000000000000f40 R14: 0000000000001180 R15: ffffc9000ed8e000 [ 1567.700067] FS: 00007f2a7da3f700(0000) GS:ffff8801d1000000(0000) knlGS:0000000000000000 [ 1567.700067] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1567.700067] CR2: 0000000000738308 CR3: 000000022e329000 CR4: 00000000000007e0 [ 1567.700067] Stack: [ 1567.700067] ffffc9000ed8e000 ffff8801d0168000 ffffc9000ed8e000 ffff8801d0168000 [ 1567.700067] ffff880246777a28 ffffffffad7c0a21 0000000000001080 ffff880246777c08 [ 1567.700067] ffff88060d302e68 ffff880246777b58 ffff880246777b88 ffffffffad9a6821 [ 1567.700067] Call Trace: [ 1567.700067] build_skb (include/linux/mm.h:508 net/core/skbuff.c:316) [ 1567.700067] netlink_sendmsg (net/netlink/af_netlink.c:1633 net/netlink/af_netlink.c:2329) [ 1567.774369] ? sched_clock_cpu (kernel/sched/clock.c:311) [ 1567.774369] ? netlink_unicast (net/netlink/af_netlink.c:2273) [ 1567.774369] ? netlink_unicast (net/netlink/af_netlink.c:2273) [ 1567.774369] sock_sendmsg (net/socket.c:614 net/socket.c:623) [ 1567.774369] sock_write_iter (net/socket.c:823) [ 1567.774369] ? sock_sendmsg (net/socket.c:806) [ 1567.774369] __vfs_write (fs/read_write.c:479 fs/read_write.c:491) [ 1567.774369] ? get_lock_stats (kernel/locking/lockdep.c:249) [ 1567.774369] ? default_llseek (fs/read_write.c:487) [ 1567.774369] ? vtime_account_user (kernel/sched/cputime.c:701) [ 1567.774369] ? rw_verify_area (fs/read_write.c:406 (discriminator 4)) [ 1567.774369] vfs_write (fs/read_write.c:539) [ 1567.774369] SyS_write (fs/read_write.c:586 fs/read_write.c:577) [ 1567.774369] ? SyS_read (fs/read_write.c:577) [ 1567.774369] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) [ 1567.774369] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2594 kernel/locking/lockdep.c:2636) [ 1567.774369] ? trace_hardirqs_on_thunk (arch/x86/lib/thunk_64.S:42) [ 1567.774369] system_call_fastpath (arch/x86/kernel/entry_64.S:261) Fixes: 79930f5892e ("net: do not deplete pfmemalloc reserve") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-24fix I_DIO_WAKEUP definitionEric Sandeen1-1/+1
I_DIO_WAKEUP is never directly used, but fix it up anyway. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-24direct-io: only inc/dec inode->i_dio_count for file systemsJens Axboe1-1/+28
do_blockdev_direct_IO() increments and decrements the inode ->i_dio_count for each IO operation. It does this to protect against truncate of a file. Block devices don't need this sort of protection. For a capable multiqueue setup, this atomic int is the only shared state between applications accessing the device for O_DIRECT, and it presents a scaling wall for that. In my testing, as much as 30% of system time is spent incrementing and decrementing this value. A mixed read/write workload improved from ~2.5M IOPS to ~9.6M IOPS, with better latencies too. Before: clat percentiles (usec): | 1.00th=[ 33], 5.00th=[ 34], 10.00th=[ 34], 20.00th=[ 34], | 30.00th=[ 34], 40.00th=[ 34], 50.00th=[ 35], 60.00th=[ 35], | 70.00th=[ 35], 80.00th=[ 35], 90.00th=[ 37], 95.00th=[ 80], | 99.00th=[ 98], 99.50th=[ 151], 99.90th=[ 155], 99.95th=[ 155], | 99.99th=[ 165] After: clat percentiles (usec): | 1.00th=[ 95], 5.00th=[ 108], 10.00th=[ 129], 20.00th=[ 149], | 30.00th=[ 155], 40.00th=[ 161], 50.00th=[ 167], 60.00th=[ 171], | 70.00th=[ 177], 80.00th=[ 185], 90.00th=[ 201], 95.00th=[ 270], | 99.00th=[ 390], 99.50th=[ 398], 99.90th=[ 418], 99.95th=[ 422], | 99.99th=[ 438] In other setups, Robert Elliott reported seeing good performance improvements: https://lkml.org/lkml/2015/4/3/557 The more applications accessing the device, the worse it gets. Add a new direct-io flags, DIO_SKIP_DIO_COUNT, which tells do_blockdev_direct_IO() that it need not worry about incrementing or decrementing the inode i_dio_count for this caller. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Elliott, Robert (Server Storage) <elliott@hp.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-24netfilter: bridge: fix NULL deref in physin/out ifindex helpersFlorian Westphal1-2/+14
Might not have an outdev yet. We'll oops when iface goes down while skbs are still nfqueue'd: RIP: 0010:[<ffffffff81422a2f>] [<ffffffff81422a2f>] dev_cmp+0x4f/0x80 nfqnl_rcv_dev_event+0xe2/0x150 notifier_call_chain+0x53/0xa0 Fixes: c737b7c4510026 ("netfilter: bridge: add helpers for fetching physin/outdev") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-24Merge tag 'dma-buf-for-4.1' of ↵Linus Torvalds1-6/+28
git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf Pull dma-buf updates from Sumit Semwal: "Minor cleanup only; this could've gone in for the 4.0 merge window, but for a copy-paste stupidity from me. It has been in the for-next since then, and no issues reported. - cleanup of dma_buf_export() - correction of copy-paste stupidity while doing the cleanup" * tag 'dma-buf-for-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf: staging: android: ion: fix wrong init of dma_buf_export_info dma-buf: cleanup dma_buf_export() to make it easily extensible
2015-04-24Merge branch 'for-linus' of git://git.infradead.org/users/vkoul/slave-dmaLinus Torvalds6-46/+6
Pull slave-dmaengine updates from Vinod Koul: - new drivers for: - Ingenic JZ4780 controller - APM X-Gene controller - Freescale RaidEngine device - Renesas USB Controller - remove device_alloc_chan_resources dummy handlers - sh driver cleanups for peri peri and related emmc and asoc patches as well - fixes and enhancements spread over the drivers * 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma: (59 commits) dmaengine: dw: don't prompt for DW_DMAC_CORE dmaengine: shdmac: avoid unused variable warnings dmaengine: fix platform_no_drv_owner.cocci warnings dmaengine: pch_dma: fix memory leak on failure path in pch_dma_probe() dmaengine: at_xdmac: unlock spin lock before return dmaengine: xgene: devm_ioremap() returns NULL on error dmaengine: xgene: buffer overflow in xgene_dma_init_channels() dmaengine: usb-dmac: Fix dereferencing freed memory 'desc' dmaengine: sa11x0: report slave capabilities to upper layers dmaengine: vdma: Fix compilation warnings dmaengine: fsl_raid: statify fsl_re_chan_probe dmaengine: Driver support for FSL RaidEngine device. dmaengine: xgene_dma_init_ring_mngr() can be static Documentation: dma: Add documentation for the APM X-Gene SoC DMA device DTS binding arm64: dts: Add APM X-Gene SoC DMA device and DMA clock DTS nodes dmaengine: Add support for APM X-Gene SoC DMA engine driver dmaengine: usb-dmac: Add Renesas USB DMA Controller (USB-DMAC) driver dmaengine: renesas,usb-dmac: Add device tree bindings documentation dmaengine: edma: fixed wrongly initialized data parameter to the edma callback dmaengine: ste_dma40: fix implicit conversion ...
2015-04-24Merge tag 'md/4.1' of git://neil.brown.name/mdLinus Torvalds2-0/+4
Pull md updates from Neil Brown: "More updates that usual this time. A few have performance impacts which hould mostly be positive, but RAID5 (in particular) can be very work-load ensitive... We'll have to wait and see. Highlights: - "experimental" code for managing md/raid1 across a cluster using DLM. Code is not ready for general use and triggers a WARNING if used. However it is looking good and mostly done and having in mainline will help co-ordinate development. - RAID5/6 can now batch multiple (4K wide) stripe_heads so as to handle a full (chunk wide) stripe as a single unit. - RAID6 can now perform read-modify-write cycles which should help performance on larger arrays: 6 or more devices. - RAID5/6 stripe cache now grows and shrinks dynamically. The value set is used as a minimum. - Resync is now allowed to go a little faster than the 'mininum' when there is competing IO. How much faster depends on the speed of the devices, so the effective minimum should scale with device speed to some extent" * tag 'md/4.1' of git://neil.brown.name/md: (58 commits) md/raid5: don't do chunk aligned read on degraded array. md/raid5: allow the stripe_cache to grow and shrink. md/raid5: change ->inactive_blocked to a bit-flag. md/raid5: move max_nr_stripes management into grow_one_stripe and drop_one_stripe md/raid5: pass gfp_t arg to grow_one_stripe() md/raid5: introduce configuration option rmw_level md/raid5: activate raid6 rmw feature md/raid6 algorithms: xor_syndrome() for SSE2 md/raid6 algorithms: xor_syndrome() for generic int md/raid6 algorithms: improve test program md/raid6 algorithms: delta syndrome functions raid5: handle expansion/resync case with stripe batching raid5: handle io error of batch list RAID5: batch adjacent full stripe write raid5: track overwrite disk count raid5: add a new flag to track if a stripe can be batched raid5: use flex_array for scribble data md raid0: access mddev->queue (request queue member) conditionally because it is not set when accessed from dm-raid md: allow resync to go faster when there is competing IO. md: remove 'go_faster' option from ->sync_request() ...
2015-04-24Merge tag 'devicetree-for-4.1' of ↵Linus Torvalds3-2/+15
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull second batch of devicetree updates from Rob Herring: "As Grant mentioned in the first devicetree pull request, here is the 2nd batch of DT changes for 4.1. The main remaining item here is the endianness bindings and related 8250 driver support. - DT endianness specification bindings - big-endian 8250 serial support - DT overlay unittest updates - various DT doc updates - compile fixes for OF_IRQ=n" * tag 'devicetree-for-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: frv: add io{read,write}{16,32}be functions mn10300: add io{read,write}{16,32}be functions Documentation: DT bindings: add doc for Altera's SoCFPGA platform of: base: improve of_get_next_child() kernel-doc Doc: dt: arch_timer: discourage clock-frequency use of: unittest: overlay: Keep track of created overlays of/fdt: fix allocation size for device node path serial: of_serial: Support big-endian register accesses serial: 8250: Add support for big-endian MMIO accesses of: Document {little,big,native}-endian bindings of/fdt: Add endianness helper function for early init code of: Add helper function to check MMIO register endianness of/fdt: Remove "reg" data prints from early_init_dt_scan_memory of: add vendor prefix for Artesyn of: Add dummy of_irq_to_resource_table() for IRQ_OF=n of: OF_IRQ should depend on IRQ_DOMAIN
2015-04-24rhashtable: don't attempt to grow when at max_sizeJohannes Berg1-1/+2
The conversion of mac80211's station table to rhashtable had a bug that I found by accident in code review, that hadn't been found as rhashtable apparently managed to have a maximum hash chain length of one (!) in all our testing. In order to test the bug and verify the fix I set my rhashtable's max_size very low (4) in order to force getting hash collisions. At that point, rhashtable WARNed in rhashtable_insert_rehash() but didn't actually reject the hash table insertion. This caused it to lose insertions - my master list of stations would have 9 entries, but the rhashtable only had 5. This may warrant a deeper look, but that WARN_ON() just shouldn't happen. Fix this by not returning true from rht_grow_above_100() when the rhashtable's max_size has been reached - in this case the user is explicitly configuring it to be at most that big, so even if it's now above 100% it shouldn't attempt to resize. This fixes the "lost insertion" issue and consequently allows my code to display its error (and verify my fix for it.) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-24Merge tag 'arm64-upstream' of ↵Linus Torvalds4-1/+54
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull initial ACPI support for arm64 from Will Deacon: "This series introduces preliminary ACPI 5.1 support to the arm64 kernel using the "hardware reduced" profile. We don't support any peripherals yet, so it's fairly limited in scope: - MEMORY init (UEFI) - ACPI discovery (RSDP via UEFI) - CPU init (FADT) - GIC init (MADT) - SMP boot (MADT + PSCI) - ACPI Kconfig options (dependent on EXPERT) ACPI for arm64 has been in development for a while now and hardware has been available that can boot with either FDT or ACPI tables. This has been made possible by both changes to the ACPI spec to cater for ARM-based machines (known as "hardware-reduced" in ACPI parlance) but also a Linaro-driven effort to get this supported on top of the Linux kernel. This pull request is the result of that work. These changes allow us to initialise the CPUs, interrupt controller, and timers via ACPI tables, with memory information and cmdline coming from EFI. We don't support a hybrid ACPI/FDT scheme. Of course, there is still plenty of work to do (a serial console would be nice!) but I expect that to happen on a per-driver basis after this core series has been merged. Anyway, the diff stat here is fairly horrible, but splitting this up and merging it via all the different subsystems would have been extremely painful. Instead, we've got all the relevant Acks in place and I've not seen anything other than trivial (Kconfig) conflicts in -next (for completeness, I've included my resolution below). Nearly half of the insertions fall under Documentation/. So, we'll see how this goes. Right now, it all depends on EXPERT and I fully expect people to use FDT by default for the immediate future" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (31 commits) ARM64 / ACPI: make acpi_map_gic_cpu_interface() as void function ARM64 / ACPI: Ignore the return error value of acpi_map_gic_cpu_interface() ARM64 / ACPI: fix usage of acpi_map_gic_cpu_interface ARM64: kernel: acpi: honour acpi=force command line parameter ARM64: kernel: acpi: refactor ACPI tables init and checks ARM64: kernel: psci: let ACPI probe PSCI version ARM64: kernel: psci: factor out probe function ACPI: move arm64 GSI IRQ model to generic GSI IRQ layer ARM64 / ACPI: Don't unflatten device tree if acpi=force is passed ARM64 / ACPI: additions of ACPI documentation for arm64 Documentation: ACPI for ARM64 ARM64 / ACPI: Enable ARM64 in Kconfig XEN / ACPI: Make XEN ACPI depend on X86 ARM64 / ACPI: Select ACPI_REDUCED_HARDWARE_ONLY if ACPI is enabled on ARM64 clocksource / arch_timer: Parse GTDT to initialize arch timer irqchip: Add GICv2 specific ACPI boot support ARM64 / ACPI: Introduce ACPI_IRQ_MODEL_GIC and register device's gsi ACPI / processor: Make it possible to get CPU hardware ID via GICC ACPI / processor: Introduce phys_cpuid_t for CPU hardware ID ARM64 / ACPI: Parse MADT for SMP initialization ...
2015-04-24Merge branch 'for-4.1' of git://linux-nfs.org/~bfields/linuxLinus Torvalds1-0/+7
Pull nfsd updates from Bruce Fields: "A quiet cycle this time; this is basically entirely bugfixes. The few that aren't cc'd to stable are cleanup or seemed unlikely to affect anyone much" * 'for-4.1' of git://linux-nfs.org/~bfields/linux: uapi: Remove kernel internal declaration nfsd: fix nsfd startup race triggering BUG_ON nfsd: eliminate NFSD_DEBUG nfsd4: fix READ permission checking nfsd4: disallow SEEK with special stateids nfsd4: disallow ALLOCATE with special stateids nfsd: add NFSEXP_PNFS to the exflags array nfsd: Remove duplicate macro define for max sec label length nfsd: allow setting acls with unenforceable DENYs nfsd: NFSD_FAULT_INJECTION depends on DEBUG_FS nfsd: remove unused status arg to nfsd4_cleanup_open_state nfsd: remove bogus setting of status in nfsd4_process_open2 NFSD: Use correct reply size calculating function NFSD: Using path_equal() for checking two paths
2015-04-24Merge tag 'xfs-for-linus-4.1-rc1' of ↵Linus Torvalds1-0/+6
git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs Pull xfs update from Dave Chinner: "This update contains: - RENAME_WHITEOUT support - conversion of per-cpu superblock accounting to use generic counters - new inode mmap lock so that we can lock page faults out of truncate, hole punch and other direct extent manipulation functions to avoid racing mmap writes from causing data corruption - rework of direct IO submission and completion to solve data corruption issue when running concurrent extending DIO writes. Also solves problem of running IO completion transactions in interrupt context during size extending AIO writes. - FALLOC_FL_INSERT_RANGE support for inserting holes into a file via direct extent manipulation to avoid needing to copy data within the file - attribute block header field overflow fix for 64k block size filesystems - Lots of changes to log messaging to be more informative and concise when errors occur. Also prevent a lot of unnecessary log spamming due to cascading failures in error conditions. - lots of cleanups and bug fixes One thing of note is the direct IO fixes that we merged last week after the window opened. Even though a little late, they fix a user reported data corruption and have been pretty well tested. I figured there was not much point waiting another 2 weeks for -rc1 to be released just so I could send them to you..." * tag 'xfs-for-linus-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (49 commits) xfs: using generic_file_direct_write() is unnecessary xfs: direct IO EOF zeroing needs to drain AIO xfs: DIO write completion size updates race xfs: DIO writes within EOF don't need an ioend xfs: handle DIO overwrite EOF update completion correctly xfs: DIO needs an ioend for writes xfs: move DIO mapping size calculation xfs: factor DIO write mapping from get_blocks xfs: unlock i_mutex in xfs_break_layouts xfs: kill unnecessary firstused overflow check on attr3 leaf removal xfs: use larger in-core attr firstused field and detect overflow xfs: pass attr geometry to attr leaf header conversion functions xfs: disallow ro->rw remount on norecovery mount xfs: xfs_shift_file_space can be static xfs: Add support FALLOC_FL_INSERT_RANGE for fallocate fs: Add support FALLOC_FL_INSERT_RANGE for fallocate xfs: Fix incorrect positive ENOMEM return xfs: xfs_mru_cache_insert() should use GFP_NOFS xfs: %pF is only for function pointers xfs: fix shadow warning in xfs_da3_root_split() ...
2015-04-23Merge tag 'nfs-rdma-for-4.1-1' of git://git.linux-nfs.org/projects/anna/nfs-rdmaTrond Myklebust20-24/+77
NFS: NFSoRDMA Client Changes This patch series creates an operation vector for each of the different memory registration modes. This should make it easier to one day increase credit limit, rsize, and wsize. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2015-04-23NFS: Move nfs_idmap.h into fs/nfs/Anna Schumaker1-68/+0
This file is only used internally to the NFS v4 module, so it doesn't need to be in the global include path. I also renamed it from nfs_idmap.h to nfs4idmap.h to emphasize that it's an NFSv4-only include file. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23NFS: Remove CONFIG_NFS_V4 checks from nfs_idmap.hAnna Schumaker1-11/+0
The idmapper is completely internal to the NFS v4 module, so this macro will always evaluate to true. This patch also removes unnecessary includes of this file from the generic NFS client. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23sunrpc: make debugfs file creation failure non-fatalJeff Layton1-9/+9
v2: gracefully handle the case where some dentry pointers end up NULL and be more dilligent about zeroing out dentry pointers We currently have a problem that SELinux policy is being enforced when creating debugfs files. If a debugfs file is created as a side effect of doing some syscall, then that creation can fail if the SELinux policy for that process prevents it. This seems wrong. We don't do that for files under /proc, for instance, so Bruce has proposed a patch to fix that. While discussing that patch however, Greg K.H. stated: "No kernel code should care / fail if a debugfs function fails, so please fix up the sunrpc code first." This patch converts all of the sunrpc debugfs setup code to be void return functins, and the callers to not look for errors from those functions. This should allow rpc_clnt and rpc_xprt creation to work, even if the kernel fails to create debugfs files for some reason. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23NFS: Don't zap caches on fallocate()Anna Schumaker1-0/+4
This patch adds a GETATTR to the end of ALLOCATE and DEALLOCATE operations so we can set the updated inode size and change attribute directly. DEALLOCATE will still need to release pagecache pages, so nfs42_proc_deallocate() now calls truncate_pagecache_range() before contacting the server. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-23netdev_alloc_pcpu_stats: use less common iterator variableJohannes Berg1-3/+3
With the CPU iteration variable called 'i', it's relatively easy to have variable shadowing which sparse will warn about. Avoid that by renaming the variable to __cpu which is less likely to be used in the surrounding context. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-23kexec: allocate the kexec control page with KEXEC_CONTROL_MEMORY_GFPMartin Schwidefsky1-0/+4
Introduce KEXEC_CONTROL_MEMORY_GFP to allow the architecture code to override the gfp flags of the allocation for the kexec control page. The loop in kimage_alloc_normal_control_pages allocates pages with GFP_KERNEL until a page is found that happens to have an address smaller than the KEXEC_CONTROL_MEMORY_LIMIT. On systems with a large memory size but a small KEXEC_CONTROL_MEMORY_LIMIT the loop will keep allocating memory until the oom killer steps in. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-04-22Merge tag 'for-linus-20150422' of git://git.infradead.org/linux-mtdLinus Torvalds2-19/+40
Pull MTD updates from Brian Norris: "Common MTD: - Add Kconfig option for keeping both the 'master' and 'partition' MTDs registered as devices. This would really make a better default if we could do it over, as it allows a lot more flexibility in (1) determining the flash topology of the system from user-space and (2) adding temporary partitions at runtime (ioctl(BLKPG)). Unfortunately, this would possibly cause user-space breakage, as it will cause renumbering of the /dev/mtdX devices. We'll see if we can change this in the future, as there have already been a few people looking for this feature, and I know others have just been working around our current limitations instead of fixing them this way. - Along with the previous change, add some additional information to sysfs, so user-space can read the offset of each partition within its master device SPI NOR: - add new device tree compatible binding to represent the mostly-compatible class of SPI NOR flash which can be detected by their extended JEDEC ID bytes, cutting down the duplication of our ID tables - misc. new IDs Various other miscellaneous fixes and changes" * tag 'for-linus-20150422' of git://git.infradead.org/linux-mtd: (53 commits) mtd: spi-nor: Add support for Macronix mx25u6435f serial flash mtd: spi-nor: Add support for Winbond w25q64dw serial flash mtd: spi-nor: add support for the Winbond W25X05 flash mtd: spi-nor: support en25s64 device mtd: m25p80: bind to "nor-jedec" ID, for auto-detection Documentation: devicetree: m25p80: add "nor-jedec" binding mtd: Make MTD tests cancelable mtd: mtd_oobtest: Fix bitflip_limit usage in test case 3 mtd: docg3: remove invalid __exit annotations mtd: fsl_ifc_nand: use msecs_to_jiffies for time conversion mtd: atmel_nand: don't map the ROM table if no pmecc table offset in DT mtd: atmel_nand: add a definition for the oob reserved bytes mtd: part: Remove partition overlap checks mtd: part: Add sysfs variable for offset of partition mtd: part: Create the master device node when partitioned mtd: ts5500_flash: Fix typo in MODULE_DESCRIPTION in ts5500_flash.c mtd: denali: Disable sub-page writes in Denali NAND driver mtd: pxa3xx_nand: cleanup wait_for_completion handling mtd: nand: gpmi: Check for scan_bbt() error mtd: nand: gpmi: fixup return type of wait_for_completion_timeout ...