summaryrefslogtreecommitdiff
path: root/fs/dlm/lowcomms.c
AgeCommit message (Collapse)AuthorFilesLines
2022-04-06dlm: add __CHECKER__ for false positivesAlexander Aring1-0/+10
This patch will adds #ifndef __CHECKER__ for false positives warnings about an imbalance lock/unlock srcu handling. Which are shown by running sparse checks: fs/dlm/midcomms.c:1065:20: warning: context imbalance in 'dlm_midcomms_get_mhandle' - wrong count at exit Using __CHECKER__ will tell sparse to ignore these sections. Those imbalances are false positive because from upper layer it is always required to call a function in sequence, e.g. if dlm_midcomms_get_mhandle() is successful there must be a dlm_midcomms_commit_mhandle() call afterwards. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2022-04-06dlm: uninitialized variable on error in dlm_listen_for_all()Dan Carpenter1-1/+1
The "sock" variable is not initialized on this error path. Cc: stable@vger.kernel.org Fixes: 2dc6b1158c28 ("fs: dlm: introduce generic listen") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2022-01-04fs: dlm: print cluster addr if non-cluster node connectsAlexander Aring1-4/+22
This patch prints the cluster node address if a non-cluster node (according to the dlm config setting) tries to connect. The current hexdump call will print in a different loglevel and only available if dynamic debug is enabled. Additional we using the ip address format strings to print an IETF ip4/6 string represenation. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-12-07fs: dlm: memory cache for lowcomms hotpathAlexander Aring1-3/+10
This patch introduces a kmem cache for dlm_msg handles which are used always if dlm sends a message out. Even if their are covered by midcomms layer or not. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-12-07fs: dlm: memory cache for writequeue_entryAlexander Aring1-5/+21
This patch introduces a kmem cache for writequeue entry. A writequeue entry get quite a lot allocated if dlm transmit messages. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-12-07fs: dlm: remove wq_alloc mutexAlexander Aring1-37/+11
This patch cleanups the code for allocating a new buffer in the dlm writequeue mechanism. There was a possible tuneup to allow scheduling while a new writequeue entry needs to be allocated because either no sending page is available or are full. To avoid multiple concurrent users checking at the same time if an entry is available or full alloc_wq was introduce that those are waiting if there is currently a new writequeue entry in process to be queued so possible further users will check on the new allocated writequeue entry if it's full. To simplify the code we just remove this mutex and switch that the already introduced spin lock will be held during writequeue check, allocation and queueing. So other users can never check on available writequeues while there is a new one in process but not queued yet. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-12-07fs: dlm: check for pending users filling buffersAlexander Aring1-1/+4
Currently we don't care if the DLM application stack is filling buffers (not committed yet) while we transmit some already committed buffers. By checking on active writequeue users before dequeue a writequeue entry we know there is coming more data and do nothing. We wait until the send worker will be triggered again if the writequeue entry users hit zero. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-11-17fs: dlm: fix build with CONFIG_IPV6 disabledAlexander Aring1-0/+2
This patch will surround the AF_INET6 case in sk_error_report() of dlm with a #if IS_ENABLED(CONFIG_IPV6). The field sk->sk_v6_daddr is not defined when CONFIG_IPV6 is disabled. If CONFIG_IPV6 is disabled, the socket creation with AF_INET6 should already fail because a runtime check if AF_INET6 is registered. However if there is the possibility that AF_INET6 is set as sk_family the sk_error_report() callback will print then an invalid family type error. Reported-by: kernel test robot <lkp@intel.com> Fixes: 4c3d90570bcc ("fs: dlm: don't call kernel_getpeername() in error_report()") Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-11-15fs: dlm: replace use of socket sk_callback_lock with sock_lockAlexander Aring1-17/+10
This patch will replace the use of socket sk_callback_lock lock and uses socket lock instead. Some users like sunrpc, see commit ea9afca88bbe ("SUNRPC: Replace use of socket sk_callback_lock with sock_lock") moving from sk_callback_lock to sock_lock which seems to be held when the socket callbacks are called. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-11-15fs: dlm: don't call kernel_getpeername() in error_report()Alexander Aring1-22/+20
In some cases kernel_getpeername() will held the socket lock which is already held when the socket layer calls error_report() callback. Since commit 9dfc685e0262 ("inet: remove races in inet{6}_getname()") this problem becomes more likely because the socket lock will be held always. You will see something like: bob9-u5 login: [ 562.316860] BUG: spinlock recursion on CPU#7, swapper/7/0 [ 562.318562] lock: 0xffff8f2284720088, .magic: dead4ead, .owner: swapper/7/0, .owner_cpu: 7 [ 562.319522] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 5.15.0+ #135 [ 562.320346] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.13.0-2.module+el8.3.0+7353+9de0a3cc 04/01/2014 [ 562.321277] Call Trace: [ 562.321529] <IRQ> [ 562.321734] dump_stack_lvl+0x33/0x42 [ 562.322282] do_raw_spin_lock+0x8b/0xc0 [ 562.322674] lock_sock_nested+0x1e/0x50 [ 562.323057] inet_getname+0x39/0x110 [ 562.323425] ? sock_def_readable+0x80/0x80 [ 562.323838] lowcomms_error_report+0x63/0x260 [dlm] [ 562.324338] ? wait_for_completion_interruptible_timeout+0xd2/0x120 [ 562.324949] ? lock_timer_base+0x67/0x80 [ 562.325330] ? do_raw_spin_unlock+0x49/0xc0 [ 562.325735] ? _raw_spin_unlock_irqrestore+0x1e/0x40 [ 562.326218] ? del_timer+0x54/0x80 [ 562.326549] sk_error_report+0x12/0x70 [ 562.326919] tcp_validate_incoming+0x3c8/0x530 [ 562.327347] ? kvm_clock_read+0x14/0x30 [ 562.327718] ? ktime_get+0x3b/0xa0 [ 562.328055] tcp_rcv_established+0x121/0x660 [ 562.328466] tcp_v4_do_rcv+0x132/0x260 [ 562.328835] tcp_v4_rcv+0xcea/0xe20 [ 562.329173] ip_protocol_deliver_rcu+0x35/0x1f0 [ 562.329615] ip_local_deliver_finish+0x54/0x60 [ 562.330050] ip_local_deliver+0xf7/0x110 [ 562.330431] ? inet_rtm_getroute+0x211/0x840 [ 562.330848] ? ip_protocol_deliver_rcu+0x1f0/0x1f0 [ 562.331310] ip_rcv+0xe1/0xf0 [ 562.331603] ? ip_local_deliver+0x110/0x110 [ 562.332011] __netif_receive_skb_core+0x46a/0x1040 [ 562.332476] ? inet_gro_receive+0x263/0x2e0 [ 562.332885] __netif_receive_skb_list_core+0x13b/0x2c0 [ 562.333383] netif_receive_skb_list_internal+0x1c8/0x2f0 [ 562.333896] ? update_load_avg+0x7e/0x5e0 [ 562.334285] gro_normal_list.part.149+0x19/0x40 [ 562.334722] napi_complete_done+0x67/0x160 [ 562.335134] virtnet_poll+0x2ad/0x408 [virtio_net] [ 562.335644] __napi_poll+0x28/0x140 [ 562.336012] net_rx_action+0x23d/0x300 [ 562.336414] __do_softirq+0xf2/0x2ea [ 562.336803] irq_exit_rcu+0xc1/0xf0 [ 562.337173] common_interrupt+0xb9/0xd0 It is and was always forbidden to call kernel_getpeername() in context of error_report(). To get rid of the problem we access the destination address for the peer over the socket structure. While on it we fix to print out the destination port of the inet socket. Fixes: 1a31833d085a ("DLM: Replace nodeid_to_addr with kernel_getpeername") Reported-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-11-04fs: dlm: remove double list_first_entry callAlexander Aring1-1/+0
This patch removes a list_first_entry() call which is already done by the previous con_next_wq() call. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-11-02fs: dlm: let handle callback data as voidAlexander Aring1-10/+9
This patch changes the dlm_lowcomms_new_msg() function pointer private data from "struct mhandle *" to "void *" to provide different structures than just "struct mhandle". Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-11-02fs: dlm: trace socket handlingAlexander Aring1-0/+4
This patch adds tracepoints for dlm socket receive and send functionality. We can use it to track how much data was send or received to or from a specific nodeid. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-11-02fs: dlm: remove check SCTP is loaded messageAlexander Aring1-1/+1
Since commit 764ff4011424 ("fs: dlm: auto load sctp module") we try load the sctp module before we try to create a sctp kernel socket. That a socket creation fails now has more likely other reasons. This patch removes the part of error to load the sctp module and instead printout the error code. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-08-19fs: dlm: implement delayed ack handlingAlexander Aring1-0/+1
This patch changes that we don't ack each message. Lowcomms will take care about to send an ack back after a bulk of messages was processed. Currently it's only when the whole receive buffer was processed, there might better positions to send an ack back but only the lowcomms implementation know when there are more data to receive. This patch has also disadvantages that we might retransmit more on errors, however this is a very rare case. Tested with make_panic on gfs2 with three nodes by running: trace-cmd record -p function -l 'dlm_send_ack' sleep 100 and trace-cmd report | wc -l Before patch: - 20548 - 21376 - 21398 After patch: - 18338 - 20679 - 19949 Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: move receive loop into receive handlerAlexander Aring1-37/+31
This patch moves the kernel_recvmsg() loop call into the receive_from_sock() function instead of doing the loop outside the function and abort the loop over it's return value. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: fix multiple empty writequeue allocAlexander Aring1-0/+21
This patch will add a mutex that a connection can allocate a writequeue entry buffer only at a sleepable context at one time. If multiple caller waits at the writequeue spinlock and the spinlock gets release it could be that multiple new writequeue page buffers were allocated instead of allocate one writequeue page buffer and other waiters will use remaining buffer of it. It will only be the case for sleepable context which is the common case. In non-sleepable contexts like retransmission we just don't care about such behaviour. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: generic connect funcAlexander Aring1-193/+150
This patch adds a generic connect function for TCP and SCTP. If the connect functionality differs from each other additional callbacks in dlm_proto_ops were added. The sockopts callback handling will guarantee that sockets created by connect() will use the same options as sockets created by accept(). Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: auto load sctp moduleAlexander Aring1-5/+15
This patch adds a "for now" better handling of missing SCTP support in the kernel and try to load the sctp module if SCTP is set. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: introduce generic listenAlexander Aring1-115/+113
This patch combines each transport layer listen functionality into one listen function. Per transport layer differences are provided by additional callbacks in dlm_proto_ops. This patch drops silently sock_set_keepalive() for listen tcp sockets only. This socket option is not set at connecting sockets, I also don't see the sense of set keepalive for sockets which are created by accept() only. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: move to static proto opsAlexander Aring1-22/+30
This patch moves the per transport socket callbacks to a static const array. We can support only one transport socket for the init namespace which will be determinted by reading the dlm config at lowcomms_start(). Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: introduce con_next_wq helperAlexander Aring1-22/+35
This patch introduce a function to determine if something is ready to being send in the writequeue. It's not just that the writequeue is not empty additional the first entry need to have a valid length field. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: clear CF_APP_LIMITED on closeAlexander Aring1-0/+1
If send_to_sock() sets CF_APP_LIMITED limited bit and it has not been cleared by a waiting lowcomms_write_space() yet and a close_connection() apprears we should clear the CF_APP_LIMITED bit again because the connection starts from a new state again at reconnect. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: use sk->sk_socket instead of con->sockAlexander Aring1-2/+1
Instead of dereference "con->sock" we can get the socket structure over "sk->sk_socket" as well. This patch will switch to this behaviour. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-06-02fs: dlm: rename socket and app buffer definesAlexander Aring1-2/+2
This patch renames DEFAULT_BUFFER_SIZE to DLM_MAX_SOCKET_BUFSIZE and LOWCOMMS_MAX_TX_BUFFER_LEN to DLM_MAX_APP_BUFSIZE as they are proper names to define what's behind those values. The DLM_MAX_SOCKET_BUFSIZE defines the maximum size of buffer which can be handled on socket layer, the DLM_MAX_APP_BUFSIZE defines the maximum size of buffer which can be handled by the DLM application layer. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-06-02fs: dlm: introduce proto valuesAlexander Aring1-4/+19
Currently the dlm protocol values are that TCP is 0 and everything else is SCTP. This makes it difficult to introduce possible other transport layers. The only one user space tool dlm_controld, which I am aware of, handles the protocol value 1 for SCTP. We change it now to handle SCTP as 1, this will break user space API but it will fix it so we can add possible other transport layers. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-06-02fs: dlm: move dlm allow connAlexander Aring1-4/+3
This patch checks if possible allowing new connections is allowed before queueing the listen socket to accept new connections. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-06-02fs: dlm: use alloc_ordered_workqueueAlexander Aring1-4/+2
The proper way to allocate ordered workqueues is to use alloc_ordered_workqueue() function. The current way implies an ordered workqueue which is also required by dlm. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-06-02fs: dlm: fix lowcomms_start error caseAlexander Aring1-3/+12
This patch fixes the error path handling in lowcomms_start(). We need to cleanup some static allocated data structure and cleanup possible workqueue if these have started. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-05-25fs: dlm: don't allow half transmitted messagesAlexander Aring1-35/+60
This patch will clean a dirty page buffer if a reconnect occurs. If a page buffer was half transmitted we cannot start inside the middle of a dlm message if a node connects again. I observed invalid length receptions errors and was guessing that this behaviour occurs, after this patch I never saw an invalid message length again. This patch might drops more messages for dlm version 3.1 but 3.1 can't deal with half messages as well, for 3.2 it might trigger more re-transmissions but will not leave dlm in a broken state. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-05-25fs: dlm: add reliable connection if reconnectAlexander Aring1-1/+3
This patch introduce to make a tcp lowcomms connection reliable even if reconnects occurs. This is done by an application layer re-transmission handling and sequence numbers in dlm protocols. There are three new dlm commands: DLM_OPTS: This will encapsulate an existing dlm message (and rcom message if they don't have an own application side re-transmission handling). As optional handling additional tlv's (type length fields) can be appended. This can be for example a sequence number field. However because in DLM_OPTS the lockspace field is unused and a sequence number is a mandatory field it isn't made as a tlv and we put the sequence number inside the lockspace id. The possibility to add optional options are still there for future purposes. DLM_ACK: Just a dlm header to acknowledge the receive of a DLM_OPTS message to it's sender. DLM_FIN: This provides a 4 way handshake for connection termination inclusive support for half-closed connections. It's provided on application layer because SCTP doesn't support half-closed sockets, the shutdown() call can interrupted by e.g. TCP resets itself and a hard logic to implement it because the othercon paradigm in lowcomms. The 4-way termination handshake also solve problems to synchronize peer EOF arrival and that the cluster manager removes the peer in the node membership handling of DLM. In some cases messages can be still transmitted in this time and we need to wait for the node membership event. To provide a reliable connection the node will retransmit all unacknowledges message to it's peer on reconnect. The receiver will then filtering out the next received message and drop all messages which are duplicates. As RCOM_STATUS and RCOM_NAMES messages are the first messages which are exchanged and they have they own re-transmission handling, there exists logic that these messages must be first. If these messages arrives we store the dlm version field. This handling is on DLM 3.1 and after this patch 3.2 the same. A backwards compatibility handling has been added which seems to work on tests without tcpkill, however it's not recommended to use DLM 3.1 and 3.2 at the same time, because DLM 3.2 tries to fix long term bugs in the DLM protocol. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-05-25fs: dlm: move out some hash functionalityAlexander Aring1-9/+0
This patch moves out some lowcomms hash functionality into lowcomms header to provide them to other layers like midcomms as well. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-05-25fs: dlm: add functionality to re-transmit a messageAlexander Aring1-19/+66
This patch introduces a retransmit functionality for a lowcomms message handle. It's just allocates a new buffer and transmit it again, no special handling about prioritize it because keeping bytestream in order. To avoid another connection look some refactor was done to make a new buffer allocation with a preexisting connection pointer. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-05-25fs: dlm: make buffer handling per msgAlexander Aring1-16/+85
This patch makes the void pointer handle for lowcomms functionality per message and not per page allocation entry. A refcount handling for the handle was added to keep the message alive until the user doesn't need it anymore. There exists now a per message callback which will be called when allocating a new buffer. This callback will be guaranteed to be called according the order of the sending buffer, which can be used that the caller increments a sequence number for the dlm message handle. For transition process we cast the dlm_mhandle to dlm_msg and vice versa until the midcomms layer will implement a specific dlm_mhandle structure. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-05-25fs: dlm: fix connection tcp EOF handlingAlexander Aring1-5/+43
This patch fixes the EOF handling for TCP that if and EOF is received we will close the socket next time the writequeue runs empty. This is a half-closed socket functionality which doesn't exists in SCTP. The midcomms layer will do a half closed socket functionality on DLM side to solve this problem for the SCTP case. However there is still the last ack flying around but other reset functionality will take care of it if it got lost. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-05-25fs: dlm: cancel work sync otherconAlexander Aring1-1/+1
These rx tx flags arguments are for signaling close_connection() from which worker they are called. Obviously the receive worker cannot cancel itself and vice versa for swork. For the othercon the receive worker should only be used, however to avoid deadlocks we should pass the same flags as the original close_connection() was called. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-05-25fs: dlm: reconnect if socket error report occursAlexander Aring1-21/+39
This patch will change the reconnect handling that if an error occurs if a socket error callback is occurred. This will also handle reconnects in a non blocking connecting case which is currently missing. If error ECONNREFUSED is reported we delay the reconnect by one second. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-05-25fs: dlm: set is othercon flagAlexander Aring1-0/+3
There is a is othercon flag which is never used, this patch will set it and printout a warning if the othercon ever sends a dlm message which should never be the case. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-05-25fs: dlm: fix srcu read lock usageAlexander Aring1-23/+52
This patch holds the srcu connection read lock in cases where we lookup the connections and accessing it. We don't hold the srcu lock in workers function where the scheduled worker is part of the connection itself. The connection should not be freed if any worker is scheduled or pending. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-05-25fs: dlm: add dlm macros for ratelimit logAlexander Aring1-2/+2
This patch add ratelimit macro to dlm subsystem and will set the connecting log message to ratelimit. In non blocking connecting cases it will print out this message a lot. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-03-29fs: dlm: fix missing unlock on error in accept_from_sock()Yang Yingliang1-0/+1
Add the missing unlock before return from accept_from_sock() in the error handling case. Fixes: 6cde210a9758 ("fs: dlm: add helper for init connection") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-03-09fs: dlm: add shutdown hookAlexander Aring1-19/+23
This patch fixes issues which occurs when dlm lowcomms synchronize their workqueues but dlm application layer already released the lockspace. In such cases messages like: dlm: gfs2: release_lockspace final free dlm: invalid lockspace 3841231384 from 1 cmd 1 type 11 are printed on the kernel log. This patch is solving this issue by introducing a new "shutdown" hook before calling "stop" hook when the lockspace is going to be released finally. This should pretend any dlm messages sitting in the workqueues during or after lockspace removal. It's necessary to call dlm_scand_stop() as I instrumented dlm_lowcomms_get_buffer() code to report a warning after it's called after dlm_midcomms_shutdown() functionality, see below: WARNING: CPU: 1 PID: 3794 at fs/dlm/midcomms.c:1003 dlm_midcomms_get_buffer+0x167/0x180 Modules linked in: joydev iTCO_wdt intel_pmc_bxt iTCO_vendor_support drm_ttm_helper ttm pcspkr serio_raw i2c_i801 i2c_smbus drm_kms_helper virtio_scsi lpc_ich virtio_balloon virtio_console xhci_pci xhci_pci_renesas cec qemu_fw_cfg drm [last unloaded: qxl] CPU: 1 PID: 3794 Comm: dlm_scand Tainted: G W 5.11.0+ #26 Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.13.0-2.module+el8.3.0+7353+9de0a3cc 04/01/2014 RIP: 0010:dlm_midcomms_get_buffer+0x167/0x180 Code: 5d 41 5c 41 5d 41 5e 41 5f c3 0f 0b 45 31 e4 5b 5d 4c 89 e0 41 5c 41 5d 41 5e 41 5f c3 4c 89 e7 45 31 e4 e8 3b f1 ec ff eb 86 <0f> 0b 4c 89 e7 45 31 e4 e8 2c f1 ec ff e9 74 ff ff ff 0f 1f 80 00 RSP: 0018:ffffa81503f8fe60 EFLAGS: 00010202 RAX: 0000000000000008 RBX: ffff8f969827f200 RCX: 0000000000000001 RDX: 0000000000000000 RSI: ffffffffad1e89a0 RDI: ffff8f96a5294160 RBP: 0000000000000001 R08: 0000000000000000 R09: ffff8f96a250bc60 R10: 00000000000045d3 R11: 0000000000000000 R12: ffff8f96a250bc60 R13: ffffa81503f8fec8 R14: 0000000000000070 R15: 0000000000000c40 FS: 0000000000000000(0000) GS:ffff8f96fbc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055aa3351c000 CR3: 000000010bf22000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: dlm_scan_rsbs+0x420/0x670 ? dlm_uevent+0x20/0x20 dlm_scand+0xbf/0xe0 kthread+0x13a/0x150 ? __kthread_bind_mask+0x60/0x60 ret_from_fork+0x22/0x30 To synchronize all dlm scand messages we stop it right before shutdown hook. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-03-09fs: dlm: flush swork on shutdownAlexander Aring1-4/+1
This patch fixes the flushing of send work before shutdown. The function cancel_work_sync() is not the right workqueue functionality to use here as it would cancel the work if the work queues itself. In cases of EAGAIN in send() for dlm message we need to be sure that everything is send out before. The function flush_work() will ensure that every send work is be done inclusive in EAGAIN cases. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-03-09fs: dlm: simplify writequeue handlingAlexander Aring1-40/+43
This patch cleans up the current dlm sending allocator handling by using some named macros, list functionality and removes some goto statements. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-03-09fs: dlm: use GFP_ZERO for page bufferAlexander Aring1-1/+1
This patch uses GFP_ZERO for allocate a page for the internal dlm sending buffer allocator instead of calling memset zero after every allocation. An already allocated space will never be reused again. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-03-09fs: dlm: change allocation limitsAlexander Aring1-2/+4
While running tcpkill I experienced invalid header length values while receiving to check that a node doesn't try to send a invalid dlm message we also check on applications minimum allocation limit. Also use DEFAULT_BUFFER_SIZE as maximum allocation limit. The define LOWCOMMS_MAX_TX_BUFFER_LEN is to calculate maximum buffer limits on application layer, future midcomms layer will subtract their needs from this define. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-03-09fs: dlm: add check if dlm is currently runningAlexander Aring1-1/+1
This patch adds checks for dlm config attributes regarding to protocol parameters as it makes only sense to change them when dlm is not running. It also adds a check for valid protocol specifiers and return invalid argument if they are not supported. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-03-09fs: dlm: set subclass for othercon sock_mutexAlexander Aring1-1/+2
This patch sets the lockdep subclass for the othercon socket mutex. In various places the connection socket mutex is held while locking the othercon socket mutex. This patch will remove lockdep warnings when such case occurs. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-03-09fs: dlm: set connected bit after acceptAlexander Aring1-0/+1
This patch sets the CF_CONNECTED bit when dlm accepts a connection from another node. If we don't set this bit, next time if the connection socket gets writable it will assume an event that the connection is successfully connected. However that is only the case when the connection did a connect. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-03-09fs: dlm: fix mark setting deadlockAlexander Aring1-15/+34
This patch fixes an deadlock issue when dlm_lowcomms_close() is called. When dlm_lowcomms_close() is called the clusters_root.subsys.su_mutex is held to remove configfs items. At this time we flushing (e.g. cancel_work_sync()) the workers of send and recv workqueue. Due the fact that we accessing configfs items (mark values), these workers will lock clusters_root.subsys.su_mutex as well which are already hold by dlm_lowcomms_close() and ends in a deadlock situation. [67170.703046] ====================================================== [67170.703965] WARNING: possible circular locking dependency detected [67170.704758] 5.11.0-rc4+ #22 Tainted: G W [67170.705433] ------------------------------------------------------ [67170.706228] dlm_controld/280 is trying to acquire lock: [67170.706915] ffff9f2f475a6948 ((wq_completion)dlm_recv){+.+.}-{0:0}, at: __flush_work+0x203/0x4c0 [67170.708026] but task is already holding lock: [67170.708758] ffffffffa132f878 (&clusters_root.subsys.su_mutex){+.+.}-{3:3}, at: configfs_rmdir+0x29b/0x310 [67170.710016] which lock already depends on the new lock. The new behaviour adds the mark value to the node address configuration which doesn't require to held the clusters_root.subsys.su_mutex by accessing mark values in a separate datastructure. However the mark values can be set now only after a node address was set which is the case when the user is using dlm_controld. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>