This commit introduces logic for using a new X11 window for each
incoming transfer, rather than having a global window for each selection
source.
This eliminates a whole class of bugs involving multiple concurrent
incoming transfers.
For now, we retain the outgoing transfer queue, and the selection
source-specific windows to support it. Source-specific windows are no
longer used in the incoming path, and will be removed in a future PR.
Refs #1497.
Previously, Xwayland could restart, and we'd get events for transfers
pointing to the previous (now freed) xwm instance. This led to
use-after-free segfaults.
Closes#2565.
Previously, wlr_xwm_selection_transfer.source_fd meant:
- the source of data in a Wayland -> X11 copy (good)
- the destination of data in a X11 -> Wayland copy (confusing)
This made reading through xwayland/selection/incoming.c difficult: in
many places, "source" actually means "destination".
We already mostly did this, but there were a couple of branches
(`calloc` failures) where we'd bail without letting the other side know.
Refs swaywm/sway#4007. Likely not going to be a real improvement there
(if `calloc` fails you're already pretty screwed), but it does address a
theoretical possibility.
It seems that if we ever try to reply to a selection request after
another has been sent by the same requestor (we reply in FIFO order),
the requestor never reads from it, and we end up stalling forever on a
transfer that will never complete.
It appears that `XCB_SELECTION_REQUEST` has some sort of singleton
semantics, and new requests for the same selection are meant to replace
outstanding older ones. I couldn't find a reference for this, but
empirically this does seem to be the case.
Real (contrived) case where we don't currently do this, and things break:
* run fcitx
* run Slack
* wl-copy < <(base64 /opt/firefox/libxul.so) # or some other large file
* focus Slack (no need to paste)
fcitx will send in an `XCB_SELECTION_REQUEST`, and we'll start
processing it. Immediately after, Slack sends its own. fcitx hangs for a
long, long time. In the meantime, Slack retries and sends another
selection request. We now have two pending requests from Slack.
Eventually fcitx gives up (or it can be `pkill`'d), and we start
processing the first request Slack gave us (FIFO). Slack (Electron?)
isn't listening on the other end anymore, and this transfer never
completes. The X11 clipboard becomes unusable until Slack is killed.
After this patch, the clipboard is immediately usable again after fcitx
bails. Also added a bunch of debug-level logging that makes diagnosing
this sort of issue easier.
Refs swaywm/sway#4007.
Fixes#2425.
wlroots can only handle one outgoing transfer at a time, so it keeps a
list of pending selections. The head of the list is the currently-active
selection, and when that transfer completes and is destroyed, the next
one is started.
The trouble is when you have a transfer to some app that is misbehaving.
fcitx is one such application. With really large transfers, fcitx will
hang and never wake up again. So, you can end up with a transfer list
that looks like this:
| T1: started | T2: pending | T3: pending | T4: pending |
The file descriptor for transfer T1 is registered in libwayland's epoll
loop. The rest are waiting in wlroots' list.
As a user, you want your clipboard back, so you `pkill fcitx`. Now
Xwayland sends `XCB_DESTROY_NOTIFY` to let us know to give up. We clean
up T4 first.
Due to a bug in wlroots code, we register the (fd, transfer data
pointer) pair for T1 with libwayland *again*, despite it already being
registered. We do this 2 more times as we remove T3 and T2.
Finally, we remove T1 and `free` all the memory associated with it,
before `close`-ing its transfer file descriptor.
However, we still have 3 copies of T1's file descriptor left in the
epoll loop, since we erroneously added them as part of removing T2/3/4.
When we `close` the file descriptor as part of T1's teardown, we
actually cause the epoll loop to wake up the next time around, saying
"this file descriptor has activity!" (it was closed, so `read`-ing would
normally return 0 to let us know of EOF).
But instead of returning 0, it returns -1 with `EBADF`, because the file
descriptor has already been closed. And finally, as part of error-handling
this, we access the transfer pointer, which was `free`'d. And we crash.
This one was awful to track down, but calls to `wlr_log` with %m have
the errno masked by the `isatty` call in `log_stderr`. Switch them to
`wlr_log_errno` instead.
Cue quality "how can read(2) POSSIBLY be returning ENOTTY?" moments.
This is a common interface that can be used for all primary selection
protocols, as discussed in [1]. A new function wlr_seat_set_primary_selection
is added to set the primary selection for all protocols.
The seat now owns again the source, and resets the selection to NULL when
destroyed.
[1]: https://github.com/swaywm/wlroots/issues/1367#issuecomment-442403454
This unbreaks the build on armhf that otherwise fails like
../xwayland/selection/incoming.c: In function 'xwm_data_source_write':
../include/wlr/util/log.h:34:17: error: format '%ld' expects argument of type 'long int', but argument 6 has type 'ssize_t {aka int}' [-Werror=format=]
_wlr_log(verb, "[%s:%d] " fmt, wlr_strip_path(__FILE__), __LINE__, ##__VA_ARGS__)
^
../xwayland/selection/incoming.c:34:2: note: in expansion of macro 'wlr_log'
wlr_log(L_DEBUG, "wrote %zd (chunk size %ld) of %d bytes",
^~~~~~~
../xwayland/selection/incoming.c:34:44: note: format string is defined here
wlr_log(L_DEBUG, "wrote %zd (chunk size %ld) of %d bytes",
~~^
%d