Created attachment 7024 Illustration of terminal window when locked up, and how it got there When using xfce4-terminal, I normally work in an environment with the ISO 8859-1 (Latin-1) character encoding and therefore have Preferences -> Advanced -> Encoding set to "Default (ISO-8859-1)". Until recently I was using Fedora 23 Linux with xfce4-terminal 0.6.3, but have recently changed to Fedora 25 with xfce4-terminal 0.8.4, or to be exact, Fedora RPM "xfce4-terminal-0.8.4-1.fc25.x86_64". Although I prefer to work in Latin-1, I sometimes come across text that's in UTF-8 (for example, in incoming email). Naturally, I don't expect this to display correctly, but I expect the terminal to keep working. But since I changed to Fedora 25, I find that sometimes it doesn't. (And If I change xfce4-terminal's settings so it expects UTF-8, then it works. So on those grounds I'm filing this as an xfce4-terminal bug.) The really weird part is that the effect not only depends on what characters are involved, but also on how they are being written to the terminal. Please refer to the attached screenshot "shot.png". The file "ouch" is one line from an incoming email message, which contains UTF-8 directional single quotes. If I "cat" the file in my usual environment, it displays those characters unhelpfully but does not lock anything up. If I use "Terminal -> Set Encoding" to change the encoding to UTF-8 and then "cat" the file, the directional quotes are there. Now, with the encoding on UTF-8, I go into "script" and repeat the "cat" command. It works in the same way. If I switch the encoding back to 8859-1 and "cat" the file again, it displays as before. All as expected. But now look: I go into "script" and repeat the "cat" command one more time, and this time my terminal window locks up. The green blob you see is my terminal cursor. "Terminal -> Reset" does not unlock it. "Terminal -> Clear Scrollback and Reset" does not unlock it. What DOES unlock it is if I open another window, use "ps" to find the PID of the relevant process -- in this case, the "sh -i" command invoked by "script", and kill that process. If I do that, the output resumes as if nothing happened. Besides "script", the other command I've found that leads to the same failure mode is my usual mail reader -- a version of mailx running on a BSD UNIX machine. Which, of course, is how I found the problem. One more piece of information: apparently it's not the full UTF-8 byte sequence that causes this effect, but specifically the octal 230 (hex 98) byte. In 8859-1 this byte is considered a control character but I don't know how it's supposed to be used. So maybe this behavior is a feature, not a bug... but if so, I can't imagine how it could be useful. As I say, in my normal environment that character never used to do anything special.
Is this by any chance the same as https://bugzilla.gnome.org/show_bug.cgi?id=777733 https://bugzilla.gnome.org/show_bug.cgi?id=737792 ?
Egmont, thanks for stepping in! To me this one really seems related to vte. Mark, do you think bugs that Egmont has mentioned are similar to yours?
Looking at https://bugzilla.gnome.org/show_bug.cgi?id=777733, it's clearly related. Two similar features stand out: [1] it was apparently triggered by a single C1 control character (though in his example this was hex 90; in mine it was 98) and [2] terminating the process that was outputting the characters ended the problem. There was mention of trying different terminal emulators. I didn't have any others on my machine, but just now I tried installing xterm. I first tried just starting it with "LC_ALL=en_US.iso88591 xterm" and tried the same example as in my illustration, and it did not lock up, but it also omitted the words "Or So It Seemed" that were between the hex 98 and hex 99 characters! I then used the Shift-Right-Click menu to set "UTF-8 encoding" and the file displayed properly with directional quotes. But I then used Shift-Right-Click again and turned *off* "UTF-8 encoding", and now the file displayed the way it did in xfce4-terminal! Anyway, I could not make xterm lock up by catting that file inside "script", but I *could* by making an ssh connection to the UNIX machine and running my mail reader. So in short, xterm behaves differently from xfce4-terminal, but neither one seems to handle this situation correctly.
As for https://bugzilla.gnome.org/show_bug.cgi?id=737792, I downloaded the file "localtime" and tried "cat localtime" inside my xfce4-terminal -- and it did not lock up. I then started "script" and did "cat localtime" again -- and this time it *did* lock up. As before, killing the "sh -i" that script had started was sufficient to unlock things. But this time, changing the encoding in xfce4-terminal between 8859-1 and UTF-8 had no effect on the behavior. I find this all mystifying; I know nothing about what goes on behind the scenes in terminal emulators.
I've also taken a closer look at your issue since I made my previous comment, and I'm sure these are essentially the same. The exact details are indeed confusing to the extent that I myself don't exactly know how xterm and vte behave in all these possible circumstances and configurations, I have to look up / experiment. The core of the problem, in your case, is that you mix two encodings, and something that is a printable character in one is a control character in the other, waiting for its terminating sequence. As such, technically ... > but I expect the terminal to keep working ... the terminal emulator keeps working, it has just entered a special mode. What I believe you actually expect is for the emulator not to switch to this special mode, and this is a false expectation from you if the emulator actually encounters these bytes. VTE could add an API for disallowing C1 control characters, and then frontends (incl. xfce4-terminal) could make a checkbox for it. However VTE tends to be an emulator that does not implement and expose such kinds of various possible behaviors, unless really-really required. Probably xterm has such an option. You might have success with layers like screen or tmux as well, not sure. > Although I prefer to work in Latin-1 The Linux world has, for a good reason, switched to UTF-8 as the default about 10 years ago. Some new terminal emulators don't even support other encodings. Of course you're free to against the wind, but it doesn't sound a wise decision for me (without knowing your environment of course) and I would recommend to you that you switch to UTF-8 as soon as possible. > I sometimes come across text that's in UTF-8 (for example, in incoming email) > [...] a version of mailx running on a BSD UNIX machine Email clients are responsible for decoding the mail according to its mime type (character set, transfer encoding etc.) and converting to the terminal emulator's charset, so you end up seeing the proper symbols as intended by the sender of the email. Such apps are responsible for not sending out bytes that screw with the terminal emulator. I'm not aware of mailx, maybe that's an ancient piece of crap^H^H^H^Hsoftware not knowing how to handle various character sets. Of course low-level debugging like grepping are a different story, but that shouldn't be the normal usage. > I sometimes come across text that's in UTF-8 (for example, in incoming email). > Naturally, I don't expect this to display correctly This is soooo wrong! You _should_ expect all your emails to display correctly, no matter what charset it is encoded in. If your environment cannot do this, you should upgrade or switch to a modern one that can do it for you. (And having an UTF-8 terminal emulator is a prerequisite for this if you actually care about seeing out-of-latin1 glyphs.) > So in short, xterm behaves differently from xfce4-terminal, but neither one > seems to handle this situation correctly. Not sure what you expect as the correct behavior. Indeed xterm differs from VTE and konsole, apparently even the authors of these disagree on the desired behavior especially with UTF-8 and C1.
First, I've suddenly realized why the behavior of "cat ouch" in the shell is different from "cat ouch" inside "script" -- this is one of the things that was really bothering me. It's because, when this bad state occurs, it goes away when the process that wrote the C1 control character terminates. In the one case, that process is "cat", so it terminates almost instantly; in the other case, it's the shell started by "script". > The core of the problem, in your case, is that you mix two encodings, > and something that is a printable character in one is a control character > in the other, waiting for its terminating sequence. As I see it, mixing two encodings is the way that this accident happened *this* time, and is the reason why it depended on the encoding that I set in xfce4-terminal. But the same accident could still happen at any time that binary output was accidentally directed to the terminal -- as in the "cat localtime" case. If I had been working in UTF-8, the binary output might still contain a C1 control character in UTF-8. > ... the terminal emulator keeps working, it has just entered a special mode. > What I believe you actually expect is for the emulator not to switch > to this special mode, and this is a false expectation from you if the > emulator actually encounters these bytes. I would be satisfied if resetting my xfce4-terminal would switch the emulator *out* of the special mode, so I can unstick it without having to kill processes. Is this also an unrealistic expectation? (If so, I think we're done here.) > VTE could add an API for disallowing C1 control characters, and > then frontends (incl. xfce4-terminal) could make a checkbox for > it. However VTE tends to be an emulator that does not implement > and expose such kinds of various possible behaviors, unless > really-really required. Is VTE part of xfce or is it someone else's project that xfce4-terminal depends on? > Probably xterm has such an option. In fact the Control-Left-Click menu in xterm has an option "8-bit Controls". I have not explored what it does. > I'm not aware of mailx, maybe that's an ancient piece of crap... It is indeed, but I've been using it for enough decades that I stay and put up with the nuisances. Thanks for your other comments.
(In reply to Mark Brader from comment #6) > Is VTE part of xfce or is it someone else's project that xfce4-terminal > depends on? VTE is part of GNOME. It's a terminal widget being used by multiple terminal apps, such as gnome-terminal, xfce4-terminal, terminix, and others.
(In reply to Mark Brader from comment #6) > But the same accident could still happen at any time > that binary output was accidentally directed to the terminal -- as in the > "cat localtime" case. If I had been working in UTF-8, the binary output > might still contain a C1 control character in UTF-8. Yup, and it can also happen with the standard 7-bit C0 control characters. That's how terminal emulators have always worked from the very beginning. They are heavily stateful. > I would be satisfied if resetting my xfce4-terminal would switch the > emulator *out* of the special mode, so I can unstick it without having > to kill processes. Is this also an unrealistic expectation? (If so, > I think we're done here.) I think it's a pretty reasonable feature request to discuss. Not sure how easily implementable, and might have downsides as well, but definitely worth investigating. Filed as https://bugzilla.gnome.org/show_bug.cgi?id=779518. In the mean time, I recommend as a workaround to have a shell prompt that contains some (harmless) escape sequence at its beginning, since the escape character seems to terminate these sequences and hence get the terminal "unstuck".
> Yup, and it can also happen with the standard 7-bit C0 control > characters. That's how terminal emulators have always worked from the > very beginning. They are heavily stateful. C0 control characters sent unexpectedly *to* the terminal may send the cursor to unexpected places, and escape sequences may do all sorts of things, but locking it up isn't one of them as far as I know, and whether it is or not, in my experience resetting the terminal emulator clears any wrong state. > Filed as https://bugzilla.gnome.org/show_bug.cgi?id=779518. Thanks. > In the mean time, I recommend as a workaround to have a shell prompt > that contains some (harmless) escape sequence at its beginning, since > the escape character seems to terminate these sequences and hence get > the terminal "unstuck". Huh, so it does. In fact my usual shell prompt *does* contain an escape sequence, for a color change, but I normally use a different prompt when inside a subshell, as in "script", and I turned off the colors when generating the example above. Using the colored prompt inside "script" means that "cat ouch" no longer gets hung. Thanks again. (And I find that mailx has an option to set the prompt, but, sadly, the version on the UNIX machine apparently strips out escape characters for my own protection. So I can't use the same trick there. Oh well.)
(In reply to Mark Brader from comment #9) > C0 control characters sent unexpectedly *to* the terminal may send the > cursor to unexpected places, and escape sequences may do all sorts of > things, but locking it up isn't one of them as far as I know, and whether > it is or not, in my experience resetting the terminal emulator clears any > wrong state. Every C1 has a C0 equivalent (it's not true the other way around). 0x80 (or, more precisely, in VTE's and Konsole's UTF-8 mode U+0080 encoded in UTF-8 as two bytes) is the same as ESC @, 0x81 is the same as ESC A etc. You keep talking about "locking" and "wrong state", nope, technically nothing is locked or is in a wrong state. Things are in a perfectly valid state, waiting for further input as the parameter to an escape sequence. I understand this sucks from the user's point of view. Resetting shouldn't make any differentiation between C0 and C1 escape sequences. If one "locks" as you call it, so should the other, and vice versa. > and I turned off the colors when generating the example above This might have changed the beahvior ;)
VTE bug (https://bugzilla.gnome.org/show_bug.cgi?id=779518) has been resolved upstream.