! Please note that this is a snapshot of our old Bugzilla server, which is read only since May 29, 2020. Please go to gitlab.xfce.org for our new server !
Charset mismatch with non-UTF8
Status:
RESOLVED: FIXED
Product:
Xfce4-terminal
Component:
General

Comments

Description Egmont Koblinger 2016-11-26 22:29:03 CET
Found at https://bugzilla.redhat.com/show_bug.cgi?id=1398749

If started up with a non-UTF-8 locale, the locale's charset (e.g. iso8859-1) is shown as selected in the menus, yet the actual behavior is UTF-8.

VTE defaults to UTF-8 no matter what the locale is, xfce4-terminal should explicitly set the charset to a different one if that's desired.
Comment 1 Igor editbugs 2016-11-28 09:25:39 CET
Egmont, thanks for the report.

Will it be a mistake to say that gnome-terminal also has this issue?
I.e., when started with a non-UTF8 locale (I'm testing it with Cyrillic locales, such as KOI8-R), it does not pass the setting to vte which results in incorrect displaying of Cyrillic characters.
Comment 2 Egmont Koblinger 2016-11-28 09:39:06 CET
Short: You're right, gnome-terminal isn't bugfree here either.

Long: We (in g-t) had some discussions whether the default charset should match the locale or the profile prefs (in the former case this profile pref wouldn't make sense). I can't find this conversation right now, not sure if happened in bugzilla or in private e-mail.

There's also https://bugzilla.gnome.org/show_bug.cgi?id=757457.

Even worse, g-t refuses to start up with non-UTF8 charset nowadays for reasons I failed to understand and disagree with the maintainer whether this restriction is okay (IMO not).

Indeed a mismatching locale settings and actual charset definitely leads to apps working incorrectly.

However, there might be a chicken-egg problem in case the user wants to fix his charset from the shell startup files (.profile) executed within the terminal after the terminal starts up. Hence others might want to specify their charset in the profile prefs rather than via locales.

However, this bugreport (just as the g-t one linked above) is about the terminal emulator itself, regardless of locale settings: the actual charset is not the same as reported in the menus. These two should definitely match.

I think this boils down to the false assumption from xfce4-terminal that vte's default charset matches the locale; no, it always defaults to UTF-8 and needs to be set to the desired value via the corresponding API call.
Comment 3 Igor editbugs 2016-11-28 10:06:20 CET
(In reply to Egmont Koblinger from comment #2)
> Indeed a mismatching locale settings and actual charset definitely leads to
> apps working incorrectly.
> 
> However, there might be a chicken-egg problem in case the user wants to fix
> his charset from the shell startup files (.profile) executed within the
> terminal after the terminal starts up. Hence others might want to specify
> their charset in the profile prefs rather than via locales.

I'm afraid setting charset from the profile file must go together with setting one on the terminal prefs.

> However, this bugreport (just as the g-t one linked above) is about the
> terminal emulator itself, regardless of locale settings: the actual charset
> is not the same as reported in the menus. These two should definitely match.
> 
> I think this boils down to the false assumption from xfce4-terminal that
> vte's default charset matches the locale; no, it always defaults to UTF-8
> and needs to be set to the desired value via the corresponding API call.

Yeah, this assumption is since vte that honored the locale setting; I didn't realize this was changed in vte3.
Comment 4 Egmont Koblinger 2016-11-28 10:08:21 CET
Yes this could have changed, there were changes around encodings, I can't remember them exactly.
Comment 5 Igor editbugs 2016-11-28 10:16:00 CET
The best guess I could make is to read terminal prefs first, and if they have no encoding setting then use global locale obtained by g_get_charset(). The charset then is provided to vte.
This allows to run `LANG=ru_RU.koi8r xfce4-terminal` in utf8 system and get correct behavior in terms of displaying the characters and showing correct encoding in the UI.
I hope this will also resolve the original bug. The commit is https://git.xfce.org/apps/xfce4-terminal/commit/?id=628c27ccc39c1835b4bc8b4736d189865cd14543
Comment 6 Igor editbugs 2017-01-24 09:08:35 CET
Fix included into 0.8.2.
Comment 7 rtc 2017-01-25 13:30:25 CET
This bug is not completely fixed yet. When I run xfce4-terminal with LANG=en_US it seems to work fine at first. But when I tell the shell to run "screen", it switches to UTF8. Under Terminal => Set Encoding, "Default (ISO-8859-1)" is still selected, despite actually being UTF8. I have to manually select Terminal => Set Encoding => Western Europe => ISO-8859-1 to fix it.
Comment 8 Egmont Koblinger 2017-01-25 13:50:16 CET
Does the actual behavior switch to UTF-8?? And does re-selecting the current encoding make a difference??

That sounds weird. Beginning with vte-0.40, it no longer supports any escape sequence switching the encoding (https://bugzilla.gnome.org/show_bug.cgi?id=731208). The only way to switch is via API, that is, via a graphical menu entry.

What's your vte version? (echo $VTE_VERSION)
Comment 9 rtc 2017-01-25 14:03:33 CET
(In reply to Egmont Koblinger from comment #8)
> Does the actual behavior switch to UTF-8?? And does re-selecting the current
> encoding make a difference??

When I run "echo äöü > testfile" inside screen, testfile contains UTF-8 umlaut characters.

Then running "echo äöü > testfile" again inside screen after Terminal => Set Encoding => Western Europe => ISO-8859-1, I get ISO-8859-1 umlauts in testfile.

> That sounds weird. Beginning with vte-0.40, it no longer supports any escape
> sequence switching the encoding
> (https://bugzilla.gnome.org/show_bug.cgi?id=731208). The only way to switch
> is via API, that is, via a graphical menu entry.
> 
> What's your vte version? (echo $VTE_VERSION)

% echo $VTE_VERSION
4601
Comment 10 Egmont Koblinger 2017-01-25 14:25:33 CET
And what does it do right after you start up xfce4-terminal (and no "screen" yet)?
Comment 11 rtc 2017-01-25 14:55:48 CET
(In reply to Egmont Koblinger from comment #10)
> And what does it do right after you start up xfce4-terminal (and no "screen"
> yet)?

- Under Terminal => Set Encoding, "Default (ISO-8859-1)" is selected by default, as it should,

- When I run "echo äöü > testfile", testfile contains ISO-8859-1 umlaut characters, as it should.

Before the (incomplete) fix,

- Terminal => Set encoding => Unicode => UTF-8 was selected by default,

- testfile contained UTF-8 umlaut characters after running echo.

However, even before the (incomplete) fix, if I selected Terminal => Set Encoding => Western Europe => ISO-8859-1 to fix the behaviour manually, and then started screen, it switched to UTF-8 and I had to manually select Terminal => Set Encoding => Western Europe => ISO-8859-1 a second time to get things working properly again.
Comment 12 Egmont Koblinger 2017-01-25 15:09:19 CET
Hmm, interesting. I'll take a closer look.
Comment 13 Egmont Koblinger 2017-01-25 15:17:12 CET
Launching screen switches the actual charset from ISO-8859-1 to UTF-8 (also in gnome-terminal). (I've confirmed it with debugging the calls to VteTerminalPrivate::set_encoding.)

Given the aforementioned VTE change, I have no clue how it's done. In caps.cc and vteseq-n.gperf, it's clear that the \e%@ and \e%G escape sequences are no-op, and indeed they are. So it's something else.

I'll continue debugging VTE later on today.
Comment 14 Egmont Koblinger 2017-01-25 15:37:01 CET
Okay, I got it. It's indeed a VTE bug:
https://bugzilla.gnome.org/show_bug.cgi?id=777747

(screen emits a \e[!p as part of its initialization.)
Comment 15 Igor editbugs 2017-01-25 17:03:36 CET
(In reply to Egmont Koblinger from comment #14)
> Okay, I got it. It's indeed a VTE bug:
> https://bugzilla.gnome.org/show_bug.cgi?id=777747
> 
> (screen emits a \e[!p as part of its initialization.)

Egmont, thanks for the analysis!

Bug #13054

Reported by:
Egmont Koblinger
Reported on: 2016-11-26
Last modified on: 2017-01-25

People

CC List:
3 users

Version

Attachments

Additional information