! Please note that this is a snapshot of our old Bugzilla server, which is read only since May 29, 2020. Please go to gitlab.xfce.org for our new server !
taskbar crashes when window title contains incorrect UTF-8
Status:
RESOLVED: WORKSFORME
Product:
Xfce4-panel
Component:
Window Buttons

Comments

Description flammie 2004-05-22 13:15:40 CEST
Some applications that handle files (most often jEdit) do not check if the
filename of opened file is encoded using proper utf-8 while utf-8 locale is
used. When they try to display the filename in application titlebar, xftaskbar
will crash. Some applications, even though they do duplicate incorrect
filenames in title, do not crash the titlebar, I believe this is caused by
these applications checking the validity beforehands.

Distribution is gentoo. XFCE version is 4.0.5. Locale is fi_FI.UTF-8.
Comment 1 flammie 2004-05-22 13:15:42 CEST
Additional information:

I suggest some trivial checkups for character coding validity to be added
everywhere where such data is handled, similarly as gnome has its
utf-validation functions everywhere.
Comment 2 Olivier Fourdan editbugs 2004-05-23 19:39:51 CEST
do you have a backtrace?
Comment 3 flammie 2004-05-24 00:15:46 CEST
Is there a simple way to get a proper backtrace. The thing has been compiled
using gentoo's portage apparently without debug information so gdb won't help
much. Strace would indicate that the problem lies in pango-hangul-fc.so, which
I think has been an open bug for quite some time and most projects have somehow
patched around it.

Digging more about the problem it would seem that the bug is same as one
reported in Gnome's Bugzilla: http://bugs.gnome.org/show_bug.cgi?id=138446,
and it might actually relate to some specific broken sequences: those which
appear at hangul jamo plane
Comment 4 Brian J. Tarricone (not reading bugmail) 2004-05-24 03:29:50 CEST
well, you could turn of binary/library stripping in portage and recompile.

either way, a stripped binary should give a semi-useful stacktrace, at least it
should have function names even if the line numbers won't be there.
Comment 5 flammie 2004-05-24 18:30:31 CEST
So, this would be sufficient:
(gdb) backtrace
#0 0x40c9f443 in ?? () from /usr/lib/pango/1.4.0/modules/pango-hangul-fc.so
#1 0x080ca378 in ?? ()
#2 0x0000ffc3 in ?? ()
#3 0xbfffbee8 in ?? ()
#4 0x4067d258 in g_utf8_strlen () from /usr/lib/libglib-2.0.so.0
#5 0x40c9f9c4 in ?? () from /usr/lib/pango/1.4.0/modules/pango-hangul-fc.so
#6 0x080ca378 in ?? ()
#7 0xbfffbf30 in ?? ()
#8 0x00000003 in ?? ()
#9 0x080f64b8 in ?? ()
#10 0xbfffbf68 in ?? ()

Right?
Comment 6 Brian J. Tarricone (not reading bugmail) 2004-05-24 19:02:21 CEST
no need for sarcasm - sometimes they're useful, sometimes not. this is
obviously the latter case.
Comment 7 flammie 2004-05-24 19:25:25 CEST
No, sorry if it came out a bit harsh, it wasn't intended. It does give you the
impression on how g_utf8_strlen() might also use the string in offending way
and thus suggest that you must validate the string even before any other glib
function might get called.
Comment 8 Olivier Fourdan editbugs 2004-05-24 19:34:31 CEST
Frankly, it looks like a bug in pango or glib to me.
Comment 9 Olivier Fourdan editbugs 2004-05-25 19:14:25 CEST
can you tar an offending file (so I can get the exact sequence of caracters
taht cause the crash) and attach it to this report?

TIA
Olivier.
Comment 10 flammie 2004-05-25 19:30:48 CEST
Attached file contains one file from my java project which causes the crash,
the supposed name of file is EiJ
Comment 11 Olivier Fourdan editbugs 2004-05-25 20:17:08 CEST
The crash doesn't occur here, all I get is "?" in place of the accentuated
characters.

I did try with LANG set to fi_FI.UTF-8 and "jedit 4.1final". I've also tried
with LANG set to C and also fi_FI but I could not reproduce the crash in any
case.
Comment 12 flammie 2004-05-25 22:11:14 CEST
That's odd. Any further ideas for investigating the issue?

For what it's worth, the broken characters are actually displayed in the
program title bar as sequence of two character zeroes, except that because
character zero does not have a glyph (because it's non-printable of course) it
gets replaced with rectangle containing four digits 0, I would assume that on a
different font system and settings the replacement character is actually
question mark or empty rectangle.

The fact that it somehow gets parsed as 0's might also result that tar won't
catch it correctly, right?
Comment 13 Olivier Fourdan editbugs 2004-06-19 13:12:22 CEST
Dunno. I really did my best to reproduce that problem w/o success.
Comment 14 Jasper Huijsmans editbugs 2004-07-15 07:40:18 CEST
I don't know, do we leave this open? Can it be assigned to someone?
Comment 15 Brian J. Tarricone (not reading bugmail) 2004-08-19 18:15:32 CEST
it occurs to me that gtk should be doing its own validation before it tries to
set the text on a label (for instance). for example, if i try to set a label to
a string that contains invalid utf8, i get a message printed to stderr from
pango that it can't validate the utf8 in the string. no crash occurs. it might
not be a bad idea for xftaskbar4 to validate the utf8 itself, and then perhaps
handle non-validating strings in some special way, but i don't really think
this is our bug.
Comment 16 Jasper Huijsmans editbugs 2004-09-20 19:23:10 CEST
Settings this to WORKSFORME, since we seem to be unable to reproduce. Please
reopen if you have more information, or if other people can reproduce it.
Comment 17 Brian J. Tarricone (not reading bugmail) 2004-10-12 22:25:32 CEST
mass reassign from zz-do-not-use to general, so i can remove the zz-do-not-use
component.  sorry for the spam, search for this string to filter these:
fis7cldoq35p3kjdu74emc

Bug #202

Reported by:
flammie
Reported on: 2004-05-22
Last modified on: 2009-07-14

People

Assignee:
Nick Schermer
CC List:
0 users

Version

Attachments

0000202-BrokenUTFSample.tar (10.00 KB, application/x-tar)
2004-05-25 19:28 CEST , flammie
no flags

Additional information