Some applications that handle files (most often jEdit) do not check if the filename of opened file is encoded using proper utf-8 while utf-8 locale is used. When they try to display the filename in application titlebar, xftaskbar will crash. Some applications, even though they do duplicate incorrect filenames in title, do not crash the titlebar, I believe this is caused by these applications checking the validity beforehands. Distribution is gentoo. XFCE version is 4.0.5. Locale is fi_FI.UTF-8.
Additional information: I suggest some trivial checkups for character coding validity to be added everywhere where such data is handled, similarly as gnome has its utf-validation functions everywhere.
do you have a backtrace?
Is there a simple way to get a proper backtrace. The thing has been compiled using gentoo's portage apparently without debug information so gdb won't help much. Strace would indicate that the problem lies in pango-hangul-fc.so, which I think has been an open bug for quite some time and most projects have somehow patched around it. Digging more about the problem it would seem that the bug is same as one reported in Gnome's Bugzilla: http://bugs.gnome.org/show_bug.cgi?id=138446, and it might actually relate to some specific broken sequences: those which appear at hangul jamo plane
well, you could turn of binary/library stripping in portage and recompile. either way, a stripped binary should give a semi-useful stacktrace, at least it should have function names even if the line numbers won't be there.
So, this would be sufficient: (gdb) backtrace #0 0x40c9f443 in ?? () from /usr/lib/pango/1.4.0/modules/pango-hangul-fc.so #1 0x080ca378 in ?? () #2 0x0000ffc3 in ?? () #3 0xbfffbee8 in ?? () #4 0x4067d258 in g_utf8_strlen () from /usr/lib/libglib-2.0.so.0 #5 0x40c9f9c4 in ?? () from /usr/lib/pango/1.4.0/modules/pango-hangul-fc.so #6 0x080ca378 in ?? () #7 0xbfffbf30 in ?? () #8 0x00000003 in ?? () #9 0x080f64b8 in ?? () #10 0xbfffbf68 in ?? () Right?
no need for sarcasm - sometimes they're useful, sometimes not. this is obviously the latter case.
No, sorry if it came out a bit harsh, it wasn't intended. It does give you the impression on how g_utf8_strlen() might also use the string in offending way and thus suggest that you must validate the string even before any other glib function might get called.
Frankly, it looks like a bug in pango or glib to me.
can you tar an offending file (so I can get the exact sequence of caracters taht cause the crash) and attach it to this report? TIA Olivier.
Attached file contains one file from my java project which causes the crash, the supposed name of file is EiJ
The crash doesn't occur here, all I get is "?" in place of the accentuated characters. I did try with LANG set to fi_FI.UTF-8 and "jedit 4.1final". I've also tried with LANG set to C and also fi_FI but I could not reproduce the crash in any case.
That's odd. Any further ideas for investigating the issue? For what it's worth, the broken characters are actually displayed in the program title bar as sequence of two character zeroes, except that because character zero does not have a glyph (because it's non-printable of course) it gets replaced with rectangle containing four digits 0, I would assume that on a different font system and settings the replacement character is actually question mark or empty rectangle. The fact that it somehow gets parsed as 0's might also result that tar won't catch it correctly, right?
Dunno. I really did my best to reproduce that problem w/o success.
I don't know, do we leave this open? Can it be assigned to someone?
it occurs to me that gtk should be doing its own validation before it tries to set the text on a label (for instance). for example, if i try to set a label to a string that contains invalid utf8, i get a message printed to stderr from pango that it can't validate the utf8 in the string. no crash occurs. it might not be a bad idea for xftaskbar4 to validate the utf8 itself, and then perhaps handle non-validating strings in some special way, but i don't really think this is our bug.
Settings this to WORKSFORME, since we seem to be unable to reproduce. Please reopen if you have more information, or if other people can reproduce it.
mass reassign from zz-do-not-use to general, so i can remove the zz-do-not-use component. sorry for the spam, search for this string to filter these: fis7cldoq35p3kjdu74emc