After latest changes (ie today) xfce4-session now cores at startup preventing xfce from actually starting. The backtrace is not very helpful: #0 0x000000342d426323 in g_type_check_is_value_type () from /lib64/libgobject-2.0.so.0 #1 0x000000342d42cb9a in g_value_init () from /lib64/libgobject-2.0.so.0 #2 0x00000035ff20f480 in dbus_g_proxy_call () from /usr/lib64/libdbus-glib-1.so.2 #3 0x0000000000a7f8d9 in xfconf_channel_get_internal (channel=0x12996a0, property=0x417319 "/splash/Engine", value=0x7fff13d64b50) at xfconf-dbus-bindings.h:57 #4 0x0000000000a82bc3 in IA__xfconf_channel_get_string (channel=0x12996a0, property=0x417319 "/splash/Engine", default_value=0x417314 "mice") at xfconf-channel.c:808 #5 0x000000000040a72c in main (argc=1, argv=0x7fff13d64ce8) at main.c:142
> What happens when you run > > xfconf-query -c xfce4-session -p /splash/Engine Actually that might be a bug with xfconf because running xfconfd cause the process to consume an insane amount of memory: 20494 ofourdan 20 0 4117m 2.6g 1416 D 5.9 79.8 0:09.48 xfconfd And it also now writes: process 20683: Attempt to remove filter function 0x406930 user data 0x22c4a10, but no such filter has been added D-Bus not built with -rdynamic so unable to print a backtrace Aborted
(in reply to comment #1) > Actually that might be a bug with xfconf because running xfconfd cause the > process to consume an insane amount of memory: > > 20494 ofourdan 20 0 4117m 2.6g 1416 D 5.9 79.8 0:09.48 xfconfd > > And it also now writes: > > process 20683: Attempt to remove filter function 0x406930 user data 0x22c4a10, > but no such filter has been added > D-Bus not built with -rdynamic so unable to print a backtrace > Aborted That looks like it's coming from the filter function that watches for dbus disconnect in xfconfd. Though I just added something similar to xfce4-session -- which svn rev of xfce4-session are you running? Can you attach your ~/.config/xfce4/xfconf/xfce-perchannel-xml/xfce4-session.xml file? Maybe I can reproduce it with that...
Created attachment 1942 xfce4-session.xml The problem occurs with current svn, ie rev. 28580 It happens on my main workstation, x86_64, dbus version 1.2.4, dbus-glib-0.74
But it's actually a xfconf problem and not specific to xfce4-session, xfwm4 just dies the same now. xfce4-session happens to be the first xfconf app to start...
xfconfd backtrace (gdb) run Starting program: /usr/local/bin/xfconfd process 25986: Attempt to remove filter function 0x406930 user data 0x19faa10, but no such filter has been added D-Bus not built with -rdynamic so unable to print a backtrace Program received signal SIGABRT, Aborted. 0x0000003801832215 in raise () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install dbus-glib.x86_64 dbus.x86_64 glib2.x86_64 glibc.x86_64 libcap.x86_64 (gdb) bt #0 0x0000003801832215 in raise () from /lib64/libc.so.6 #1 0x0000003801833d83 in abort () from /lib64/libc.so.6 #2 0x00000035fd629d65 in ?? () from /lib64/libdbus-1.so.3 #3 0x00000035fd625bad in ?? () from /lib64/libdbus-1.so.3 #4 0x0000000000406e61 in xfconf_daemon_finalize (obj=0x19faa10) at xfconf-daemon.c:162 #5 0x000000342d40d698 in g_object_unref () from /lib64/libgobject-2.0.so.0 #6 0x0000000000406b45 in xfconf_daemon_new_unique (backend_ids=0x19f54e0, error=0x7fff1fc75d70) at xfconf-daemon.c:545 #7 0x0000000000404bf3 in main (argc=1, argv=0x7fff1fc75e88) at main.c:202
xfconf-query backtrace (gdb) run Starting program: /usr/local/bin/xfconf-query Channels: Program received signal SIGSEGV, Segmentation fault. 0x000000380187a815 in free () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install dbus-glib.x86_64 dbus.x86_64 glib2.x86_64 glibc.x86_64 libcap.x86_64 (gdb) bt #0 0x000000380187a815 in free () from /lib64/libc.so.6 #1 0x00000035ff20ee3d in ?? () from /usr/lib64/libdbus-glib-1.so.2 #2 0x00000035ff20f6dc in dbus_g_proxy_call () from /usr/lib64/libdbus-glib-1.so.2 #3 0x0000000000115511 in IA__xfconf_list_channels () at xfconf-dbus-bindings.h:208 #4 0x00000000004024bb in main (argc=1, argv=0x7fff567ab9d8) at main.c:274
Ok, your xfconfd crash isn't actually a big deal. Well, sorta. What's happening is that xfconfd is failing to start for some reason, before it's added the dbus filter function to watch for dbus disconnects. In finalize, it removes the dbus filter function (which hasn't been added yet), and dbus unhelpfully abort()s because of that, so xfconfd never gets to the point where it can actually print out the error message that's causing it to not start. So, I fixed that problem -- now you should actually get a useful error message as to why xfconfd doesn't want to start. However, there also appears to be a crash in libxfconf that appears to get triggered if xfconfd dies while libxfconf is in the middle of making a request. Looking at the dbus-glib source, dbus-glib appears to be the only library in existence that exposes API that takes GError** params, but doesn't allow that param to be NULL, because it just g_set_error()s on them without checking for NULL first. I assume that's the crash you're getting, so I need to fix libxfconf to always pass errors to dbus-glib, even if we don't care about them. Ok. So can you update libxfconf and xfconfd and try again? If xfconfd still quits, also please do let me know what error message it prints to console.
Well, not much difference even with rev. 28588. xfconfd still aborts if another instance is running (yes, it was the error ^_~, but abort() is definitely too harsh IMHO) and apps that use xfconf still segfault. xfconfd ** ERROR **: Xfconfd failed to start: Another Xfconf daemon is already running aborting... Aborted xfconf-query Channels: Segmentation fault The backtrace is identical.
(In reply to comment #8) > Well, not much difference even with rev. 28588. xfconfd still aborts if another > instance is running I kinda expected that, but that's weird -- why is it already running, and why does it get started twice? You don't have it in autostart or something, do you? > (yes, it was the error ^_~, but abort() is definitely too harsh IMHO) Well, it g_error()s, so that's g_error()'s doing, not mine. > and apps that use xfconf still segfault. > > xfconfd > > ** ERROR **: Xfconfd failed to start: Another Xfconf daemon is already running > > aborting... > Aborted > > xfconf-query > Channels: > Segmentation fault > > The backtrace is identical. Can you install the debuginfo packages that your backtrace specifies and get a better bt? Current one is pretty useless as it's not crashing in libxfconf...
Ok, I downgraded the g_error() to a g_critical() so it exits cleanly. ^_~
The degug symbols don't ehelp much and even with "-O0 -g3" the backtrace looks as follow: Program received signal SIGSEGV, Segmentation fault. __libc_free (mem=<value optimized out>) at malloc.c:3599 3599 if (chunk_is_mmapped(p)) /* release mmapped memory. */ (gdb) bt #0 __libc_free (mem=<value optimized out>) at malloc.c:3599 #1 0x00000035ff20ee3d in dbus_g_proxy_end_call_internal ( proxy=<value optimized out>, call_id=<value optimized out>, error=<value optimized out>, first_arg_type=<value optimized out>, args=<value optimized out>) at dbus-gproxy.c:2344 #2 0x00000035ff20f6dc in dbus_g_proxy_call (proxy=<value optimized out>, method=<value optimized out>, error=<value optimized out>, first_arg_type=<value optimized out>) at dbus-gproxy.c:2545 #3 0x000000000011b2cc in xfconf_client_list_channels (proxy=0x1022180, OUT_channels=0x7ffff00b9090, error=0x7ffff00b9088) at xfconf-dbus-bindings.h:208 #4 0x000000000011b263 in IA__xfconf_list_channels () at xfconf-channel.c:2290 #5 0x00000000004026e8 in main (argc=1, argv=0x7ffff00b92c8) at main.c:274 (gdb)
Actually that is somewhat useful... Can you tell me which version of dbus-glib you have installed? And do you know if there are any distro patches applied to it?
Oh wait, that's the xfconf-query error? If you can get a bt of the xfce4-session crash, that might be more useful.
Also, what's the prototype for xfconf_client_list_channels() in xfconf/xfconf-dbus-bindings.h?
> Actually that is somewhat useful... Can you tell me which version of dbus-glib > you have installed? And do you know if there are any distro patches applied to > it? That's dbus-glib-0.74 but there are quite a few patches applied to the upstream sources. Still this version has been installed since June 11th so there is nothing new here. > Oh wait, that's the xfconf-query error? If you can get a bt of the > xfce4-session crash, that might be more useful. Sure, here it is. But keep in mind that all apps using xfconf crash, not just xfce4-session. Program received signal SIGSEGV, Segmentation fault. 0x000000342d426323 in IA__g_type_check_is_value_type ( type=<value optimized out>) at gtype.c:3261 3261 if (node && node->mutatable_check_cache) (gdb) bt #0 0x000000342d426323 in IA__g_type_check_is_value_type ( type=<value optimized out>) at gtype.c:3261 #1 0x000000342d41c47d in IA__g_signal_newv ( signal_name=<value optimized out>, itype=<value optimized out>, signal_flags=<value optimized out>, class_closure=<value optimized out>, accumulator=<value optimized out>, accu_data=<value optimized out>, c_marshaller=Could not find the frame base for "IA__g_signal_newv". ) at gsignal.c:1275 #2 0x000000342d41c751 in IA__g_signal_new_valist ( signal_name=<value optimized out>, itype=<value optimized out>, signal_flags=<value optimized out>, class_closure=<value optimized out>, accumulator=<value optimized out>, accu_data=<value optimized out>, c_marshaller=Could not find the frame base for "IA__g_signal_new_valist". ) at gsignal.c:1373 #3 0x000000342d41c8f7 in IA__g_signal_new (signal_name=<value optimized out>, itype=<value optimized out>, signal_flags=<value optimized out>, class_offset=<value optimized out>, accumulator=<value optimized out>, accu_data=<value optimized out>, c_marshaller=Could not find the frame base for "IA__g_signal_new". ) at gsignal.c:1130 #4 0x00007fb3f8bf0e1b in xfconf_channel_class_init (klass=0xb5a690) at xfconf-channel.c:154 #5 0x00007fb3f8bf0cf2 in xfconf_channel_class_intern_init (klass=0xb5a690) at xfconf-channel.c:129 #6 0x000000342d42acfd in IA__g_type_class_ref (type=<value optimized out>) at gtype.c:1880 #7 0x000000342d4118f1 in IA__g_object_new_valist ( object_type=<value optimized out>, first_property_name=<value optimized out>, var_args=<value optimized out>) at gobject.c:988 #8 0x000000342d411cec in IA__g_object_new (object_type=<value optimized out>, first_property_name=<value optimized out>) at gobject.c:795 #9 0x00007fb3f8bf1c43 in IA__xfconf_channel_new ( channel_name=0x112202 "xfce4-session") at xfconf-channel.c:560 #10 0x00007fb3f8bf1b7f in IA__xfconf_channel_get ( channel_name=0x112202 "xfce4-session") at xfconf-channel.c:529 #11 0x000000000040a70f in main (argc=1, argv=0x7fff00e2f188) at main.c:200 > Also, what's the prototype for xfconf_client_list_channels() in > conf/xfconf-dbus-bindings.h? static #ifdef G_HAVE_INLINE inline #endif gboolean xfconf_client_list_channels (DBusGProxy *proxy, char *** OUT_channels, GError **error) { return dbus_g_proxy_call (proxy, "ListChannels", error, G_TYPE_INVALID, G_TYPE_STRV, OUT_channels, G_TYPE_INVALID); } But this file is *not* installed. "find /usr/local/include -name xfconf-dbus-bindings.h" return nothing.
Rebuilt everything again, with --enable-maintainer-mode, gives another backtrace: (gdb) run Starting program: /usr/local/bin/xfce4-session [Thread debugging using libthread_db enabled] [New Thread 0x7f171f48c740 (LWP 27341)] Detaching after fork from child process 27348. Program received signal SIGSEGV, Segmentation fault. 0x000000342d426323 in IA__g_type_check_is_value_type ( type=<value optimized out>) at gtype.c:3261 3261 if (node && node->mutatable_check_cache) (gdb) bt #0 0x000000342d426323 in IA__g_type_check_is_value_type ( type=<value optimized out>) at gtype.c:3261 #1 0x000000342d42cb9a in IA__g_value_init (value=<value optimized out>, g_type=<value optimized out>) at gvalue.c:77 #2 0x00000035ff20f480 in dbus_g_proxy_call (proxy=<value optimized out>, method=<value optimized out>, error=<value optimized out>, first_arg_type=<value optimized out>) at dbus-gproxy.c:2538 #3 0x0000000000a7f998 in xfconf_channel_get_internal ( channel=<value optimized out>, property=0x417319 "/splash/Engine", value=0x7fff274d2610) at xfconf-dbus-bindings.h:57 #4 0x0000000000a82cd3 in IA__xfconf_channel_get_string (channel=0xf10930, property=0x417319 "/splash/Engine", default_value=0x417314 "mice") at xfconf-channel.c:810 #5 0x000000000040a72c in main (argc=1, argv=0x7fff274d27a8) at main.c:142 (gdb) This one is very similar to the backtrace obtained from xfconf-query, xfce4-panel or xfwm4.
Ok, the last bt seems to show it crashing in the DBUS_G_VALUE_ARRAY_COLLECT_ALL() macro inside dbus_g_proxy_call(), which is just a varargs collector. Weird. So I don't think the corruption is there. Do you get a different bt if you set the env var MALLOC_CHECK_=3 (yes, trailing _ is correct)? And does valgrind say anything? I wish I could reproduce this... I wonder if it's a 64bit thing. Do you know when the last time you updated xfconf was before it started crashing?
Created attachment 1951 File that causes the crash Brian, I found out the culprit this is this file. Copy it to ~/.config/xfce4/xfconf/xfce-perchannel-xml/ and all xfconf apps will core at startup.
Humm, no sorry, false alert, xfce4-session still crashes
I tried rewinding until r27779 from 2008-09-09 and it still crashes, so I am really confused now...
Ok focus on xfce4-session... svn up -r "{20081011}" => works svn up -r "{20081012}" => cores But I still do not understand why all components core now, I was updating every day and did not detect the issue previously.
(In reply to comment #21) > Ok focus on xfce4-session... > > svn up -r "{20081011}" => works > svn up -r "{20081012}" => cores > > But I still do not understand why all components core now, I was updating every > day and did not detect the issue previously. Meh, there are 9 commits on 12 October. Let's see... fortunately a bunch of them only touched xfconf-query or the tests. So, our crasher is either rev 28167 or rev 28170. Can you check these 2 out and see which one it is? If 28167 crashes, can you also check one rev before that to make sure the previous rev doesn't crash?
Oh wait, sorry, I thought you were talking about xfconf revs. Just a sec...
Does the 20081011 version that works even use xfconf? That's the day that I switched xfce4-session over. If it doesn't, then this little exercise has only proved that xfconf is causing your crash, which we already knew.
Ok, found out. Happened that the problem was caused by a change in the build script...
Hmm? Can you explain in more detail? I'm curious...