! Please note that this is a snapshot of our old Bugzilla server, which is read only since May 29, 2020. Please go to gitlab.xfce.org for our new server !
xfce4-session cores at startup
Status:
RESOLVED: INVALID
Severity:
blocker
Product:
Xfce4-session
Component:
General

Comments

Description Olivier Fourdan editbugs 2008-11-01 22:36:22 CET
After latest changes (ie today) xfce4-session now cores at startup preventing xfce from actually starting.

The backtrace is not very helpful:

#0  0x000000342d426323 in g_type_check_is_value_type ()
   from /lib64/libgobject-2.0.so.0
#1  0x000000342d42cb9a in g_value_init () from /lib64/libgobject-2.0.so.0
#2  0x00000035ff20f480 in dbus_g_proxy_call ()
   from /usr/lib64/libdbus-glib-1.so.2
#3  0x0000000000a7f8d9 in xfconf_channel_get_internal (channel=0x12996a0, 
    property=0x417319 "/splash/Engine", value=0x7fff13d64b50)
    at xfconf-dbus-bindings.h:57
#4  0x0000000000a82bc3 in IA__xfconf_channel_get_string (channel=0x12996a0, 
    property=0x417319 "/splash/Engine", default_value=0x417314 "mice")
    at xfconf-channel.c:808
#5  0x000000000040a72c in main (argc=1, argv=0x7fff13d64ce8) at main.c:142
Comment 1 Olivier Fourdan editbugs 2008-11-01 22:42:04 CET
> What happens when you run
> 
> xfconf-query -c xfce4-session -p /splash/Engine

Actually that might be a bug with xfconf because running xfconfd cause the process to consume an insane amount of memory:

20494 ofourdan  20   0 4117m 2.6g 1416 D  5.9 79.8   0:09.48 xfconfd            

And it also now writes:

process 20683: Attempt to remove filter function 0x406930 user data 0x22c4a10, but no such filter has been added
  D-Bus not built with -rdynamic so unable to print a backtrace
Aborted
Comment 2 Brian J. Tarricone (not reading bugmail) 2008-11-02 00:36:38 CET
(in reply to comment #1)
> Actually that might be a bug with xfconf because running xfconfd cause the
> process to consume an insane amount of memory:
> 
> 20494 ofourdan  20   0 4117m 2.6g 1416 D  5.9 79.8   0:09.48 xfconfd            
> 
> And it also now writes:
> 
> process 20683: Attempt to remove filter function 0x406930 user data 0x22c4a10,
> but no such filter has been added
>   D-Bus not built with -rdynamic so unable to print a backtrace
> Aborted

That looks like it's coming from the filter function that watches for dbus disconnect in xfconfd.  Though I just added something similar to xfce4-session -- which svn rev of xfce4-session are you running?

Can you attach your ~/.config/xfce4/xfconf/xfce-perchannel-xml/xfce4-session.xml file?  Maybe I can reproduce it with that...
Comment 3 Olivier Fourdan editbugs 2008-11-02 14:40:38 CET
Created attachment 1942 
xfce4-session.xml

The problem occurs with current svn, ie rev. 28580

It happens on my main workstation, x86_64, dbus version 1.2.4, dbus-glib-0.74
Comment 4 Olivier Fourdan editbugs 2008-11-02 14:46:12 CET
But it's actually a xfconf problem and not specific to xfce4-session, xfwm4 just dies the same now.

xfce4-session happens to be the first xfconf app to start...
Comment 5 Olivier Fourdan editbugs 2008-11-02 15:12:46 CET
xfconfd backtrace

(gdb) run
Starting program: /usr/local/bin/xfconfd 
process 25986: Attempt to remove filter function 0x406930 user data 0x19faa10, but no such filter has been added
  D-Bus not built with -rdynamic so unable to print a backtrace

Program received signal SIGABRT, Aborted.
0x0000003801832215 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install dbus-glib.x86_64 dbus.x86_64 glib2.x86_64 glibc.x86_64 libcap.x86_64
(gdb) bt
#0  0x0000003801832215 in raise () from /lib64/libc.so.6
#1  0x0000003801833d83 in abort () from /lib64/libc.so.6
#2  0x00000035fd629d65 in ?? () from /lib64/libdbus-1.so.3
#3  0x00000035fd625bad in ?? () from /lib64/libdbus-1.so.3
#4  0x0000000000406e61 in xfconf_daemon_finalize (obj=0x19faa10)
    at xfconf-daemon.c:162
#5  0x000000342d40d698 in g_object_unref () from /lib64/libgobject-2.0.so.0
#6  0x0000000000406b45 in xfconf_daemon_new_unique (backend_ids=0x19f54e0, 
    error=0x7fff1fc75d70) at xfconf-daemon.c:545
#7  0x0000000000404bf3 in main (argc=1, argv=0x7fff1fc75e88) at main.c:202
Comment 6 Olivier Fourdan editbugs 2008-11-02 15:25:29 CET
xfconf-query backtrace

(gdb) run
Starting program: /usr/local/bin/xfconf-query 
Channels:

Program received signal SIGSEGV, Segmentation fault.
0x000000380187a815 in free () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install dbus-glib.x86_64 dbus.x86_64 glib2.x86_64 glibc.x86_64 libcap.x86_64
(gdb) bt
#0  0x000000380187a815 in free () from /lib64/libc.so.6
#1  0x00000035ff20ee3d in ?? () from /usr/lib64/libdbus-glib-1.so.2
#2  0x00000035ff20f6dc in dbus_g_proxy_call ()
   from /usr/lib64/libdbus-glib-1.so.2
#3  0x0000000000115511 in IA__xfconf_list_channels ()
    at xfconf-dbus-bindings.h:208
#4  0x00000000004024bb in main (argc=1, argv=0x7fff567ab9d8) at main.c:274
Comment 7 Brian J. Tarricone (not reading bugmail) 2008-11-02 18:19:11 CET
Ok, your xfconfd crash isn't actually a big deal.  Well, sorta.

What's happening is that xfconfd is failing to start for some reason, before it's added the dbus filter function to watch for dbus disconnects.  In finalize, it removes the dbus filter function (which hasn't been added yet), and dbus unhelpfully abort()s because of that, so xfconfd never gets to the point where it can actually print out the error message that's causing it to not start.

So, I fixed that problem -- now you should actually get a useful error message as to why xfconfd doesn't want to start.

However, there also appears to be a crash in libxfconf that appears to get triggered if xfconfd dies while libxfconf is in the middle of making a request.

Looking at the dbus-glib source, dbus-glib appears to be the only library in existence that exposes API that takes GError** params, but doesn't allow that param to be NULL, because it just g_set_error()s on them without checking for NULL first.  I assume that's the crash you're getting, so I need to fix libxfconf to always pass errors to dbus-glib, even if we don't care about them.

Ok.

So can you update libxfconf and xfconfd and try again?  If xfconfd still quits, also please do let me know what error message it prints to console.
Comment 8 Olivier Fourdan editbugs 2008-11-02 18:41:07 CET
Well, not much difference even with rev. 28588. xfconfd still aborts if another instance is running (yes, it was the error ^_~, but abort() is definitely too harsh IMHO) and apps that use xfconf still segfault.

xfconfd 

** ERROR **: Xfconfd failed to start: Another Xfconf daemon is already running

aborting...
Aborted

 xfconf-query 
Channels:
Segmentation fault

The backtrace is identical.
Comment 9 Brian J. Tarricone (not reading bugmail) 2008-11-02 18:54:11 CET
(In reply to comment #8)
> Well, not much difference even with rev. 28588. xfconfd still aborts if another
> instance is running

I kinda expected that, but that's weird -- why is it already running, and why does it get started twice?  You don't have it in autostart or something, do you?

> (yes, it was the error ^_~, but abort() is definitely too harsh IMHO)

Well, it g_error()s, so that's g_error()'s doing, not mine.


> and apps that use xfconf still segfault.
> 
> xfconfd 
> 
> ** ERROR **: Xfconfd failed to start: Another Xfconf daemon is already running
> 
> aborting...
> Aborted
> 
>  xfconf-query 
> Channels:
> Segmentation fault
> 
> The backtrace is identical.

Can you install the debuginfo packages that your backtrace specifies and get a better bt?  Current one is pretty useless as it's not crashing in libxfconf...
Comment 10 Brian J. Tarricone (not reading bugmail) 2008-11-02 19:05:45 CET
Ok, I downgraded the g_error() to a g_critical() so it exits cleanly. ^_~
Comment 11 Olivier Fourdan editbugs 2008-11-02 20:36:27 CET
The degug symbols don't ehelp much and even with "-O0 -g3" the backtrace looks as follow:

Program received signal SIGSEGV, Segmentation fault.
__libc_free (mem=<value optimized out>) at malloc.c:3599
3599	  if (chunk_is_mmapped(p))                       /* release mmapped memory. */
(gdb) bt
#0  __libc_free (mem=<value optimized out>) at malloc.c:3599
#1  0x00000035ff20ee3d in dbus_g_proxy_end_call_internal (
    proxy=<value optimized out>, call_id=<value optimized out>, 
    error=<value optimized out>, first_arg_type=<value optimized out>, 
    args=<value optimized out>) at dbus-gproxy.c:2344
#2  0x00000035ff20f6dc in dbus_g_proxy_call (proxy=<value optimized out>, 
    method=<value optimized out>, error=<value optimized out>, 
    first_arg_type=<value optimized out>) at dbus-gproxy.c:2545
#3  0x000000000011b2cc in xfconf_client_list_channels (proxy=0x1022180, 
    OUT_channels=0x7ffff00b9090, error=0x7ffff00b9088)
    at xfconf-dbus-bindings.h:208
#4  0x000000000011b263 in IA__xfconf_list_channels () at xfconf-channel.c:2290
#5  0x00000000004026e8 in main (argc=1, argv=0x7ffff00b92c8) at main.c:274
(gdb)
Comment 12 Brian J. Tarricone (not reading bugmail) 2008-11-02 21:11:28 CET
Actually that is somewhat useful... Can you tell me which version of dbus-glib you have installed?  And do you know if there are any distro patches applied to it?
Comment 13 Brian J. Tarricone (not reading bugmail) 2008-11-02 21:13:22 CET
Oh wait, that's the xfconf-query error?  If you can get a bt of the xfce4-session crash, that might be more useful.
Comment 14 Brian J. Tarricone (not reading bugmail) 2008-11-02 21:15:36 CET
Also, what's the prototype for xfconf_client_list_channels() in xfconf/xfconf-dbus-bindings.h?
Comment 15 Olivier Fourdan editbugs 2008-11-02 22:57:25 CET
> Actually that is somewhat useful... Can you tell me which version of dbus-glib
> you have installed?  And do you know if there are any distro patches applied to
> it?

That's dbus-glib-0.74 but there are quite a few patches applied to the upstream sources. Still this version has been installed since June 11th so there is nothing new here. 

> Oh wait, that's the xfconf-query error?  If you can get a bt of the
> xfce4-session crash, that might be more useful.

Sure, here it is. But keep in mind that all apps using xfconf crash, not just xfce4-session.

Program received signal SIGSEGV, Segmentation fault.
0x000000342d426323 in IA__g_type_check_is_value_type (
    type=<value optimized out>) at gtype.c:3261
3261	  if (node && node->mutatable_check_cache)
(gdb) bt
#0  0x000000342d426323 in IA__g_type_check_is_value_type (
    type=<value optimized out>) at gtype.c:3261
#1  0x000000342d41c47d in IA__g_signal_newv (
    signal_name=<value optimized out>, itype=<value optimized out>, 
    signal_flags=<value optimized out>, class_closure=<value optimized out>, 
    accumulator=<value optimized out>, accu_data=<value optimized out>, 
    c_marshaller=Could not find the frame base for "IA__g_signal_newv".
) at gsignal.c:1275
#2  0x000000342d41c751 in IA__g_signal_new_valist (
    signal_name=<value optimized out>, itype=<value optimized out>, 
    signal_flags=<value optimized out>, class_closure=<value optimized out>, 
    accumulator=<value optimized out>, accu_data=<value optimized out>, 
    c_marshaller=Could not find the frame base for "IA__g_signal_new_valist".
) at gsignal.c:1373
#3  0x000000342d41c8f7 in IA__g_signal_new (signal_name=<value optimized out>, 
    itype=<value optimized out>, signal_flags=<value optimized out>, 
    class_offset=<value optimized out>, accumulator=<value optimized out>, 
    accu_data=<value optimized out>, c_marshaller=Could not find the frame base for "IA__g_signal_new".
) at gsignal.c:1130
#4  0x00007fb3f8bf0e1b in xfconf_channel_class_init (klass=0xb5a690)
    at xfconf-channel.c:154
#5  0x00007fb3f8bf0cf2 in xfconf_channel_class_intern_init (klass=0xb5a690)
    at xfconf-channel.c:129
#6  0x000000342d42acfd in IA__g_type_class_ref (type=<value optimized out>)
    at gtype.c:1880
#7  0x000000342d4118f1 in IA__g_object_new_valist (
    object_type=<value optimized out>, 
    first_property_name=<value optimized out>, var_args=<value optimized out>)
    at gobject.c:988
#8  0x000000342d411cec in IA__g_object_new (object_type=<value optimized out>, 
    first_property_name=<value optimized out>) at gobject.c:795
#9  0x00007fb3f8bf1c43 in IA__xfconf_channel_new (
    channel_name=0x112202 "xfce4-session") at xfconf-channel.c:560
#10 0x00007fb3f8bf1b7f in IA__xfconf_channel_get (
    channel_name=0x112202 "xfce4-session") at xfconf-channel.c:529
#11 0x000000000040a70f in main (argc=1, argv=0x7fff00e2f188) at main.c:200

> Also, what's the prototype for xfconf_client_list_channels() in
> conf/xfconf-dbus-bindings.h?

static
#ifdef G_HAVE_INLINE
inline
#endif
gboolean
xfconf_client_list_channels (DBusGProxy *proxy, char *** OUT_channels, GError **error)

{
  return dbus_g_proxy_call (proxy, "ListChannels", error, G_TYPE_INVALID, G_TYPE_STRV, OUT_channels, G_TYPE_INVALID);
}

But this file is *not* installed. "find /usr/local/include -name xfconf-dbus-bindings.h" return nothing.
Comment 16 Olivier Fourdan editbugs 2008-11-02 23:16:35 CET
Rebuilt everything again, with --enable-maintainer-mode, gives another backtrace:

(gdb) run
Starting program: /usr/local/bin/xfce4-session 
[Thread debugging using libthread_db enabled]
[New Thread 0x7f171f48c740 (LWP 27341)]
Detaching after fork from child process 27348.

Program received signal SIGSEGV, Segmentation fault.
0x000000342d426323 in IA__g_type_check_is_value_type (
    type=<value optimized out>) at gtype.c:3261
3261	  if (node && node->mutatable_check_cache)
(gdb) bt
#0  0x000000342d426323 in IA__g_type_check_is_value_type (
    type=<value optimized out>) at gtype.c:3261
#1  0x000000342d42cb9a in IA__g_value_init (value=<value optimized out>, 
    g_type=<value optimized out>) at gvalue.c:77
#2  0x00000035ff20f480 in dbus_g_proxy_call (proxy=<value optimized out>, 
    method=<value optimized out>, error=<value optimized out>, 
    first_arg_type=<value optimized out>) at dbus-gproxy.c:2538
#3  0x0000000000a7f998 in xfconf_channel_get_internal (
    channel=<value optimized out>, property=0x417319 "/splash/Engine", 
    value=0x7fff274d2610) at xfconf-dbus-bindings.h:57
#4  0x0000000000a82cd3 in IA__xfconf_channel_get_string (channel=0xf10930, 
    property=0x417319 "/splash/Engine", default_value=0x417314 "mice")
    at xfconf-channel.c:810
#5  0x000000000040a72c in main (argc=1, argv=0x7fff274d27a8) at main.c:142
(gdb) 


This one is very similar to the backtrace obtained from xfconf-query, xfce4-panel or xfwm4.
Comment 17 Brian J. Tarricone (not reading bugmail) 2008-11-03 02:14:49 CET
Ok, the last bt seems to show it crashing in the DBUS_G_VALUE_ARRAY_COLLECT_ALL() macro inside dbus_g_proxy_call(), which is just a varargs collector.  Weird.  So I don't think the corruption is there.

Do you get a different bt if you set the env var MALLOC_CHECK_=3 (yes, trailing _ is correct)?

And does valgrind say anything?

I wish I could reproduce this... I wonder if it's a 64bit thing.

Do you know when the last time you updated xfconf was before it started crashing?
Comment 18 Olivier Fourdan editbugs 2008-11-03 19:52:12 CET
Created attachment 1951 
File that causes the crash

Brian, I found out the culprit this is this file.

Copy it to ~/.config/xfce4/xfconf/xfce-perchannel-xml/ and all xfconf apps will core at startup.
Comment 19 Olivier Fourdan editbugs 2008-11-03 21:03:44 CET
Humm, no sorry, false alert, xfce4-session still crashes
Comment 20 Olivier Fourdan editbugs 2008-11-03 21:18:14 CET
I tried rewinding until r27779 from  2008-09-09 and it still crashes, so I am really confused now...
Comment 21 Olivier Fourdan editbugs 2008-11-03 22:28:42 CET
Ok focus on xfce4-session...

  svn up -r "{20081011}" => works
  svn up -r "{20081012}" => cores

But I still do not understand why all components core now, I was updating every day and did not detect the issue previously.
Comment 22 Brian J. Tarricone (not reading bugmail) 2008-11-03 23:00:51 CET
(In reply to comment #21)
> Ok focus on xfce4-session...
> 
>   svn up -r "{20081011}" => works
>   svn up -r "{20081012}" => cores
> 
> But I still do not understand why all components core now, I was updating every
> day and did not detect the issue previously.

Meh, there are 9 commits on 12 October.  Let's see... fortunately a bunch of them only touched xfconf-query or the tests.  So, our crasher is either rev 28167 or rev 28170.  Can you check these 2 out and see which one it is?

If 28167 crashes, can you also check one rev before that to make sure the previous rev doesn't crash?
Comment 23 Brian J. Tarricone (not reading bugmail) 2008-11-03 23:13:25 CET
Oh wait, sorry, I thought you were talking about xfconf revs.  Just a sec...
Comment 24 Brian J. Tarricone (not reading bugmail) 2008-11-03 23:17:12 CET
Does the 20081011 version that works even use xfconf?  That's the day that I switched xfce4-session over.  If it doesn't, then this little exercise has only proved that xfconf is causing your crash, which we already knew.
Comment 25 Olivier Fourdan editbugs 2008-11-05 20:31:18 CET
Ok, found out. Happened that the problem was caused by a change in the build script...
Comment 26 Brian J. Tarricone (not reading bugmail) 2008-11-05 21:06:07 CET
Hmm?  Can you explain in more detail?  I'm curious...

Bug #4548

Reported by:
Olivier Fourdan
Reported on: 2008-11-01
Last modified on: 2009-07-14

People

Assignee:
Brian J. Tarricone (not reading bugmail)
CC List:
0 users

Version

Attachments

xfce4-session.xml (1.67 KB, text/xml)
2008-11-02 14:40 CET , Olivier Fourdan
no flags
File that causes the crash (571 bytes, text/xml)
2008-11-03 19:52 CET , Olivier Fourdan
no flags

Additional information