Created attachment 5735 Correctly determine the hostname Since I didn't get a reply via email: > commit 19e9cc2db222fde7f138de86f3cedcda4a4d4295 > Author: Alistair Buxton <a.j.buxton@gmail.com> > Date: Wed Nov 5 19:36:11 2014 +0000 > > Determine the maximum host name length correctly. > > The actual limit on Linux is 64 characters, and on BSD it is 255 > characters. The limit is defined in HOST_NAME_MAX on Posic operating > systems, so use that definition instead of the one hardcoded into > Xfwm, and print a warning if it still fails. > > Too long hostnames would previously cause the hostname to be set > to null, which would cause a segfault when terminating a client. > This patch also adds a null check to that function. > > Signed-off-by: Simon Steinbeiss <simon.steinbeiss@elfenbeinturm.at> > --- > src/client.c | 2 +- > src/display.c | 11 ++++++----- > 2 files changed, 7 insertions(+), 6 deletions(-) > > diff --git a/src/client.c b/src/client.c > index 60430a8..dc00545 100644 > --- a/src/client.c > +++ b/src/client.c > @@ -2573,7 +2573,7 @@ clientTerminate (Client *c) > screen_info = c->screen_info; > display_info = screen_info->display_info; > > - if ((c->hostname) && (c->pid > 0)) > + if ((c->hostname) && (display_info->hostname) && (c->pid > 0)) > { > if (!strcmp (display_info->hostname, c->hostname)) > { > diff --git a/src/display.c b/src/display.c > index 00318d5..3c2e7ba 100644 > --- a/src/display.c > +++ b/src/display.c > @@ -44,9 +44,9 @@ > #include "client.h" > #include "compositor.h" > > -#ifndef MAX_HOSTNAME_LENGTH > -#define MAX_HOSTNAME_LENGTH 32 > -#endif /* MAX_HOSTNAME_LENGTH */ > +#ifndef HOST_NAME_MAX > +#define HOST_NAME_MAX 255 > +#endif /* HOST_NAME_MAX */ > > #ifndef CURSOR_ROOT > #define CURSOR_ROOT XC_left_ptr > @@ -313,9 +313,10 @@ myDisplayInit (GdkDisplay *gdisplay) > display->nb_screens = 0; > display->current_time = CurrentTime; > > - display->hostname = g_new0 (gchar, (size_t) MAX_HOSTNAME_LENGTH); > - if (gethostname ((char *) display->hostname, MAX_HOSTNAME_LENGTH - 1)) > + display->hostname = g_new0 (gchar, (size_t) HOST_NAME_MAX + 1); > + if (gethostname ((char *) display->hostname, HOST_NAME_MAX + 1)) > { > + g_warning ("The display's hostname could not be determined."); > g_free (display->hostname); > display->hostname = NULL; > } > > -- > To stop receiving notification emails like this one, please contact > the administrator of this repository. The above code was bad before but this commit made it worse. First the explanation is wrong, I don't know of any gethostname implementation that returns NULL if a hostname does not fit into the buffer passed to it, rather it will be silently truncated and depending on the implementation without null-termination. The new code assumes POSIX semantics but does not actually request it via the _XOPEN_SOURCE macro which might not be the default on non-Linux/glibc. Then it assumes that the maximum hostname length cannot be greater than 255 characters if HOST_NAME_MAX is not defined. In case both flawed assumptions are combined and an implementation does not provide POSIX semantics by default and allows hostnames larger than 255 bytes you now get a possibly unterminated string which was not possible before. All of the above mess including the NULL check can be simply avoided by a display->hostname = g_strdup(g_get_host_name()); g_get_host_name() cannot fail, it always returns a valid string in a static buffer and g_strdup() should abort on ENOMEM. See attached patch.
gethostname returns NULL when the hostname was fetched successfully. A non-zero return value means it failed, including because the buffer was not big enough: http://linux.die.net/man/2/gethostname The GNU C library does not employ the gethostname() system call; instead, it implements gethostname() as a library function that calls uname(2) and copies up to len bytes from the returned nodename field into name. Having performed the copy, the function then checks if the length of the nodename was greater than or equal to len, and if it is, then the function returns -1 with errno set to ENAMETOOLONG; in this case, a terminating null byte is not included in the returned name. http://www.freebsd.org/cgi/man.cgi?sektion=3&query=gethostname Upon successful completion, the value 0 is returned; otherwise the value -1 is returned and the global variable errno is set to indicate the error. g_get_host_name makes the same assumption: https://git.gnome.org/browse/glib/tree/glib/gutils.c#n972 Except it has a hardcoded limit of 100 chars for the hostname, which is incorrect for BSD: Host names are limited to {HOST_NAME_MAX} characters, not including the trailing null, currently 255. g_get_host_name does not return an error but instead sets the string to "localhost", so we would need to check for this case to prevent clientTerminate from possibly attempting to kill remote PIDs on the local machine.
(In reply to Alistair Buxton from comment #1) > gethostname returns NULL when the hostname was fetched successfully. A > non-zero return value means it failed, including because the buffer was not > big enough: > > http://linux.die.net/man/2/gethostname > > The GNU C library does not employ the gethostname() system call; > instead, > it implements gethostname() as a library function that calls uname(2) > and > copies up to len bytes from the returned nodename field into name. > Having > performed the copy, the function then checks if the length of the > nodename > was greater than or equal to len, and if it is, then the function > returns > -1 with errno set to ENAMETOOLONG; in this case, a terminating null > byte > is not included in the returned name. > > http://www.freebsd.org/cgi/man.cgi?sektion=3&query=gethostname > > Upon successful completion, the value 0 is returned; otherwise the > value -1 is returned and the global variable errno is set to indicate > the > error. That's a detail of glibc and FreeBSD extension and not portable, even if you assume POSIX semantics. From http://pubs.opengroup.org/onlinepubs/9699919799/functions/gethostname.html: DESCRIPTION The gethostname() function shall return the standard host name for the current machine. The namelen argument shall specify the size of the array pointed to by the name argument. The returned name shall be null-terminated, except that if namelen is an insufficient length to hold the host name, then the returned name shall be truncated and it is unspecified whether the returned name is null-terminated. Host names are limited to {HOST_NAME_MAX} bytes. RETURN VALUE Upon successful completion, 0 shall be returned; otherwise, -1 shall be returned. ERRORS No errors are defined. And if you need a concrete example, on Solaris 11.2 and Illumos gethostname(3) will silently truncate the hostname and return 0 if you pass in a buffer that is too small. Furthermore, they don't define HOST_NAME_MAX (even if you request POSIX semantics) and rather use MAXHOSTNAMELEN. I believe it's the same on HP-UX, though that might be a less relevant platform for Xfce. > g_get_host_name makes the same assumption: > > https://git.gnome.org/browse/glib/tree/glib/gutils.c#n972 > > Except it has a hardcoded limit of 100 chars for the hostname, which is > incorrect for BSD: > > Host names are limited to {HOST_NAME_MAX} characters, not including the > trailing null, currently 255. OK, my bad, I didn't check the implementation, that is just ugly. > g_get_host_name does not return an error but instead sets the string to > "localhost", so we would need to check for this case to prevent > clientTerminate from possibly attempting to kill remote PIDs on the local > machine. Hm, ok I withdraw my patch. The current code is still wrong and worse than before, you cannot portably detect truncation by checking the return value of gethostname() and because you pass in HOST_NAME_MAX + 1 as the buffer size, the resulting string may not be null-terminated. Also it might not be a good idea not to use MAX_HOSTNAME_LENGTH to allocate a buffer because even POSIX semantics only impose a minimum size of currently 255 bytes but no maximum limit. Finally if you mean POSIX semantics you should define _XOPEN_SOURCE appropriately and possibly put the compiler in c99 mode for POSIX.1-2001 or later.
This is yet another case of "We need a function that returns exactly what _XGetHostname returns, but we can't use _XGetHostname because it is private" Which has been an issue for 10 years: http://comments.gmane.org/gmane.comp.xfree86.devel/5982 _XGetHostname: http://cgit.freedesktop.org/xorg/lib/libX11/tree/src/XlibInt.c#n1616 And the place where it is used to set the client hostname: http://cgit.freedesktop.org/xorg/lib/libX11/tree/src/WMProps.c#n86 Reasoning on HOST_NAME_MAX: POSIX.1-2001 guarantees that "Host names (not including the terminating null byte) are limited to HOST_NAME_MAX bytes". So if HOST_NAME_MAX is defined then we need a buffer which is HOST_NAME_MAX + 1 to also hold the null byte. The namelen argument shall specify the size of the array pointed to by the name argument. So if the array size is HOST_NAME_MAX + 1 then we need to pass the same to gethostname. There could be a none-standard implementation where the namelen argument means "maximum number of characters allowed not including null byte" in which case the buffer would need to be HOST_NAME_MAX + 2 and namelen would need to be HOST_NAME_MAX + 1. Here is an implementation which does exactly this, in a round about way: http://www.opensource.apple.com/source/cvs/cvs-47/cvs/lib/xgethostname.c ...SunOS 5.5's gethostname whereby it NUL-terminates HOSTNAME even when the name is as long as the supplied buffer. > Also it might not be a good idea not to use MAX_HOSTNAME_LENGTH to allocate a buffer because even POSIX semantics only impose a minimum size of currently 255 bytes but no maximum limit. MAX_HOSTNAME_LENGTH is, as far as I can tell, a completely arbitrary made up number that is not defined anywhere except Xfwm. Why would one arbitrary made up number be better than another? Is the problem here that some system may define HOST_NAME_MAX to an arbitrarily large number such as MAX_INT?
Created attachment 5737 Patch to make the buffer one longer than namelen again.
(In reply to Alistair Buxton from comment #3) > This is yet another case of "We need a function that returns exactly what > _XGetHostname returns, but we can't use _XGetHostname because it is private" > > Which has been an issue for 10 years: > > http://comments.gmane.org/gmane.comp.xfree86.devel/5982 > > _XGetHostname: > > http://cgit.freedesktop.org/xorg/lib/libX11/tree/src/XlibInt.c#n1616 > > And the place where it is used to set the client hostname: > > http://cgit.freedesktop.org/xorg/lib/libX11/tree/src/WMProps.c#n86 > > > > Reasoning on HOST_NAME_MAX: > > POSIX.1-2001 guarantees that "Host names (not including the > terminating null byte) are limited to HOST_NAME_MAX bytes". > > So if HOST_NAME_MAX is defined then we need a buffer which is HOST_NAME_MAX > + 1 to also hold the null byte. > > The namelen argument shall specify the size of the array > pointed to by the name argument. > > So if the array size is HOST_NAME_MAX + 1 then we need to pass the same to > gethostname. The problem is that HOST_NAME_MAX may not be defined (as e.g. Solaris and Illumos which use MAXHOSTNAMELEN) and then you define it as 255 while the platform might actually have a higher limit leading to truncation without null-termination. > There could be a none-standard implementation where the namelen argument > means "maximum number of characters allowed not including null byte" in > which case the buffer would need to be HOST_NAME_MAX + 2 and namelen would > need to be HOST_NAME_MAX + 1. > > Here is an implementation which does exactly this, in a round about way: > > http://www.opensource.apple.com/source/cvs/cvs-47/cvs/lib/xgethostname.c > > ...SunOS 5.5's gethostname whereby it NUL-terminates HOSTNAME > even when the name is as long as the supplied buffer. > > > Also it might not be a good idea not to use MAX_HOSTNAME_LENGTH to allocate a buffer because even POSIX semantics only impose a minimum size of currently 255 bytes but no maximum limit. > > MAX_HOSTNAME_LENGTH is, as far as I can tell, a completely arbitrary made up > number that is not defined anywhere except Xfwm. Why would one arbitrary > made up number be better than another? Is the problem here that some system > may define HOST_NAME_MAX to an arbitrarily large number such as MAX_INT? Sorry for the confusion, I meant it may not be a good idea to rely on the system's HOST_NAME_MAX, I don't see anything that forbids a POSIX-conforming implementation not to impose a limit and to set it to SIZE_MAX or so. That's why I said the code before this commit imposing an arbitrary limit is not nice and fails to reliably detect truncation but at least it makes sure the string is null-terminated by passing the buffer size - 1. Is there any harm later if the hostname is truncated?
Possibly. If a client hangs, the display->hostname from gethostname is compared to the client's WM_CLIENT_MACHINE string, which ultimately comes from _XGetHostname. If they match, Xfwm kills the client's PID. If they don't match, Xfwm assumes it is a remote client and does not kill the PID. Truncating the hostname could lead to a problem if you have many machines named like myverylongmachinenamethatismorethanthirtythreecharacters-01, -02, etc. This becomes more unlikely as the limit gets longer though. Also it would only match if the two versions were truncated at the same point. Xorg truncates to 255+null for example, other implementations might be different. It seems that hostname longer than 31 characters is rare already, since the real issue was that it would segfault Xfwm, and nobody noticed this in 8 years. Perhaps then the best thing to do would be to revert the current patch and then just increase the arbitrary limit to 256 instead of 32?
(In reply to Alistair Buxton from comment #6) > Possibly. If a client hangs, the display->hostname from gethostname is > compared to the client's WM_CLIENT_MACHINE string, which ultimately comes > from _XGetHostname. If they match, Xfwm kills the client's PID. If they > don't match, Xfwm assumes it is a remote client and does not kill the PID. > > Truncating the hostname could lead to a problem if you have many machines > named like myverylongmachinenamethatismorethanthirtythreecharacters-01, -02, > etc. > > This becomes more unlikely as the limit gets longer though. Also it would > only match if the two versions were truncated at the same point. Xorg > truncates to 255+null for example, other implementations might be different. > > It seems that hostname longer than 31 characters is rare already, since the > real issue was that it would segfault Xfwm, and nobody noticed this in 8 > years. > > Perhaps then the best thing to do would be to revert the current patch and > then just increase the arbitrary limit to 256 instead of 32? I guess so, as you've pointed out in comment#3 the challenge is to keep in sync with libX11's idea of the hostname. Might be good to add a comment that the limit corresponds to what XOrg's libX11 does internally because both strings are compared later in the code. The really important thing is that the potential truncated string issue is fixed.
Created attachment 5739 Go back to using MAX_HOSTNAME_LENGTH
Created attachment 5740 Go back to using MAX_HOSTNAME_LENGTH v2 Previous patch still wasn't right. The intention of going back to MAX_HOSTNAME_LENGTH with a sensible default was to allow it to be overridden with a platform specific value from autoconf. Therefore we should assume it contains the number of characters not including NULLs as this is the safest option. The default length of 256 wasn't picked for Xorg's benefit and does not actually match the Xorg buffer size. We aren't really matching what Xorg does when truncation happens at all. It ignores the return value of gethostname and always truncates to 255+NULL, where as on certain platforms we will not accept a truncated hostname, but on others we do - depending on whether gethostname returns an error or not in this case. If truncation happens it's better to not match at all. Besides, we might not even be dealing with Xorg. Instead 256 was picked as it is unambiguously large enough for any platform mentioned so far here.
This really is a can of worms. Just for reference, here is what metacity does when killing a client: https://git.gnome.org/browse/metacity/tree/src/core/delete.c#n178 And compiz: http://bazaar.launchpad.net/~compiz-team/compiz/0.9.12/view/head:/gtk/window-decorator/forcequit.c#L65 These are both equivalent to the last patch I posted, except they use a buffer which is 1 byte smaller. But then metacity also uses HOST_NAME_MAX in another place: https://git.gnome.org/browse/metacity/tree/src/core/window-props.c#n546
Getting the hostname right is not so important really, if the hostname can't be retrieved then simply assume the client is remote. Fixing the segfault that follows should be the main goal IMHO.
Created attachment 5741 Use a local buffer then g_strdup it. Well, the original segfault is already fixed. Unfortunately I introduced a new one on obscure platforms. What about this patch? It solves the problem without increasing memory use and is even a bit simpler than the old code, and closer to what everone else does. nametmp[512] because that's the most I've seen anyone else use, and well, why not?
Created attachment 5742 Use a local buffer then g_strdup it. Sorry, wrong patch.
Created attachment 5778 Extremely paranoid version of hostname lookup. This patch should be completely bullet-proof. A 512 byte buffer is used for the hostname, which is twice as long as any known OS needs. The buffer is allocated with g_new0 so it is zero-initialized. If gethostname() does nothing, we won't get garbage uninitialized data. The buffer is actually one longer than what we tell gethostname(), so buggy implementations that overrun by 1 character aren't a problem. The final element is always set to zero regardless, so it won't crash even on a platform with hostnames longer than 512 and a buggy gethostname() that overruns the buffer. The temp buffer is g_strdup'd and then freed so that we don't waste memory. If gethostname returns any kind of error then the hostname is set to NULL as before.
Reposting (slightly updated) test case from previous bug: In order to test this you need a program which doesn't respond to NET_WM_PING. A suitable test application is available here: https://maemo.gitorious.org/fremantle-hildon-desktop/hildon-desktop/source/1be6ec951a4eed1d79878f88d2eacb298b447112:tests/test-hung-process.c Compile it like this: gcc -o test-hung-process test-hung-process.c `pkg-config --cflags --libs gtk+-2.0` Run this command to set hostname to 60 characters: sudo hostname aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa Log out and log in again to update environment. (Don't restart, hostname is not saved.) Now when you start a shell you should see your new, long, hostname. Run the test application, and when the window opens, WAIT FOR AT LEAST 1 SECOND, then click the close button. After a few seconds Xfwm will ask if you want to kill it. Click "yes". The process should be killed as expected, and Xfwm should not crash.
Looks good to me.
I tested both the earlier versions and the latest version of the patch. The latest one worked as expected, since it was also +1'd by Olivier, I pushed it to master and am marking this fixed. http://git.xfce.org/xfce/xfwm4/commit/?id=85dffa9adab8bc2c5d11550bdbc8235291802d4e Please re-open this bugreport if things aren't working as they should.