! Please note that this is a snapshot of our old Bugzilla server, which is read only since May 29, 2020. Please go to gitlab.xfce.org for our new server !
Unable to auto-guess Automake file type by filename
Status:
RESOLVED: FIXED
Product:
Mousepad
Component:
General

Comments

Description Nikita Zlobin 2019-02-15 16:01:33 CET
Automake file type seems to be derivation of Makefile type. Mousepad tries first to guess mime type, and if not found - gtksourceview does it by own, with empty mime info. I just discovered there are no mime types for any autotools special mime types (m4, autoconf, automake). GtkSourceView own guess function is able to guess C sources/headers by own, without content_type. But unable to guess even plain Makefile, thus making it only by getting text/x-makefile content_type.

I tried nearly two times to ask something at gnome irc channel, but was completely ignored for at least several hours of not a day.
Comment 1 Nikita Zlobin 2019-02-15 23:00:08 CET
I tried to test other gtksourceview versions (my stable was 3.24.8). Version 3.24.9 makes no difference. This version seems to be released 5 months ago (looking to tags date at git), but makefile and automake lang files are 1 year old, yet there are no commits, related to guessing. Looked to lang files theirselves - they look correct, at least filename glob is ok, and makefile.lang mime type is too.

Unfortunally there are no branches like 3.x, 4.x - only like gnome-3.24, so i could not get bleeding-edge 3.x version. Also can't find something to test gtksourceview-4.x, though it seems redundant. I'm not sure, is it good idea for me to report it right to them, unless i develop something with that lib.

Btw, when i try to open Makefile, i get:
(mousepad:25215): GtkSourceView-CRITICAL **: gtk_source_language_get_name: assertion 'GTK_SOURCE_IS_LANGUAGE (language)' failed
Comment 2 Andre Miranda editbugs 2019-02-16 03:06:11 CET
Random ideas:
- See how other text editors based on gtksourceview handle this file type.
- Put together a SSCCE with gtksourceview and check how it guesses automake files. This way we can determine if it's something gtksourceview's side or mousepad's.
- Port mousepad to gtk4/gtksourceview-4 :P
Comment 3 Nikita Zlobin 2019-02-16 07:51:13 CET
I thought about port to new gsv :) Besides jokes it makes big sence, as gsv-3 is unlikely to get special support as it is just previous version (not branch) of all gsv. What was real surprise for me is that gsv-4.0.4 depends in gtk-3.20, not on yet unreleased gtk4.

I don't think there is possible difference. I did purest possible GSV test via mousepad by replacing content_type (third) argument with NULL, so gtk_source_language_manager guess function is forced to guess it by own. It did it for C source/header files and shell scripts (ltmain.sh) but is unable to do for both plain Makefile and Makefile.am.

So it is rather time to try fix it.
Did you mean http://www.sscce.org ?
Comment 4 Nikita Zlobin 2019-02-16 09:07:36 CET
I just tried meld, which has support for syntax highlight. It correctly guesses both Makefile and Makefile.am.
Comment 5 Nikita Zlobin 2019-02-16 15:08:28 CET
There is minimal patch, which should be enough - though guess function describes filename parameter as filename, it works better with basename.
Meld, however, uses GFile and g_file_get_content_type(), thus not deadling with uncertainity - that function returns plant/text for Makefile, while g_content_type_guess returns application/octet-stream in same case. In both cases, however, it works equally well if fed by basename (and equally bad with file->filename).
Comment 6 Nikita Zlobin 2019-02-16 15:10:25 CET
Created attachment 8305 
Patch, fixing content type guess
Comment 7 Nikita Zlobin 2019-02-16 15:11:56 CET
Comment on attachment 8305 
Patch, fixing content type guess

In addition, it should also recognize configure.ac as m4, which gives some good syntax highlight for autoconf files.
Comment 8 Nikita Zlobin 2019-02-16 16:50:52 CET
Probably filename is supposed to be basename.
GFile has only path and basename attributes, no name.
GFileInfo has name, and it is equal to basename.
Comment 9 Andre Miranda editbugs 2019-02-17 00:32:15 CET
The patch makes sense, that's how gedit uses gtk_source_language_manager_guess_language (indeed the documentation is unclear).
Can you send me an updated patch with commit info?
Comment 10 Nikita Zlobin 2019-02-17 07:14:09 CET
Created attachment 8307 
Updated patch, exported from git commit
Comment 11 Matthew Brush editbugs 2019-02-17 07:44:52 CET
The new `basename` variable isn't initialized and is only conditionally assigned, so the `g_free` at the end could be called on an uninitialized/garbage pointer. Also inside the conditional where `basename` is assigned, `file->filename` may be `NULL`, which would cause a runtime assertion failure (possibly abort, depending how GLib is compiled) from the `g_path_get_basename` function.
Comment 12 Nikita Zlobin 2019-02-17 15:46:05 CET
Hm, but entire function begins with:
content_type = g_content_type_guess (file->filename, NULL, 0, &result_uncertain);

How should it ever work with null filename? I understand it like "\0", but NULL...
Comment 13 Nikita Zlobin 2019-02-17 17:22:47 CET
Created attachment 8308 
Patch, fixing own language guess function

I somehow missed uninitialized basename, there is updated patch. I also remade it to better handle NULL filename. There are some issues though.

When i changed g_content_type_guess() call to take NULL name, it sets uncertainity, thus last condition depends entirely on filename!=NULL test.

Final code:

GtkSourceLanguage *
mousepad_file_guess_language (MousepadFile *file)
{
  gchar             *content_type;
  gchar             *basename;
  gboolean           result_uncertain = FALSE;
  GtkSourceLanguage *language = NULL;

  if (file->filename != NULL)
    {
      content_type = g_content_type_guess (file->filename, NULL, 0, &result_uncertain);
      basename = g_path_get_basename (file->filename);
      language = gtk_source_language_manager_guess_language (gtk_source_language_manager_get_default (),
                                                             basename,
                                                             result_uncertain ? NULL : content_type);

      g_free (basename);
      g_free (content_type);
    }

  return language;
}

Block, testing uncertainity is replaced by inline test in gtk_source_language_manager_guess_language() call.
Comment 14 Nikita Zlobin 2019-02-17 17:57:55 CET
Created attachment 8309 
Fix language guess function

Further towards perfection...
At least now patch itself makes less brain strain.
Comment 15 Matthew Brush editbugs 2019-02-17 20:11:30 CET
I haven't tested yet, but the code looks better. The only thing I'd change is to use `g_return_val_if_fail` macro to replace the `file->filename` check, if it is never supposed to be `NULL` (I didn't check if this is the case). Also while I'm not personally a fan of sticking all the variable declarations at the top of the function, since all the existing code does it already it might make sense to put that check below the variable declaration block, though it doesn't matter.

It might be a while before I have a chance to test/merge this, so if Andre or anyone with commit rights wants to merge it, it's fine by me.
Comment 16 Nikita Zlobin 2019-02-17 22:14:17 CET
Created attachment 8310 
Fix launguage guess code

I forgot about that class of function, though saw it before - those, with word assert, are in testing section and have rather unwanted effect (probably abort).

As for check before variables block - at least now, as far as i know, automatic variables are allocated when scope is reached. This check doesn't need any of them. Though i'm aware, it is some kind of good taste sign... i'm not sure about preference, can be changed later.
Comment 17 Matthew Brush editbugs 2019-02-17 22:35:07 CET
(In reply to Nikita Zlobin from comment #16)
> [...] As for check before variables block - at least now, as far as i know,
> automatic variables are allocated when scope is reached. This check doesn't
> need any of them. Though i'm aware, it is some kind of good taste sign...
> i'm not sure about preference, can be changed later.

Yeah, it's just an old-school C convention used throughout Mousepad's code, which is the only reason I mentioned it, no biggie.
Comment 18 Git Bot editbugs 2019-02-23 21:03:26 CET
Nikita Zlobin referenced this bugreport in commit 67063863a1620c8bfce367e204f6513eb44cb614

Fix language guess function (Bug #15141)

https://git.xfce.org/apps/mousepad/commit?id=67063863a1620c8bfce367e204f6513eb44cb614
Comment 19 Andre Miranda editbugs 2019-02-23 21:05:54 CET
All looks good, I just moved the g_return_val_if_fail to be after vars declaration to avoid a gcc warning.
Thanks Nikita!

Bug #15141

Reported by:
Nikita Zlobin
Reported on: 2019-02-15
Last modified on: 2019-02-23

People

Assignee:
Matthew Brush
CC List:
2 users

Version

Version:
Unspecified
Target Milestone:
Mousepad 0.4.x

Attachments

Patch, fixing content type guess (1.06 KB, patch)
2019-02-16 15:10 CET , Nikita Zlobin
no flags
Updated patch, exported from git commit (1.29 KB, patch)
2019-02-17 07:14 CET , Nikita Zlobin
no flags
Patch, fixing own language guess function (1.73 KB, patch)
2019-02-17 17:22 CET , Nikita Zlobin
no flags
Fix language guess function (1.74 KB, patch)
2019-02-17 17:57 CET , Nikita Zlobin
no flags
Fix launguage guess code (1.75 KB, patch)
2019-02-17 22:14 CET , Nikita Zlobin
no flags

Additional information