! Please note that this is a snapshot of our old Bugzilla server, which is read only since May 29, 2020. Please go to gitlab.xfce.org for our new server !
Incorrect sort order in thunar with numeric names
Status:
RESOLVED: FIXED

Comments

Description charlie-tca 2009-05-15 15:37:04 CEST
This bug has been reported on Ubuntu Launchpad as:
 https://bugs.launchpad.net/bugs/376156

Binary package hint: thunar

When a folder contains other folders with only numeric names and that the highest number is over 100, thunar does not keep the sorting order correctly.

Please test with:

#!/bin/bash
dir="foo"
for (( i = 1; i < 140; i++ )); do mkdir -p $dir/$i; done
thunar $dir
exit 0

the sorting order will be correct. Now go up the file tree and then back in the folder: the order is correct for 1 to 9 but then is garbled. It is not random though: the (incorrect) order is the same every time the folder is visited. Re-sorting by names will restore the correct order but it will still not stick.

This has been testing on 2 installations of xubuntu with the following characteristics:

$ lsb_release -rd
Description: Ubuntu 9.04
Release: 9.04

$ apt-cache policy thunar
thunar:
  Installed: 1.0.0-1ubuntu3
  Candidate: 1.0.0-1ubuntu3
  Version table:
 *** 1.0.0-1ubuntu3 0
        500 http://mirrors.rit.edu jaunty/universe Packages
        100 /var/lib/dpkg/status
Comment 1 craig 2009-07-25 04:12:53 CEST
thunar version 1.0.1, Arch

I have directories like these:
0_Arch
0_download
00000_new
00_backup
00_Vostro
0_dvd

Sorting back and forth by name ascending gives different ordering each time.
Comment 2 PyroPeter 2009-12-09 21:18:50 CET
Thunar sorts like "sort -n".

I think this is quite annoying, as any other filesystem-related software sorts like "sort".

As "ls" is a kind of 'standard', I consider this a bug. I see the numerical sort-order has its advantages, but its absolutely counter-intuitive. You could add a setting to switch betwen ls-style and numerical sorting.
Comment 3 Minarto Margoliono 2010-06-26 22:29:53 CEST
I have folders with "word number" pattern (ch 6, ch 98, ch 103 etc)

The first display will be messy, not totally random, but some folder were in wrong position. However when i re-sort it, the sorting was perfect. and it is not happening when the folder contains only number from 1-99.
Comment 4 Peter de Ridder editbugs 2012-02-14 22:47:57 CET
Created attachment 4195 
Improve numeric sorting in thunar

This patch is not a complete fix for this bug.
This sorts numbers like these correct: 11 100 101 1000
But number starting with 0 are sorted incorrect: 0001 010

I'm working on a none intrusive way of sloving this.
Comment 5 Peter de Ridder editbugs 2012-02-15 21:56:26 CET
Created attachment 4205 
Correctly sort numeric names

This will sort numbers correctly also with leading zeros.

The last diff corrects case insensative sorting unicode names.
Comment 6 Peter de Ridder editbugs 2012-02-15 23:22:26 CET
Created attachment 4206 
Even more correctly sort numbers

The previous patch still missed some cases for correctly sorting.
This version will sort leading zeros as followed:
00.txt 00a.txt 0.txt 0b.txt 010.txt 10.txt
Comment 7 Peter de Ridder editbugs 2012-02-16 21:42:16 CET
Created attachment 4214 
Small optimization in number length compare

Walk both numbers in parallel so we only need to iterate for the size of the shortest number.
This could be a first step for hex number support.
Comment 8 Peter de Ridder editbugs 2012-02-17 22:19:04 CET
Created attachment 4219 
Patch created on top of current master HEAD

Created patch on top of the current master HEAD. There should be no conflict when applying this patch.

It would be a good idea to move numeric sort to eather libxfce4util or exo. That way other applications can use the same sorting.
Comment 9 Peter de Ridder editbugs 2012-02-17 22:34:45 CET
Created attachment 4220 
Sort hex numbers

This patch can be applied on top of the numeric_sort_5.patch
With this patch hex numbers of the same length would be sorted correct.
With numeric sort a62fea.txt would go before a153dc.txt
With this patch a153dc.txt would before a62fea.txt
This only works for the same lenth hex numbers since that is the only way to guess a number is a hex number. Also if filenames have hex numbers, they are probebly generated with a fixed length.
Comment 10 Jannis Pohlmann editbugs 2012-02-18 15:12:59 CET
A few comments:

I cannot reproduce the bug reported originally with Thunar built from the master branch. It is always 1, 2, 3, ..., 9, 10, 11, ..., 140. The order is always correct. I also tried different amounts of leading zeros in the 1...140 sequence. It still works in master. I also tried the folders from comment 1, and I have to admit the ordering is not predictable and makes little sense. However, unlike the comment suggests, it's not different each time.

Peter, I tried your patch and it seems to work correctly in all cases I tested. I am not sure whether we really want hex ordering. It is, in general, not obvious whether "a62fea.txt" and "a153dc.txt" are hex numbers or non-hex numbers. Second-guessing the user's intentions may not be what we want and it seems impossible to satisfy all use cases with one single approach here. Even assuming that "a 01" is to appear before "b 2" or "0100" should show up after "099" will always annoy some people. The question is: which cases do we want to support and how far do we go with detecting and distinguishing them.

I'm fine with applying the patch though (we should try to make it a little less cryptic though), as it definitely improves the situation.
Comment 11 Peter de Ridder editbugs 2012-02-19 11:11:42 CET
(In reply to comment #10)
> A few comments:
> 
> I cannot reproduce the bug reported originally with Thunar built from the
> master branch. It is always 1, 2, 3, ..., 9, 10, 11, ..., 140. The order is
> always correct. I also tried different amounts of leading zeros in the 1...140
> sequence. It still works in master. I also tried the folders from comment 1,
> and I have to admit the ordering is not predictable and makes little sense.
> However, unlike the comment suggests, it's not different each time.

I got a report on irc about sorting and choose this bug to added the patches to, from all sorting bugs open. The test case in this report does not always hit the problem although it can (100 would be sorted before 11 in thunar master). Because of this unpredictable compare in combination with qsort it is posible the directory sortes different when it tries to resort.

> Peter, I tried your patch and it seems to work correctly in all cases I tested.
> I am not sure whether we really want hex ordering. It is, in general, not
> obvious whether "a62fea.txt" and "a153dc.txt" are hex numbers or non-hex
> numbers. Second-guessing the user's intentions may not be what we want and it
> seems impossible to satisfy all use cases with one single approach here. Even
> assuming that "a 01" is to appear before "b 2" or "0100" should show up after
> "099" will always annoy some people. The question is: which cases do we want to
> support and how far do we go with detecting and distinguishing them.

The patch for hex sorting was intentionaly seperate so it could be left out. I created it since it was not that hard. For people viewing images for example with generated hex numbers it is sometimes not clear why the files are sorted out-of-order. For the leading zero or numeric sorting, that can be annoy to some, but that would still be clear how it is sorted, wheter you agree or not. To be able to slove that you need to be able enable/disable different kind of sorting.

> I'm fine with applying the patch though (we should try to make it a little less
> cryptic though), as it definitely improves the situation.

Try to make is less cryptic in what way? User documentation, code comments or ...?
Comment 12 Peter de Ridder editbugs 2012-02-26 23:30:56 CET
Created attachment 4238 
Numeric compare with leading zeros made less cryptic

Always call skip_leading_zeros to show how leadingzero is calculated.
Comment 13 Jannis Pohlmann editbugs 2012-02-27 01:48:03 CET
I applied and pushed an alternative version of Peter's patch in master. It does not include the hexadecimal feature yet. Also, we should probably add a few examples to the comment of the sort function to explain how this is supposed to work. I'm leaving the bug open for now so that we can discuss the hex issue later.
Comment 14 Jannis Pohlmann editbugs 2012-02-27 01:48:45 CET
Correction: I didn't push yet because my network is currently broken. Will do it tomorrow.
Comment 15 Peter de Ridder editbugs 2012-02-27 02:15:00 CET
Created attachment 4239 
less cryptic HEX sorting

This patch could be applied on the not yet pushed master (see previous comment).

This hex check doesn't use the knowledge of previous checked digits by the infavor of easier code. See the previous hex patch for more a optimized solution.
Comment 16 Peter de Ridder editbugs 2012-03-26 22:27:47 CEST
After testing there is a non-determinism in hex sorting.
Since hex sorting and number sorting create different sorted sets they can't be mixed when sorting on element bases.
For example: 023 22 02f
Numering sort: 02f 22 023
Hex sort: 22 023 02f
When mixing this: you get these compares, which can't be sorted.
22 < 023
023 < 02f
02f < 22
Comment 17 Jannis Pohlmann editbugs 2012-03-26 22:40:34 CEST
I think the result of hex sorting would be confusing. Of course the ordering of hex strings will now be confusing as well, but we have to make a decision on what to prioritize. For now, Peter and I have decided to drop hex sorting. I'm marking this bug as fixed.
Comment 18 Jannis Pohlmann editbugs 2012-03-26 22:43:00 CEST
BTW, the commit that made it into master was this one:

commit b501fe002028f83d75959d37023f0812b3675dac
Author: Peter de Ridder <peter@xfce.org>
Date:   Mon Feb 27 00:42:11 2012 +0000

    Improve sorting of file names that include numbers (bug #5359).
    
    This commit improves the handling of leading zeros of numbers in file
    names. Previously, the order was not always predictable and was also
    often incorrect.
Comment 19 Jannis Pohlmann editbugs 2012-03-26 22:47:31 CEST
*** Bug 4269 has been marked as a duplicate of this bug. ***

Bug #5359

Reported by:
charlie-tca
Reported on: 2009-05-15
Last modified on: 2012-03-26
Duplicates (1):
  • 4269 Wrong name sorting when numbers are used in filenames

People

Assignee:
Peter de Ridder
CC List:
6 users

Version

Attachments

Improve numeric sorting in thunar (1.22 KB, patch)
2012-02-14 22:47 CET , Peter de Ridder
no flags
Correctly sort numeric names (3.99 KB, patch)
2012-02-15 21:56 CET , Peter de Ridder
no flags
Even more correctly sort numbers (4.58 KB, patch)
2012-02-15 23:22 CET , Peter de Ridder
no flags
Small optimization in number length compare (4.81 KB, patch)
2012-02-16 21:42 CET , Peter de Ridder
no flags
Patch created on top of current master HEAD (4.82 KB, patch)
2012-02-17 22:19 CET , Peter de Ridder
no flags
Sort hex numbers (2.54 KB, patch)
2012-02-17 22:34 CET , Peter de Ridder
no flags
Numeric compare with leading zeros made less cryptic (5.12 KB, patch)
2012-02-26 23:30 CET , Peter de Ridder
no flags
less cryptic HEX sorting (1.98 KB, patch)
2012-02-27 02:15 CET , Peter de Ridder
no flags

Additional information