Author Topic: A Question on Sorting  (Read 6859 times)

0 Members and 1 Guest are viewing this topic.

Offline Dimster

  • Forum Resident
  • Posts: 500
    • View Profile
A Question on Sorting
« on: September 17, 2021, 02:01:56 pm »
Basically the question is ... is there any real difference if a sort is low to high or high to low?

In this instance I have a sequential file of data. I do not want to use a Quick Sort or Bubble Sort.Nor do I want to create another Array. Neither do I want to use SWAP. All I want is to do is have 5 Variables - High1, High2, High3, High4, High5. Where High1 will hold the highest of all values. Then input one data value at a time, compare it to the values in the various High variables and slot the value accordingly.

So the test becomes:
 "If VALUE > High5, and VALUE > High4 and VALUE > High3 and VALUE > High2 and VALUE > High1 then High1 = VALUE
   If Value > High5 and VALUE > High4 and VALUE > High3 and VALUE > High 2 and VALUE < High1 then High2  = VALUE
... this test eventually winds down to IF VALUE is less than all but High5 then VALUE = High5

Same litany of a test but this time running from High to Low
 " if VALUE  > High1 then High1 = VALUE
 " If VALUE < High1 and VALUE > High2 then High2 = VALUE
...this test eventually winds down to If VALUE < High1 and VALUE < High2 and VALUE < High3 and VALUE < High4 and VALUE > High5 and then High5 = VALUE

Now I realize there needs to be a lot more to the Test to deal with equal values and the possible shuffle of rankings that involves, and the sheer inefficiency in coding lines this type of a sorting involves but the question that comes to my mind is Does it matter, (in terms of speed and accuracy) with this type of sort routine if it runs high to low or low to high? In terms of accuracy, it seems ( seems the operative word here), that if I sort High to Low, initializing the High at zero helps, as does the sort low to high initializing the Low to say 100 if I know there is likely data with a value lower than that.

P.S. My goto for these kind of stupid questions is my Google box. Who informs me to put it in an array and use a quick sort. Other than that advice it has no idea what I'm talking about.




Offline Petr

  • Forum Resident
  • Posts: 1720
  • The best code is the DNA of the hops.
    • View Profile
Re: A Question on Sorting
« Reply #1 on: September 17, 2021, 02:24:21 pm »
Code: QB64: [Select]
  1.  
  2. 'generate 10.000 values and search 5 maximals
  3.  
  4. DIM values(9999) AS INTEGER
  5. DIM max(4) AS INTEGER
  6.  
  7. 'fill array with 10000 values
  8. FOR fill = 0 TO 9999
  9.     values(fill) = fill * 3
  10.  
  11. 'search 5 max
  12. FOR s = 0 TO 4
  13.     max(s) = SearchMax(values())
  14.  
  15. 'show outputs
  16. FOR s = 0 TO 4
  17.     PRINT max(s)
  18.  
  19.  
  20.  
  21.  
  22. FUNCTION SearchMax (values() AS INTEGER)
  23.     done = LBOUND(values)
  24.     DO UNTIL done = UBOUND(values)
  25.         IF SearchMax < values(done) THEN
  26.             SearchMax = values(done): index = done
  27.         END IF
  28.         done = done + 1
  29.     LOOP
  30.     values(index) = 0 'set maximal value to zero for next search

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
    • View Profile
Re: A Question on Sorting
« Reply #2 on: September 17, 2021, 04:35:11 pm »
Code: QB64: [Select]
  1. _Title "Top 5" 'b+ 2021-09-17
  2. Dim top(4)
  3. For i = 1 To 30 ' 30 randoms 0 to 99
  4.     r = Int(Rnd * 100) ' <<<<<<<<<<<<<<<<<< plug in data here
  5.     Print "New data:"; r: Print
  6.     If r > top(0) Then
  7.         For j = 3 To 0 Step -1
  8.             top(j + 1) = top(j)
  9.         Next
  10.         top(0) = r
  11.     ElseIf r > top(1) Then
  12.         For j = 3 To 1 Step -1
  13.             top(j + 1) = top(j)
  14.         Next
  15.         top(1) = r
  16.     ElseIf r > top(2) Then
  17.         For j = 3 To 2 Step -1
  18.             top(j + 1) = top(j)
  19.         Next
  20.         top(2) = r
  21.     ElseIf r > top(3) Then
  22.         top(4) = top(3)
  23.         top(3) = r
  24.     ElseIf r > top(4) Then
  25.         top(4) = r
  26.     End If
  27.  
  28.     For j = 0 To 4 ' progress
  29.         Print top(j)
  30.     Next
  31.     Print "zzz... press any"
  32.     Sleep
  33.     Cls
  34. Print "Top 5"
  35. For j = 0 To 4
  36.     Print top(j)
  37.  
  38.  
  39.  

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
    • View Profile
Re: A Question on Sorting
« Reply #3 on: September 17, 2021, 04:44:53 pm »
Now it's a Sub
Code: QB64: [Select]
  1. _Title "Top 5" 'b+ 2021-09-17
  2. Dim Shared top(4)
  3. Dim r(1 To 30)
  4. For i = 1 To 30 ' 30 randoms 0 to 99
  5.     r(i) = Int(Rnd * 100) ' <<<<<<<<<<<<<<<<<< plug in data here
  6. top5 r()
  7. Print "Top 5"
  8. For j = 0 To 4
  9.     Print top(j)
  10.  
  11. Sub top5 (arr())
  12.     For i = LBound(arr) To UBound(arr)
  13.         r = arr(i)
  14.         If r > top(0) Then
  15.             For j = 3 To 0 Step -1
  16.                 top(j + 1) = top(j)
  17.             Next
  18.             top(0) = r
  19.         ElseIf r > top(1) Then
  20.             For j = 3 To 1 Step -1
  21.                 top(j + 1) = top(j)
  22.             Next
  23.             top(1) = r
  24.         ElseIf r > top(2) Then
  25.             For j = 3 To 2 Step -1
  26.                 top(j + 1) = top(j)
  27.             Next
  28.             top(2) = r
  29.         ElseIf r > top(3) Then
  30.             top(4) = top(3)
  31.             top(3) = r
  32.         ElseIf r > top(4) Then
  33.             top(4) = r
  34.         End If
  35.     Next
  36.  
  37.  

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
    • View Profile
Re: A Question on Sorting
« Reply #4 on: September 17, 2021, 05:17:21 pm »
Generalized Top 5 to Top N, you say how many of the array you want off the top:
Code: QB64: [Select]
  1. _Title "Top N" 'b+ 2021-09-17  Generalize Top 5
  2. ReDim Shared top(0) '
  3. Dim r(1 To 30) ' random data array to test
  4. For i = 1 To 30 ' 30 randoms 0 to 99
  5.     r(i) = Int(Rnd * 100) ' <<<<<<<<<<<<<<<<<< plug in data here
  6. For N = 1 To 10
  7.     topN r(), N
  8.     Print "Top"; N
  9.     For j = 0 To N - 1
  10.         Print top(j)
  11.     Next
  12.     Print
  13.     Print "zzz... press any to contiue until Top 10"
  14.     Sleep
  15.     Cls
  16.  
  17. Sub topN (arr(), N)
  18.     ReDim top(N - 1)
  19.     start = N - 2
  20.     For i = LBound(arr) To UBound(arr)
  21.         r = arr(i)
  22.         fini = 0
  23.         again:
  24.         If r > top(fini) Then
  25.             For j = start To fini Step -1
  26.                 top(j + 1) = top(j)
  27.             Next
  28.             top(fini) = r
  29.             _Continue
  30.         End If
  31.         fini = fini + 1
  32.         If fini < N Then GoTo again
  33.     Next
  34.  
  35.  

Offline Dimster

  • Forum Resident
  • Posts: 500
    • View Profile
Re: A Question on Sorting
« Reply #5 on: September 18, 2021, 10:44:39 am »
Thanks you guys. I gather there is no getting away from it, an array needs to be defined, rather than individual variables. And I suppose this implies that with an array it's immaterial if the values are calculated low to high or high to low?

I appreciate the input.

Offline SMcNeill

  • QB64 Developer
  • Forum Resident
  • Posts: 3972
    • View Profile
    • Steve’s QB64 Archive Forum
Re: A Question on Sorting
« Reply #6 on: September 18, 2021, 11:26:44 am »
Regardless of being in an array, or not, there’s not going to be any difference in sorting speeds from high to low, or low to high.

The basic, unfolded logic looks like this:

IF X1 < X2 THEN SWAP X1, X2
IF X1 < X3 THEN SWAP X1, X3
IF X1 < X4 THEN SWAP X1, X4
IF X1 < X5 THEN SWAP X1, X5
IF X2 < X3 THEN SWAP X2, X3
….and so on.

With the above, we have 5 variables (x1, x2, x3, x4, x5), and we swap to place them in lowest to highest order.  The first 4 statements make certain that X1 is the lowest.  The next 3 statements would make certain X2 is the lowest…. and so on.

If we wanted high to low instead, we’d do the EXACT same process except we’d change  < into >.  Same logic.  Same number of steps.  All exactly the same, except we’d change < to >.  There wouldn’t be a single difference in speed and performance between the two.



As an array, the same code would look like:
FOR i = 1 TO 5
    FOR j = i TO 5
        IF X(I) < X(J) THEN SWAP X(I), X(J)
    NEXT
NEXT

Simpler code, but the same basic process controlled by loops to sort from low to high.

And to sort from high to low, what’s needed?

You got it!  Just change that < to >, and that’s it!



Not a bit of difference in the performance between high-to-low and low-to-high one bit! 
https://github.com/SteveMcNeill/Steve64 — A github collection of all things Steve!

Offline SMcNeill

  • QB64 Developer
  • Forum Resident
  • Posts: 3972
    • View Profile
    • Steve’s QB64 Archive Forum
Re: A Question on Sorting
« Reply #7 on: September 18, 2021, 11:43:04 am »
Or, in your specific case that you asked about:

IF Value < High1 THEN
    High5 = High4: High4 = High3: High3 = High2: High2 = High1: High1 = Value
ELSEIF Value < High2 THEN
    High5 = High4: High4 = High3: High3 = High2: High2 = Value
ELSEIF Value < High3 THEN
…. and so on.

The above would be for low to high input.  For high to low, you’d just swap that < to >.  In the end it’d still perform the same.
https://github.com/SteveMcNeill/Steve64 — A github collection of all things Steve!

Offline TempodiBasic

  • Forum Resident
  • Posts: 1792
    • View Profile
Re: A Question on Sorting
« Reply #8 on: September 18, 2021, 02:12:10 pm »
Hi
my philosophical degression:

is faster from high to low or vice versa?  No one is faster than the other. 10miles are the same both running from left to right both running from right to left.

Your method , that has been showed in a fantastic way by All you QB64 master coders , resembles the bubble sort .
Thinking that at start time the array has all its elements = 0, and going on to put values in ordered mode from Highest to Lowest
I think that this bubble sort can become faster merging it with the binary search.

What do you think about this?
Programming isn't difficult, only it's  consuming time and coffee

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
    • View Profile
Re: A Question on Sorting
« Reply #9 on: September 18, 2021, 02:47:23 pm »
I don't think the Bubble sort is good for any case except for ease of understanding of how it works.

Also Binary search requires a sorted list to work.
« Last Edit: September 18, 2021, 02:49:13 pm by bplus »

Offline SMcNeill

  • QB64 Developer
  • Forum Resident
  • Posts: 3972
    • View Profile
    • Steve’s QB64 Archive Forum
Re: A Question on Sorting
« Reply #10 on: September 18, 2021, 05:34:39 pm »
I don't think the Bubble sort is good for any case except for ease of understanding of how it works.

Also Binary search requires a sorted list to work.

Bubble sor….    **QUIVERS UNCONTROLLABLY**

BRRRR!!  I just can’t bring myself to even mention it, if we’re talking about usefulness! 

Still, if I’m completely honest, like most programmers who *hate* its performance, I’ll have to admit to implementing it with very limited datasets just for its simplicity.  I’ll stress here, for emphasis: VERY LIMITED DATASETS.

If I need to sort a deck of cards, then I might fall back onto a bubble sort.  If it’s a grocery list, a bubble sort will do.  If it’s my list of chores for the week, a bubble sort will do.  Usually, any dataset with less than 100 elements in it can be bubble sorted.

Beyond that, I either grab a Binary Insertion Sort, a Quick Sort, or a simple Comb Sort routine and make use of them — for the following reasons:

When the data allows, nothing is faster than a Binary Insertion Sort.  How does this work??

Start with an ordered list of ### items: apple, bat, cat, dog, frog, elephant…
Add a new item to the list: chicken
Do a binary search of your data to find the position where chicken goes: After 3rd word
Shift everything to the right one array element after that and insert the word in its proper place.

One shift of data in memory, coupled with a quick binary position search — it doesn’t get any faster or more efficent than this!  (But, it’s a limited use case sort method.)



Method 2: Quick Sort — works to become the fastest sort method with just about any generalized data set.  The only drawback here is if you recursively sort, I have seen issues with running out of stack space over the years.  Generally, I’ll apply this for moderate sized data sets (say less than 1,000,000,000 data items).



Method 3: Comb Sort —almost as fast as quick sort and still very simple to implement with usually less than 20 lines of code.  Works with all data sets, regardless of size, with no recursion or issues with stack limitations.  If I’m going to take the trouble to copy/paste an existing sort into a different program, *this* is the one I fall back on.  Comb Sort is also what MemSort is based upon, so if you’ve used it from the samples forums here, you’ve used a Comb Sort in your work before.  ;)
https://github.com/SteveMcNeill/Steve64 — A github collection of all things Steve!

Offline MWheatley

  • Newbie
  • Posts: 64
    • View Profile
Re: A Question on Sorting
« Reply #11 on: September 18, 2021, 06:40:03 pm »
I don't think the Bubble sort is good for any case except for ease of understanding of how it works.


Agreed.  But gosh, it's easy to use in real-world low-number scenarios.

Malcolm

Offline rickclark58

  • Newbie
  • Posts: 18
    • View Profile
    • YouTube Channel
Re: A Question on Sorting
« Reply #12 on: September 18, 2021, 08:03:49 pm »
For every sorting algorithm, there is a best-case and a worst-case. Generally, the average is a good rule of thumb to go by for general-purpose sorts. They all take a finite amount of time and what you use is largely dependant on the type and quantity of the data. You really need to fit the sort to the data for the best results. https://www.geeksforgeeks.org/time-complexities-of-all-sorting-algorithms/
Rick Clark

Offline Dimster

  • Forum Resident
  • Posts: 500
    • View Profile
Re: A Question on Sorting
« Reply #13 on: September 19, 2021, 08:43:11 am »
Wow, thanks again, lots to consider here. That "SWAPPING" algorythm Steve is similar to where I was heading but was using Select Case instead of the "If' statements.
And Tempodi's analogy of the 10 miles Left to Right or Right to Left pretty well highlights my thinking on the difference between a High to Low sort v's a Low to High but in my case I'm thinking ... travelling Right to Left is easier for a right handed person and vice versa. And walking that 10 mile highway going forward of backward makes a major difference. Conceptually, I have a sequential file and want to pick out the data one item at a time. I want to handle that data just once ( one decision on where it fits in a hierarchy) and onto the end of the sequential file. I wanted to avoid creating arrays and doing a sort or multiple swaps with each piece of data (because I have a lot of sequential files)

I have always considered a sort in High to Low order (always looking for the best rather than the worst or walking that 10 km/mile highway forward) as faster than Low to High. Human bias I guess.

Thanks again.

Offline TempodiBasic

  • Forum Resident
  • Posts: 1792
    • View Profile
Re: A Question on Sorting
« Reply #14 on: September 19, 2021, 11:18:22 am »
HI

about bubblesort
yes it seems to be the oldest and the slowest method of sorting a set of data of the same kind and with a theoretically homogeneus distribution in the range of  possible values. But if you change the kind of distribution the results change,  each sorting method has its ideal condition and its quicksand. For example think of wonderful Quicksort working with this set of data (02134569), here Bubblesort wins easily. While working with this set of data (90654231) I think that QuickSort wins easily vs Bubblesort.


about Dimster question
In the situation described by Dimster, at start point we have no mixed data but all 0 values into the 5 elements' array.
Going on with the input of value, once at time, we must sorting them from the highest to the lowest....after at each new input we must ordering inserting the value in the right position in respect of rule from hightest to lowest.
in simulation:
sorting High      1  2  3  4  5           

at start point    0  0  0  0  0

first = 2            2  0  0  0  0
second = 8       8  2  0  0  0
third = 1           8  2  1  0  0
forth = 6           8  6  2  1  0
fifth = 3            8   6  3  2 1

it seems that Bubblesort idea is the fastest in this approach.

about phylosophy (jump if you're not interested)--------------------------------------------------------
The Protagora's rule teach us that High and Low are relative to the setting and to the metric that we are using. No absolute point of view.
just for an example: going from High to Low is equal to go down or to go worst ,  while going from Low to High is equal to go up or to go better. But just we western humans think High as good and Low as bad....but a lower interest of banch loan is good enough? on the other hand see Yin-Yang idea there is no "high" or "low" but "higher than" and "lower than"
----------------------------------------------------------------------------------------------------------------------------------------------

about sorting algorithm


and
SortDemo.BAS

https://github.com/gondur/dmg8bit/blob/master/SORTDEMO.BAS  to get file


Fine to exchange opinions and share resources
Thanks
Programming isn't difficult, only it's  consuming time and coffee