Sorry your browser is not supported!

You are using an outdated browser that does not support modern web technologies, in order to use this site please update to a new browser.

Browsers supported include Chrome, FireFox, Safari, Opera, Internet Explorer 10+ or Microsoft Edge.

Newcomers DBPro Corner / sorting an array.

Author
Message
saintlee
10
Years of Service
User Offline
Joined: 30th Nov 2013
Location:
Posted: 22nd Jan 2014 00:38
A little bit of advice please chaps.

I need to sort a fairly large array from highest to lowest,

the tricky bit is, I have a second array which needs to keep it's values in relation to the first array

eg.

value1(x)=50: value2(x)=1

after x is sorted, i need the two arrays to still match.

I've managed to do this with a bubblesort





but it's too slow.

I can't quite figure out how to use the 'sort array' command to achieve this, as it will only sort one array or the other, and not keep the association between the two values..

any thoughts of how I can speed this up?

I've looked at various other sorts, but don't really understand how they work, which is a problem when I need to customize them!
Derek Darkly
12
Years of Service
User Offline
Joined: 22nd Sep 2011
Location: Whats Our Vector, Victor?
Posted: 22nd Jan 2014 14:26
Before running this code may I ask,

-What are the number values being used...random scores? What range?
-Is the desired sort simply from high to low?
-What is the specific association between the lists?

Ortu
DBPro Master
16
Years of Service
User Offline
Joined: 21st Nov 2007
Location: Austin, TX
Posted: 22nd Jan 2014 18:58 Edited at: 22nd Jan 2014 19:02
you might try to convert the first array to a UDT so that you can store both a value and an id to each index where the id is the matching array index in the second array. matrix1 plugin can sort UDT arrays on the value type then you can use the id to cross reference the second array. you may need to copy array 2 to a temporary array, and using the .id from array 1 against the temp data, you can empty array 2 and rebuild it in sync with array 1.

not sure how well this might perform...

saintlee
10
Years of Service
User Offline
Joined: 30th Nov 2013
Location:
Posted: 22nd Jan 2014 21:32
thanks for the replies guys.

I haven't played much with the udt's but this might well be worth investing some time into.

derek- I'm writing a program to work out good value bets on football (soccer for those stateside!)fixtures results. (I appreciate db isn't necessarily the best platform for this, but it is what I'm comfortable with and don't have the time to learn a new language)

I have converted a large csv file of over 20,000 results from 3 years worth of results in all the major European leagues into a set of arrays, which contain the full time scores, the half time scores,
the competing teams and the league the games took place in, so I have extensive historical data to test my prediction formula's against.

I have already written a spreadsheet which worked well, but wanted the flexibility of writing my own code.

the specific routine I am struggling with is so the software can actually work out it's own best formula's for accurately predicting a result (for example, a goal in the first half happens in 64% of all of the games in the database, my best formula accurately predicts at 86% for the top 5% of all games in the database, so I have a proven formula to ascertain what the probability is for any given fixture)

so the first field is the score the formula has worked out for the result, the second field is a simple 0 or 1 to identify whether that result actually had a goal in the first half or not.

I then sort the array in ascending order (keeping the association in the 2nd array) and then work out (by the 2nd field) how many of the top 5% scores were actually the result being looked for.

The bubble sort works great, but I now want the software to test different variables against itself to look for the most accurate formula, which means the sort has to be done every time a new set of variables is tested.

It currently takes about 20 seconds , but a bubble sort is far from perfect for a large sort needed continuously.

Also for some strange reason, the sort works a lot quicker when I isolate the routine away from my main code block and run it inside a project folder of its own, not sure why, it may have something to do with the amount of large arrays being used in the main program hogging memory .
saintlee
10
Years of Service
User Offline
Joined: 30th Nov 2013
Location:
Posted: 22nd Jan 2014 21:37
just to add, the range will be from 0-10000 as float values.

and needs sorting from highest to lowest.

The second range field is just a true/false value
Ortu
DBPro Master
16
Years of Service
User Offline
Joined: 21st Nov 2007
Location: Austin, TX
Posted: 23rd Jan 2014 20:55
hmm it sounds like you could just merge this into a single array either with UDT or as a multidimensional array and it should be pretty easy with a single sort

Phaelax
DBPro Master
21
Years of Service
User Offline
Joined: 16th Apr 2003
Location: Metropia
Posted: 24th Jan 2014 02:49
parallel arrays? I'd never suggest that if it can be avoided. And in your case I think it most definitely can be by using a UDT as others have suggested.

Bubble sort is very easy to implement, but it's not that fast. Here's an example of a quick sort.



Duffer
21
Years of Service
User Offline
Joined: 9th Feb 2003
Location: chair
Posted: 26th Jan 2014 11:24
~ saintlee -

Make your life easy, use IanM's Matrix plugin...

a long time dabbler with DBC and DBPro with no actual talent but lots of enthusiasm...
Lukas W
20
Years of Service
User Offline
Joined: 5th Sep 2003
Location: Sweden
Posted: 29th Jan 2014 00:04 Edited at: 29th Jan 2014 00:08
I have a question as well - how would I sort an array based on two values given by a different array where one of values are a string.

To clarify:


where, borrowing the sort function from Phaelax, my current code is


IS that a game, a program, or 3 year old drawing.
Phaelax
DBPro Master
21
Years of Service
User Offline
Joined: 16th Apr 2003
Location: Metropia
Posted: 29th Jan 2014 08:40 Edited at: 29th Jan 2014 08:50
For one thing, I would make group an integer. There's no need to make that a string. If you want the groups to have specific names, then I would store those separately then reference them with an index number.

To sort by multiple values, you need to specify which is more important. You said sort by group and then rarity. So as you sort by group first, if the value it's being compared to in the sorting function is equal to rather than greater or lesser, then compare the next important value. In this case that would be rarity.

How do you implement this into the sort algorithm? Look at the part that compares two values to see which is greater or lesser than. We're going to replace that bit with a function called compareTo(a, b). What this function does is return 1 if a is greater than b, -1 if a is less than b, or 0 if they're equal. This makes it very easy to add any number of columns to the sort without really changing the sort routine at all.






Here's an example using a separate array of group names for reference like I was mentioning.


Lukas W
20
Years of Service
User Offline
Joined: 5th Sep 2003
Location: Sweden
Posted: 30th Jan 2014 03:22
I see, I understand now. Thanks for the help!

Just for fun I added a 3rd value that it will account for while sorting so that not only does it sort by Group, and then Rarity, but it also sort it by item nr.. Like you said, it is easy to add any number of columns.



IS that a game, a program, or 3 year old drawing.
Kevin Picone
21
Years of Service
User Offline
Joined: 27th Aug 2002
Location: Australia
Posted: 30th Jan 2014 14:41
When sorting on a number of keys, I find it's often best to separate them. So if there's only a handful of possible groups than a raddix styled shuffle would be effective. Once the list is sorted on group, do the same Rarity. But this time rather than processing the entire list you're just looking at run of items that share the same group. This should get rid of a lot of the hard work. The the bigger data set the more balanced the impact of sorting it.


You could tweak up what you have though, unfortunately DBpro doesn't cache array/type accesses. So every time you query a field, it drops code to pull the value out of the data structure. This stuff can really have a negative impact on code fragments with a high iteration count. You can manage this by holding values in temp variables.

eg.


Phaelax
DBPro Master
21
Years of Service
User Offline
Joined: 16th Apr 2003
Location: Metropia
Posted: 31st Jan 2014 09:09
Quote: "You could tweak up what you have though, unfortunately DBpro doesn't cache array/type accesses. So every time you query a field, it drops code to pull the value out of the data structure. This stuff can really have a negative impact on code fragments with a high iteration count. You can manage this by holding values in temp variables. "

Good to know!

Kevin Picone
21
Years of Service
User Offline
Joined: 27th Aug 2002
Location: Australia
Posted: 31st Jan 2014 19:05
Phaelax,

yeah.. It can be very handy handy thing to know. Same applies to DB and AppGameKit btw.


Luke,

Just looking over your example and it seems to me that you just pack the three values into a single integer, then just sort the array upon that packed as normal.

In your code Item/ Rarity and group as set like this.

Array(i).Item = rnd(1)
Array(i).rarity = rnd(3)
Array(i).group = rnd(5)

from that we see ITEM is 1 bit, Rarity and 2 bit and group is 3 bit. I suspect you'll have bigger ranges in your actual program, but hopefully you get the gist.


If you wanted them sorted by GROUP -> RARITY -> ITEM where each has a range of 0=255 (8bit) then you could pack these in a value.

We might do this like so,

PackedSortKey = ( (array(i).group and 255) << 16 )
PackedSortKey = PackedSortKey+((array(i).rarity and 255) <<8)
PackedSortKey = PackedSortKey+(array(i).item and 255)

array(i).SortKey =PackedSortKey

Since we're assuming each field is 8bit, you could equally use the RGB() function in this case, it does the same thing.

examples,

If group= 10 , rarity =1 and item = 2 the packed version would equal $0A0102 (in hex) or 655,618 in decimal

If group= 1 , rarity =2 and item = 5 the packed version would equal $010205 (in hex) or 66,053 in decimal

so if sort on the packed key, we get our desired priorities

Phaelax
DBPro Master
21
Years of Service
User Offline
Joined: 16th Apr 2003
Location: Metropia
Posted: 1st Feb 2014 00:38
To expand on what Kevin is talking about, I have a tutorial on binary and bitwise operations.

Lukas W
20
Years of Service
User Offline
Joined: 5th Sep 2003
Location: Sweden
Posted: 1st Feb 2014 05:42 Edited at: 1st Feb 2014 06:05
Thank you, that really helps Kevin. I will read your tutorial too Phaelax because it looks like that could be useful not only for inventory management.

I attach to this post my full inventory, borrowing Phaelax sorting code (a bit modified to fit my need since an empty inventory-slot in my setup will appear as "-1" to the array, and that causes a crash as you can expect) as well as suggestion from Kevin to use a bitwise operations calculation in order to find sorting priority.
The code is a complete mess, but I have tried to comment everything out. I am sure that it would be possible to optimize this snippet a lot, but for now I am too lazy to begin with that.

Edit: need advanced 2d plugin to compile, and I dunno what else (if anything).

Edit 2: lol, fixed a silly bug.. which is hard to explain. but basically it would remove a busy item from inventory so that when the true item was removed it was not removed but yet it was.. eh, like I said: hard to explain >.< Anyway, now it's a working code.

IS that a game, a program, or 3 year old drawing.

Login to post a reply

Server time is: 2024-04-20 17:32:56
Your offset time is: 2024-04-20 17:32:56