Difference between revisions of "Data Structures"

From Kolmafia
Jump to navigation Jump to search
imported>Dj d
(Created page with 'Most of this information was copied directly from ASH Maps Tutorial, by Veracity (http://kolmafia.sourceforge.net/advanced.html#maps) A map is indexed by one data type (the…')
 
(Since scoping rules are an "advanced" topic, move the scope subsection to end of Records section)
 
(39 intermediate revisions by 15 users not shown)
Line 1: Line 1:
Most of this information was copied directly from ASH Maps Tutorial, by Veracity
+
{{TOCright}}
(http://kolmafia.sourceforge.net/advanced.html#maps)
 
  
    A map is indexed by one data type (the key) and associates that key with another (or the same) data type (the value). The key can be any ASH simple data type: boolean, int, float, string, item, zodiac, location, class, stat, skill, effect, familiar, slot, or monster. The value can be any of those or can be an aggregate: another map. This effectively allows multi-dimensional maps and. In fact, that's how the syntax we provide for multi-dimensional aggregates actually operates: maps of maps of maps ...
+
KoLmafia supports complex data structures such as maps and records made from simple [[Data Types|data types]].
  
    You can declare a map any time you can declare a variable: as a top level (global) variable, as a function parameter, or as a local variable in any scope.
+
== Maps ==
 +
If you are new to programming or find the information below confusing, you may want to read [[Map_Guide | A Noob's Guide to Maps]] first.  
  
    You can fetch data from a map any time you can provide a data value: in an expression, as a function parameter, on the right side of an assignment statement, from a "return" statement, as so on. You can pass around entire maps, individual elements, or intermediate maps: "slices".
+
Most of this information was copied directly from ASH Maps Tutorial, by Veracity (http://kolmafia.sourceforge.net/advanced.html#maps)
  
    If you use a map on the left side of an assignment, you set the whole map at once to the new value. If you specify a map and a complete set of indices (of the correct types) on the left side of an assignment statement, you set a single element. If you specify a map and a prefix of indices (of the correct type), you directly set one of the intermediate maps, a "slice".
+
A map is indexed by one data type (the key) and associates that key with another (or the same) data type (the value). The key can be any ASH simple data type: boolean, int, float, string, item, location, class, stat, skill, effect, familiar, slot, or monster. The value can be any ASH data type at all: a simple type, a record, or can be another map. This effectively allows multi-dimensional maps and. In fact, that's how the syntax we provide for multi-dimensional maps actually operate: maps of maps of maps ...
  
    The syntax for declaring the data type of a map:
+
You can declare a map any time you can declare a variable: as a top level (global) variable, as a function parameter, or as a local variable in any scope.
  
        <data type> [ <key type>, ... ] 
+
You can fetch data from a map any time you can provide a data value: in an expression, as a function parameter, on the right side of an assignment statement, from a "return" statement, as so on. You can pass around entire maps, individual elements, or intermediate maps: "slices".
  
    The syntax for referencing an element (or slice) of a map:
+
=== Declarations ===
  
        <variablename>[ <key expression>, ... ] 
+
The syntax for declaring the data type of a map:
  
    All the key expressions will be evaluated at run time. If you specify all the keys the map expects, you fetch data of the type specified by the map. If you specify fewer keys than the map expects, you get an intermediate map, a "slice".
+
<pre>
 +
<data type> [ <key type>, ... ] <aggregate_name>
 +
</pre>
  
    As an example:
+
For example:
  
        boolean [string, string] props;
+
{{CodeSample
 +
|code=<syntaxhighlight>
 +
string [item] map1;
 +
float [class, string, int] another_map;
 +
</syntaxhighlight>}}
  
    might be used to hold "properties" associated with names.
+
=== Assignments ===
  
        props[ "dog", "mammal" ] = true; props[ "dog", "pet" ] = true; props[ "dog", "fun" ] = false; props[ "turtle", "mammal" ] = false; props[ "turtle", "pet" ] = true; props[ "turtle", "fun" ] = false; props[ "aardvark", "mammal" ] = true; props[ "aardvark", "pet" ] = false; props[ "aardvark", "fun" ] = true; 
+
==== Assigning individual values ====
  
    references:
+
If you specify a map and a complete set of indices (of the correct types) on the left side of an assignment statement, you set a single element.
  
        props[ "dog", "mammal"] => true boolean [string] animal = props[ "turtle" ]; animal[ "fun" ] => false 
+
{{CodeSample
 +
|code=<syntaxhighlight>
 +
int[item] my_pricelist;
 +
my_pricelist[ $item[ pail ] ] = 1000;
 +
my_pricelist[ $item[ bum cheek ] ] = 404;
 +
my_pricelist[ $item[ red button ] ] = 100;
 +
my_pricelist[ $item[ red balloon ] ] = 100000;
 +
</syntaxhighlight>}}
  
    You can test the presence of a key in a map using the "contains" operator:
+
==== Aggregate literals ====
 +
You can declare ready to use, anonymous (single use) maps on the fly, called ''aggregate literals'', at any time you can provide a data value. They follow the syntax:
 +
<pre>
 +
{ <key>: <value>, <key>: <value>, <key>: <value> ... }
 +
</pre>
  
        <aggregate reference expression> contains <key expression>  
+
An ''aggregate literal'' can be used to initialize an entire aggregate at once. For example:
  
    Where <aggregate reference expression> must evaluate at run time to a map or slice, and must evaluate at run time to a key of the appropriate type. (Note that that is enforced at parse time; ASH can tell the datatype any expression will produce).
+
{{CodeSample
 +
|code=<syntaxhighlight>
 +
int[item] my_pricelist = {
 +
  $item[pail]: 1000,
 +
  $item[bum cheek]: 404,
 +
  $item[red button]: 100,
 +
  $item[red balloon]: 100000
 +
};
 +
</syntaxhighlight>}}
 +
which achieves the same as what was done in the 'Assigning individual values' example.
  
        props contains "dog" => true props contains "elephant" => false props[ "aardvark" ] contains "fun" => true animal contains "pet" => true animal contains "favorite food" => false 
 
  
    You can remove a key-value association from a map using the "remove" unary operator:
+
Note: when using an aggregate literal for anything else than assignment (e.g. as a return statement, in an expression, or as a function parameter), you need to supply the data types of the aggregate in front of the aggregate literal, even when the expected data types are already known. Example:
 +
{{CodeSample
 +
|code=<syntaxhighlight>
 +
item [int] my_function () {
  
        remove <aggregate reference> 
+
  ...
  
    For clarification, an aggregate reference is "<map name>[ <index 1> ... <index n> ]" where <map name>[ <index 1> ... <index n-1> ] specifies the "slice" and <index n> specifies the "key". Which is just what you expect, if you fully specify the indices; for a single dimensional map, "map[10]" -> "map" is the slice and 10 is the key. The "remove" operator removes the "key" from the "slice". For example:
 
  
        string [int] map1; map1[5] = "foo"; print( count( map1 ) + " " + map1 contains 5 + " " + map1[5] ); print( "remove: " + remove map1[5] ); print( count( map1 ) + " " + map1 contains 5 + " " + map1[5] ); print( "remove: " + remove map1[5] ); int [string, string] map2; map2["me","you"] = 17; print( count( map2["me"] ) + " " + map2["me"] contains "you" + " " + map2["me","you"] ); print( "remove: " + remove map2["me", "you"] ); print( count( map2["me"] ) + " " + map2["me"] contains "you" + " " + map2["me","you"] ); print( "remove: " + remove map2["me", "you"] ); print( count( map2 ) + " " + map2["me"] ); print( "remove: " + remove map2["me"] ); print( count( map2 ) + " " + map2["me"] );
+
  return item [int] {<int>: <item>, <int>: <item>, <int>: <item> ...};
 +
}
 +
</syntaxhighlight>}}
  
    yields:
 
  
        1 true foo remove: foo 0 false remove: 1 true 17 remove: 17 0 false 0 remove: 0 1 aggregate int [string] remove: aggregate int [string] 0 aggregate int [string]
+
Note bis: if the key of your map is integers, providing the keys is not mandatory; the keys will be automatically supplied in order, so:
 +
{{CodeSample
 +
|code=<syntaxhighlight>
 +
string [int] {"this","is","called","a","list"};
 +
</syntaxhighlight>}}
 +
Creates a map that contains:
 +
{{CodeSample
 +
|code=<syntaxhighlight>
 +
0 => this
 +
1 => is
 +
2 => called
 +
3 => a
 +
4 => list
 +
</syntaxhighlight>}}
 +
 
 +
 
 +
It is possible to make an aggregate literal of a multi-dimensional map. Since multi-dimensional maps are effectively maps of maps (of maps) ''(of maps)'' ''(...)'', their aggregate literal would be an aggregate literal, inside an aggregate literal ''(inside an aggregate literal)'' (...). Nested aggregate literals.
 +
{{CodeSample
 +
|code=<syntaxhighlight>
 +
item[class, slot] basic_equipments = {
 +
  $class[Seal Clubber]: {
 +
    $slot[hat]: $item[seal-skull helmet],
 +
    $slot[weapon]: $item[seal-clubbing club],
 +
    $slot[pants]: $item[old sweatpants]
 +
  },
 +
  $class[Turtle Tamer]: {
 +
    $slot[hat]: $item[helmet turtle],
 +
    $slot[weapon]: $item[turtle totem],
 +
    $slot[pants]: $item[old sweatpants]
 +
  },
 +
  $class[Pastamancer]: {
 +
    $slot[hat]: $item[ravioli hat],
 +
    $slot[weapon]: $item[pasta spoon],
 +
    $slot[pants]: $item[old sweatpants]
 +
  },
 +
  $class[Sauceror]: {
 +
    $slot[hat]: $item[Hollandaise helmet],
 +
    $slot[weapon]: $item[saucepan],
 +
    $slot[pants]: $item[old sweatpants]
 +
  },
 +
  $class[Disco Bandit]: {
 +
    $slot[hat]: $item[disco mask],
 +
    $slot[weapon]: $item[disco ball],
 +
    $slot[pants]: $item[old sweatpants]
 +
  },
 +
  $class[Accordion Thief]: {
 +
    $slot[hat]: $item[mariachi hat],
 +
    $slot[weapon]: $item[stolen accordion],
 +
    $slot[pants]: $item[mariachi pants]
 +
  }
 +
};
 +
</syntaxhighlight>}}
 +
 
 +
==== Map-map assignments ====
 +
If you use a map on the left side of an assignment, you set the whole map at once to the new value.
 +
 
 +
{{CodeSample
 +
|code=<syntaxhighlight>
 +
int [item] my_pricelist;
 +
int [item] new_pricelist;
 +
 
 +
/* Some code that updates my_pricelist with new_pricelist */
 +
 
 +
my_pricelist = new_pricelist;
 +
 
 +
/* Now my_pricelist and new_pricelist point to the same aggregate;
 +
  changes to an element of one of them will be visible in the other */
 +
</syntaxhighlight>}}
 +
 
 +
 
 +
==== Slices ====
 +
If you specify a map and a prefix of indices (of the correct type), you directly set one of the intermediate maps, a "slice".
 +
 
 +
{{CodeSample
 +
|code=<syntaxhighlight>
 +
float [string, int, string] my_map;
 +
float [int, string] slice1;
 +
 
 +
/* Some code that fills my_map[ "slice1" ] with slice1 */
 +
my_map[ "slice1" ] = slice1;
 +
</syntaxhighlight>}}
 +
 
 +
=== References ===
 +
 
 +
The syntax for referencing an element (or slice) of a map:
 +
 
 +
<pre>
 +
<aggregate name>[ <key expression>, ... ]
 +
</pre>
 +
 
 +
All the key expressions will be evaluated at run time. If you specify all the keys the map expects, you fetch data of the type specified by the map. If you specify fewer keys than the map expects, you get an intermediate map, a "slice".
 +
 
 +
As an example:
 +
{{
 +
CodeSample|
 +
code=
 +
<syntaxhighlight>
 +
boolean [string, string] props;
 +
</syntaxhighlight>}}
 +
 
 +
might be used to hold "properties" associated with names.
 +
{{
 +
CodeSample|
 +
code=
 +
<syntaxhighlight>
 +
props[ "dog", "mammal" ] = true;
 +
props[ "dog", "pet" ] = true;
 +
props[ "dog", "fun" ] = false;
 +
props[ "turtle", "mammal" ] = false;
 +
props[ "turtle", "pet" ] = true;
 +
props[ "turtle", "fun" ] = false;
 +
props[ "aardvark", "mammal" ] = true;
 +
props[ "aardvark", "pet" ] = false;
 +
props[ "aardvark", "fun" ] = true;
 +
</syntaxhighlight>}}
 +
 +
 
 +
references:
 +
 
 +
<pre>
 +
props[ "dog", "mammal"] => true
 +
boolean [string] animal = props[ "turtle" ];
 +
animal[ "fun" ] => false
 +
</pre>
 +
 
 +
=== Contains ===
 +
 
 +
You can test the presence of a key in a map using the "contains" operator:
 +
 
 +
<pre>
 +
<aggregate reference expression> contains <key expression>
 +
</pre>
 +
 
 +
Where <aggregate reference expression> must evaluate at run time to a map or slice, and must evaluate at run time to a key of the appropriate type. (Note that that is enforced at parse time; ASH can tell the datatype any expression will produce).
 +
 
 +
<pre>
 +
props contains "dog" => true
 +
props contains "elephant" => false
 +
props[ "aardvark" ] contains "fun" => true
 +
animal contains "pet" => true
 +
animal contains "favorite food" => false
 +
</pre>
 +
 
 +
 
 +
Note that only the current slice's keys are evaluated. This means that:
 +
 
 +
<pre>
 +
boolean [ boolean ] [ boolean ] [ boolean ] [ boolean ] [ boolean ] much_truthness;
 +
much_truthness [ true ] [ true ] [ false ] [ true ] [ true ] = true;
 +
 
 +
return much_truthness [ true ] [ true ] contains true;
 +
</pre>
 +
 
 +
yields:
 +
<pre>
 +
Returned: false
 +
</pre>
 +
 
 +
=== Remove ===
 +
 
 +
You can remove a key-value association from a map using the "remove" unary operator:
 +
 
 +
<pre>
 +
remove <aggregate reference>
 +
</pre>
 +
 
 +
For clarification, an aggregate reference is "<map name>[ <index 1> ... <index n> ]" where <map name>[ <index 1> ... <index n-1> ] specifies the "slice" and <index n> specifies the "key". Which is just what you expect, if you fully specify the indices; for a single dimensional map, "map[10]" -> "map" is the slice and 10 is the key. The "remove" operator removes the "key" from the "slice". For example:
 +
 
 +
{{CodeSample|code=<syntaxhighlight>
 +
string [int] map1;
 +
map1[5] = "foo";
 +
print( count( map1 ) + " " + map1 contains 5 + " " + map1[5] );
 +
print( "remove: " + remove map1[5] );
 +
print( count( map1 ) + " " + map1 contains 5 + " "  + map1[5] );
 +
print( "remove: " + remove map1[5] );
 +
int [string, string] map2;
 +
map2["me","you"] = 17;
 +
print( count( map2["me"] ) + " " + map2["me"] contains "you" + " " + map2["me","you"] );
 +
print( "remove: " + remove map2["me", "you"] );
 +
print( count( map2["me"] ) + " " + map2["me"] contains "you" + " " + map2["me","you"] );
 +
print( "remove: " + remove map2["me", "you"] );
 +
print( count( map2 ) + " " + map2["me"] );
 +
print( "remove: " + remove map2["me"] );
 +
print( count( map2 ) + " " + map2["me"] );
 +
</syntaxhighlight>}}
 +
 
 +
yields:
 +
 
 +
<pre>
 +
1 true foo
 +
remove: foo
 +
0 false
 +
remove:
 +
1 true 17
 +
remove: 17
 +
0 false 0
 +
remove: 0
 +
1 aggregate int [string]
 +
remove: aggregate int [string]
 +
0 aggregate int [string]
 +
</pre>
 +
 
 +
=== Clear ===
 +
 
 +
You can remove all <code>key => value</code> entries from a map using the {{f|clear}} function:
 +
 
 +
{{CodeSample|code=<syntaxhighlight>clear( <aggregate> );</syntaxhighlight>}}
 +
 
 +
=== Count ===
 +
 
 +
The {{f|count}} function returns the number of defined keys for the specified aggregate.
 +
 
 +
{{CodeSample|code=<syntaxhighlight>int size = count( <aggregate> );</syntaxhighlight>}}
 +
 
 +
=== Sort ===
 +
 
 +
From http://kolmafia.us/showthread.php?t=1738 and http://kolmafia.us/showthread.php?10729
 +
 
 +
The syntax is:
 +
{{CodeSample|code=<syntaxhighlight>sort aggregate by keyExpr;</syntaxhighlight>}}
 +
 
 +
<code>aggregate</code> is a reference to the object to be sorted - arrays are probably the most useful things to sort, but any mapping type can be used.  But please note that when you sort a map, you change the values that correspond to the index. To sort on a map, you would want to use a multidimensional maps, but note that you can only sort along a single dimension at a time when doing this. Simply put... "sort" is only useful in cases where your data exists entirely in the values of the map; the keys can have no meaning beyond simply being distinct.
 +
 
 +
The reference must not be enclosed in parentheses, as that would look like a call to a function named <code>sort()</code> - which is still perfectly valid, "sort" has not become a [[Reserved Words|reserved word]].
 +
 
 +
<code>keyExpr</code> is an arbitrary expression that defines how the items should be ordered. It is evaluated once for every entry in the aggregate, in a scope with two additional variables implicitly defined: '<code>index</code>' and '<code>value</code>', holding the details of that entry. The value of the <code>keyExpr</code> is used as the sort key; typically it would be an <code>int</code> or <code>string</code>, but can be any ASH type that can be compared via "<" and the other relational operators.
 +
 
 +
The most basic form of sorting would therefore be "<code>sort ... by value</code>", but many useful things can be done with the use of a more complex <code>keyExpr</code> - the only real restriction is that the expression should not modify the object you're sorting. For example, if you had an array of items, you could sort it "<code>by autosell_price(value)</code>". An array of weapon items could be sorted "<code>by -get_power(value)</code>" to put it in decreasing order of power. If the elements of your aggregate are records, you'd need to use something like "<code>by value.fieldName</code>", since the records themselves can't be meaningfully compared.
 +
 
 +
After the sort statement, the aggregate will have exactly the same sets of keys and values as before (even if the keys weren't consecutive), and the iteration order of the keys will be the same, but the values will likely be associated with different keys. The sort is stable - in other words, elements with sort keys that compare as equal will remain in the same order. This means that you can sort on multiple criteria by simply performing separate sorts for each of the criteria, in increasing order of significance.
 +
 
 +
To find out how many things you have, you might do:
 +
{{CodeSample|code=<syntaxhighlight>
 +
item [int] whatGot;
 +
int ctr =0;
 +
 
 +
foreach it in get_inventory() {
 +
  whatGot[ctr] = it;
 +
  ctr+=1;
 +
}
 +
 
 +
sort whatGot by item_amount(value);
 +
 
 +
foreach x, it in whatGot
 +
  print(item_amount(it) + ' of ' + it);
 +
</syntaxhighlight>}}
 +
Note that this use of an optional feature of foreach. The second variable in the foreach is the value of whatGot[x].
 +
 
 +
 
 +
A few more examples of things you can do:
 +
* "<code>by -value</code>" sorts integers in decreasing order (there's no similar trick for <code>string</code> values).
 +
* "<code>by -index</code>" reverses the existing order of an array (or map with integer keys).
 +
* "<code>by random(1000000)</code>" shuffles into a random order.
 +
* "<code>by otherArray[index]</code>" uses values from a parallel array as the sort keys (you'd then need to do "<code>sort otherArray by value;</code>" if you wanted the two arrays to remain in sync).
 +
 
 +
===Iteration===
 +
To iterate through a map, use the '''foreach''' operator. For instance, if you wanted to print out how many of each item you had, you could do something like the following:
 +
{{CodeSample|code=<syntaxhighlight>
 +
int[item] map = get_inventory();
 +
foreach key in map {
 +
    print(key + " (" + map[key] + ")");
 +
}
 +
</syntaxhighlight>}}
 +
 
 +
Multidimensional maps are implemented as maps that map keys to maps. '''int[item][string]map''' is really a mapping of items to int[string] maps. Iteration, therefore, is as follows:
 +
{{CodeSample|code=<syntaxhighlight>
 +
int[item][string] map;
 +
file_to_map("somefile.txt", map);
 +
foreach k1 in map {
 +
    print(k1 + ": ");
 +
    foreach k2 in map[k1] {
 +
        print("\t" + k2 + ": " + map[k1][k2]);
 +
    }
 +
}
 +
</syntaxhighlight>}}
 +
 
 +
Two things to note: First, '''int[item][string]map''' is equivalent to '''int[item, string]map'''. This really comes down to author preference, although the second form is generally more common. Second, the two following foreach loops are equivalent:
 +
{{CodeSample|code=<syntaxhighlight>
 +
int[item][string] map;
 +
foreach k1 in map {
 +
    foreach k2 in map[k1] {
 +
        func(map[k1][k2]);
 +
    }
 +
}
 +
 
 +
foreach k1, k2 in map {
 +
    func(map[k1][k2]);
 +
}
 +
</syntaxhighlight>}}
 +
 
 +
Of course, the latter does not lend itself to, say, only printing the first key once, whereas the former can be used that way (see the preceding example).
 +
 
 +
===Implementation===
 +
Maps in ASH are implemented internally as TreeMaps [http://download.oracle.com/javase/1.5.0/docs/api/java/util/TreeMap.html]. See below for some implications.
 +
 
 +
== Arrays ==
 +
These look and behave like mappings of integers to values, where the keys only take values from 0 to n, but these are implemented as Java Arrays.
 +
 
 +
===Differences between arrays and maps===
 +
 
 +
'''item [12] array;'''
 +
 
 +
Can use keys 0 - 11. You get a runtime error if you use any other key. It always uses memory to hold 12 items, even if you only use a couple of them. But it's a constant time - O(1) - to access any element.
 +
 
 +
'''item [int] map;'''
 +
 
 +
Can use any int as a key. It has constant memory for the Java map, and additional memory for each element in the map, but is O( log n) to access any particular element.
 +
 
 +
If you are able to use (a fairly densely packed set of) integers as keys, your program will be faster and use (potentially) slightly more memory.
 +
 
 +
If you have a sparse set of integers, you can still use an array and get fast access, but you will waste a lot of memory.
 +
 
 +
If you can't use integers as keys or don't want to waste memory on a sparse array, you can have a slower but less memory consuming map.
 +
 
 +
[http://kolmafia.us/showthread.php?6425-Sorting-skills-by-mana-cost&p=48703&viewfull=1#post48703]
 +
 
 +
====Time considerations====
 +
* Given '''if (a == item1 || a == item2 || a == item3)''' and '''if ($items[item1, item2, item3] contains a)''', which is faster?
 +
 
 +
This is going to depend on the number of items in the list, and which one happens to match; if 'a' is almost always item1, then the first form is likely to win on practical grounds, even though it's theoretically slower (O(n) vs. O(log n)).
 +
 
 +
The second form is a definite win assuming no such coincidences of the item chosen, a somewhat larger set of items, and that the code is executed more than once per run of the script. The first lookup in a plural constant actually builds an internal map that allows such queries to be efficiently done; this is deferred because typical use of a plural constant involves only iteration, not lookups.
 +
 
 +
There's always the "profile" command, if you really need to know which is more efficient in a given situation - although it's unlikely that either would have a noticeable effect on your script's performance.
 +
 
 +
[http://kolmafia.us/showthread.php?6425-Sorting-skills-by-mana-cost&p=48728&viewfull=1#post48728]
 +
 
 +
== Records ==
 +
 
 +
(Link to Veracity's post introducing the record [http://kolmafia.us/showthread.php?t=280])
 +
 
 +
Records are custom-made data types, that act like compartmentalized boxes [https://i.imgur.com/rgxLKRC.jpg].
 +
 
 +
They are "variable packages", that allow you to group together data of any type, including other records, in them.
 +
 
 +
Records have "models" (the custom data type of the record itself), and names representing them (though not all of them are creative, such as ''"the_black_one_that_I_bought_a_year_ago"'', or ''"the_one_that_is_over_there_by_the_door"'').
 +
 
 +
Each "compartment" (field) also has a name, as well as an expected data type. ''(Stop trying to put screws "there"; that's my god damn lunchbox, and this is "the spot in which you put the guacamole"!)''
 +
 
 +
Once a "box's model" (record's data type) is declared, it can be used (almost) anywhere that can use a built-in type name (as long as you don't leave the scope in which the record was declared), such as in functions and other records.
 +
 
 +
===Declarations===
 +
The syntax for declaring a new record:
 +
<pre>
 +
record <"model" name> { <data type 1> <field name 1>; <data type 2> <field name 2>; ... }
 +
</pre>
 +
 
 +
For example:
 +
{{CodeSample
 +
|code=<syntaxhighlight>
 +
record human {
 +
    string name;
 +
    int date_of_birth;
 +
    location location_of_birth;
 +
    item [slot] clothes;
 +
};
 +
</syntaxhighlight>}}
 +
 
 +
This declared a new data '''''type''''', a ''mold'', a ''blueprint''... No variable was created here.
 +
 
 +
To actually ''create'' a variable that ''uses'' this "model", you would '''then''' do:
 +
{{CodeSample
 +
|code=<syntaxhighlight>
 +
human that_guy;
 +
human this_guy;
 +
human that_other_guy;
 +
</syntaxhighlight>}}
 +
You now have 3 "human"s, each with a name, date of birth, location of birth, and clothes (though none of them has had those assigned, yet).
 +
 
 +
 
 +
It is possible to declare a record and create a variable of that type at the same time. The following is equivalent to the two previous code samples combined:
 +
{{CodeSample
 +
|code=<syntaxhighlight>
 +
record human {
 +
    string name;
 +
    int date_of_birth;
 +
    location location_of_birth;
 +
    item [slot] clothes;
 +
} that_guy;
 +
 
 +
human this_guy;
 +
human that_other_guy;
 +
</syntaxhighlight>}}
 +
 
 +
===Accessing the fields===
 +
Once a variable of a recorded data type was created, you may access its fields with <variable name>.<field name>:
 +
 
 +
{{CodeSample
 +
|code=<syntaxhighlight>
 +
that_guy.name = "Andrew";
 +
that_guy.date_of_birth = 08212006;
 +
that_guy.location_of_birth = $location[Noob Cave];
 +
 
 +
that_other_guy.name = "Bob";
 +
that_other_guy.location_of_birth = $location[South of the Border];
 +
 
 +
if ( that_other_guy.clothes [ $slot[pants] ] == $item[none] )
 +
    print( "Put some pants on, " + that_other_guy.name + "!" );
 +
</syntaxhighlight>}}
 +
 
 +
 
 +
===Omitting the "model" name===
 +
When declaring a record, supplying the "model" name (what goes between "record" and the left curly bracket) is not mandatory. If you don't, the record will be "anonymous".
 +
 
 +
An "anonymous" record is like a makeshift bindle; you made it on a whim, it's unique, it's yours, you couldn't replicate it even if you tried, and nobody else wants it.
 +
 
 +
If you omit data type's name, you'll definitely want to supply a variable name after the curly braces, or it will have been a pure waste of time:
 +
{{CodeSample|code=<syntaxhighlight>
 +
record {
 +
  int days_spent_travelling;
 +
item [int] content;
 +
} mah_bindle;
 +
 
 +
mah_bindle.content [0] = $item[ big rock ];
 +
</syntaxhighlight>}}
 +
 
 +
 
 +
===[[New]]===
 +
[[New]] can be used to assign a whole record at once.
 +
 
 +
=== Scope ===
 +
Like variables, records can be defined inside blocks (<code>{}</code>).
 +
 
 +
<syntaxhighlight>
 +
if ( /* some condition */ ) {
 +
  // Records can be defined in if-blocks
 +
  record Dog { string name; };
 +
}
 +
 
 +
void foo() {
 +
  // Records can be defined inside functions
 +
  record Bar { int id; };
 +
}
 +
</syntaxhighlight>
 +
 
 +
A record definition is local to (i.e. only valid within) the scope it is defined:
 +
 
 +
<syntaxhighlight>
 +
{
 +
  // Dog is defined within this block
 +
  // (can be an if-block, for-loop, function, ...)
 +
  record Dog { string name; };
 +
}
 +
 
 +
// Since Dog is not defined here, this is invalid code
 +
Dog mydog; // Error: Unknown variable 'Dog'
 +
</syntaxhighlight>
 +
 
 +
Unlike variables, record definitions cannot shadow other records in outer scopes.
 +
 
 +
<syntaxhighlight>
 +
// 'Dog' record is defined in the global scope
 +
record Dog { string name; };
 +
 
 +
void foo() {
 +
  // Since Dog is already defined in the parent scope, this is invalid code
 +
  record Dog { int id; }; // Error: Record name 'Dog' is already defined
 +
}
 +
</syntaxhighlight>
 +
 
 +
However, since record definitions are local to their scope, this is fine:
 +
 
 +
<syntaxhighlight>
 +
{
 +
  // 'Dog' record is defined inside a block
 +
  // (can be an if-block, for-loop, function, ...)
 +
  record Dog { string name; };
 +
}
 +
 
 +
// Since Dog is not defined in this scope (yet), it can be defined here
 +
// Note: This is poor practice and shouldn't be used in real code
 +
record Dog { int id; };
 +
</syntaxhighlight>
 +
 
 +
[[Category:Scripting]]

Latest revision as of 17:33, 27 February 2021

KoLmafia supports complex data structures such as maps and records made from simple data types.

Maps

If you are new to programming or find the information below confusing, you may want to read A Noob's Guide to Maps first.

Most of this information was copied directly from ASH Maps Tutorial, by Veracity (http://kolmafia.sourceforge.net/advanced.html#maps)

A map is indexed by one data type (the key) and associates that key with another (or the same) data type (the value). The key can be any ASH simple data type: boolean, int, float, string, item, location, class, stat, skill, effect, familiar, slot, or monster. The value can be any ASH data type at all: a simple type, a record, or can be another map. This effectively allows multi-dimensional maps and. In fact, that's how the syntax we provide for multi-dimensional maps actually operate: maps of maps of maps ...

You can declare a map any time you can declare a variable: as a top level (global) variable, as a function parameter, or as a local variable in any scope.

You can fetch data from a map any time you can provide a data value: in an expression, as a function parameter, on the right side of an assignment statement, from a "return" statement, as so on. You can pass around entire maps, individual elements, or intermediate maps: "slices".

Declarations

The syntax for declaring the data type of a map:

<data type> [ <key type>, ... ] <aggregate_name>

For example:


string [item] map1;
float [class, string, int] another_map;


Assignments

Assigning individual values

If you specify a map and a complete set of indices (of the correct types) on the left side of an assignment statement, you set a single element.


int[item] my_pricelist;
my_pricelist[ $item[ pail ] ] = 1000;
my_pricelist[ $item[ bum cheek ] ] = 404;
my_pricelist[ $item[ red button ] ] = 100;
my_pricelist[ $item[ red balloon ] ] = 100000;


Aggregate literals

You can declare ready to use, anonymous (single use) maps on the fly, called aggregate literals, at any time you can provide a data value. They follow the syntax:

{ <key>: <value>, <key>: <value>, <key>: <value> ... }

An aggregate literal can be used to initialize an entire aggregate at once. For example:


int[item] my_pricelist = {
  $item[pail]: 1000,
  $item[bum cheek]: 404,
  $item[red button]: 100,
  $item[red balloon]: 100000
};

which achieves the same as what was done in the 'Assigning individual values' example.


Note: when using an aggregate literal for anything else than assignment (e.g. as a return statement, in an expression, or as a function parameter), you need to supply the data types of the aggregate in front of the aggregate literal, even when the expected data types are already known. Example:

item [int] my_function () {

  ...


  return item [int] {<int>: <item>, <int>: <item>, <int>: <item> ...};
}


Note bis: if the key of your map is integers, providing the keys is not mandatory; the keys will be automatically supplied in order, so:

string [int] {"this","is","called","a","list"};

Creates a map that contains:

0 => this
1 => is
2 => called
3 => a
4 => list


It is possible to make an aggregate literal of a multi-dimensional map. Since multi-dimensional maps are effectively maps of maps (of maps) (of maps) (...), their aggregate literal would be an aggregate literal, inside an aggregate literal (inside an aggregate literal) (...). Nested aggregate literals.

item[class, slot] basic_equipments = {
  $class[Seal Clubber]: {
    $slot[hat]: $item[seal-skull helmet],
    $slot[weapon]: $item[seal-clubbing club],
    $slot[pants]: $item[old sweatpants]
  },
  $class[Turtle Tamer]: {
    $slot[hat]: $item[helmet turtle],
    $slot[weapon]: $item[turtle totem],
    $slot[pants]: $item[old sweatpants]
  },
  $class[Pastamancer]: {
    $slot[hat]: $item[ravioli hat],
    $slot[weapon]: $item[pasta spoon],
    $slot[pants]: $item[old sweatpants]
  },
  $class[Sauceror]: {
    $slot[hat]: $item[Hollandaise helmet],
    $slot[weapon]: $item[saucepan],
    $slot[pants]: $item[old sweatpants]
  },
  $class[Disco Bandit]: {
    $slot[hat]: $item[disco mask],
    $slot[weapon]: $item[disco ball],
    $slot[pants]: $item[old sweatpants]
  },
  $class[Accordion Thief]: {
    $slot[hat]: $item[mariachi hat],
    $slot[weapon]: $item[stolen accordion],
    $slot[pants]: $item[mariachi pants]
  }
};


Map-map assignments

If you use a map on the left side of an assignment, you set the whole map at once to the new value.


int [item] my_pricelist;
int [item] new_pricelist;

/* Some code that updates my_pricelist with new_pricelist */

my_pricelist = new_pricelist;

/* Now my_pricelist and new_pricelist point to the same aggregate; 
   changes to an element of one of them will be visible in the other */


Slices

If you specify a map and a prefix of indices (of the correct type), you directly set one of the intermediate maps, a "slice".


float [string, int, string] my_map;
float [int, string] slice1;

/* Some code that fills my_map[ "slice1" ] with slice1 */
my_map[ "slice1" ] = slice1;


References

The syntax for referencing an element (or slice) of a map:

<aggregate name>[ <key expression>, ... ]

All the key expressions will be evaluated at run time. If you specify all the keys the map expects, you fetch data of the type specified by the map. If you specify fewer keys than the map expects, you get an intermediate map, a "slice".

As an example:

boolean [string, string] props;


might be used to hold "properties" associated with names.

props[ "dog", "mammal" ] = true; 
props[ "dog", "pet" ] = true; 
props[ "dog", "fun" ] = false;
props[ "turtle", "mammal" ] = false;
props[ "turtle", "pet" ] = true;
props[ "turtle", "fun" ] = false;
props[ "aardvark", "mammal" ] = true;
props[ "aardvark", "pet" ] = false;
props[ "aardvark", "fun" ] = true;


references:

props[ "dog", "mammal"] => true
boolean [string] animal = props[ "turtle" ];
animal[ "fun" ] => false

Contains

You can test the presence of a key in a map using the "contains" operator:

<aggregate reference expression> contains <key expression>

Where <aggregate reference expression> must evaluate at run time to a map or slice, and must evaluate at run time to a key of the appropriate type. (Note that that is enforced at parse time; ASH can tell the datatype any expression will produce).

props contains "dog" => true
props contains "elephant" => false
props[ "aardvark" ] contains "fun" => true
animal contains "pet" => true
animal contains "favorite food" => false


Note that only the current slice's keys are evaluated. This means that:

boolean [ boolean ] [ boolean ] [ boolean ] [ boolean ] [ boolean ] much_truthness;
much_truthness [ true ] [ true ] [ false ] [ true ] [ true ] = true;

return much_truthness [ true ] [ true ] contains true;

yields:

Returned: false

Remove

You can remove a key-value association from a map using the "remove" unary operator:

remove <aggregate reference>

For clarification, an aggregate reference is "<map name>[ <index 1> ... <index n> ]" where <map name>[ <index 1> ... <index n-1> ] specifies the "slice" and <index n> specifies the "key". Which is just what you expect, if you fully specify the indices; for a single dimensional map, "map[10]" -> "map" is the slice and 10 is the key. The "remove" operator removes the "key" from the "slice". For example:


string [int] map1;
map1[5] = "foo";
print( count( map1 ) + " " + map1 contains 5 + " " + map1[5] );
print( "remove: " + remove map1[5] );
print( count( map1 ) + " " + map1 contains 5 + " "  + map1[5] );
print( "remove: " + remove map1[5] );
int [string, string] map2;
map2["me","you"] = 17;
print( count( map2["me"] ) + " " + map2["me"] contains "you" + " " + map2["me","you"] );
print( "remove: " + remove map2["me", "you"] );
print( count( map2["me"] ) + " " + map2["me"] contains "you" + " " + map2["me","you"] );
print( "remove: " + remove map2["me", "you"] );
print( count( map2 ) + " " + map2["me"] );
print( "remove: " + remove map2["me"] );
print( count( map2 ) + " " + map2["me"] );


yields:

1 true foo
remove: foo
0 false
remove:
1 true 17
remove: 17
0 false 0
remove: 0
1 aggregate int [string]
remove: aggregate int [string]
0 aggregate int [string]

Clear

You can remove all key => value entries from a map using the clear() function:


clear( <aggregate> );


Count

The count() function returns the number of defined keys for the specified aggregate.


int size = count( <aggregate> );


Sort

From http://kolmafia.us/showthread.php?t=1738 and http://kolmafia.us/showthread.php?10729

The syntax is:

sort aggregate by keyExpr;


aggregate is a reference to the object to be sorted - arrays are probably the most useful things to sort, but any mapping type can be used. But please note that when you sort a map, you change the values that correspond to the index. To sort on a map, you would want to use a multidimensional maps, but note that you can only sort along a single dimension at a time when doing this. Simply put... "sort" is only useful in cases where your data exists entirely in the values of the map; the keys can have no meaning beyond simply being distinct.

The reference must not be enclosed in parentheses, as that would look like a call to a function named sort() - which is still perfectly valid, "sort" has not become a reserved word.

keyExpr is an arbitrary expression that defines how the items should be ordered. It is evaluated once for every entry in the aggregate, in a scope with two additional variables implicitly defined: 'index' and 'value', holding the details of that entry. The value of the keyExpr is used as the sort key; typically it would be an int or string, but can be any ASH type that can be compared via "<" and the other relational operators.

The most basic form of sorting would therefore be "sort ... by value", but many useful things can be done with the use of a more complex keyExpr - the only real restriction is that the expression should not modify the object you're sorting. For example, if you had an array of items, you could sort it "by autosell_price(value)". An array of weapon items could be sorted "by -get_power(value)" to put it in decreasing order of power. If the elements of your aggregate are records, you'd need to use something like "by value.fieldName", since the records themselves can't be meaningfully compared.

After the sort statement, the aggregate will have exactly the same sets of keys and values as before (even if the keys weren't consecutive), and the iteration order of the keys will be the same, but the values will likely be associated with different keys. The sort is stable - in other words, elements with sort keys that compare as equal will remain in the same order. This means that you can sort on multiple criteria by simply performing separate sorts for each of the criteria, in increasing order of significance.

To find out how many things you have, you might do:

item [int] whatGot;
int ctr =0;

foreach it in get_inventory() {
   whatGot[ctr] = it;
   ctr+=1;
}

sort whatGot by item_amount(value);

foreach x, it in whatGot
   print(item_amount(it) + ' of ' + it);

Note that this use of an optional feature of foreach. The second variable in the foreach is the value of whatGot[x].


A few more examples of things you can do:

  • "by -value" sorts integers in decreasing order (there's no similar trick for string values).
  • "by -index" reverses the existing order of an array (or map with integer keys).
  • "by random(1000000)" shuffles into a random order.
  • "by otherArray[index]" uses values from a parallel array as the sort keys (you'd then need to do "sort otherArray by value;" if you wanted the two arrays to remain in sync).

Iteration

To iterate through a map, use the foreach operator. For instance, if you wanted to print out how many of each item you had, you could do something like the following:

int[item] map = get_inventory();
foreach key in map {
    print(key + " (" + map[key] + ")");
}


Multidimensional maps are implemented as maps that map keys to maps. int[item][string]map is really a mapping of items to int[string] maps. Iteration, therefore, is as follows:

int[item][string] map;
file_to_map("somefile.txt", map);
foreach k1 in map {
    print(k1 + ": ");
    foreach k2 in map[k1] {
        print("\t" + k2 + ": " + map[k1][k2]);
    }
}


Two things to note: First, int[item][string]map is equivalent to int[item, string]map. This really comes down to author preference, although the second form is generally more common. Second, the two following foreach loops are equivalent:

int[item][string] map;
foreach k1 in map {
    foreach k2 in map[k1] {
        func(map[k1][k2]);
    }
}

foreach k1, k2 in map {
    func(map[k1][k2]);
}


Of course, the latter does not lend itself to, say, only printing the first key once, whereas the former can be used that way (see the preceding example).

Implementation

Maps in ASH are implemented internally as TreeMaps [1]. See below for some implications.

Arrays

These look and behave like mappings of integers to values, where the keys only take values from 0 to n, but these are implemented as Java Arrays.

Differences between arrays and maps

item [12] array;

Can use keys 0 - 11. You get a runtime error if you use any other key. It always uses memory to hold 12 items, even if you only use a couple of them. But it's a constant time - O(1) - to access any element.

item [int] map;

Can use any int as a key. It has constant memory for the Java map, and additional memory for each element in the map, but is O( log n) to access any particular element.

If you are able to use (a fairly densely packed set of) integers as keys, your program will be faster and use (potentially) slightly more memory.

If you have a sparse set of integers, you can still use an array and get fast access, but you will waste a lot of memory.

If you can't use integers as keys or don't want to waste memory on a sparse array, you can have a slower but less memory consuming map.

[2]

Time considerations

  • Given if (a == item1 || a == item2 || a == item3) and if ($items[item1, item2, item3] contains a), which is faster?

This is going to depend on the number of items in the list, and which one happens to match; if 'a' is almost always item1, then the first form is likely to win on practical grounds, even though it's theoretically slower (O(n) vs. O(log n)).

The second form is a definite win assuming no such coincidences of the item chosen, a somewhat larger set of items, and that the code is executed more than once per run of the script. The first lookup in a plural constant actually builds an internal map that allows such queries to be efficiently done; this is deferred because typical use of a plural constant involves only iteration, not lookups.

There's always the "profile" command, if you really need to know which is more efficient in a given situation - although it's unlikely that either would have a noticeable effect on your script's performance.

[3]

Records

(Link to Veracity's post introducing the record [4])

Records are custom-made data types, that act like compartmentalized boxes [5].

They are "variable packages", that allow you to group together data of any type, including other records, in them.

Records have "models" (the custom data type of the record itself), and names representing them (though not all of them are creative, such as "the_black_one_that_I_bought_a_year_ago", or "the_one_that_is_over_there_by_the_door").

Each "compartment" (field) also has a name, as well as an expected data type. (Stop trying to put screws "there"; that's my god damn lunchbox, and this is "the spot in which you put the guacamole"!)

Once a "box's model" (record's data type) is declared, it can be used (almost) anywhere that can use a built-in type name (as long as you don't leave the scope in which the record was declared), such as in functions and other records.

Declarations

The syntax for declaring a new record:

record <"model" name> { <data type 1> <field name 1>; <data type 2> <field name 2>; ... }

For example:

record human {
    string name;
    int date_of_birth;
    location location_of_birth;
    item [slot] clothes;
};


This declared a new data type, a mold, a blueprint... No variable was created here.

To actually create a variable that uses this "model", you would then do:

human that_guy;
human this_guy;
human that_other_guy;

You now have 3 "human"s, each with a name, date of birth, location of birth, and clothes (though none of them has had those assigned, yet).


It is possible to declare a record and create a variable of that type at the same time. The following is equivalent to the two previous code samples combined:

record human {
    string name;
    int date_of_birth;
    location location_of_birth;
    item [slot] clothes;
} that_guy;

human this_guy;
human that_other_guy;


Accessing the fields

Once a variable of a recorded data type was created, you may access its fields with <variable name>.<field name>:


that_guy.name = "Andrew";
that_guy.date_of_birth = 08212006;
that_guy.location_of_birth = $location[Noob Cave];

that_other_guy.name = "Bob";
that_other_guy.location_of_birth = $location[South of the Border];

if ( that_other_guy.clothes [ $slot[pants] ] == $item[none] )
    print( "Put some pants on, " + that_other_guy.name + "!" );


Omitting the "model" name

When declaring a record, supplying the "model" name (what goes between "record" and the left curly bracket) is not mandatory. If you don't, the record will be "anonymous".

An "anonymous" record is like a makeshift bindle; you made it on a whim, it's unique, it's yours, you couldn't replicate it even if you tried, and nobody else wants it.

If you omit data type's name, you'll definitely want to supply a variable name after the curly braces, or it will have been a pure waste of time:

record {
  	int days_spent_travelling;
	item [int] content;
} mah_bindle;

mah_bindle.content [0] = $item[ big rock ];


New

New can be used to assign a whole record at once.

Scope

Like variables, records can be defined inside blocks ({}).

if ( /* some condition */ ) {
   // Records can be defined in if-blocks
   record Dog { string name; };
}

void foo() {
   // Records can be defined inside functions
   record Bar { int id; };
}

A record definition is local to (i.e. only valid within) the scope it is defined:

{
   // Dog is defined within this block
   // (can be an if-block, for-loop, function, ...)
   record Dog { string name; };
}

// Since Dog is not defined here, this is invalid code
Dog mydog; // Error: Unknown variable 'Dog'

Unlike variables, record definitions cannot shadow other records in outer scopes.

// 'Dog' record is defined in the global scope
record Dog { string name; };

void foo() {
   // Since Dog is already defined in the parent scope, this is invalid code
   record Dog { int id; }; // Error: Record name 'Dog' is already defined
}

However, since record definitions are local to their scope, this is fine:

{
   // 'Dog' record is defined inside a block
   // (can be an if-block, for-loop, function, ...)
   record Dog { string name; };
}

// Since Dog is not defined in this scope (yet), it can be defined here
// Note: This is poor practice and shouldn't be used in real code
record Dog { int id; };