Talk:Data Structures

From Kolmafia
Revision as of 21:40, 4 July 2023 by Mcroft (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

To Do

We really should consider moving things like Veracity's guide to Maps either to the wiki or to GitHub. --Mcroft (talk) 21:40, 4 July 2023 (UTC)

Records

I thought that the Records section would've been helpful in figuring out multidimensional maps -- maybe not; Veracity might've gone a bit too far with that example (which I understand, as she was trying to cover all her bases with one single unifying example). --Heeheehee 06:42, 10 March 2010 (UTC)

An attempt at explaining maps

(by Grotfang)

A map consists of two types of component.

  • The key
  • The value

A normal map (without using a record) can have multiple keys, but only one value. A map is formed by specifying a normal datatype for the value and including a key, then giving the map a name:

datatype [key] name;

Eg.

string [int] message;

This in itself is a useful feature, but there are many situations where you may wish to associate two values. For example, in the message map above, I may wish to specify both a message and a name to send it to. There are two ways I could deal with this problem. The first would be to create a second map with a corresponding key. I could then assume that message[key] should be send to recipient[key]. While this works, it is inefficient and untidy, as well as inflexible.

Mafia supports the ability to customise maps using record. Record tells mafia what each entry in a map means, thereby enabling us to have more than one value to the map. The best way to visualise maps is to see a row on a page:

key   value

We can extend our keys already by specifying multiples:

value [key1 , key2] name;

Produces:

key1   key2   value

What record lets us do is add more values:

key   value1   value2

Each line in a record is the next value and has to include the datatype of that value. Before we specified the value's datatype when we named the map (since there was only one value to assign a datatype to). Now we have to make a record of ALL the datatypes that will come up, and then when we make our map we will use the record name in place of a single datatype:

record kmailer{
   string message;
   string recipient;
}
kmailer [int] my_messages;

The example above creates the following map:

key   value1    value2
int   message   recipient

So far, so simple. The next bit is where it gets interesting. We can no longer assign a value to the map as a whole. Without a record, we can do the following:

string [int] message;
message[1] = "Hi there!";

This makes the string "Hi there!" our value for key = 1:

key   value
1   Hi there!

With a record, we can no longer assign a value to a key alone; we have more than one value and we need to specify which one it is we want to assign something to. So using the record above, to assign a message we would have to do the following:

record kmailer{
   string message;
   string recipient;
}
kmailer [int] my_messages;
my_messages[1].message = "Hi there!";
my_messages[1].recipient = "Grotfang";

Both values are assigned to our "1" key, but we must specify the value we are assigning. The following details what is going on in the map:

key   value1   value2
int   message   recipient
1   Hi there!   Grotfang

Multiple Keys

This is more complex. Maps with single keys are easy to understand how they will return - specify the map and key and you return the value (you need to specify which value you want if you use a record, but the principle is the same).

More than one key is where maps get truly interesting. To all extents and purposes, multiple keys tend to be a way to categorise data into groups and subgroups. The return values get more complicated, though.

Whenever you add a value to a map, all keys MUST be specified. However, when you are retrieving data this is not the case. This works best with an example:

string [string , string] my_map;
my_map["a","b"] = "ab";
print(my_map["a","b"]);

This returns:

ab

The map setup is:

key1   key2   value
a      b      ab

As you might expect. Here we specified a map with two keys and one value. We assigned a value "ab" to the combination of key1 ("a") and key2 ("b"). When we retrieved the value for that combination, we got "ab". So far, so unsurprising.

However, the natural question arises, what if we try to retrieve a value while only specifying one of the keys? The answer is that we need to change our perspective of the map to what I suggested earlier - that of groups and sub-groups.

The way I visualise this is to imaging key1 represents a group and key2 is a subgroup. If two map entries have the same key1, then you should perceive them as having some commonality. What happens when you retrieve using key1 only is that you return a map of all the map entries that have the specified value of key1. This means (as you are returning an aggregate) you need to specify a new map for the values to be placed in. The following example illustrates this:

string [string , string] my_map;
my_map["a","b"] = "ab";
my_map["b","c"] = "bc";
my_map["b","d"] = "bd";
my_map["c","d"] = "cd";
my_map["e","f"] = "ef";

string [string] my_map2 = my_map["b"];

foreach x in my_map2 {
   print( my_map2[x] );
}

The output is:

bc
bd

The setup for my_map is:

key1   key2   value
a      b      ab
b      c      bc
b      d      bd
c      d      cd
e      f      ef

The setup for my_map2 (containing only entries that have "b" as key1) is:

key   value
c      bc
d      bd

As you can see, my_map["a","b"], which has "b" as key2 is NOT included in the aggregate return. The more keys, the more level of control you have, but you cannot return (I don't think) a map of key1 that shares key2. It's a one-way structure. Of course, you could work around this with a new map that had the keys in reverse.