CONCEPT mappings LAST UPDATE Mon, 15 Mar 1999 DESCRIPTION A step-by-step introduction to mappings: ---------------------------------------- 1. What is a mapping? A mapping is a datatype which allows to store data associated to a key. In other languages they are also known as 'dictionaries' or 'alists'. There are also alists in LPC but they are not a separate datatype but are implemented on top of arrays. Alists are the predecessors of mappings. The keys and the values can be of any type. But most common datatypes for keys are strings, integers and objects. Others like arrays, mappings or closures aren't a good choice because comparision between i.e. arrays often returns false even if they equal in content. This is because the driver compares i.e. two arrays by their internal pointers and not by their content. The reason for this is simple: speed. Mappings are allways treated as references when passing them to functions. This means when you pass a mapping to another object and this object modifies the mapping the modification will take place in a global scope - visible to all objects holding this mapping in a variable. 2. What are mappings good for? The term 'dictionary' probably describes the use of a mapping best. Opposed to arrays mappings don't have a specific order. They provide a mechanism to create a set of associations between values. Such an association consists of a unique key and data that is identified by the key. Think of a dictionary where you have a word and a definition of it. You use the word to lookup its definition. Mappings can be used i.e. to hold aliases for commands. The key would then be the name of the alias and the data the command(s) behind an alias. Or they can be used for the exits of a room. The keys would be the directions where one can go to and the associated data would be the file names of the rooms. But mappings can also be used as a kind of a sparse array. A sparse array is an array where most of the elements aren't used (occupied by 0). I.e. if you want to store values at the position 0, 13 and 37642 of an array you would have to create an array with a size of at least 37643. This costs a lot of memory so a mapping would be more useful because you would then use the numbers 0, 13 and 37642 as a key and not as an index to a position (actually the keys of a mapping are sometimes called indices but this is just because the way data is accessed in a mapping is similar to arrays: by the [] operator). This also allows to query all occupied positions of a sparse array by querying for all the keys of the mapping opposed to an array where you have to iterate over all elements. 3. How do I create a mapping? There are several ways to do so. The most convenient is the following: mapping map; map = ([ key0: value00; ...; value0n, ... : ... ; ...; ... , keyn: valuen0; ...; valuenn ]); As you can see, a key may have more than one value assigned. But the amount of values per key must always be equal. It is even possible to have mappings without any values! Another method is to use the efun mkmapping(). This efun gets two arguments with the first beeing an array of keys and the following beeing arrays of values: mapping map; map = mkmapping (({ key0 , ..., keyn }), ({ value00, ..., value0n }), ({ ... , ..., ... }), ({ valuen0, ..., valuenn })); If the efun only gets one argument, then this argument will be taken as an array of keys and a mapping without values will be returned. An empty mapping can be created by using the above described methods by simply ommitting the keys and values: mapping map; map = ([]); or: map = mkmapping(({}), ({})); Or by using the efun m_allocate(). This efun gets as first argument the amount of keys which will be added soon and an optional second argument specifying the width of the mapping: map = m_allocate(n, width); The value may be a bit confusing since mappings shrink and grow dynamically. This value just tells the driver how 'long' this mapping is going to be so proper memory allocations will be performed to reduce the overhead of memory reallocation. I.e. if you want to read in a file and store the read data in a mapping you probably know the amount of keys. So you allocate a mapping with this efun and tell the driver how much memory should be allocated by specifing a proper value. Thus causing a speedup when adding the read data to the mapping afterwards. The just specifies how many values per key this mapping is going to have. If no width is given, 1 will be taken as default. An empty mapping created with '([])' will always have a width of 1. To create empty mappings with other widths, write it as map = ([:width ]); can be any expression returning an integer value (including function calls), and in fact this notation is just a fancy way of writing map = m_allocate(0, width); 4. How can I modify the data of a mapping? Adding a new key is similiar to modifying the associated data of an existing key: map += ([ key: value0; ...; valuen ]); Or in case only a single value should be modified: map[key, n] = valuen; If is out of range or if doesn't exists and is greater than 0 an "Illegal index" error will be reported. If is equal to 0 or the mapping only has a single value per key one can abbreviate it with: map[key] = value; If there is no (and is equal to 0 or not specified at all) a new one will be added automatically. Multiple values for a single key can be modified with range access: map[key, n1..n2] = ({ valuen1, ..., valuen2 }); map[key, 0..<1] = ({ value0, ..., valuen }); Deletion of a key is done with the -= operator or the efun m_delete(). A mapping can only be substracted by one without any values: map -= ([ key ]); or: map -= ([ key0, ..., keyn ]); The efun takes a mapping as first and a key as second argument: m_delete(map, key); The efun m_delete() returns the mapping but because mappings are handled as references there is no need of an assignment like: map = m_delete(map, key); 5. How can I access the data stored in a mapping? This can be done by: valuen = map[key, n]; Or in case of a mapping with just one value per key: value0 = map[key]; If there is no in the mapping and is 0 or not specified at all (which is the same) a 0 will be returned or if is greater than 0 an "Illegal index" error will be reported. Multiple values for a single key can be accessed by: arr = map[key, n1..n2]; The values are returned in an array. 6. How can I test for the existance of a key? A return value of 0 is sufficient for most applications but sometimes the ambiguity between an existing value of 0 and a nonexisting key can lead to a problem. Therefore one can use the efun member() or mapping_contains() to check if there actually is a key in the mapping: if (member(map, key)) { ... } or: if (mapping_contains(&value0, ..., &valuen, map, key)) { ... } This also shows how one can retrieve all values associated to a key from a mapping in a single step. The '&' is the reference operator which is neccesary to let the efun store the values in the variables. In case of mappings with no values, the efun member() and mapping_contains() are equal in their behaviour and their way of calling because mapping_contains() won't get any reference variables to store the values in (obviously, because there aren't any). Also normally member() is known to return the postion of an element in a list (i.e. a character in a string or data in an array) and if an element couldn't be found -1 is returned. But in the case of mappings there are no such things as order and postion. So member() only returns 0 or 1. 7. How can I copy a mapping? A mapping can be copied with the + operator or by the efun copy_mapping(): newmap = ([]) + map; or: newmap = copy_mapping(map); A mapping should only be copied when it is neccesary to get an own copy of it that must not be shared by other objects. 8. How can I get all keys of a mapping? The efun m_indices() gets a mapping as argument and returns an array holding all keys defined in this mapping: keys = m_indices(map); 9. How can I get all the values of a mapping? The efun m_values() gets a mapping as argument and returns an array holding all the first (second, ...) values of it. values0 = m_values(map); returns the first values values0 = m_values(map, 0); dito values1 = m_values(map, 1); returns the second values etc 10. How can I determine the size of a mapping? Because a mapping is a kind of rectangle it has two sizes: a length and a width. There are three different efuns to query these values. The first two are the efuns sizeof(), which returns the amount of key-value associations (the length of a mapping), and widthof(), which returns the number of values per key (the width). The third is the efun get_type_info(). get_type_info() is meant to be a function to identify a datatype. Its return value is an array of two numerical values. The first specifies the datatype of the argument and the second is a datatype dependend value. In the case of a mapping the first value is T_MAPPING (which is a value defined in ) and the second the amount of values per key (a.k.a. columns or the width of the mapping - actually it would be correct to say that the width of a mapping is the amount of columns plus one for the keys but this is uncommon). 11. What is the best method to iterate over a mapping? First of all the main purpose of a mapping is not meant to be a set of data to iterate over. Afterall the keys in a mapping have no specific but a random order (at least on the LPC side). But still it is possible and sometimes even neccesary to do so. If all key-value associations should be processed then one should use walk_mapping(). If all keys of a mapping should be processed to create a new mapping being a subset of the given one, then filter_mapping() should be used. If all keys are going to be processed and to create a new mapping with the same set of keys as the given mapping, then one would use map_mapping(). But in the case of an iteration that should/can stop even if not all data is processed it is probably wise to iterate over the mapping by first querying for the keys and then to iterate over them with a for() or a while() loop and querying the values by 'hand'. The efun walk_mapping() gets a mapping as first argument and the name of a function as second one. All the following arguments are treated as extras which will be passed to the function specified with the 2nd argument. Instead of a string for the name of a function a closure can be used, too. Nothing will be returned: ... walk_mapping(map, "func", xarg0, ..., xargn); ... void func(mixed key, mixed value0, ..., mixed valuen, mixed xarg0, ..., mixed xargn) { ... } func() will be called for all key-value associations and gets as first argument the key. The next arguments are the values behind the key and are passed as references. The rest of the passed arguments are those specified as extras. Because the values are passed as references (opposed to copies) it is possible to modify them from inside func() by simply assigning new value to the variables , ..., . The efun filter_mapping() calls a function for each key in a mapping and creates a new mapping which only contains key-value associations for which the called function returned true (not equal 0 that is). The first argument is the mapping to iterate over and the second is a function name given as a string or a closure: ... submap = filter_mapping(map, "func", xarg0, ..., xargn); ... int func(mixed key, mixed xarg0, ..., mixed xargn) { ... } func() gets as first argument the key and the others are those passed as extras to filter_mapping(). The efun map_mapping() gets a mapping as first argument and a string as a function name (or again a closure) as second argument. Any additional arguments are again used as extras that will be passed to the iteration function. This efun returns a new mapping with the same keys as the given one. The values returned by the function that is invoked for each key will be used as the associated data behind each key of the new mapping: ... newmap = map_mapping(map, "func", xarg0, ..., xargn); ... mixed func(mixed key, mixed xarg0, ..., mixed xargn) { ... } func() gets as first argument the key and the others are those passed as extras to map_mapping(). Because a function can only return a single value (even when it is an array) it restricts the use of map_mapping() to only allow creation of mappings with a single value per key. 12. Is it possible to join/intersect/cut mappings with another? Joining mappings is only possible, if they have the same width (amount of values per key). One can use the + and += operator: map = map1 + map2 + ... + mapn; map += map1 + map2 + ... + mapn; Intersection of two mappings is only possible by using filter_mapping(). There is no efun or operator which features this. The 'easiest' way may be the following function: mapping intersect_mapping(mapping map1, mapping map2) { closure cl; cl = lambda(({ 'key }), ({ #'member, map2, 'key })); return filter_mapping(map1, cl, map2); } This function returns a new mapping which consists of all key-value associations of for which an equal key could be found in . This function uses a closure which returns 0 or 1 depending on wether a key from is contained in or not. Cutting out all key-value associations of a mapping for which a key could be found in another mapping can be done by using the - and -= operator: mapping cut_mapping(mapping map1, mapping map2) { return map1 - mkmapping(m_indices(map2)); } Because a maping can only be substracted by one without any values we first have to create such by using m_indices() and mkmapping(). 13. What are those mappings without any values (besides keys) good for? Because the way how the driver searches for a key in a mapping is rather fast, those mappings can be used as a set of elements with a fast method for testing if an element is contained in the set. This technique is called hashing (further explanation would lead too far) which is faster than searching for values in array (which is done in a linear fashion). Another (maybe more pratical) use of these mappings are to create a array of unique values out of an array with several equal values: uniques = m_indices(mkmapping(array)); mkmapping() uses to create a mapping without any values but just keys. And because a mapping can only have unique keys all multiple values in are taken as one. The call of m_indices() then returns an array of these unique keys. Actually we only make use of those mappings temporarily. 14. How can I convert an alist into a mapping and vice versa? There are no special efuns which handle such conversions. But it can be done by the following functions: mapping alist_to_mapping(mixed *alist) { return apply(#'mkmapping, alist); } The efun apply() takes a closure and an array of values and passes each element of the array as an argument to the closure. Because an alist consists of an array of arrays with the first beeing the list of keys and the others the values associated to each key passing them as arguments to the efun closure #'mkmapping via apply() causes the creation of a mapping out of an alist. mixed *mapping_to_alist(mapping map) { mixed *alist; symbol *vars; string var; closure cl; int width; width = get_type_info(map)[1]; alist = allocate(width + 1); vars = allocate(width + 2); for (var = "a"; width; var[0]++, width--) { alist[width] = ({}); vars[width] = quote(var); } alist[0] = ({}); vars[0] = 'key; vars[<1] = 'alist; cl = lambda(vars, ({ #'=, 'alist, ({ #'insert_alist }) + vars })); walk_mapping(map, cl, &alist); return alist; } This function is a bit more complicated than the other and detailed description would lead too far of the topic. This function has one restriction: it can only turn a mappings with up to 26 values per key into an alist. But this should be sufficient for probably all applications which use mappings. And Hyps further comment on this: The function mapping_to_alist() is also not that clever because insert_alist() allways creates a new alist. A second (optional) argument to m_values() to specify the value column would be better. Besides this, the conversion of a mapping into an alist could be done by to_array(). 15. Dirty Mappings 'Dirty mappings' are nothing the LPC programmer directly is involved with, however, as it relates to the way mappings are implemented internally by the gamedriver. However, as this term shows up in various driver statistics, it is explained here. There are two fundamental approaches to implement mappings: 1. Store all data entries in an array-like structure, in sorted order. 2. Store all data in a hashtable, each entry allocaed separately. Method 1 is very space efficient, as it doesn't need much overhead per entry; however, insertions and deletions of entries are relatively slow as all other entries need to be moved.     Method 2 is very fast as nothing needs to be moved in memory, however it has a large overhead. The gamedriver uses a hybrid method: at the basis is a mapping implementation based on arrays. However the driver uses a hash table in addition to handle all the ongoing insertions and deletions. Every once in a while, the contents of the hash table are sorted into the base array, reasoning that any entry surviving for longer time in the hash table is worth keeping in a more space-efficient manner. 'Dirty' mappings are such mappings with both an array and a hash part, 'clean' mappings are those with just an array part. HISTORY The ([:width ]) notation was added in LDMud 3.2.9/3.3.208 . SEE ALSO alists(LPC), closures(LPC), structs(LPC), mkmapping(E), walk_mapping(E)