@node Container data types @section Container data types @c Copyright (C) 2019--2024 Free Software Foundation, Inc. @c Permission is granted to copy, distribute and/or modify this document @c under the terms of the GNU Free Documentation License, Version 1.3 or @c any later version published by the Free Software Foundation; with no @c Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A @c copy of the license is at . @c Written by Bruno Haible. @c This macro expands to \log in TeX mode, but just 'log' in HTML and info @c modes. @ifnottex @macro log log @end macro @end ifnottex @c This macro expands to \mathopsup in TeX mode, to a superscript in HTML @c mode, and to ^ without braces in info mode. @ifhtml @macro mathopsup {EXP} @sup{\EXP\} @end macro @end ifhtml @ifinfo @macro mathopsup {EXP} ^\EXP\ @end macro @end ifinfo Gnulib provides several generic container data types. They can be used to organize collections of application-defined objects. @node Ordinary containers @subsection Ordinary container data types @multitable @columnfractions .15 .5 .1 .1 .15 @headitem Data type @tab Details @tab Module @tab Main include file @tab Include file for operations with out-of-memory checking @item Sequential list @tab Can contain any number of objects in any given order. Duplicates are allowed, but can optionally be forbidden. @tab @code{list} @tab @code{"gl_list.h"} @tab @code{"gl_xlist.h"} @item Set @tab Can contain any number of objects; the order does not matter. Duplicates (in the sense of the comparator) are forbidden. @tab @code{set} @tab @code{"gl_set.h"} @tab @code{"gl_xset.h"} @item Ordered set @tab Can contain any number of objects in the order of a given comparator function. Duplicates (in the sense of the comparator) are forbidden. @tab @code{oset} @tab @code{"gl_oset.h"} @tab @code{"gl_xoset.h"} @item Map @tab Can contain any number of (key, value) pairs, where keys and values are objects; there are no (key, value1) and (key, value2) pairs with the same key (in the sense of a given comparator function). @tab @code{map} @tab @code{"gl_map.h"} @tab @code{"gl_xmap.h"} @item Ordered map @tab Can contain any number of (key, value) pairs, where keys and values are objects; the (key, value) pairs are ordered by the key, in the order of a given comparator function; there are no (key, value1) and (key, value2) pairs with the same key (in the sense of the comparator function). @tab @code{omap} @tab @code{"gl_omap.h"} @tab @code{"gl_xomap.h"} @end multitable Operations without out-of-memory checking (suitable for use in libraries) are declared in the ``main include file''. Whereas operations with out-of-memory checking (suitable only in programs) are declared in the ``include file for operations with out-of-memory checking''. For each of the data types, several implementations are available, with different performance profiles with respect to the available operations. This enables you to start with the simplest implementation (ARRAY) initially, and switch to a more suitable implementation after profiling your application. The implementation of each container instance is specified in a single place only: in the invocation of the function @code{gl_*_create_empty} that creates the instance. The implementations and the guaranteed average performance for the operations for the ``sequential list'' data type are: @multitable @columnfractions 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 @headitem Operation @tab ARRAY @tab CARRAY @tab LINKED @tab TREE @tab LINKEDHASH with duplicates @tab LINKEDHASH without duplicates @tab TREEHASH with duplicates @tab TREEHASH without duplicates @item @code{gl_list_size} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @item @code{gl_list_node_value} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @item @code{gl_list_node_set_value} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(1)} @item @code{gl_list_next_node} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_list_previous_node} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_list_first_node} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_list_last_node} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_list_get_at} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(n)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_list_get_first} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_list_get_last} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_list_set_at} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(n)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(@log n)} @item @code{gl_list_set_first} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(1)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(@log n)} @item @code{gl_list_set_last} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(1)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(@log n)} @item @code{gl_list_search} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(1)} @item @code{gl_list_search_from} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(1)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(@log n)} @item @code{gl_list_search_from_to} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(1)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(@log n)} @item @code{gl_list_indexof} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_list_indexof_from} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(@log n)} @item @code{gl_list_indexof_from_to} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(@log n)} @item @code{gl_list_add_first} @tab @math{O(n)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(@log n)} @item @code{gl_list_add_last} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(@log n)} @item @code{gl_list_add_before} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(@log n)} @item @code{gl_list_add_after} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(@log n)} @item @code{gl_list_add_at} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(@log n)} @item @code{gl_list_remove_node} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(1)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(@log n)} @item @code{gl_list_remove_at} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(@log n)} @item @code{gl_list_remove_first} @tab @math{O(n)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(1)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(@log n)} @item @code{gl_list_remove_last} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(1)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(@log n)} @item @code{gl_list_remove} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(1)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(@log n)} @item @code{gl_list_iterator} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_list_iterator_from_to} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(n)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_list_iterator_next} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(1)} @tab @math{O(1)} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_sortedlist_search} @tab @math{O(@log n)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_sortedlist_search_from} @tab @math{O(@log n)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_sortedlist_indexof} @tab @math{O(@log n)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_sortedlist_indexof_from} @tab @math{O(@log n)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_sortedlist_add} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(@log n)} @item @code{gl_sortedlist_remove} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O(@log n)} @tab @math{O(n)} @tab @math{O(n)} @tab @math{O((@log n)@mathopsup{2})} @tab @math{O(@log n)} @end multitable The implementations and the guaranteed average performance for the operations for the ``set'' data type are: @multitable @columnfractions 0.4 0.2 0.4 @headitem Operation @tab ARRAY @tab LINKEDHASH, HASH @item @code{gl_set_size} @tab @math{O(1)} @tab @math{O(1)} @item @code{gl_set_add} @tab @math{O(n)} @tab @math{O(1)} @item @code{gl_set_remove} @tab @math{O(n)} @tab @math{O(1)} @item @code{gl_set_search} @tab @math{O(n)} @tab @math{O(1)} @item @code{gl_set_iterator} @tab @math{O(1)} @tab @math{O(1)} @item @code{gl_set_iterator_next} @tab @math{O(1)} @tab @math{O(1)} @end multitable The implementations and the guaranteed average performance for the operations for the ``ordered set'' data type are: @multitable @columnfractions 0.5 0.25 0.25 @headitem Operation @tab ARRAY @tab TREE @item @code{gl_oset_size} @tab @math{O(1)} @tab @math{O(1)} @item @code{gl_oset_add} @tab @math{O(n)} @tab @math{O(@log n)} @item @code{gl_oset_remove} @tab @math{O(n)} @tab @math{O(@log n)} @item @code{gl_oset_search} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_oset_search_atleast} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_oset_iterator} @tab @math{O(1)} @tab @math{O(@log n)} @item @code{gl_oset_iterator_next} @tab @math{O(1)} @tab @math{O(@log n)} @end multitable The implementations and the guaranteed average performance for the operations for the ``map'' data type are: @multitable @columnfractions 0.4 0.2 0.4 @headitem Operation @tab ARRAY @tab LINKEDHASH, HASH @item @code{gl_map_size} @tab @math{O(1)} @tab @math{O(1)} @item @code{gl_map_get} @tab @math{O(n)} @tab @math{O(1)} @item @code{gl_map_put} @tab @math{O(n)} @tab @math{O(1)} @item @code{gl_map_remove} @tab @math{O(n)} @tab @math{O(1)} @item @code{gl_map_search} @tab @math{O(n)} @tab @math{O(1)} @item @code{gl_map_iterator} @tab @math{O(1)} @tab @math{O(1)} @item @code{gl_map_iterator_next} @tab @math{O(1)} @tab @math{O(1)} @end multitable The implementations and the guaranteed average performance for the operations for the ``ordered map'' data type are: @multitable @columnfractions 0.5 0.25 0.25 @headitem Operation @tab ARRAY @tab TREE @item @code{gl_omap_size} @tab @math{O(1)} @tab @math{O(1)} @item @code{gl_omap_get} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_omap_put} @tab @math{O(n)} @tab @math{O(@log n)} @item @code{gl_omap_remove} @tab @math{O(n)} @tab @math{O(@log n)} @item @code{gl_omap_search} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_omap_search_atleast} @tab @math{O(@log n)} @tab @math{O(@log n)} @item @code{gl_omap_iterator} @tab @math{O(1)} @tab @math{O(@log n)} @item @code{gl_omap_iterator_next} @tab @math{O(1)} @tab @math{O(@log n)} @end multitable For C++, Gnulib provides a C++ template class for each of these container data types. @multitable @columnfractions .30 .20 .25 .25 @headitem Data type @tab C++ class @tab Module @tab Include file @item Sequential list @tab @code{gl_List} @tab @code{list-c++} @tab @code{"gl_list.hh"} @item Set @tab @code{gl_Set} @tab @code{set-c++} @tab @code{"gl_set.hh"} @item Ordered set @tab @code{gl_OSet} @tab @code{oset-c++} @tab @code{"gl_oset.hh"} @item Map @tab @code{gl_Map} @tab @code{map-c++} @tab @code{"gl_map.hh"} @item Ordered map @tab @code{gl_OMap} @tab @code{omap-c++} @tab @code{"gl_omap.hh"} @end multitable @node Specialized containers @subsection Specialized container data types The @code{hamt} module implements the hash array mapped trie (HAMT) data structure. This is a data structure that contains (key, value) pairs. Lookup of a (key, value) pair given the key is on average an @math{O(1)} operation, assuming a good hash function for the keys is employed. The HAMT data structure is useful when you want modifications (additions of pairs, removal, value changes) to be visible only to some part of your program, whereas other parts of the program continue to use the unmodified HAMT. The HAMT makes this possible in a space-efficient manner: the modified and the unmodified HAMT share most of their allocated memory. It is also time-efficient: Every such modification is @math{O(1)} on average, again assuming a good hash function for the keys. A HAMT can be used whenever an ordinary hash table would be used. It does however, provide non-destructive updating operations without the need to copy the whole container. On the other hand, a hash table is simpler so that its performance may be better when non-destructive update operations are not needed. For example, a HAMT can be used to model the dynamic environment in a LISP interpreter. Updating a value in the dynamic environment of one continuation frame would not modify values in earlier frames. To use the module, include @code{hamt.h} in your code. The public interface is documented in that header file. You have to provide a hash function and an equivalence relation, which defines key equality. The module includes a test file @code{test-hamt.c}, which demonstrates how the API can be used. In the current implementation, each inner node of the HAMT can store up to @math{32 = 2^5} entries and subtries. Whenever a collision between the initial bits of the hash values of two entries would happen, the next @math{5} bits of the hash values are examined and the two entries pushed down one level in the trie. HAMTs have the same average access times as hash tables but grow and shrink dynamically, so they use memory more economically and do not have to be periodically resized. They were described and analyzed in @cite{Phil Bagwell (2000). Ideal Hash Trees (Report). Infoscience Department, École Polytechnique Fédérale de Lausanne.} The persistence aspect of the HAMT data structure, which means that each updating operation (like inserting, replacing, or removing an entry) returns a new HAMT while leaving the original one intact, is achieved through structure sharing, which is even safe in the presence of multiple threads when the used C compiler supports atomics. @ifnottex @unmacro log @end ifnottex @ifhtml @unmacro mathopsup @end ifhtml @ifinfo @unmacro mathopsup @end ifinfo