@c Gnulib README @c Copyright 2001, 2003--2024 Free Software Foundation, Inc. @c Permission is granted to copy, distribute and/or modify this document @c under the terms of the GNU Free Documentation License, Version 1.3 or @c any later version published by the Free Software Foundation; with no @c Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A @c copy of the license is at . @menu * Gnulib Basics:: * Git Checkout:: * Keeping Up-to-date:: * Contributing to Gnulib:: * Portability guidelines:: * High Quality:: * Joining GNU:: @end menu @node Gnulib Basics @section Gnulib Basics While portability across operating systems is not one of GNU's primary goals, it has helped introduce many people to the GNU system, and is worthwhile when it can be achieved at a low cost. This collection helps lower that cost. Gnulib is intended to be the canonical source for most of the important ``portability'' and/or common files for GNU projects. These are files intended to be shared at the source level; Gnulib is not a typical library meant to be installed and linked against. Thus, unlike most projects, Gnulib does not normally generate a source tarball distribution; instead, developers grab modules directly from the source repository. The easiest, and recommended, way to do this is to use the @command{gnulib-tool} script. Since there is no installation procedure for Gnulib, @command{gnulib-tool} needs to be run directly in the directory that contains the Gnulib source code. You can do this either by specifying the absolute filename of @command{gnulib-tool}, or by using a symbolic link from a place inside your @env{PATH} to the @command{gnulib-tool} file of your preferred Gnulib checkout. For example: @example $ ln -s $HOME/gnu/src/gnulib.git/gnulib-tool $HOME/bin/gnulib-tool @end example @node Git Checkout @section Git Checkout Gnulib is available for anonymous checkout. In any Bourne-shell the following should work: @example $ git clone https://git.savannah.gnu.org/git/gnulib.git @end example For a read-write checkout you need to have a login on @samp{savannah.gnu.org} and be a member of the Gnulib project at @url{https://savannah.gnu.org/projects/gnulib}. Then, instead of the URL @url{https://git.savannah.gnu.org/git/gnulib.git}, use the URL @samp{ssh://@var{user}@@git.savannah.gnu.org/srv/git/gnulib} where @var{user} is your login name on savannah.gnu.org. git resources: @table @asis @item Overview: @url{https://en.wikipedia.org/wiki/Git_(software)} @item Homepage: @url{https://git-scm.com/} @end table When you use @code{git annotate} or @code{git blame} with Gnulib, it's recommended that you use the @option{-w} option, in order to ignore massive whitespace changes that happened in 2009. @node Keeping Up-to-date @section Keeping Up-to-date The best way to work with Gnulib is to check it out of git. To synchronize, you can use @code{git pull}. Subscribing to the @email{bug-gnulib@@gnu.org} mailing list will help you to plan when to update your local copy of Gnulib (which you use to maintain your software) from git. You can review the archives, subscribe, etc., via @url{https://lists.gnu.org/mailman/listinfo/bug-gnulib}. Sometimes, using an updated version of Gnulib will require you to use newer versions of GNU Automake or Autoconf. You may find it helpful to join the autotools-announce mailing list to be advised of such changes. @node Contributing to Gnulib @section Contributing to Gnulib All software here is copyrighted by the Free Software Foundation---you need to have filled out an assignment form for a project that uses the module for that contribution to be accepted here. If you have a piece of code that you would like to contribute, please email @email{bug-gnulib@@gnu.org}. Generally we are looking for files that fulfill at least one of the following requirements: @itemize @item If your @file{.c} and @file{.h} files define functions that are broken or missing on some other system, we should be able to include it. @item If your functions remove arbitrary limits from existing functions (either under the same name, or as a slightly different name), we should be able to include it. @end itemize If your functions define completely new but rarely used functionality, you should probably consider packaging it as a separate library. @menu * Gnulib licensing:: * Indent with spaces not TABs:: * How to add a new module:: @end menu @node Gnulib licensing @subsection Gnulib licensing Gnulib contains code both under GPL and LGPL@. Because several packages that use Gnulib are GPL, the files state they are licensed under GPL@. However, to support LGPL projects as well, you may use some of the files under LGPL@. The ``License:'' information in the files under modules/ clarifies the real license that applies to the module source. Keep in mind that if you submit patches to files in Gnulib, you should license them under a compatible license, which means that sometimes the contribution will have to be LGPL, if the original file is available under LGPL via a ``License: LGPL'' information in the projects' modules/ file. @node Indent with spaces not TABs @subsection Indent with spaces not TABs We use space-only indentation in nearly all files. This includes all @file{*.h}, @file{*.c}, @file{*.y} files, except for the @code{regex} module. Makefile and ChangeLog files are excluded, since TAB characters are part of their format. In order to tell your editor to produce space-only indentation, you can use these instructions. @itemize @item For Emacs: Add these lines to your Emacs initialization file (@file{$HOME/.emacs} or similar): @example ;; In Gnulib, indent with spaces everywhere (not TABs). ;; Exceptions: Makefile and ChangeLog modes. (add-hook 'find-file-hook '(lambda () (if (and buffer-file-name (string-match "/gnulib\\>" (buffer-file-name)) (not (string-equal mode-name "Change Log")) (not (string-equal mode-name "Makefile"))) (setq indent-tabs-mode nil)))) @end example @item For vi (vim): Add these lines to your @file{$HOME/.vimrc} file: @example " Don't use tabs for indentation. Spaces are nicer to work with. set expandtab @end example For Makefile and ChangeLog files, compensate for this by adding this to your @file{$HOME/.vim/after/indent/make.vim} file, and similarly for your @file{$HOME/.vim/after/indent/changelog.vim} file: @example " Use tabs for indentation, regardless of the global setting. set noexpandtab @end example @item For Eclipse: In the ``Window|Preferences'' dialog (or ``Eclipse|Preferences'' dialog on Mac OS), @enumerate @item Under ``General|Editors|Text Editors'', select the ``Insert spaces for tabs'' checkbox. @item Under ``C/C++|Code Style'', select a code style profile that has the ``Indentation|Tab policy'' combobox set to ``Spaces only'', such as the ``GNU [built-in]'' policy. @end enumerate If you use the GNU indent program, pass it the option @option{--no-tabs}. @end itemize @node How to add a new module @subsection How to add a new module @itemize @item Add the header files and source files to @file{lib/}. @item If the module needs configure-time checks, write an Autoconf macro for it in @file{m4/@var{module}.m4}. See @file{m4/README} for details. @item Write a module description @file{modules/@var{module}}, based on @file{modules/TEMPLATE}. @item If the module contributes a section to the end-user documentation, put this documentation in @file{doc/@var{module}.texi} and add it to the ``Files'' section of @file{modules/@var{module}}. Most modules don't do this; they have only documentation for the programmer (= Gnulib user). Such documentation usually goes into the @file{lib/} source files. It may also go into @file{doc/}; but don't add it to the module description in this case. @item Add the module to the list in @file{MODULES.html.sh}. @end itemize @noindent You can test that a module builds correctly with: @example $ ./gnulib-tool --create-testdir --dir=/tmp/testdir module1 ... moduleN $ cd /tmp/testdir $ ./configure && make @end example @noindent Other things: @itemize @item Check the license and copyright year of headers. @item Check that the source code follows the GNU coding standards; see @url{https://www.gnu.org/prep/standards}. @item Add source files to @file{config/srclist*} if they are identical to upstream and should be upgraded in Gnulib whenever the upstream source changes. @item Include header files in source files to verify the function prototypes. @item Make sure a replacement function doesn't cause warnings or clashes on systems that have the function. @item Autoconf functions can use @samp{gl_*} prefix. The @samp{AC_*} prefix is for autoconf internal functions. @item Build files only if they are needed on a platform. Look at the @code{alloca} and @code{fnmatch} modules for how to achieve this. If for some reason you cannot do this, and you have a @file{.c} file that leads to an empty @file{.o} file on some platforms (through some big @code{#if} around all the code), then ensure that the compilation unit is not empty after preprocessing. One way to do this is to @code{#include } or @code{} before the big @code{#if}. @end itemize @node Portability guidelines @section Portability guidelines Gnulib code is intended to be portable to a wide variety of platforms, not just GNU platforms. Gnulib typically attempts to support a platform as long as it is still supported by its provider, even if the platform is not the latest version. @xref{Target Platforms}. Many Gnulib modules exist so that applications need not worry about undesirable variability in implementations. For example, an application that uses the @code{malloc} module need not worry about @code{malloc@ (0)} returning a null pointer on some Standard C platforms; and @code{glob} users need not worry about @code{glob} silently omitting symbolic links to nonexistent files on some platforms that do not conform to POSIX. Gnulib code is intended to port without problem to new hosts, e.g., hosts conforming to recent C and POSIX standards. Hence Gnulib code should avoid using constructs that these newer standards no longer require, without first testing for the presence of these constructs. For example, because C11 made variable length arrays optional, Gnulib code should avoid them unless it first uses the @code{vararrays} module to check whether they are supported. The following subsections discuss some exceptions and caveats to the general Gnulib portability guidelines. @menu * C language versions:: * C99 features assumed:: * C99 features avoided:: * Other portability assumptions:: @end menu @node C language versions @subsection C language versions Currently Gnulib assumes at least a freestanding C99 compiler, possibly operating with a C library that predates C99; with time this assumption will likely be strengthened to later versions of the C standard. Old platforms currently supported include AIX 6.1, HP-UX 11i v1 and Solaris 10, though these platforms are rarely tested. Gnulib itself is so old that it contains many fixes for obsolete platforms, fixes that may be removed in the future. Because of the freestanding C99 assumption, Gnulib code can include @code{}, @code{}, @code{}, @code{}, and @code{} unconditionally; @code{} is also in the C99 freestanding list but is obsolescent as of C23. Gnulib code can also assume the existence of @code{}, @code{}, @code{}, @code{}, @code{}, @code{}, @code{}, @code{}, and @code{}. Similarly, many modules include @code{} even though it's not even in C11; that's OK since @code{} has been around nearly forever. Even if the include files exist, they may not conform to the C standard. However, GCC has a @command{fixincludes} script that attempts to fix most conformance problems. Gnulib currently assumes include files largely conform to C99 or better. People still using ancient hosts should use fixincludes or fix their include files manually. Even if the include files conform, the library itself may not. For example, @code{strtod} and @code{mktime} have some bugs on some platforms. You can work around some of these problems by requiring the relevant modules, e.g., the Gnulib @code{mktime} module supplies a working and conforming @code{mktime}. @node C99 features assumed @subsection C99 features assumed by Gnulib Although the C99 standard specifies many features, Gnulib code is conservative about using them, partly because Gnulib predates the widespread adoption of C99, and partly because many C99 features are not well-supported in practice. C99 features that are reasonably portable nowadays include: @itemize @item A declaration after a statement, or as the first clause in a @code{for} statement. @item @code{long long int}. @item @code{}, although Gnulib code no longer uses it directly, preferring plain @code{bool} via the @code{stdbool} module instead. @xref{stdbool.h}. @item @code{}, assuming the @code{stdint} module is used. @xref{stdint.h}. @item Compound literals and designated initializers. @item Variadic macros.@* @findex __VA_ARGS__ Note: The handling of @code{__VA_ARGS__} in MSVC differs from the one in ISO C 99, see @url{https://stackoverflow.com/questions/5134523/}. But usually this matters only for macros that decompose @code{__VA_ARGS__}. @item @code{static inline} functions. @item @code{__func__}, assuming the @code{func} module is used. @xref{func}. @item The @code{restrict} qualifier, assuming @code{AC_REQUIRE([AC_C_RESTRICT])} is used. This qualifier is sometimes implemented via a macro, so C++ code that uses Gnulib should avoid using @code{restrict} as an identifier. @item Flexible array members (however, see the @code{flexmember} module). @end itemize @node C99 features avoided @subsection C99 features avoided by Gnulib Gnulib avoids some features even though they are standardized by C99, as they have portability problems in practice. Here is a partial list of avoided C99 features. Many other C99 features are portable only if their corresponding modules are used; Gnulib code that uses such a feature should require the corresponding module. @itemize @item Variable length arrays (VLAs) or variably modified types, without checking whether @code{__STDC_NO_VLA__} is defined. See the @code{vararrays} and @code{vla} modules. @item Block-scope variable length arrays, without checking whether either @code{GNULIB_NO_VLA} or @code{__STDC_NO_VLA__} is defined. This lets you define @code{GNULIB_NO_VLA} to pacify GCC when using its @option{-Wvla-larger-than warnings} option, and to avoid large stack usage that may have security implications. @code{GNULIB_NO_VLA} does not affect Gnulib's other uses of VLAs and variably modified types, such as array declarations in function prototype scope. @item Converting to pointers via integer types other than @code{intptr_t} or @code{uintptr_t}. Although the C standard says that values of these integer types, if they exist, should be convertible to and from @code{intmax_t} and @code{uintmax_t} without loss of information, on CHERI platforms such conversions result in integers that, if converted back to a pointer, cannot be dereferenced. @item @code{extern inline} functions, without checking whether they are supported. @xref{extern inline}. @item Type-generic math functions. @item Universal character names in source code. @item @code{}, since GNU programs need not worry about deficient source-code encodings. @item Comments beginning with @samp{//}. This is mostly for style reasons. @end itemize @node Other portability assumptions @subsection Other portability assumptions made by Gnulib Gnulib code makes the following assumptions that go beyond what C and POSIX require: @itemize @item Standard internal types like @code{ptrdiff_t} and @code{size_t} are no wider than @code{long}. The GNU coding standards allow code to make this assumption, POSIX requires implementations to support at least one programming environment where this is true, and such environments are recommended for Gnulib-using applications. When it is easy to port to non-POSIX platforms like MinGW where these types are wider than @code{long}, new Gnulib code should do so, e.g., by using @code{ptrdiff_t} instead of @code{long}. However, it is not always that easy, and no effort has been made to check that all Gnulib modules work on MinGW-like environments. @item @code{int} and @code{unsigned int} are at least 32 bits wide. POSIX and the GNU coding standards both require this. @item Signed integer arithmetic is two's complement. Previously, Gnulib code sometimes also assumed that signed integer arithmetic wraps around, but modern compiler optimizations sometimes do not guarantee this, and Gnulib code with this assumption is now considered to be questionable. @xref{Integer Properties}. Although some Gnulib modules contain explicit support for ones' complement and signed magnitude integer representations, which are allowed by C17 and earlier, these modules are the exception rather than the rule. All practical Gnulib targets use two's complement, which is required by C23. @item There are no ``holes'' in integer values: all the bits of an integer contribute to its value in the usual way. In particular, an unsigned type and its signed counterpart have the same number of bits when you count the latter's sign bit. (As an exception, Gnulib code is portable to CHERI platforms even though this assumption is false for CHERI.) @item Objects with all bits zero are treated as zero or as null pointers. For example, @code{memset@ (A, 0, sizeof@ A)} initializes an array @code{A} of pointers to null pointers. @item The types @code{intptr_t} and @code{uintptr_t} exist, and pointers can be converted to and from these types without loss of information. @item Addresses and sizes behave as if objects reside in a flat address space. In particular: @itemize @item If two nonoverlapping objects have sizes @var{S} and @var{T} represented as @code{ptrdiff_t} or @code{size_t} values, then @code{@var{S} + @var{T}} cannot overflow. @item A pointer @var{P} points within an object @var{O} if and only if @code{(char *) &@var{O} <= (char *) @var{P} && (char *) @var{P} < (char *) (&@var{O} + 1)}. @item Arithmetic on a valid pointer is equivalent to the same arithmetic on the pointer converted to @code{uintptr_t}, except that offsets are multiplied by the size of the pointed-to objects. For example, if @code{P + I} is a valid expression involving a pointer @var{P} and an integer @var{I}, then @code{(uintptr_t) (P + I) == (uintptr_t) ((uintptr_t) P + I * sizeof *P)}. Similar arithmetic can be done with @code{intptr_t}, although more care must be taken in case of integer overflow or negative integers. @item A pointer @code{P} has alignment @code{A} if and only if @code{(uintptr_t) P % A} is zero, and similarly for @code{intptr_t}. @item If an existing object has size @var{S}, and if @var{T} is sufficiently small (e.g., 8 KiB), then @code{@var{S} + @var{T}} cannot overflow. Overflow in this case would mean that the rest of your program fits into @var{T} bytes, which can't happen in realistic flat-address-space hosts. @item Adding zero to a null pointer does not change the pointer. For example, @code{0 + (char *) NULL == (char *) NULL}. @end itemize @end itemize Some system platforms violate these assumptions and are therefore not Gnulib porting targets. @xref{Unsupported Platforms}. @node High Quality @section High Quality We develop and maintain a testsuite for Gnulib. The goal is to have a 100% firm interface so that maintainers can feel free to update to the code in git at @emph{any} time and know that their application will not break. This means that before any change can be committed to the repository, a test suite program must be produced that exposes the bug for regression testing. @node Stable Branches @subsection Stable Branches In Gnulib, we don't use topic branches for experimental work. Therefore, occasionally a broken commit may be pushed in Gnulib. It does not happen often, but it does happen. To compensate for this, Gnulib offers ``stable branches''. These are branches of the Gnulib code that are maintained over some longer period (a year, for example) and include @itemize @item bug fixes, @item portability enhancements (to existing as well as to new platforms), @item updates to @code{config.guess} and @code{config.sub}. @end itemize @noindent Not included in the stable branches are: @itemize @item new features, such as new modules, @item optimizations, @item refactorings, @item complex or risky changes in general, @item updates to @code{texinfo.tex}, @item documentation updates. @end itemize So far, we have four stable branches: @table @code @item stable-202307 A stable branch that starts at the beginning of July 2023. @item stable-202301 A stable branch that starts at the beginning of January 2023. @item stable-202207 A stable branch that starts at the beginning of July 2022. It is no longer updated. @item stable-202201 A stable branch that starts at the beginning of January 2022. It is no longer updated. @end table The two use-cases of stable branches are thus: @itemize @item You want to protect yourself from occasional breakage in Gnulib. @item When making a bug-fix release of your code, you can incorporate bug fixes in Gnulib, by pulling in the newest commits from the same stable branch that you were already using for the previous release. @end itemize @node Writing reliable code @subsection Writing reliable code When compiling and testing Gnulib and Gnulib-using programs, certain compiler options can help improve reliability. First of all, make it a habit to use @samp{-Wall} in all compilation commands. Beyond that, the @code{manywarnings} module enables several forms of static checking in GCC and related compilers (@pxref{manywarnings}). For dynamic checking, you can run @code{configure} with @code{CFLAGS} options appropriate for your compiler. For example: @example ./configure \ CPPFLAGS='-Wall'\ CFLAGS='-g3 -O2'\ ' -D_FORTIFY_SOURCE=2'\ ' -fsanitize=undefined'\ ' -fsanitize-undefined-trap-on-error' @end example @noindent Here: @itemize @bullet @item @code{-D_FORTIFY_SOURCE=2} enables extra security hardening checks in the GNU C library. @item @code{-fsanitize=undefined} enables GCC's undefined behavior sanitizer (@code{ubsan}), and @item @code{-fsanitize-undefined-trap-on-error} causes @code{ubsan} to abort the program (through an ``illegal instruction'' signal). This measure stops exploit attempts and also allows you to debug the issue. @end itemize Without the @code{-fsanitize-undefined-trap-on-error} option, @code{-fsanitize=undefined} causes messages to be printed, and execution continues after an undefined behavior situation. The message printing causes GCC-like compilers to arrange for the program to dynamically link to libraries it might not otherwise need. With GCC, instead of @code{-fsanitize-undefined-trap-on-error} you can use the @code{-static-libubsan} option to arrange for two of the extra libraries (@code{libstdc++} and @code{libubsan}) to be linked statically rather than dynamically, though this typically bloats the executable and the remaining extra libraries are still linked dynamically. It is also good to occasionally run the programs under @code{valgrind} (@pxref{Running self-tests under valgrind}). @include join-gnu.texi