Skip to content
Snippets Groups Projects
Select Git revision
0 results

bstrlib.txt

Blame
  • bstrlib.txt 164.72 KiB
    Better String library
    ---------------------
    
    by Paul Hsieh
    
    The bstring library is an attempt to provide improved string processing
    functionality to the C and C++ language.  At the heart of the bstring library
    (Bstrlib for short) is the management of "bstring"s which are a significant
    improvement over '\0' terminated char buffers.
    
    ===============================================================================
    
    Motivation
    ----------
    
    The standard C string library has serious problems:
    
        1) Its use of '\0' to denote the end of the string means knowing a
           string's length is O(n) when it could be O(1).
        2) It imposes an interpretation for the character value '\0'.
        3) gets() always exposes the application to a buffer overflow.
        4) strtok() modifies the string its parsing and thus may not be usable in
           programs which are re-entrant or multithreaded.
        5) fgets has the unusual semantic of ignoring '\0's that occur before
           '\n's are consumed.
        6) There is no memory management, and actions performed such as strcpy,
           strcat and sprintf are common places for buffer overflows.
        7) strncpy() doesn't '\0' terminate the destination in some cases.
        8) Passing NULL to C library string functions causes an undefined NULL
           pointer access.
        9) Parameter aliasing (overlapping, or self-referencing parameters)
           within most C library functions has undefined behavior.
       10) Many C library string function calls take integer parameters with
           restricted legal ranges.  Parameters passed outside these ranges are
           not typically detected and cause undefined behavior.
    
    So the desire is to create an alternative string library that does not suffer
    from the above problems and adds in the following functionality:
    
        1) Incorporate string functionality seen from other languages.
            a) MID$() - from BASIC
            b) split()/join() - from Python
            c) string/char x n - from Perl
        2) Implement analogs to functions that combine stream IO and char buffers
           without creating a dependency on stream IO functionality.
        3) Implement the basic text editor-style functions insert, delete, find,
           and replace.
        4) Implement reference based sub-string access (as a generalization of
           pointer arithmetic.)
        5) Implement runtime write protection for strings.
    
    There is also a desire to avoid "API-bloat".  So functionality that can be
    implemented trivially in other functionality is omitted.  So there is no
    left$() or right$() or reverse() or anything like that as part of the core
    functionality.
    
    Explaining Bstrings
    -------------------
    
    A bstring is basically a header which wraps a pointer to a char buffer.  Lets
    start with the declaration of a struct tagbstring:
    
        struct tagbstring {
            int mlen;
            int slen;
            unsigned char * data;
        };
    
    This definition is considered exposed, not opaque (though it is neither
    necessary nor recommended that low level maintenance of bstrings be performed
    whenever the abstract interfaces are sufficient).  The mlen field (usually)
    describes a lower bound for the memory allocated for the data field.  The
    slen field describes the exact length for the bstring.  The data field is a
    single contiguous buffer of unsigned chars.  Note that the existence of a '\0'
    character in the unsigned char buffer pointed to by the data field does not
    necessarily denote the end of the bstring.
    
    To be a well formed modifiable bstring the mlen field must be at least the
    length of the slen field, and slen must be non-negative.  Furthermore, the
    data field must point to a valid buffer in which access to the first mlen
    characters has been acquired.  So the minimal check for correctness is:
    
        (slen >= 0 && mlen >= slen && data != NULL)
    
    bstrings returned by bstring functions can be assumed to be either NULL or
    satisfy the above property.  (When bstrings are only readable, the mlen >=
    slen restriction is not required; this is discussed later in this section.)
    A bstring itself is just a pointer to a struct tagbstring:
    
        typedef struct tagbstring * bstring;
    
    Note that use of the prefix "tag" in struct tagbstring is required to work
    around the inconsistency between C and C++'s struct namespace usage.  This
    definition is also considered exposed.
    
    Bstrlib basically manages bstrings allocated as a header and an associated
    data-buffer.  Since the implementation is exposed, they can also be
    constructed manually.  Functions which mutate bstrings assume that the header
    and data buffer have been malloced; the bstring library may perform free() or
    realloc() on both the header and data buffer of any bstring parameter.
    Functions which return bstring's create new bstrings.  The string memory is
    freed by a bdestroy() call (or using the bstrFree macro).
    
    The following related typedef is also provided:
    
        typedef const struct tagbstring * const_bstring;
    
    which is also considered exposed.  These are directly bstring compatible (no
    casting required) but are just used for parameters which are meant to be
    non-mutable.  So in general, bstring parameters which are read as input but
    not meant to be modified will be declared as const_bstring, and bstring
    parameters which may be modified will be declared as bstring.  This convention
    is recommended for user written functions as well.
    
    Since bstrings maintain interoperability with C library char-buffer style
    strings, all functions which modify, update or create bstrings also append a
    '\0' character into the position slen + 1.  This trailing '\0' character is
    not required for bstrings input to the bstring functions; this is provided
    solely as a convenience for interoperability with standard C char-buffer
    functionality.
    
    Analogs for the ANSI C string library functions have been created when they
    are necessary, but have also been left out when they are not.  In particular
    there are no functions analogous to fwrite, or puts just for the purposes of
    bstring.  The ->data member of any string is exposed, and therefore can be
    used just as easily as char buffers for C functions which read strings.
    
    For those that wish to hand construct bstrings, the following should be kept
    in mind:
    
        1) While bstrlib can accept constructed bstrings without terminating
           '\0' characters, the rest of the C language string library will not
           function properly on such non-terminated strings.  This is obvious
           but must be kept in mind.
        2) If it is intended that a constructed bstring be written to by the
           bstring library functions then the data portion should be allocated
           by the malloc function and the slen and mlen fields should be entered
           properly.  The struct tagbstring header is not reallocated, and only
           freed by bdestroy.
        3) Writing arbitrary '\0' characters at various places in the string
           will not modify its length as perceived by the bstring library
           functions.  In fact, '\0' is a legitimate non-terminating character
           for a bstring to contain.
        4) For read only parameters, bstring functions do not check the mlen.
           I.e., the minimal correctness requirements are reduced to:
    
                (slen >= 0 && data != NULL)
    
    Better pointer arithmetic
    -------------------------
    
    One built-in feature of '\0' terminated char * strings, is that its very easy
    and fast to obtain a reference to the tail of any string using pointer
    arithmetic.  Bstrlib does one better by providing a way to get a reference to
    any substring of a bstring (or any other length delimited block of memory.)
    So rather than just having pointer arithmetic, with bstrlib one essentially
    has segment arithmetic.  This is achieved using the macro blk2tbstr() which
    builds a reference to a block of memory and the macro bmid2tbstr() which
    builds a reference to a segment of a bstring.  Bstrlib also includes
    functions for direct consumption of memory blocks into bstrings, namely
    bcatblk () and blk2bstr ().
    
    One scenario where this can be extremely useful is when string contains many
    substrings which one would like to pass as read-only reference parameters to
    some string consuming function without the need to allocate entire new
    containers for the string data.  More concretely, imagine parsing a command
    line string whose parameters are space delimited.  This can only be done for
    tails of the string with '\0' terminated char * strings.
    
    Improved NULL semantics and error handling
    ------------------------------------------
    
    Unless otherwise noted, if a NULL pointer is passed as a bstring or any other
    detectably illegal parameter, the called function will return with an error
    indicator (either NULL or BSTR_ERR) rather than simply performing a NULL
    pointer access, or having undefined behavior.
    
    To illustrate the value of this, consider the following example:
    
            strcpy (p = malloc (13 * sizeof (char)), "Hello,");
            strcat (p, " World");
    
    This is not correct because malloc may return NULL (due to an out of memory
    condition), and the behaviour of strcpy is undefined if either of its
    parameters are NULL.  However:
    
            bstrcat (p = bfromcstr ("Hello,"), q = bfromcstr (" World"));
            bdestroy (q);
    
    is well defined, because if either p or q are assigned NULL (indicating a
    failure to allocate memory) both bstrcat and bdestroy will recognize it and
    perform no detrimental action.
    
    Note that it is not necessary to check any of the members of a returned
    bstring for internal correctness (in particular the data member does not need
    to be checked against NULL when the header is non-NULL), since this is
    assured by the bstring library itself.
    
    bStreams
    --------
    
    In addition to the bgets and bread functions, bstrlib can abstract streams
    with a high performance read only stream called a bStream.  In general, the
    idea is to open a core stream (with something like fopen) then pass its
    handle as well as a bNread function pointer (like fread) to the bsopen
    function which will return a handle to an open bStream.  Then the functions
    bsread, bsreadln or bsreadlns can be called to read portions of the stream.
    Finally, the bsclose function is called to close the bStream -- it will
    return a handle to the original (core) stream.  So bStreams, essentially,
    wrap other streams.
    
    The bStreams have two main advantages over the bgets and bread (as well as
    fgets/ungetc) paradigms:
    
    1) Improved functionality via the bunread function which allows a stream to
       unread characters, giving the bStream stack-like functionality if so
       desired.
    2) A very high performance bsreadln function.  The C library function fgets()
       (and the bgets function) can typically be written as a loop on top of
       fgetc(), thus paying all of the overhead costs of calling fgetc on a per
       character basis.  bsreadln will read blocks at a time, thus amortizing the
       overhead of fread calls over many characters at once.
    
    However, clearly bStreams are suboptimal or unusable for certain kinds of
    streams (stdin) or certain usage patterns (a few spotty, or non-sequential
    reads from a slow stream.)  For those situations, using bgets will be more
    appropriate.
    
    The semantics of bStreams allows practical construction of layerable data
    streams.  What this means is that by writing a bNread compatible function on
    top of a bStream, one can construct a new bStream on top of it.  This can be
    useful for writing multi-pass parsers that don't actually read the entire
    input more than once and don't require the use of intermediate storage.
    
    Aliasing
    --------
    
    Aliasing occurs when a function is given two parameters which point to data
    structures which overlap in the memory they occupy.  While this does not
    disturb read only functions, for many libraries this can make functions that
    write to these memory locations malfunction.  This is a common problem of the
    C standard library and especially the string functions in the C standard
    library.
    
    The C standard string library is entirely char by char oriented (as is
    bstring) which makes conforming implementations alias safe for some
    scenarios.  However no actual detection of aliasing is typically performed,
    so it is easy to find cases where the aliasing will cause anomolous or
    undesirable behaviour (consider: strcat (p, p).)  The C99 standard includes
    the "restrict" pointer modifier which allows the compiler to document and
    assume a no-alias condition on usage.  However, only the most trivial cases
    can be caught (if at all) by the compiler at compile time, and thus there is
    no actual enforcement of non-aliasing.
    
    Bstrlib, by contrast, permits aliasing and is completely aliasing safe, in
    the C99 sense of aliasing.  That is to say, under the assumption that
    pointers of incompatible types from distinct objects can never alias, bstrlib
    is completely aliasing safe.  (In practice this means that the data buffer
    portion of any bstring and header of any bstring are assumed to never alias.)
    With the exception of the reference building macros, the library behaves as
    if all read-only parameters are first copied and replaced by temporary
    non-aliased parameters before any writing to any output bstring is performed
    (though actual copying is extremely rarely ever done.)
    
    Besides being a useful safety feature, bstring searching/comparison
    functions can improve to O(1) execution when aliasing is detected.
    
    Note that aliasing detection and handling code in Bstrlib is generally
    extremely cheap.  There is almost never any appreciable performance penalty
    for using aliased parameters.
    
    Reenterancy
    -----------
    
    Nearly every function in Bstrlib is a leaf function, and is completely
    reenterable with the exception of writing to common bstrings.  The split
    functions which use a callback mechanism requires only that the source string
    not be destroyed by the callback function unless the callback function returns
    with an error status (note that Bstrlib functions which return an error do
    not modify the string in any way.)  The string can in fact be modified by the
    callback and the behaviour is deterministic.  See the documentation of the
    various split functions for more details.
    
    Undefined scenarios
    -------------------
    
    One of the basic important premises for Bstrlib is to not to increase the
    propogation of undefined situations from parameters that are otherwise legal
    in of themselves.  In particular, except for extremely marginal cases, usages
    of bstrings that use the bstring library functions alone cannot lead to any
    undefined action.  But due to C/C++ language and library limitations, there
    is no way to define a non-trivial library that is completely without
    undefined operations.  All such possible undefined operations are described
    below:
    
    1) bstrings or struct tagbstrings that are not explicitely initialized cannot
       be passed as a parameter to any bstring function.
    2) The members of the NULL bstring cannot be accessed directly.  (Though all
       APIs and macros detect the NULL bstring.)
    3) A bstring whose data member has not been obtained from a malloc or
       compatible call and which is write accessible passed as a writable
       parameter will lead to undefined results.  (i.e., do not writeAllow any
       constructed bstrings unless the data portion has been obtained from the
       heap.)
    4) If the headers of two strings alias but are not identical (which can only
       happen via a defective manual construction), then passing them to a
       bstring function in which one is writable is not defined.
    5) If the mlen member is larger than the actual accessible length of the data
       member for a writable bstring, or if the slen member is larger than the
       readable length of the data member for a readable bstring, then the
       corresponding bstring operations are undefined.
    6) Any bstring definition whose header or accessible data portion has been
       assigned to inaccessible or otherwise illegal memory clearly cannot be
       acted upon by the bstring library in any way.
    7) Destroying the source of an incremental split from within the callback
       and not returning with a negative value (indicating that it should abort)
       will lead to undefined behaviour.  (Though *modifying* or adjusting the
       state of the source data, even if those modification fail within the
       bstrlib API, has well defined behavior.)
    8) Modifying a bstring which is write protected by direct access has
       undefined behavior.
    
    While this may seem like a long list, with the exception of invalid uses of
    the writeAllow macro, and source destruction during an iterative split
    without an accompanying abort, no usage of the bstring API alone can cause
    any undefined scenario to occurr.  I.e., the policy of restricting usage of
    bstrings to the bstring API can significantly reduce the risk of runtime
    errors (in practice it should eliminate them) related to string manipulation
    due to undefined action.
    
    C++ wrapper
    -----------
    
    A C++ wrapper has been created to enable bstring functionality for C++ in the
    most natural (for C++ programers) way possible.  The mandate for the C++
    wrapper is different from the base C bstring library.  Since the C++ language
    has far more abstracting capabilities, the CBString structure is considered
    fully abstracted -- i.e., hand generated CBStrings are not supported (though
    conversion from a struct tagbstring is allowed) and all detectable errors are
    manifest as thrown exceptions.
    
    - The C++ class definitions are all under the namespace Bstrlib.  bstrwrap.h
      enables this namespace (with a using namespace Bstrlib; directive at the
      end) unless the macro BSTRLIB_DONT_ASSUME_NAMESPACE has been defined before
      it is included.
    
    - Erroneous accesses results in an exception being thrown.  The exception
      parameter is of type "struct CBStringException" which is derived from
      std::exception if STL is used.  A verbose description of the error message
      can be obtained from the what() method.
    
    - CBString is a C++ structure derived from a struct tagbstring.  An address
      of a CBString cast to a bstring must not be passed to bdestroy.  The bstring
      C API has been made C++ safe and can be used directly in a C++ project.
    
    - It includes constructors which can take a char, '\0' terminated char
      buffer, tagbstring, (char, repeat-value), a length delimited buffer or a
      CBStringList to initialize it.
    
    - Concatenation is performed with the + and += operators.  Comparisons are
      done with the ==, !=, <, >, <= and >= operators.  Note that == and != use
      the biseq call, while <, >, <= and >= use bstrcmp.
    
    - CBString's can be directly cast to const character buffers.
    
    - CBString's can be directly cast to double, float, int or unsigned int so
      long as the CBString are decimal representations of those types (otherwise
      an exception will be thrown).  Converting the other way should be done with
      the format(a) method(s).
    
    - CBString contains the length, character and [] accessor methods.  The
      character and [] accessors are aliases of each other.  If the bounds for
      the string are exceeded, an exception is thrown.  To avoid the overhead for
      this check, first cast the CBString to a (const char *) and use [] to
      dereference the array as normal.  Note that the character and [] accessor
      methods allows both reading and writing of individual characters.
    
    - The methods: format, formata, find, reversefind, findcaseless,
      reversefindcaseless, midstr, insert, insertchrs, replace, findreplace,
      findreplacecaseless, remove, findchr, nfindchr, alloc, toupper, tolower,
      gets, read are analogous to the functions that can be found in the C API.
    
    - The caselessEqual and caselessCmp methods are analogous to biseqcaseless
      and bstricmp functions respectively.
    
    - Note that just like the bformat function, the format and formata methods do
      not automatically cast CBStrings into char * strings for "%s"-type
      substitutions:
    
        CBString w("world");
        CBString h("Hello");
        CBString hw;
    
        /* The casts are necessary */
        hw.format ("%s, %s", (const char *)h, (const char *)w);
    
    - The methods trunc and repeat have been added instead of using pattern.
    
    - ltrim, rtrim and trim methods have been added.  These remove characters
      from a given character string set (defaulting to the whitespace characters)
      from either the left, right or both ends of the CBString, respectively.
    
    - The method setsubstr is also analogous in functionality to bsetstr, except
      that it cannot be passed NULL.  Instead the method fill and the fill-style
      constructor have been supplied to enable this functionality.
    
    - The writeprotect(), writeallow() and iswriteprotected() methods are
      analogous to the bwriteprotect(), bwriteallow() and biswriteprotected()
      macros in the C API.  Write protection semantics in CBString are stronger
      than with the C API in that indexed character assignment is checked for
      write protection.  However, unlike with the C API, a write protected
      CBString can be destroyed by the destructor.
    
    - CBStream is a C++ structure which wraps a struct bStream (its not derived
      from it, since destruction is slightly different).  It is constructed by
      passing in a bNread function pointer and a stream parameter cast to void *.
      This structure includes methods for detecting eof, setting the buffer
      length, reading the whole stream or reading entries line by line or block
      by block, an unread function, and a peek function.
    
    - If STL is available, the CBStringList structure is derived from a vector of
      CBString with various split methods.  The split method has been overloaded
      to accept either a character or CBString as the second parameter (when the
      split parameter is a CBString any character in that CBString is used as a
      seperator).  The splitstr method takes a CBString as a substring seperator.
      Joins can be performed via a CBString constructor which takes a
      CBStringList as a parameter, or just using the CBString::join() method.
    
    - If there is proper support for std::iostreams, then the >> and << operators
      and the getline() function have been added (with semantics the same as
      those for std::string).
    
    Multithreading
    --------------
    
    A mutable bstring is kind of analogous to a small (two entry) linked list
    allocated by malloc, with all aliasing completely under programmer control.
    I.e., manipulation of one bstring will never affect any other distinct
    bstring unless explicitely constructed to do so by the programmer via hand
    construction or via building a reference.  Bstrlib also does not use any
    static or global storage, so there are no hidden unremovable race conditions.
    Bstrings are also clearly not inherently thread local.  So just like
    char *'s, bstrings can be passed around from thread to thread and shared and
    so on, so long as modifications to a bstring correspond to some kind of
    exclusive access lock as should be expected (or if the bstring is read-only,
    which can be enforced by bstring write protection) for any sort of shared
    object in a multithreaded environment.
    
    Bsafe module
    ------------
    
    For convenience, a bsafe module has been included.  The idea is that if this
    module is included, inadvertant usage of the most dangerous C functions will
    be overridden and lead to an immediate run time abort.  Of course, it should
    be emphasized that usage of this module is completely optional.  The
    intention is essentially to provide an option for creating project safety
    rules which can be enforced mechanically rather than socially.  This is
    useful for larger, or open development projects where its more difficult to
    enforce social rules or "coding conventions".
    
    Problems not solved
    -------------------
    
    Bstrlib is written for the C and C++ languages, which have inherent weaknesses
    that cannot be easily solved:
    
    1. Memory leaks:  Forgetting to call bdestroy on a bstring that is about to be
       unreferenced, just as forgetting to call free on a heap buffer that is
       about to be dereferenced.  Though bstrlib itself is leak free.
    2. Read before write usage:  In C, declaring an auto bstring does not
       automatically fill it with legal/valid contents.  This problem has been
       somewhat mitigated in C++.  (The bstrDeclare and bstrFree macros from
       bstraux can be used to help mitigate this problem.)
    
    Other problems not addressed:
    
    3. Built-in mutex usage to automatically avoid all bstring internal race
       conditions in multitasking environments: The problem with trying to
       implement such things at this low a level is that it is typically more
       efficient to use locks in higher level primitives. There is also no
       platform independent way to implement locks or mutexes.
    
    Note that except for spotty support of wide characters, the default C
    standard library does not address any of these problems either.
    
    Configurable compilation options
    --------------------------------
    
    The Better String Library is not an application, it is a library.  To compile
    it, you need to compile bstrlib.c to an object file that is linked to your
    application.  A Makefile might contain entries such as the following to
    accomplish this:
    
    BSTRDIR = $(CDIR)/bstrlib
    INCLUDES = -I$(BSTRDIR)
    BSTROBJS = $(ODIR)/bstrlib.o
    DEFINES =
    CFLAGS = -O3 -Wall -pedantic -ansi -s $(DEFINES)
    
    application: $(ODIR)/main.o $(BSTROBJS)
    	echo Linking: $@
    	$(CC) $< $(BSTROBJS) -o $@
    
    $(ODIR)/%.o : $(BSTRDIR)/%.c
    	echo Compiling: $<
    	$(CC) $(CFLAGS) $(INCLUDES) -c $< -o $@
    
    $(ODIR)/%.o : %.c
    	echo Compiling: $<
    	$(CC) $(CFLAGS) $(INCLUDES) -c $< -o $@
    
    You can configure bstrlib using with the standard macro defines passed to
    the compiler.  All configuration options are meant solely for the purpose of
    compiler compatibility.  Configuration options are not meant to change the
    semantics or capabilities of the library, except where it is unavoidable.
    
    Since some C++ compilers don't include the Standard Template Library and some
    have the options of disabling exception handling, a number of macros can be
    used to conditionally compile support for each of this:
    
    BSTRLIB_CAN_USE_STL
    
      - defining this will enable the used of the Standard Template Library.
        Defining BSTRLIB_CAN_USE_STL overrides the BSTRLIB_CANNOT_USE_STL macro.
    
    BSTRLIB_CANNOT_USE_STL
    
      - defining this will disable the use of the Standard Template Library.
        Defining BSTRLIB_CAN_USE_STL overrides the BSTRLIB_CANNOT_USE_STL macro.
    
    BSTRLIB_CAN_USE_IOSTREAM
    
      - defining this will enable the used of streams from class std.  Defining
        BSTRLIB_CAN_USE_IOSTREAM overrides the BSTRLIB_CANNOT_USE_IOSTREAM macro.
    
    BSTRLIB_CANNOT_USE_IOSTREAM
    
      - defining this will disable the use of streams from class std.  Defining
        BSTRLIB_CAN_USE_IOSTREAM overrides the BSTRLIB_CANNOT_USE_IOSTREAM macro.
    
    BSTRLIB_THROWS_EXCEPTIONS
    
      - defining this will enable the exception handling within bstring.
        Defining BSTRLIB_THROWS_EXCEPTIONS overrides the
        BSTRLIB_DOESNT_THROWS_EXCEPTIONS macro.
    
    BSTRLIB_DOESNT_THROW_EXCEPTIONS
    
      - defining this will disable the exception handling within bstring.
        Defining BSTRLIB_THROWS_EXCEPTIONS overrides the
        BSTRLIB_DOESNT_THROW_EXCEPTIONS macro.
    
    Note that these macros must be defined consistently throughout all modules
    that use CBStrings including bstrwrap.cpp.
    
    Some older C compilers do not support functions such as vsnprintf.  This is
    handled by the following macro variables:
    
    BSTRLIB_NOVSNP
    
      - defining this indicates that the compiler does not support vsnprintf.
        This will cause bformat and bformata to not be declared.  Note that
        for some compilers, such as Turbo C, this is set automatically.
        Defining BSTRLIB_NOVSNP overrides the BSTRLIB_VSNP_OK macro.
    
    BSTRLIB_VSNP_OK
    
      - defining this will disable the autodetection of compilers that do not
        vsnprintf.
        Defining BSTRLIB_NOVSNP overrides the BSTRLIB_VSNP_OK macro.
    
    Semantic compilation options
    ----------------------------
    
    Bstrlib comes with very few compilation options for changing the semantics of
    of the library.  These are described below.
    
    BSTRLIB_DONT_ASSUME_NAMESPACE
    
      - Defining this before including bstrwrap.h will disable the automatic
        enabling of the Bstrlib namespace for the C++ declarations.
    
    BSTRLIB_DONT_USE_VIRTUAL_DESTRUCTOR
    
      - Defining this will make the CBString destructor non-virtual.
    
    BSTRLIB_MEMORY_DEBUG
    
      - Defining this will cause the bstrlib modules bstrlib.c and bstrwrap.cpp
        to invoke a #include "memdbg.h".  memdbg.h has to be supplied by the user.
    
    Note that these macros must be defined consistently throughout all modules
    that use bstrings or CBStrings including bstrlib.c, bstraux.c and
    bstrwrap.cpp.
    
    ===============================================================================
    
    Files
    -----
    
    Core C files (required for C and C++):
    bstrlib.c       - C implementaion of bstring functions.
    bstrlib.h       - C header file for bstring functions.
    
    Core C++ files (required for C++):
    bstrwrap.cpp    - C++ implementation of CBString.
    bstrwrap.h      - C++ header file for CBString.
    
    Base Unicode support:
    utf8util.c      - C implemention of generic utf8 parsing functions.
    utf8util.h      - C head file for generic utf8 parsing functions.
    buniutil.c      - C implemention utf8 bstring packing and unpacking functions.
    buniutil.c      - C header file for utf8 bstring functions.
    
    Extra utility functions:
    bstraux.c       - C example that implements trivial additional functions.
    bstraux.h       - C header for bstraux.c
    
    Miscellaneous:
    bstest.c        - C unit/regression test for bstrlib.c
    test.cpp        - C++ unit/regression test for bstrwrap.cpp
    bsafe.c         - C runtime stubs to abort usage of unsafe C functions.
    bsafe.h         - C header file for bsafe.c functions.
    
    C modules need only include bstrlib.h and compile/link bstrlib.c to use the
    basic bstring library.  C++ projects need to additionally include bstrwrap.h
    and compile/link bstrwrap.cpp.  For both, there may be a need to make choices
    about feature configuration as described in the "Configurable compilation
    options" in the section above.
    
    Other files that are included in this archive are:
    
    license.txt     - The BSD license for Bstrlib
    gpl.txt         - The GPL version 2
    security.txt    - A security statement useful for auditting Bstrlib
    porting.txt     - A guide to porting Bstrlib
    bstrlib.txt     - This file
    
    ===============================================================================
    
    The functions
    -------------
    
        extern bstring bfromcstr (const char * str);
    
        Take a standard C library style '\0' terminated char buffer and generate
        a bstring with the same contents as the char buffer.  If an error occurs
        NULL is returned.
    
        So for example:
    
        bstring b = bfromcstr ("Hello");
        if (!b) {
            fprintf (stderr, "Out of memory");
        } else {
            puts ((char *) b->data);
        }
    
        ..........................................................................
    
        extern bstring bfromcstralloc (int mlen, const char * str);
    
        Create a bstring which contains the contents of the '\0' terminated
        char * buffer str.  The memory buffer backing the bstring is at least
        mlen characters in length.  The buffer is also at least size required
        to hold the string with the '\0' terminator.  If an error occurs NULL
        is returned.
    
        So for example:
    
        bstring b = bfromcstralloc (64, someCstr);
        if (b) b->data[63] = 'x';
    
        The idea is that this will set the 64th character of b to 'x' if it is at
        least 64 characters long otherwise do nothing.  And we know this is well
        defined so long as b was successfully created, since it will have been
        allocated with at least 64 characters.
    
        ..........................................................................
    
        extern bstring bfromcstrrangealloc (int minl, int maxl, const char* str);
    
        Create a bstring which contains the contents of the '\0' terminated
        char * buffer str.  The memory buffer backing the string is at least
        minl characters in length, but an attempt is made to allocate up to
        maxl characters.  The buffer is also at least size required to hold
        the string with the '\0' terminator.  If an error occurs NULL is
        returned.
    
        So for example:
    
        bstring b = bfromcstrrangealloc (0, 128, "Hello.");
        if (b) b->data[5] = '!';
    
        The idea is that this will set the 6th character of b to '!' if it was
        allocated otherwise do nothing.  And we know this is well defined so
        long as b was successfully created, since it will have been allocated
        with at least 7 (strlen("Hello.")) characters.
    
        ..........................................................................
    
        extern bstring blk2bstr (const void * blk, int len);
    
        Create a bstring whose contents are described by the contiguous buffer
        pointing to by blk with a length of len bytes.  Note that this function
        creates a copy of the data in blk, rather than simply referencing it.
        Compare with the blk2tbstr macro.  If an error occurs NULL is returned.
    
        ..........................................................................
    
        extern char * bstr2cstr (const_bstring s, char z);
    
        Create a '\0' terminated char buffer which contains the contents of the
        bstring s, except that any contained '\0' characters are converted to the
        character in z.  This returned value should be freed with bcstrfree(), by
        the caller.  If an error occurs NULL is returned.
    
        ..........................................................................
    
        extern int bcstrfree (char * s);
    
        Frees a C-string generated by bstr2cstr ().  This is normally unnecessary
        since it just wraps a call to free (), however, if malloc () and free ()
        have been redefined as a macros within the bstrlib module (via macros in
        the memdbg.h backdoor) with some difference in behaviour from the std
        library functions, then this allows a correct way of freeing the memory
        that allows higher level code to be independent from these macro
        redefinitions.
    
        ..........................................................................
    
        extern bstring bstrcpy (const_bstring b1);
    
        Make a copy of the passed in bstring.  The copied bstring is returned if
        there is no error, otherwise NULL is returned.
    
        ..........................................................................
    
        extern int bassign (bstring a, const_bstring b);
    
        Overwrite the bstring a with the contents of bstring b.  Note that the
        bstring a must be a well defined and writable bstring.  If an error
        occurs BSTR_ERR is returned and a is not overwritten.
    
        ..........................................................................
    
        int bassigncstr (bstring a, const char * str);
    
        Overwrite the string a with the contents of char * string str.  Note that
        the bstring a must be a well defined and writable bstring.  If an error
        occurs BSTR_ERR is returned and a may be partially overwritten.
    
        ..........................................................................
    
        int bassignblk (bstring a, const void * s, int len);
    
        Overwrite the string a with the contents of the block (s, len).  Note that
        the bstring a must be a well defined and writable bstring.  If an error
        occurs BSTR_ERR is returned and a is not overwritten.
    
        ..........................................................................
    
        extern int bassignmidstr (bstring a, const_bstring b, int left, int len);
    
        Overwrite the bstring a with the middle of contents of bstring b
        starting from position left and running for a length len.  left and
        len are clamped to the ends of b as with the function bmidstr.  Note that
        the bstring a must be a well defined and writable bstring.  If an error
        occurs BSTR_ERR is returned and a is not overwritten.
    
        ..........................................................................
    
        extern bstring bmidstr (const_bstring b, int left, int len);
    
        Create a bstring which is the substring of b starting from position left
        and running for a length len (clamped by the end of the bstring b.)  If
        there was no error, the value of this constructed bstring is returned
        otherwise NULL is returned.
    
        ..........................................................................
    
        extern int bdelete (bstring s1, int pos, int len);
    
        Removes characters from pos to pos+len-1 and shifts the tail of the
        bstring starting from pos+len to pos.  len must be positive for this call
        to have any effect.  The section of the bstring described by (pos, len)
        is clamped to boundaries of the bstring b.  The value BSTR_OK is returned
        if the operation is successful, otherwise BSTR_ERR is returned.
    
        ..........................................................................
    
        extern int bconcat (bstring b0, const_bstring b1);
    
        Concatenate the bstring b1 to the end of bstring b0.  The value BSTR_OK
        is returned if the operation is successful, otherwise BSTR_ERR is
        returned.
    
        ..........................................................................
    
        extern int bconchar (bstring b, char c);
    
        Concatenate the character c to the end of bstring b.  The value BSTR_OK
        is returned if the operation is successful, otherwise BSTR_ERR is
        returned.
    
        ..........................................................................
    
        extern int bcatcstr (bstring b, const char * s);
    
        Concatenate the char * string s to the end of bstring b.  The value
        BSTR_OK is returned if the operation is successful, otherwise BSTR_ERR is
        returned.
    
        ..........................................................................
    
        extern int bcatblk (bstring b, const void * s, int len);
    
        Concatenate a fixed length buffer (s, len) to the end of bstring b.  The
        value BSTR_OK is returned if the operation is successful, otherwise
        BSTR_ERR is returned.
    
        ..........................................................................
    
        extern int biseq (const_bstring b0, const_bstring b1);
    
        Compare the bstring b0 and b1 for equality.  If the bstrings differ, 0
        is returned, if the bstrings are the same, 1 is returned, if there is an
        error, -1 is returned.  If the length of the bstrings are different, this
        function has O(1) complexity.  Contained '\0' characters are not treated
        as a termination character.
    
        Note that the semantics of biseq are not completely compatible with
        bstrcmp because of its different treatment of the '\0' character.
    
        ..........................................................................
    
        extern int bisstemeqblk (const_bstring b, const void * blk, int len);
    
        Compare beginning of bstring b0 with a block of memory of length len for
        equality.  If the beginning of b0 differs from the memory block (or if b0
        is too short), 0 is returned, if the bstrings are the same, 1 is returned,
        if there is an error, -1 is returned.
    
        ..........................................................................
    
        extern int biseqcaseless (const_bstring b0, const_bstring b1);
    
        Compare two bstrings for equality without differentiating between case.
        If the bstrings differ other than in case, 0 is returned, if the bstrings
        are the same, 1 is returned, if there is an error, -1 is returned.  If
        the length of the bstrings are different, this function is O(1).  '\0'
        termination characters are not treated in any special way.
    
        ..........................................................................
    
        extern int biseqcaselessblk (const_bstring b, const void * blk, int len);
    
        Compare content of b and the array of bytes in blk for length len for
        equality without differentiating between character case.  If the content
        differs other than in case, 0 is returned, if, ignoring case, the content
        is the same, 1 is returned, if there is an error, -1 is returned.  If the
        length of the strings are different, this function is O(1).  '\0'
        termination characters are not treated in any special way.
    
        ..........................................................................
    
        extern int bisstemeqcaselessblk (const_bstring b0, const void * blk, int len);
    
        Compare beginning of bstring b0 with a block of memory of length len
        without differentiating between case for equality.  If the beginning of b0
        differs from the memory block other than in case (or if b0 is too short),
        0 is returned, if the bstrings are the same, 1 is returned, if there is an
        error, -1 is returned.
    
        ..........................................................................
    
        int biseqblk (const_bstring b, const void * blk, int len)
    
        Compare the string b with the character block blk of length len.  If the
        content differs, 0 is returned, if the content is the same, 1 is returned,
        if there is an error, -1 is returned.  If the length of the strings are
        different, this function is O(1).  '\0' characters are not treated in
        any special way.
    
        ..........................................................................
    
        extern int biseqcstr (const_bstring b, const char *s);
    
        Compare the bstring b and char * bstring s.  The C string s must be '\0'
        terminated at exactly the length of the bstring b, and the contents
        between the two must be identical with the bstring b with no '\0'
        characters for the two contents to be considered equal.  This is
        equivalent to the condition that their current contents will be always be
        equal when comparing them in the same format after converting one or the
        other.  If they are equal 1 is returned, if they are unequal 0 is
        returned and if there is a detectable error BSTR_ERR is returned.
    
        ..........................................................................
    
        extern int biseqcstrcaseless (const_bstring b, const char *s);
    
        Compare the bstring b and char * string s.  The C string s must be '\0'
        terminated at exactly the length of the bstring b, and the contents
        between the two must be identical except for case with the bstring b with
        no '\0' characters for the two contents to be considered equal.  This is
        equivalent to the condition that their current contents will be always be
        equal ignoring case when comparing them in the same format after
        converting one or the other.  If they are equal, except for case, 1 is
        returned, if they are unequal regardless of case 0 is returned and if
        there is a detectable error BSTR_ERR is returned.
    
        ..........................................................................
    
        extern int bstrcmp (const_bstring b0, const_bstring b1);
    
        Compare the bstrings b0 and b1 for ordering.  If there is an error,
        SHRT_MIN is returned, otherwise a value less than or greater than zero,
        indicating that the bstring pointed to by b0 is lexicographically less
        than or greater than the bstring pointed to by b1 is returned.  If the
        bstring lengths are unequal but the characters up until the length of the
        shorter are equal then a value less than, or greater than zero,
        indicating that the bstring pointed to by b0 is shorter or longer than the
        bstring pointed to by b1 is returned.  0 is returned if and only if the
        two bstrings are the same.  If the length of the bstrings are different,
        this function is O(n).  Like its standard C library counter part, the
        comparison does not proceed past any '\0' termination characters
        encountered.
    
        The seemingly odd error return value, merely provides slightly more
        granularity than the undefined situation given in the C library function
        strcmp.  The function otherwise behaves very much like strcmp().
    
        Note that the semantics of bstrcmp are not completely compatible with
        biseq because of its different treatment of the '\0' termination
        character.
    
        ..........................................................................
    
        extern int bstrncmp (const_bstring b0, const_bstring b1, int n);
    
        Compare the bstrings b0 and b1 for ordering for at most n characters.  If
        there is an error, SHRT_MIN is returned, otherwise a value is returned as
        if b0 and b1 were first truncated to at most n characters then bstrcmp
        was called with these new bstrings are paremeters.  If the length of the
        bstrings are different, this function is O(n).  Like its standard C
        library counter part, the comparison does not proceed past any '\0'
        termination characters encountered.
    
        The seemingly odd error return value, merely provides slightly more
        granularity than the undefined situation given in the C library function
        strncmp.  The function otherwise behaves very much like strncmp().
    
        ..........................................................................
    
        extern int bstricmp (const_bstring b0, const_bstring b1);
    
        Compare two bstrings without differentiating between case.  The return
        value is the difference of the values of the characters where the two
        bstrings first differ, otherwise 0 is returned indicating that the
        bstrings are equal.  If the lengths are different, then a difference from
        0 is given, but if the first extra character is '\0', then it is taken to
        be the value UCHAR_MAX+1.
    
        ..........................................................................
    
        extern int bstrnicmp (const_bstring b0, const_bstring b1, int n);
    
        Compare two bstrings without differentiating between case for at most n
        characters.  If the position where the two bstrings first differ is
        before the nth position, the return value is the difference of the values
        of the characters, otherwise 0 is returned.  If the lengths are different
        and less than n characters, then a difference from 0 is given, but if the
        first extra character is '\0', then it is taken to be the value
        UCHAR_MAX+1.
    
        ..........................................................................
    
        extern int bdestroy (bstring b);
    
        Deallocate the bstring passed.  Passing NULL in as a parameter will have
        no effect.  Note that both the header and the data portion of the bstring
        will be freed.  No other bstring function which modifies one of its
        parameters will free or reallocate the header.  Because of this, in
        general, bdestroy cannot be called on any declared struct tagbstring even
        if it is not write protected.  A bstring which is write protected cannot
        be destroyed via the bdestroy call.  Any attempt to do so will result in
        no action taken, and BSTR_ERR will be returned.
    
        Note to C++ users: Passing in a CBString cast to a bstring will lead to
        undefined behavior (free will be called on the header, rather than the
        CBString destructor.)  Instead just use the ordinary C++ language
        facilities to dealloc a CBString.
    
        ..........................................................................
    
        extern int binstr (const_bstring s1, int pos, const_bstring s2);
    
        Search for the bstring s2 in s1 starting at position pos and looking in a
        forward (increasing) direction.  If it is found then it returns with the
        first position after pos where it is found, otherwise it returns BSTR_ERR.
        The algorithm used is brute force; O(m*n).
    
        ..........................................................................
    
        extern int binstrr (const_bstring s1, int pos, const_bstring s2);
    
        Search for the bstring s2 in s1 starting at position pos and looking in a
        backward (decreasing) direction.  If it is found then it returns with the
        first position after pos where it is found, otherwise return BSTR_ERR.
        Note that the current position at pos is tested as well -- so to be
        disjoint from a previous forward search it is recommended that the
        position be backed up (decremented) by one position.  The algorithm used
        is brute force; O(m*n).
    
        ..........................................................................
    
        extern int binstrcaseless (const_bstring s1, int pos, const_bstring s2);
    
        Search for the bstring s2 in s1 starting at position pos and looking in a
        forward (increasing) direction but without regard to case.  If it is
        found then it returns with the first position after pos where it is
        found, otherwise it returns BSTR_ERR. The algorithm used is brute force;
        O(m*n).
    
        ..........................................................................
    
        extern int binstrrcaseless (const_bstring s1, int pos, const_bstring s2);
    
        Search for the bstring s2 in s1 starting at position pos and looking in a
        backward (decreasing) direction but without regard to case.  If it is
        found then it returns with the first position after pos where it is
        found, otherwise return BSTR_ERR. Note that the current position at pos
        is tested as well -- so to be disjoint from a previous forward search it
        is recommended that the position be backed up (decremented) by one
        position.  The algorithm used is brute force; O(m*n).
    
        ..........................................................................
    
        extern int binchr (const_bstring b0, int pos, const_bstring b1);
    
        Search for the first position in b0 starting from pos or after, in which
        one of the characters in b1 is found.  This function has an execution
        time of O(b0->slen + b1->slen).  If such a position does not exist in b0,
        then BSTR_ERR is returned.
    
        ..........................................................................
    
        extern int binchrr (const_bstring b0, int pos, const_bstring b1);
    
        Search for the last position in b0 no greater than pos, in which one of
        the characters in b1 is found.  This function has an execution time
        of O(b0->slen + b1->slen).  If such a position does not exist in b0,
        then BSTR_ERR is returned.
    
        ..........................................................................
    
        extern int bninchr (const_bstring b0, int pos, const_bstring b1);
    
        Search for the first position in b0 starting from pos or after, in which
        none of the characters in b1 is found and return it.  This function has
        an execution time of O(b0->slen + b1->slen).  If such a position does
        not exist in b0, then BSTR_ERR is returned.
    
        ..........................................................................
    
        extern int bninchrr (const_bstring b0, int pos, const_bstring b1);
    
        Search for the last position in b0 no greater than pos, in which none of
        the characters in b1 is found and return it.  This function has an
        execution time of O(b0->slen + b1->slen).  If such a position does not
        exist in b0, then BSTR_ERR is returned.
    
        ..........................................................................
    
        extern int bstrchr (const_bstring b, int c);
    
        Search for the character c in the bstring b forwards from the start of
        the bstring.  Returns the position of the found character or BSTR_ERR if
        it is not found.
    
        NOTE: This has been implemented as a macro on top of bstrchrp ().
    
        ..........................................................................
    
        extern int bstrrchr (const_bstring b, int c);
    
        Search for the character c in the bstring b backwards from the end of the
        bstring.  Returns the position of the found character or BSTR_ERR if it is
        not found.
    
        NOTE: This has been implemented as a macro on top of bstrrchrp ().
    
        ..........................................................................
    
        extern int bstrchrp (const_bstring b, int c, int pos);
    
        Search for the character c in b forwards from the position pos
        (inclusive).  Returns the position of the found character or BSTR_ERR if
        it is not found.
    
        ..........................................................................
    
        extern int bstrrchrp (const_bstring b, int c, int pos);
    
        Search for the character c in b backwards from the position pos in bstring
        (inclusive).  Returns the position of the found character or BSTR_ERR if
        it is not found.
    
        ..........................................................................
    
        extern int bsetstr (bstring b0, int pos, const_bstring b1, unsigned char fill);
    
        Overwrite the bstring b0 starting at position pos with the bstring b1. If
        the position pos is past the end of b0, then the character "fill" is
        appended as necessary to make up the gap between the end of b0 and pos.
        If b1 is NULL, it behaves as if it were a 0-length bstring. The value
        BSTR_OK is returned if the operation is successful, otherwise BSTR_ERR is
        returned.
    
        ..........................................................................
    
        extern int binsert (bstring s1, int pos, const_bstring s2, unsigned char fill);
    
        Inserts the bstring s2 into s1 at position pos.  If the position pos is
        past the end of s1, then the character "fill" is appended as necessary to
        make up the gap between the end of s1 and pos.  The value BSTR_OK is
        returned if the operation is successful, otherwise BSTR_ERR is returned.
    
        ..........................................................................
    
        int binsertblk (bstring b, int pos, const void * blk, int len,
                        unsigned char fill)
    
        Inserts the block of characters at blk with length len into b at position
        pos.  If the position pos is past the end of b, then the character "fill"
        is appended as necessary to make up the gap between the end of b1 and pos.
        Unlike bsetstr, binsert does not allow b2 to be NULL.
    
        ..........................................................................
    
        extern int binsertch (bstring s1, int pos, int len, unsigned char fill);
    
        Inserts the character fill repeatedly into s1 at position pos for a
        length len.  If the position pos is past the end of s1, then the
        character "fill" is appended as necessary to make up the gap between the
        end of s1 and the position pos + len (exclusive).  The value BSTR_OK is
        returned if the operation is successful, otherwise BSTR_ERR is returned.
    
        ..........................................................................
    
        extern int breplace (bstring b1, int pos, int len, const_bstring b2,
                             unsigned char fill);
    
        Replace a section of a bstring from pos for a length len with the bstring
        b2. If the position pos is past the end of b1 then the character "fill"
        is appended as necessary to make up the gap between the end of b1 and
        pos.
    
        ..........................................................................
    
        extern int bfindreplace (bstring b, const_bstring find,
                                 const_bstring replace, int position);
    
        Replace all occurrences of the find substring with a replace bstring
        after a given position in the bstring b.  The find bstring must have a
        length > 0 otherwise BSTR_ERR is returned.  This function does not
        perform recursive per character replacement; that is to say successive
        searches resume at the position after the last replace.
    
        So for example:
    
            bfindreplace (a0 = bfromcstr("aabaAb"), a1 = bfromcstr("a"),
                          a2 = bfromcstr("aa"), 0);
    
        Should result in changing a0 to "aaaabaaAb".
    
        This function performs exactly (b->slen - position) bstring comparisons,
        and data movement is bounded above by character volume equivalent to size
        of the output bstring.
    
        ..........................................................................
    
        extern int bfindreplacecaseless (bstring b, const_bstring find,
                                 const_bstring replace, int position);
    
        Replace all occurrences of the find substring, ignoring case, with a
        replace bstring after a given position in the bstring b.  The find bstring
        must have a length > 0 otherwise BSTR_ERR is returned.  This function
        does not perform recursive per character replacement; that is to say
        successive searches resume at the position after the last replace.
    
        So for example:
    
            bfindreplacecaseless (a0 = bfromcstr("AAbaAb"), a1 = bfromcstr("a"),
                                  a2 = bfromcstr("aa"), 0);
    
        Should result in changing a0 to "aaaabaaaab".
    
        This function performs exactly (b->slen - position) bstring comparisons,
        and data movement is bounded above by character volume equivalent to size
        of the output bstring.
    
        ..........................................................................
    
        extern int balloc (bstring b, int length);
    
        Increase the allocated memory backing the data buffer for the bstring b
        to a length of at least length.  If the memory backing the bstring b is
        already large enough, not action is performed.  This has no effect on the
        bstring b that is visible to the bstring API.  Usually this function will
        only be used when a minimum buffer size is required coupled with a direct
        access to the ->data member of the bstring structure.
    
        Be warned that like any other bstring function, the bstring must be well
        defined upon entry to this function.  I.e., doing something like:
    
            b->slen *= 2; /* ?? Most likely incorrect */
            balloc (b, b->slen);
    
        is invalid, and should be implemented as:
    
            int t;
            if (BSTR_OK == balloc (b, t = (b->slen * 2))) b->slen = t;
    
        This function will return with BSTR_ERR if b is not detected as a valid
        bstring or length is not greater than 0, otherwise BSTR_OK is returned.
    
        ..........................................................................
    
        extern int ballocmin (bstring b, int length);
    
        Change the amount of memory backing the bstring b to at least length.
        This operation will never truncate the bstring data including the
        extra terminating '\0' and thus will not decrease the length to less than
        b->slen + 1.  Note that repeated use of this function may cause
        performance problems (realloc may be called on the bstring more than
        the O(log(INT_MAX)) times).  This function will return with BSTR_ERR if b
        is not detected as a valid bstring or length is not greater than 0,
        otherwise BSTR_OK is returned.
    
        So for example:
    
        if (BSTR_OK == ballocmin (b, 64)) b->data[63] = 'x';
    
        The idea is that this will set the 64th character of b to 'x' if it is at
        least 64 characters long otherwise do nothing.  And we know this is well
        defined so long as the ballocmin call was successfully, since it will
        ensure that b has been allocated with at least 64 characters.
    
        ..........................................................................
    
        int btrunc (bstring b, int n);
    
        Truncate the bstring to at most n characters.  This function will return
        with BSTR_ERR if b is not detected as a valid bstring or n is less than
        0, otherwise BSTR_OK is returned.
    
        ..........................................................................
    
        extern int bpattern (bstring b, int len);
    
        Replicate the starting bstring, b, end to end repeatedly until it
        surpasses len characters, then chop the result to exactly len characters.
        This function operates in-place.  This function will return with BSTR_ERR
        if b is NULL or of length 0, otherwise BSTR_OK is returned.
    
        ..........................................................................
    
        extern int btoupper (bstring b);
    
        Convert contents of bstring to upper case.  This function will return with
        BSTR_ERR if b is NULL or of length 0, otherwise BSTR_OK is returned.
    
        ..........................................................................
    
        extern int btolower (bstring b);
    
        Convert contents of bstring to lower case.  This function will return with
        BSTR_ERR if b is NULL or of length 0, otherwise BSTR_OK is returned.
    
        ..........................................................................
    
        extern int bltrimws (bstring b);
    
        Delete whitespace contiguous from the left end of the bstring.  This
        function will return with BSTR_ERR if b is NULL or of length 0, otherwise
        BSTR_OK is returned.
    
        ..........................................................................
    
        extern int brtrimws (bstring b);
    
        Delete whitespace contiguous from the right end of the bstring.  This
        function will return with BSTR_ERR if b is NULL or of length 0, otherwise
        BSTR_OK is returned.
    
        ..........................................................................
    
        extern int btrimws (bstring b);
    
        Delete whitespace contiguous from both ends of the bstring.  This function
        will return with BSTR_ERR if b is NULL or of length 0, otherwise BSTR_OK
        is returned.
    
        ..........................................................................
    
        extern struct bstrList* bstrListCreate (void);
    
        Create an empty struct bstrList. The struct bstrList output structure is
        declared as follows:
    
        struct bstrList {
            int qty, mlen;
            bstring * entry;
        };
    
        The entry field actually is an array with qty number entries.  The mlen
        record counts the maximum number of bstring's for which there is memory
        in the entry record.
    
        The Bstrlib API does *NOT* include a comprehensive set of functions for
        full management of struct bstrList in an abstracted way.  The reason for
        this is because aliasing semantics of the list are best left to the user
        of this function, and performance varies wildly depending on the
        assumptions made.  For a complete list of bstring data type it is
        recommended that the C++ public std::vector<CBString> be used, since its
        semantics and usage are more standard.
    
        ..........................................................................
    
        extern int bstrListDestroy (struct bstrList * sl);
    
        Destroy a struct bstrList structure that was returned by the bsplit
        function.  Note that this will destroy each bstring in the ->entry array
        as well.  See bstrListCreate() above for structure of struct bstrList.
    
        ..........................................................................
    
        extern int bstrListAlloc (struct bstrList * sl, int msz);
    
        Ensure that there is memory for at least msz number of entries for the
        list.
    
        ..........................................................................
    
        extern int bstrListAllocMin (struct bstrList * sl, int msz);
    
        Try to allocate the minimum amount of memory for the list to include at
        least msz entries or sl->qty whichever is greater.
    
        ..........................................................................
    
        extern struct bstrList * bsplit (bstring str, unsigned char splitChar);
    
        Create an array of sequential substrings from str divided by the
        character splitChar.  Successive occurrences of the splitChar will be
        divided by empty bstring entries, following the semantics from the Python
        programming language.  To reclaim the memory from this output structure,
        bstrListDestroy () should be called.  See bstrListCreate() above for
        structure of struct bstrList.
    
        ..........................................................................
    
        extern struct bstrList * bsplits (bstring str, const_bstring splitStr);
    
        Create an array of sequential substrings from str divided by any
        character contained in splitStr.  An empty splitStr causes a single entry
        bstrList containing a copy of str to be returned.  See bstrListCreate()
        above for structure of struct bstrList.
    
        ..........................................................................
    
        extern struct bstrList * bsplitstr (bstring str, const_bstring splitStr);
    
        Create an array of sequential substrings from str divided by the entire
        substring splitStr.  An empty splitStr causes a single entry bstrList
        containing a copy of str to be returned.  See bstrListCreate() above for
        structure of struct bstrList.
    
        ..........................................................................
    
        extern bstring bjoin (const struct bstrList * bl, const_bstring sep);
    
        Join the entries of a bstrList into one bstring by sequentially
        concatenating them with the sep bstring in between.  If sep is NULL, it
        is treated as if it were the empty bstring.  Note that:
    
            bjoin (l = bsplit (b, s->data[0]), s);
    
        should result in a copy of b, if s->slen is 1.  If there is an error NULL
        is returned, otherwise a bstring with the correct result is returned.
        See bstrListCreate() above for structure of struct bstrList.
    
        ..........................................................................
    
        bstring bjoinblk (const struct bstrList * bl, void * blk, int len);
    
        Join the entries of a bstrList into one bstring by sequentially
        concatenating them with the content from blk for length len in between.
        If there is an error NULL is returned, otherwise a bstring with the
        correct result is returned.
    
        ..........................................................................
    
        extern int bsplitcb (const_bstring str, unsigned char splitChar, int pos,
        int (* cb) (void * parm, int ofs, int len), void * parm);
    
        Iterate the set of disjoint sequential substrings over str starting at
        position pos divided by the character splitChar.  The parm passed to
        bsplitcb is passed on to cb.  If the function cb returns a value < 0,
        then further iterating is halted and this value is returned by bsplitcb.
    
        Note: Non-destructive modification of str from within the cb function
        while performing this split is not undefined.  bsplitcb behaves in
        sequential lock step with calls to cb.  I.e., after returning from a cb
        that return a non-negative integer, bsplitcb continues from the position
        1 character after the last detected split character and it will halt
        immediately if the length of str falls below this point.  However, if the
        cb function destroys str, then it *must* return with a negative value,
        otherwise bsplitcb will continue in an undefined manner.
    
        This function is provided as an incremental alternative to bsplit that is
        abortable and which does not impose additional memory allocation.
    
        ..........................................................................
    
        extern int bsplitscb (const_bstring str, const_bstring splitStr, int pos,
        int (* cb) (void * parm, int ofs, int len), void * parm);
    
        Iterate the set of disjoint sequential substrings over str starting at
        position pos divided by any of the characters in splitStr.  An empty
        splitStr causes the whole str to be iterated once.  The parm passed to
        bsplitcb is passed on to cb.  If the function cb returns a value < 0,
        then further iterating is halted and this value is returned by bsplitcb.
    
        Note: Non-destructive modification of str from within the cb function
        while performing this split is not undefined.  bsplitscb behaves in
        sequential lock step with calls to cb.  I.e., after returning from a cb
        that return a non-negative integer, bsplitscb continues from the position
        1 character after the last detected split character and it will halt
        immediately if the length of str falls below this point.  However, if the
        cb function destroys str, then it *must* return with a negative value,
        otherwise bsplitscb will continue in an undefined manner.
    
        This function is provided as an incremental alternative to bsplits that
        is abortable and which does not impose additional memory allocation.
    
        ..........................................................................
    
        extern int bsplitstrcb (const_bstring str, const_bstring splitStr, int pos,
        int (* cb) (void * parm, int ofs, int len), void * parm);
    
        Iterate the set of disjoint sequential substrings over str starting at
        position pos divided by the entire substring splitStr.  An empty splitStr
        causes each character of str to be iterated.  The parm passed to bsplitcb
        is passed on to cb.  If the function cb returns a value < 0, then further
        iterating is halted and this value is returned by bsplitcb.
    
        Note: Non-destructive modification of str from within the cb function
        while performing this split is not undefined.  bsplitstrcb behaves in
        sequential lock step with calls to cb.  I.e., after returning from a cb
        that return a non-negative integer, bsplitstrcb continues from the position
        1 character after the last detected split character and it will halt
        immediately if the length of str falls below this point.  However, if the
        cb function destroys str, then it *must* return with a negative value,
        otherwise bsplitscb will continue in an undefined manner.
    
        This function is provided as an incremental alternative to bsplitstr that
        is abortable and which does not impose additional memory allocation.
    
        ..........................................................................
    
        extern bstring bformat (const char * fmt, ...);
    
        Takes the same parameters as printf (), but rather than outputting
        results to stdio, it forms a bstring which contains what would have been
        output. Note that if there is an early generation of a '\0' character,
        the bstring will be truncated to this end point.
    
        Note that %s format tokens correspond to '\0' terminated char * buffers,
        not bstrings.  To print a bstring, first dereference data element of the
        the bstring:
    
            /* b1->data needs to be '\0' terminated, so tagbstrings generated
               by blk2tbstr () might not be suitable. */
            b0 = bformat ("Hello, %s", b1->data);
    
        Note that if the BSTRLIB_NOVSNP macro has been set when bstrlib has been
        compiled the bformat function is not present.
    
        ..........................................................................
    
        extern int bformata (bstring b, const char * fmt, ...);
    
        In addition to the initial output buffer b, bformata takes the same
        parameters as printf (), but rather than outputting results to stdio, it
        appends the results to the initial bstring parameter. Note that if
        there is an early generation of a '\0' character, the bstring will be
        truncated to this end point.
    
        Note that %s format tokens correspond to '\0' terminated char * buffers,
        not bstrings.  To print a bstring, first dereference data element of the
        the bstring:
    
            /* b1->data needs to be '\0' terminated, so tagbstrings generated
               by blk2tbstr () might not be suitable. */
            bformata (b0 = bfromcstr ("Hello"), ", %s", b1->data);
    
        Note that if the BSTRLIB_NOVSNP macro has been set when bstrlib has been
        compiled the bformata function is not present.
    
        ..........................................................................
    
        extern int bassignformat (bstring b, const char * fmt, ...);
    
        After the first parameter, it takes the same parameters as printf (), but
        rather than outputting results to stdio, it outputs the results to
        the bstring parameter b. Note that if there is an early generation of a
        '\0' character, the bstring will be truncated to this end point.
    
        Note that %s format tokens correspond to '\0' terminated char * buffers,
        not bstrings.  To print a bstring, first dereference data element of the
        the bstring:
    
            /* b1->data needs to be '\0' terminated, so tagbstrings generated
               by blk2tbstr () might not be suitable. */
            bassignformat (b0 = bfromcstr ("Hello"), ", %s", b1->data);
    
        Note that if the BSTRLIB_NOVSNP macro has been set when bstrlib has been
        compiled the bassignformat function is not present.
    
        ..........................................................................
    
        extern int bvcformata (bstring b, int count, const char * fmt, va_list arglist);
    
        The bvcformata function formats data under control of the format control
        string fmt and attempts to append the result to b.  The fmt parameter is
        the same as that of the printf function.  The variable argument list is
        replaced with arglist, which has been initialized by the va_start macro.
        The size of the output is upper bounded by count.  If the required output
        exceeds count, the string b is not augmented with any contents and a value
        below BSTR_ERR is returned.  If a value below -count is returned then it
        is recommended that the negative of this value be used as an update to the
        count in a subsequent pass.  On other errors, such as running out of
        memory, parameter errors or numeric wrap around BSTR_ERR is returned.
        BSTR_OK is returned when the output is successfully generated and
        appended to b.
    
        Note: There is no sanity checking of arglist, and this function is
        destructive of the contents of b from the b->slen point onward.  If there
        is an early generation of a '\0' character, the bstring will be truncated
        to this end point.
    
        Although this function is part of the external API for Bstrlib, the
        interface and semantics (length limitations, and unusual return codes)
        are fairly atypical.  The real purpose for this function is to provide an
        engine for the bvformata macro.
    
        Note that if the BSTRLIB_NOVSNP macro has been set when bstrlib has been
        compiled the bvcformata function is not present.
    
        ..........................................................................
    
        extern bstring bread (bNread readPtr, void * parm);
        typedef size_t (* bNread) (void *buff, size_t elsize, size_t nelem,
                                   void *parm);
    
        Read an entire stream into a bstring, verbatum.  The readPtr function
        pointer is compatible with fread sematics, except that it need not obtain
        the stream data from a file.  The intention is that parm would contain
        the stream data context/state required (similar to the role of the FILE*
        I/O stream parameter of fread.)
    
        Abstracting the block read function allows for block devices other than
        file streams to be read if desired.  Note that there is an ANSI
        compatibility issue if "fread" is used directly; see the ANSI issues
        section below.
    
        ..........................................................................
    
        extern int breada (bstring b, bNread readPtr, void * parm);
    
        Read an entire stream and append it to a bstring, verbatum.  Behaves
        like bread, except that it appends it results to the bstring b.
        BSTR_ERR is returned on error, otherwise 0 is returned.
    
        ..........................................................................
    
        extern bstring bgets (bNgetc getcPtr, void * parm, char terminator);
        typedef int (* bNgetc) (void * parm);
    
        Read a bstring from a stream.  As many bytes as is necessary are read
        until the terminator is consumed or no more characters are available from
        the stream.  If read from the stream, the terminator character will be
        appended to the end of the returned bstring.  The getcPtr function must
        have the same semantics as the fgetc C library function (i.e., returning
        an integer whose value is negative when there are no more characters
        available, otherwise the value of the next available unsigned character
        from the stream.)  The intention is that parm would contain the stream
        data context/state required (similar to the role of the FILE* I/O stream
        parameter of fgets.)  If no characters are read, or there is some other
        detectable error, NULL is returned.
    
        bgets will never call the getcPtr function more often than necessary to
        construct its output (including a single call, if required, to determine
        that the stream contains no more characters.)
    
        Abstracting the character stream function and terminator character allows
        for different stream devices and string formats other than '\n'
        terminated lines in a file if desired (consider \032 terminated email
        messages, in a UNIX mailbox for example.)
    
        For files, this function can be used analogously as fgets as follows:
    
            fp = fopen ( ... );
            if (fp) b = bgets ((bNgetc) fgetc, fp, '\n');
    
        (Note that only one terminator character can be used, and that '\0' is
        not assumed to terminate the stream in addition to the terminator
        character. This is consistent with the semantics of fgets.)
    
        ..........................................................................
    
        extern int bgetsa (bstring b, bNgetc getcPtr, void * parm, char terminator);
    
        Read from a stream and concatenate to a bstring.  Behaves like bgets,
        except that it appends it results to the bstring b.  The value 1 is
        returned if no characters are read before a negative result is returned
        from getcPtr.  Otherwise BSTR_ERR is returned on error, and 0 is returned
        in other normal cases.
    
        ..........................................................................
    
        extern int bassigngets (bstring b, bNgetc getcPtr, void * parm, char terminator);
    
        Read from a stream and concatenate to a bstring.  Behaves like bgets,
        except that it assigns the results to the bstring b.  The value 1 is
        returned if no characters are read before a negative result is returned
        from getcPtr.  Otherwise BSTR_ERR is returned on error, and 0 is returned
        in other normal cases.
    
        ..........................................................................
    
        extern struct bStream * bsopen (bNread readPtr, void * parm);
    
        Wrap a given open stream (described by a fread compatible function
        pointer and stream handle) into an open bStream suitable for the bstring
        library streaming functions.
    
        ..........................................................................
    
        extern void * bsclose (struct bStream * s);
    
        Close the bStream, and return the handle to the stream that was
        originally used to open the given stream.  If s is NULL or detectably
        invalid, NULL will be returned.
    
        ..........................................................................
    
        extern int bsbufflength (struct bStream * s, int sz);
    
        Set the length of the buffer used by the bStream.  If sz is the macro
        BSTR_BS_BUFF_LENGTH_GET (which is 0), the length is not set.  If s is
        NULL or sz is negative, the function will return with BSTR_ERR, otherwise
        this function returns with the previous length.
    
        ..........................................................................
    
        extern int bsreadln (bstring r, struct bStream * s, char terminator);
    
        Read a bstring terminated by the terminator character or the end of the
        stream from the bStream (s) and return it into the parameter r.  The
        matched terminator, if found, appears at the end of the line read.  If
        the stream has been exhausted of all available data, before any can be
        read, BSTR_ERR is returned.  This function may read additional characters
        into the stream buffer from the core stream that are not returned, but
        will be retained for subsequent read operations.  When reading from high
        speed streams, this function can perform significantly faster than bgets.
    
        ..........................................................................
    
        extern int bsreadlna (bstring r, struct bStream * s, char terminator);
    
        Read a bstring terminated by the terminator character or the end of the
        stream from the bStream (s) and concatenate it to the parameter r.  The
        matched terminator, if found, appears at the end of the line read.  If
        the stream has been exhausted of all available data, before any can be
        read, BSTR_ERR is returned.  This function may read additional characters
        into the stream buffer from the core stream that are not returned, but
        will be retained for subsequent read operations.  When reading from high
        speed streams, this function can perform significantly faster than bgets.
    
        ..........................................................................
    
        extern int bsreadlns (bstring r, struct bStream * s, bstring terminators);
    
        Read a bstring terminated by any character in the terminators bstring or
        the end of the stream from the bStream (s) and return it into the
        parameter r. This function may read additional characters from the core
        stream that are not returned, but will be retained for subsequent read
        operations.
    
        ..........................................................................
    
        extern int bsreadlnsa (bstring r, struct bStream * s, bstring terminators);
    
        Read a bstring terminated by any character in the terminators bstring or
        the end of the stream from the bStream (s) and concatenate it to the
        parameter r.  If the stream has been exhausted of all available data,
        before any can be read, BSTR_ERR is returned.  This function may read
        additional characters from the core stream that are not returned, but
        will be retained for subsequent read operations.
    
        ..........................................................................
    
        extern int bsread (bstring r, struct bStream * s, int n);
    
        Read a bstring of length n (or, if it is fewer, as many bytes as is
        remaining) from the bStream.  This function will read the minimum
        required number of additional characters from the core stream.  When the
        stream is at the end of the file BSTR_ERR is returned, otherwise BSTR_OK
        is returned.
    
        ..........................................................................
    
        extern int bsreada (bstring r, struct bStream * s, int n);
    
        Read a bstring of length n (or, if it is fewer, as many bytes as is
        remaining) from the bStream and concatenate it to the parameter r.  This
        function will read the minimum required number of additional characters
        from the core stream.  When the stream is at the end of the file BSTR_ERR
        is returned, otherwise BSTR_OK is returned.
    
        ..........................................................................
    
        extern int bsunread (struct bStream * s, const_bstring b);
    
        Insert a bstring into the bStream at the current position.  These
        characters will be read prior to those that actually come from the core
        stream.
    
        ..........................................................................
    
        extern int bspeek (bstring r, const struct bStream * s);
    
        Return the number of currently buffered characters from the bStream that
        will be read prior to reads from the core stream, and append it to the
        the parameter r.
    
        ..........................................................................
    
        extern int bssplitscb (struct bStream * s, const_bstring splitStr,
        int (* cb) (void * parm, int ofs, const_bstring entry), void * parm);
    
        Iterate the set of disjoint sequential substrings over the stream s
        divided by any character from the bstring splitStr.  The parm passed to
        bssplitscb is passed on to cb.  If the function cb returns a value < 0,
        then further iterating is halted and this return value is returned by
        bssplitscb.
    
        Note: At the point of calling the cb function, the bStream pointer is
        pointed exactly at the position right after having read the split
        character.  The cb function can act on the stream by causing the bStream
        pointer to move, and bssplitscb will continue by starting the next split
        at the position of the pointer after the return from cb.
    
        However, if the cb causes the bStream s to be destroyed then the cb must
        return with a negative value, otherwise bssplitscb will continue in an
        undefined manner.
    
        This function is provided as way to incrementally parse through a file
        or other generic stream that in total size may otherwise exceed the
        practical or desired memory available.  As with the other split callback
        based functions this is abortable and does not impose additional memory
        allocation.
    
        ..........................................................................
    
        extern int bssplitstrcb (struct bStream * s, const_bstring splitStr,
        int (* cb) (void * parm, int ofs, const_bstring entry), void * parm);
    
        Iterate the set of disjoint sequential substrings over the stream s
        divided by the entire substring splitStr.  The parm passed to
        bssplitstrcb is passed on to cb.  If the function cb returns a
        value < 0, then further iterating is halted and this return value is
        returned by bssplitstrcb.
    
        Note: At the point of calling the cb function, the bStream pointer is
        pointed exactly at the position right after having read the split
        character.  The cb function can act on the stream by causing the bStream
        pointer to move, and bssplitstrcb will continue by starting the next
        split at the position of the pointer after the return from cb.
    
        However, if the cb causes the bStream s to be destroyed then the cb must
        return with a negative value, otherwise bssplitscb will continue in an
        undefined manner.
    
        This function is provided as way to incrementally parse through a file
        or other generic stream that in total size may otherwise exceed the
        practical or desired memory available.  As with the other split callback
        based functions this is abortable and does not impose additional memory
        allocation.
    
        ..........................................................................
    
        extern int bseof (const struct bStream * s);
    
        Return the defacto "EOF" (end of file) state of a stream (1 if the
        bStream is in an EOF state, 0 if not, and BSTR_ERR if stream is closed or
        detectably erroneous.)  When the readPtr callback returns a value <= 0
        the stream reaches its "EOF" state. Note that bunread with non-empty
        content will essentially turn off this state, and the stream will not be
        in its "EOF" state so long as its possible to read more data out of it.
    
        Also note that the semantics of bseof() are slightly different from
        something like feof().  I.e., reaching the end of the stream does not
        necessarily guarantee that bseof() will return with a value indicating
        that this has happened.  bseof() will only return indicating that it has
        reached the "EOF" and an attempt has been made to read past the end of
        the bStream.
    
    The macros
    ----------
    
        The macros described below are shown in a prototype form indicating their
        intended usage.  Note that the parameters passed to these macros will be
        referenced multiple times.  As with all macros, programmer care is
        required to guard against unintended side effects.
    
        int blengthe (const_bstring b, int err);
    
        Returns the length of the bstring.  If the bstring is NULL err is
        returned.
    
        ..........................................................................
    
        int blength (const_bstring b);
    
        Returns the length of the bstring.  If the bstring is NULL, the length
        returned is 0.
    
        ..........................................................................
    
        int bchare (const_bstring b, int p, int c);
    
        Returns the p'th character of the bstring b.  If the position p refers to
        a position that does not exist in the bstring or the bstring is NULL,
        then c is returned.
    
        ..........................................................................
    
        char bchar (const_bstring b, int p);
    
        Returns the p'th character of the bstring b.  If the position p refers to
        a position that does not exist in the bstring or the bstring is NULL,
        then '\0' is returned.
    
        ..........................................................................
    
        char * bdatae (bstring b, char * err);
    
        Returns the char * data portion of the bstring b.  If b is NULL, err is
        returned.
    
        ..........................................................................
    
        char * bdata (bstring b);
    
        Returns the char * data portion of the bstring b.  If b is NULL, NULL is
        returned.
    
        ..........................................................................
    
        char * bdataofse (bstring b, int ofs, char * err);
    
        Returns the char * data portion of the bstring b offset by ofs.  If b is
        NULL, err is returned.
    
        ..........................................................................
    
        char * bdataofs (bstring b, int ofs);
    
        Returns the char * data portion of the bstring b offset by ofs.  If b is
        NULL, NULL is returned.
    
        ..........................................................................
    
        struct tagbstring var = bsStatic ("...");
    
        The bsStatic macro allows for static declarations of literal string
        constants as struct tagbstring structures.  The resulting tagbstring does
        not need to be freed or destroyed.  Note that this macro is only well
        defined for string literal arguments.  For more general string pointers,
        use the btfromcstr macro.
    
        The resulting struct tagbstring is permanently write protected.  Attempts
        to write to this struct tagbstring from any bstrlib function will lead to
        BSTR_ERR being returned.  Invoking the bwriteallow macro onto this struct
        tagbstring has no effect.
    
        ..........................................................................
    
        <void * blk, int len> <- bsStaticBlkParms ("...")
    
        The bsStaticBlkParms macro emits a pair of comma seperated parameters
        corresponding to the block parameters for the block functions in Bstrlib
        (i.e., blk2bstr, bcatblk, blk2tbstr, bisstemeqblk, bisstemeqcaselessblk.)
        Note that this macro is only well defined for string literal arguments.
    
        Examples:
    
        bstring b = blk2bstr (bsStaticBlkParms ("Fast init. "));
        bcatblk (b, bsStaticBlkParms ("No frills fast concatenation."));
    
        These are faster than using bfromcstr() and bcatcstr() respectively
        because the length of the inline string is known as a compile time
        constant.  Also note that seperate struct tagbstring declarations for
        holding the output of a bsStatic() macro are not required.
    
        ..........................................................................
    
        void btfromcstr (struct tagbstring& t, const char * s);
    
        Fill in the tagbstring t with the '\0' terminated char buffer s.  This
        action is purely reference oriented; no memory management is done.  The
        data member is just assigned s, and slen is assigned the strlen of s.
        The s parameter is accessed exactly once in this macro.
    
        The resulting struct tagbstring is initially write protected.  Attempts
        to write to this struct tagbstring in a write protected state from any
        bstrlib function will lead to BSTR_ERR being returned.  Invoke the
        bwriteallow on this struct tagbstring to make it writeable (though this
        requires that s be obtained from a function compatible with malloc.)
    
        ..........................................................................
    
        void btfromblk (struct tagbstring& t, void * s, int len);
    
        Fill in the tagbstring t with the data buffer s with length len.  This
        action is purely reference oriented; no memory management is done.  The
        data member of t is just assigned s, and slen is assigned len.  Note that
        the buffer is not appended with a '\0' character.  The s and len
        parameters are accessed exactly once each in this macro.
    
        The resulting struct tagbstring is initially write protected.  Attempts
        to write to this struct tagbstring in a write protected state from any
        bstrlib function will lead to BSTR_ERR being returned.  Invoke the
        bwriteallow on this struct tagbstring to make it writeable (though this
        requires that s be obtained from a function compatible with malloc.)
    
        ..........................................................................
    
        void btfromblkltrimws (struct tagbstring& t, void * s, int len);
    
        Fill in the tagbstring t with the data buffer s with length len after it
        has been left trimmed.  This action is purely reference oriented; no
        memory management is done.  The data member of t is just assigned to a
        pointer inside the buffer s.  Note that the buffer is not appended with a
        '\0' character.  The s and len parameters are accessed exactly once each
        in this macro.
    
        The resulting struct tagbstring is permanently write protected.  Attempts
        to write to this struct tagbstring from any bstrlib function will lead to
        BSTR_ERR being returned.  Invoking the bwriteallow macro onto this struct
        tagbstring has no effect.
    
        ..........................................................................
    
        void btfromblkrtrimws (struct tagbstring& t, void * s, int len);
    
        Fill in the tagbstring t with the data buffer s with length len after it
        has been right trimmed.  This action is purely reference oriented; no
        memory management is done.  The data member of t is just assigned to a
        pointer inside the buffer s.  Note that the buffer is not appended with a
        '\0' character.  The s and len parameters are accessed exactly once each
        in this macro.
    
        The resulting struct tagbstring is permanently write protected.  Attempts
        to write to this struct tagbstring from any bstrlib function will lead to
        BSTR_ERR being returned.  Invoking the bwriteallow macro onto this struct
        tagbstring has no effect.
    
        ..........................................................................
    
        void btfromblktrimws (struct tagbstring& t, void * s, int len);
    
        Fill in the tagbstring t with the data buffer s with length len after it
        has been left and right trimmed.  This action is purely reference
        oriented; no memory management is done.  The data member of t is just
        assigned to a pointer inside the buffer s.  Note that the buffer is not
        appended with a '\0' character.  The s and len parameters are accessed
        exactly once each in this macro.
    
        The resulting struct tagbstring is permanently write protected.  Attempts
        to write to this struct tagbstring from any bstrlib function will lead to
        BSTR_ERR being returned.  Invoking the bwriteallow macro onto this struct
        tagbstring has no effect.
    
        ..........................................................................
    
        void bmid2tbstr (struct tagbstring& t, bstring b, int pos, int len);
    
        Fill the tagbstring t with the substring from b, starting from position
        pos with a length len.  The segment is clamped by the boundaries of
        the bstring b.  This action is purely reference oriented; no memory
        management is done.  Note that the buffer is not appended with a '\0'
        character.  Note that the t parameter to this macro may be accessed
        multiple times.  Note that the contents of t will become undefined
        if the contents of b change or are destroyed.
    
        The resulting struct tagbstring is permanently write protected.  Attempts
        to write to this struct tagbstring in a write protected state from any
        bstrlib function will lead to BSTR_ERR being returned.  Invoking the
        bwriteallow macro on this struct tagbstring will have no effect.
    
        ..........................................................................
    
        bstring bfromStatic("...");
    
        Allocate a bstring with the contents of a string literal.  Returns
        NULL if an error has occurred (ran out of memory).  The string literal
        parameter is enforced as literal at compile time.
    
        ..........................................................................
    
        int bcatStatic (bstring b, "...");
    
        Append a string literal to bstring b.  Returns 0 if successful, or
        BSTR_ERR if some error has occurred.  The string literal parameter is
        enforced as literal at compile time.
    
        ..........................................................................
    
        int binsertStatic (bstring s1, int pos, " ... ", char fill);
    
        Inserts the string literal into s1 at position pos.  If the position pos
        is past the end of s1, then the character "fill" is appended as necessary
        to make up the gap between the end of s1 and pos.  The value BSTR_OK is
        returned if the operation is successful, otherwise BSTR_ERR is returned.
    
        ..........................................................................
    
        int bassignStatic (bstring b, " ... ");
    
        Assign the contents of a string literal to the bstring b.  The string
        literal parameter is enforced as literal at compile time.
    
        ..........................................................................
    
        int biseqStatic (const_bstring b, " ... ");
    
        Compare the string b with the string literal.  If the content differs, 0
        is returned, if the content is the same, 1 is returned, if there is an
        error, -1 is returned.  If the length of the strings are different, this
        function is O(1).  '\0' characters are not treated in any special way.
    
        ..........................................................................
    
        int biseqcaselessStatic (const_bstring b, " ... ");
    
        Compare content of b and the string literal for equality without
        differentiating between character case.  If the content differs other
        than in case, 0 is returned, if, ignoring case, the content is the same,
        1 is returned, if there is an error, -1 is returned.  If the length of
        the strings are different, this function is O(1).  '\0' characters are
        not treated in any special way.
    
        ..........................................................................
    
        int bisstemeqStatic (bstring b, " ... ");
    
        Compare beginning of bstring b with a string literal for equality.  If
        the beginning of b differs from the memory block (or if b is too short),
        0 is returned, if the bstrings are the same, 1 is returned, if there is
        an error, -1 is returned.  The string literal parameter is enforced as
        literal at compile time.
    
        ..........................................................................
    
        int bisstemeqcaselessStatic (bstring b, " ... ");
    
        Compare beginning of bstring b with a string literal without
        differentiating between case for equality.  If the beginning of b differs
        from the memory block other than in case (or if b is too short), 0 is
        returned, if the bstrings are the same, 1 is returned, if there is an
        error, -1 is returned.  The string literal parameter is enforced as
        literal at compile time.
    
        ..........................................................................
    
        bstring bjoinStatic (const struct bstrList * bl, " ... ");
    
        Join the entries of a bstrList into one bstring by sequentially
        concatenating them with the string literal in between.  If there is an
        error NULL is returned, otherwise a bstring with the correct result is
        returned.  See bstrListCreate() above for structure of struct bstrList.
    
        ..........................................................................
    
        void bvformata (int& ret, bstring b, const char * format, lastarg);
    
        Append the bstring b with printf like formatting with the format control
        string, and the arguments taken from the ... list of arguments after
        lastarg passed to the containing function.  If the containing function
        does not have ... parameters or lastarg is not the last named parameter
        before the ... then the results are undefined.  If successful, the
        results are appended to b and BSTR_OK is assigned to ret.  Otherwise
        BSTR_ERR is assigned to ret.
    
        Example:
    
        void dbgerror (FILE * fp, const char * fmt, ...) {
            int ret;
            bstring b;
            bvformata (ret, b = bfromcstr ("DBG: "), fmt, fmt);
            if (BSTR_OK == ret) fputs ((char *) bdata (b), fp);
            bdestroy (b);
        }
    
        Note that if the BSTRLIB_NOVSNP macro was set when bstrlib had been
        compiled the bvformata macro will not link properly.  If the
        BSTRLIB_NOVSNP macro has been set, the bvformata macro will not be
        available.
    
        ..........................................................................
    
        void bwriteprotect (struct tagbstring& t);
    
        Disallow bstring from being written to via the bstrlib API.  Attempts to
        write to the resulting tagbstring from any bstrlib function will lead to
        BSTR_ERR being returned.
    
        Note: bstrings which are write protected cannot be destroyed via bdestroy.
    
        Note to C++ users: Setting a CBString as write protected will not prevent
        it from being destroyed by the destructor.
    
        ..........................................................................
    
        void bwriteallow (struct tagbstring& t);
    
        Allow bstring to be written to via the bstrlib API.  Note that such an
        action makes the bstring both writable and destroyable.  If the bstring is
        not legitimately writable (as is the case for struct tagbstrings
        initialized with a bsStatic value), the results of this are undefined.
    
        Note that invoking the bwriteallow macro may increase the number of
        reallocs by one more than necessary for every call to bwriteallow
        interleaved with any bstring API which writes to this bstring.
    
        ..........................................................................
    
        int biswriteprotected (struct tagbstring& t);
    
        Returns 1 if the bstring is write protected, otherwise 0 is returned.
    
    ===============================================================================
    
    Unicode functions
    -----------------
    
        The two modules utf8util.c and buniutil.c implement basic functions for
        parsing and collecting Unicode data in the UTF8 format.  Unicode is
        described by a sequence of "code points" which are values between 0 and
        1114111 inclusive mapped to symbol content corresponding to nearly all
        the standardized scripts of the world.
    
        The semantics of Unicode code points is varied and complicated.  The
        base support of the better string library does not attempt to perform
        any interpretation of these code points.  The better string library
        solely provides support for iterating through unicode code points,
        appending and extracting code points to and from bstrings, and parsing
        UTF8 and UTF16 from raw data.
    
        The types cpUcs4 and cpUcs2 respectively are defined as 4 byte and 2 byte
        encoding formats corresponding to UCS4 and UCS2 respectively.  To test
        if a raw code point is valid, the macro isLegalUnicodeCodePoint() has
        been defined.  The utf8 iterator is defined by struct utf8Iterator.  To
        test if the iterator has more code points to walk through the macro
        utf8IteratorNoMore() has been defined.
    
        To use these functions compile and link utf8util.c and buniutil.c
    
        ..........................................................................
    
        extern void utf8IteratorInit (struct utf8Iterator* iter,
                                      unsigned char* data, int slen);
    
        Initialize a unicode utf8 iterator to traverse an array of utf8 encoded
        code points pointed to by data, with length slen from the start.  The
        iterator iter is only valid for as long as the array it is pointed to
        is valid and not modified.
    
        ..........................................................................
    
        extern void utf8IteratorUninit (struct utf8Iterator* iter);
    
        Invalidate utf8 iterator.  After calling this the iterator iter, should
        yield false when passed to the utf8IteratorNoMore() macro.
    
        ..........................................................................
    
        extern cpUcs4 utf8IteratorGetNextCodePoint (struct utf8Iterator* iter,
                                                    cpUcs4 errCh);
    
        Parse code point the iterator is pointing at and advance the iterator to
        the next code point.  If the iterator was pointing at a valid code point
        the code point is returned, otherwise, errCh will be returned.
    
        ..........................................................................
    
        extern cpUcs4 utf8IteratorGetCurrCodePoint (struct utf8Iterator* iter,
                                                    cpUcs4 errCh);
    
        Parse code point the iterator is pointing at.  If the iterator was
        pointing at a valid code point the code point is returned, otherwise,
        errCh will be returned.
    
        ..........................................................................
    
        extern int utf8ScanBackwardsForCodePoint (unsigned char* msg, int len,
                                                  int pos, cpUcs4* out);
    
        From the position "pos" in the array msg of length len, search for the
        last position before or at pos where from which a valid Unicode code
        point can be parsed.  If such an offset is found it is returned otherwise
        a negative value is returned.  The code point parsed is put into *out if
        it is not NULL.
    
        ..........................................................................
    
        extern int buIsUTF8Content (const_bstring bu);
    
        Scan a bstring and determine if it is made entirely of unicode code
        valid points.  If it is, 1 is returned, otherwise 0 is returned.
    
        ..........................................................................
    
        extern int buAppendBlkUcs4 (bstring b, const cpUcs4* bu, int len,
                                    cpUcs4 errCh);
    
        Append the code points passed in the UCS4 format (raw numbers) in the
        array bu of length len.  Any unparsable characters are replaced by errCh.
        If errCh is not a valid Unicode code point, then parsing errors will cause
        BSTR_ERR to be returned.
    
        ..........................................................................
    
        extern int buGetBlkUTF16 (cpUcs2* ucs2, int len, cpUcs4 errCh,
                                  const_bstring bu, int pos);
    
        Convert a string of UTF8 codepoints (bu), skipping the first pos, into a
        sequence of UTF16 encoded code points.  Returns the number of UCS2 16-bit
        words written to the output.  No more than len words are written to the
        target array ucs2.  If any code point in bu is unparsable, it will be
        translated to errCh.
    
        ..........................................................................
    
        extern int buAppendBlkUTF16 (bstring bu, const cpUcs2* utf16, int len,
                                     cpUcs2* bom, cpUcs4 errCh);
    
        Append an array of UCS2 code points (utf16) to UTF8 codepoints (bu).  Any
        invalid code point is replaced by errCh.  If errCh is itself not a
        valid code point, then this translation will halt upon the first error
        and return BSTR_ERR.  Otherwise BSTR_OK is returned.  If a byte order mark
        has been previously read, it may be passed in as bom, otherwise if *bom is
        set to 0, it will be filled in with the BOM as read from the first
        character if it is a BOM.
    
    ===============================================================================
    
    The bstest module
    -----------------
    
    The bstest module is just a unit test for the bstrlib module.  For correct
    implementations of bstrlib, it should execute with 0 failures being reported.
    This test should be utilized if modifications/customizations to bstrlib have
    been performed.  It tests each core bstrlib function with bstrings of every
    mode (read-only, NULL, static and mutable) and ensures that the expected
    semantics are observed (including results that should indicate an error). It
    also tests for aliasing support.  Passing bstest is a necessary but not a
    sufficient condition for ensuring the correctness of the bstrlib module.
    
    
    The test module
    ---------------
    
    The test module is just a unit test for the bstrwrap module.  For correct
    implementations of bstrwrap, it should execute with 0 failures being
    reported.  This test should be utilized if modifications/customizations to
    bstrwrap have been performed.  It tests each core bstrwrap function with
    CBStrings write protected or not and ensures that the expected semantics are
    observed (including expected exceptions.)  Note that exceptions cannot be
    disabled to run this test.  Passing test is a necessary but not a sufficient
    condition for ensuring the correctness of the bstrwrap module.
    
    ===============================================================================
    
    Using Bstring and CBString as an alternative to the C library
    -------------------------------------------------------------
    
    First let us give a table of C library functions and the alternative bstring
    functions and CBString methods that should be used instead of them.
    
    C-library         Bstring alternative             CBString alternative
    ---------         -------------------             --------------------
    gets              bgets                           ::gets
    strcpy            bassign                         = operator
    strncpy           bassignmidstr                   ::midstr
    strcat            bconcat                         += operator
    strncat           bconcat + btrunc                += operator + ::trunc
    strtok            bsplit, bsplits                 ::split
    sprintf           b(assign)format                 ::format
    snprintf          b(assign)format + btrunc        ::format + ::trunc
    vsprintf          bvformata                       bvformata
    
    vsnprintf         bvformata + btrunc              bvformata + btrunc
    vfprintf          bvformata + fputs               use bvformata + fputs
    strcmp            biseq, bstrcmp                  comparison operators.
    strncmp           bstrncmp, memcmp                bstrncmp, memcmp
    strlen            ->slen, blength                 ::length
    strdup            bstrcpy                         constructor
    strset            bpattern                        ::fill
    strstr            binstr                          ::find
    strpbrk           binchr                          ::findchr
    stricmp           bstricmp                        cast & use bstricmp
    strlwr            btolower                        cast & use btolower
    strupr            btoupper                        cast & use btoupper
    strrev            bReverse (aux module)           cast & use bReverse
    strchr            bstrchr                         cast & use bstrchr
    strspnp           use strspn                      use strspn
    ungetc            bsunread                        bsunread
    
    The top 9 C functions listed here are troublesome in that they impose memory
    management in the calling function.  The Bstring and CBstring interfaces have
    built-in memory management, so there is far less code with far less potential
    for buffer overrun problems.  strtok can only be reliably called as a "leaf"
    calculation, since it (quite bizarrely) maintains hidden internal state.  And
    gets is well known to be broken no matter what.  The Bstrlib alternatives do
    not suffer from those sorts of problems.
    
    The substitute for strncat can be performed with higher performance by using
    the blk2tbstr macro to create a presized second operand for bconcat.
    
    C-library         Bstring alternative             CBString alternative
    ---------         -------------------             --------------------
    strspn            strspn acceptable               strspn acceptable
    strcspn           strcspn acceptable              strcspn acceptable
    strnset           strnset acceptable              strnset acceptable
    printf            printf acceptable               printf acceptable
    puts              puts acceptable                 puts acceptable
    fprintf           fprintf acceptable              fprintf acceptable
    fputs             fputs acceptable                fputs acceptable
    memcmp            memcmp acceptable               memcmp acceptable
    
    Remember that Bstring (and CBstring) functions will automatically append the
    '\0' character to the character data buffer.  So by simply accessing the data
    buffer directly, ordinary C string library functions can be called directly
    on them.  Note that bstrcmp is not the same as memcmp in exactly the same way
    that strcmp is not the same as memcmp.
    
    C-library         Bstring alternative             CBString alternative
    ---------         -------------------             --------------------
    fread             balloc + fread                  ::alloc + fread
    fgets             balloc + fgets                  ::alloc + fgets
    
    These are odd ones because of the exact sizing of the buffer required.  The
    Bstring and CBString alternatives requires that the buffers are forced to
    hold at least the prescribed length, then just use fread or fgets directly.
    However, typically the automatic memory management of Bstring and CBstring
    will make the typical use of fgets and fread to read specifically sized
    strings unnecessary.
    
    Implementation Choices
    ----------------------
    
    Overhead:
    .........
    
    The bstring library has more overhead versus straight char buffers for most
    functions.  This overhead is essentially just the memory management and
    string header allocation.  This overhead usually only shows up for small
    string manipulations.  The performance loss has to be considered in
    light of the following:
    
    1) What would be the performance loss of trying to write this management
       code in one's own application?
    2) Since the bstring library source code is given, a sufficiently powerful
       modern inlining globally optimizing compiler can remove function call
       overhead.
    
    Since the data type is exposed, a developer can replace any unsatisfactory
    function with their own inline implementation.  And that is besides the main
    point of what the better string library is mainly meant to provide.  Any
    overhead lost has to be compared against the value of the safe abstraction
    for coupling memory management and string functionality.
    
    Performance of the C interface:
    ...............................
    
    The algorithms used have performance advantages versus the analogous C
    library functions.  For example:
    
    1. bfromcstr/blk2str/bstrcpy versus strcpy/strdup.  By using memmove instead
       of strcpy, the break condition of the copy loop is based on an independent
       counter (that should be allocated in a register) rather than having to
       check the results of the load.  Modern out-of-order executing CPUs can
       parallelize the final branch mis-predict penality with the loading of the
       source string.  Some CPUs will also tend to have better built-in hardware
       support for counted memory moves than load-compare-store.  (This is a
       minor, but non-zero gain.)
    2. biseq versus strcmp.  If the strings are unequal in length, bsiseq will
       return in O(1) time.  If the strings are aliased, or have aliased data
       buffers, biseq will return in O(1) time.  strcmp will always be O(k),
       where k is the length of the common prefix or the whole string if they are
       identical.
    3. ->slen versus strlen.  ->slen is obviously always O(1), while strlen is
       always O(n) where n is the length of the string.
    4. bconcat versus strcat.  Both rely on precomputing the length of the
       destination string argument, which will favor the bstring library.  On
       iterated concatenations the performance difference can be enormous.
    5. bsreadln versus fgets.  The bsreadln function reads large blocks at a time
       from the given stream, then parses out lines from the buffers directly.
       Some C libraries will implement fgets as a loop over single fgetc calls.
       Testing indicates that the bsreadln approach can be several times faster
       for fast stream devices (such as a file that has been entirely cached.)
    6. bsplits/bsplitscb versus strspn.  Accelerators for the set of match
       characters are generated only once.
    7. binstr versus strstr.  The binstr implementation unrolls the loops to
       help reduce loop overhead.  This will matter if the target string is
       long and source string is not found very early in the target string.
       With strstr, while it is possible to unroll the source contents, it is
       not possible to do so with the destination contents in a way that is
       effective because every destination character must be tested against
       '\0' before proceeding to the next character.
    8. bReverse versus strrev.  The C function must find the end of the string
       first before swaping character pairs.
    9. bstrrchr versus no comparable C function.  Its not hard to write some C
       code to search for a character from the end going backwards.  But there
       is no way to do this without computing the length of the string with
       strlen.
    
    Practical testing indicates that in general Bstrlib is never signifcantly
    slower than the C library for common operations, while very often having a
    performance advantage that ranges from significant to massive.  Even for
    functions like b(n)inchr versus str(c)spn() (where, in theory, there is no
    advantage for the Bstrlib architecture) the performance of Bstrlib is vastly
    superior to most tested C library implementations.
    
    Some of Bstrlib's extra functionality also lead to inevitable performance
    advantages over typical C solutions.  For example, using the blk2tbstr macro,
    one can (in O(1) time) generate an internal substring by reference while not
    disturbing the original string.  If disturbing the original string is not an
    option, typically, a comparable char * solution would have to make a copy of
    the substring to provide similar functionality.  Another example is reverse
    character set scanning -- the str(c)spn functions only scan in a forward
    direction which can complicate some parsing algorithms.
    
    Where high performance char * based algorithms are available, Bstrlib can
    still leverage them by accessing the ->data field on bstrings.  So
    realistically Bstrlib can never be significantly slower than any standard
    '\0' terminated char * based solutions.
    
    Performance of the C++ interface:
    .................................
    
    The C++ interface has been designed with an emphasis on abstraction and safety
    first.  However, since it is substantially a wrapper for the C bstring
    functions, for longer strings the performance comments described in the
    "Performance of the C interface" section above still apply. Note that the
    (CBString *) type can be directly cast to a (bstring) type, and passed as
    parameters to the C functions (though a CBString must never be passed to
    bdestroy.)
    
    Probably the most controversial choice is performing full bounds checking on
    the [] operator.  This decision was made because 1) the fast alternative of
    not bounds checking is still available by first casting the CBString to a
    (const char *) buffer or to a (struct tagbstring) then derefencing .data and
    2) because the lack of bounds checking is seen as one of the main weaknesses
    of C/C++ versus other languages.  This check being done on every access leads
    to individual character extraction being actually slower than other languages
    in this one respect (other language's compilers will normally dedicate more
    resources on hoisting or removing bounds checking as necessary) but otherwise
    bring C++ up to the level of other languages in terms of functionality.
    
    It is common for other C++ libraries to leverage the abstractions provided by
    C++ to use reference counting and "copy on write" policies.  While these
    techniques can speed up some scenarios, they impose a problem with respect to
    thread safety.  bstrings and CBStrings can be properly protected with
    "per-object" mutexes, meaning that two bstrlib calls can be made and execute
    simultaneously, so long as the bstrings and CBstrings are distinct.  With a
    reference count and alias before copy on write policy, global mutexes are
    required that prevent multiple calls to the strings library to execute
    simultaneously regardless of whether or not the strings represent the same
    string.
    
    One interesting trade off in CBString is that the default constructor is not
    trivial.  I.e., it always prepares a ready to use memory buffer.  The purpose
    is to ensure that there is a uniform internal composition for any functioning
    CBString that is compatible with bstrings.  It also means that the other
    methods in the class are not forced to perform "late initialization" checks.
    In the end it means that construction of CBStrings are slower than other
    comparable C++ string classes.  Initial testing, however, indicates that
    CBString outperforms std::string and MFC's CString, for example, in all other
    operations.  So to work around this weakness it is recommended that CBString
    declarations be pushed outside of inner loops.
    
    Practical testing indicates that with the exception of the caveats given
    above (constructors and safe index character manipulations) the C++ API for
    Bstrlib generally outperforms popular standard C++ string classes.  Amongst
    the standard libraries and compilers, the quality of concatenation operations
    varies wildly and very little care has gone into search functions.  Bstrlib
    dominates those performance benchmarks.
    
    Memory management:
    ..................
    
    The bstring functions which write and modify bstrings will automatically
    reallocate the backing memory for the char buffer whenever it is required to
    grow.  The algorithm for resizing chosen is to snap up to sizes that are a
    power of two which are sufficient to hold the intended new size.  Memory
    reallocation is not performed when the required size of the buffer is
    decreased.  This behavior can be relied on, and is necessary to make the
    behaviour of balloc deterministic.  This trades off additional memory usage
    for decreasing the frequency for required reallocations:
    
    1. For any bstring whose size never exceeds n, its buffer is not ever
       reallocated more than log_2(n) times for its lifetime.
    2. For any bstring whose size never exceeds n, its buffer is never more than
       2*(n+1) in length.  (The extra characters beyond 2*n are to allow for the
       implicit '\0' which is always added by the bstring modifying functions.)
    
    Decreasing the buffer size when the string decreases in size would violate 1)
    above and in real world case lead to pathological heap thrashing.  Similarly,
    allocating more tightly than "least power of 2 greater than necessary" would
    lead to a violation of 1) and have the same potential for heap thrashing.
    
    Property 2) needs emphasizing.  Although the memory allocated is always a
    power of 2, for a bstring that grows linearly in size, its buffer memory also
    grows linearly, not exponentially.  The reason is that the amount of extra
    space increases with each reallocation, which decreases the frequency of
    future reallocations.
    
    Obviously, given that bstring writing functions may reallocate the data
    buffer backing the target bstring, one should not attempt to cache the data
    buffer address and use it after such bstring functions have been called.
    This includes making reference struct tagbstrings which alias to a writable
    bstring.
    
    balloc or bfromcstralloc can be used to preallocate the minimum amount of
    space used for a given bstring.  This will reduce even further the number of
    times the data portion is reallocated.  If the length of the string is never
    more than one less than the memory length then there will be no further
    reallocations.
    
    Note that invoking the bwriteallow macro may increase the number of reallocs
    by one more than necessary for every call to bwriteallow interleaved with any
    bstring API which writes to this bstring.
    
    The library does not use any mechanism for automatic clean up for the C API.
    Thus explicit clean up via calls to bdestroy() are required to avoid memory
    leaks.
    
    Constant and static tagbstrings:
    ................................
    
    A struct tagbstring can be write protected from any bstrlib function using
    the bwriteprotect macro.  A write protected struct tagbstring can then be
    reset to being writable via the bwriteallow macro.  There is, of course, no
    protection from attempts to directly access the bstring members.  Modifying a
    bstring which is write protected by direct access has undefined behavior.
    
    static struct tagbstrings can be declared via the bsStatic macro.  They are
    considered permanently unwritable.  Such struct tagbstrings's are declared
    such that attempts to write to it are not well defined.  Invoking either
    bwriteallow or bwriteprotect on static struct tagbstrings has no effect.
    
    struct tagbstring's initialized via btfromcstr or blk2tbstr are protected by
    default but can be made writeable via the bwriteallow macro.  If bwriteallow
    is called on such struct tagbstring's, it is the programmer's responsibility
    to ensure that:
    
    1) the buffer supplied was allocated from the heap.
    2) bdestroy is not called on this tagbstring (unless the header itself has
       also been allocated from the heap.)
    3) free is called on the buffer to reclaim its memory.
    
    bwriteallow and bwriteprotect can be invoked on ordinary bstrings (they have
    to be dereferenced with the (*) operator to get the levels of indirection
    correct) to give them write protection.
    
    Buffer declaration:
    ...................
    
    The memory buffer is actually declared "unsigned char *" instead of "char *".
    The reason for this is to trigger compiler warnings whenever uncasted char
    buffers are assigned to the data portion of a bstring.  This will draw more
    diligent programmers into taking a second look at the code where they
    have carelessly left off the typically required cast.  (Research from
    AT&T/Lucent indicates that additional programmer eyeballs is one of the most
    effective mechanisms at ferreting out bugs.)
    
    Function pointers:
    ..................
    
    The bgets, bread and bStream functions use function pointers to obtain
    strings from data streams.  The function pointer declarations have been
    specifically chosen to be compatible with the fgetc and fread functions.
    While this may seem to be a convoluted way of implementing fgets and fread
    style functionality, it has been specifically designed this way to ensure
    that there is no dependency on a single narrowly defined set of device
    interfaces, such as just stream I/O.  In the embedded world, its quite
    possible to have environments where such interfaces may not exist in the
    standard C library form.  Furthermore, the generalization that this opens up
    allows for more sophisticated uses for these functions (performing an fgets
    like function on a socket, for example.) By using function pointers, it also
    allows such abstract stream interfaces to be created using the bstring library
    itself while not creating a circular dependency.
    
    Use of int's for sizes:
    .......................
    
    This is just a recognition that 16bit platforms with requirements for strings
    that are larger than 64K and 32bit+ platforms with requirements for strings
    that are larger than 4GB are pretty marginal.  The main focus is for 32bit
    platforms, and emerging 64bit platforms with reasonable < 4GB string
    requirements.  Using ints allows for negative values which has meaning
    internally to bstrlib.
    
    Semantic consideration:
    .......................
    
    Certain care needs to be taken when copying and aliasing bstrings.  A bstring
    is essentially a pointer type which points to a multipart abstract data
    structure.  Thus usage, and lifetime of bstrings have semantics that follow
    these considerations.  For example:
    
        bstring a, b;
        struct tagbstring t;
    
        a = bfromcstr("Hello"); /* Create new bstring and copy "Hello" into it. */
        b = a;                  /* Alias b to the contents of a.                */
        t = *a;                 /* Create a current instance pseudo-alias of a. */
        bconcat (a, b);         /* Double a and b, t is now undefined.          */
        bdestroy (a);           /* Destroy the contents of both a and b.        */
    
    Variables of type bstring are really just references that point to real
    bstring objects.  The equal operator (=) creates aliases, and the asterisk
    dereference operator (*) creates a kind of alias to the current instance (which
    is generally not useful for any purpose.)  Using bstrcpy() is the correct way
    of creating duplicate instances.  The ampersand operator (&) is useful for
    creating aliases to struct tagbstrings (remembering that constructed struct
    tagbstrings are not writable by default.)
    
    CBStrings use complete copy semantics for the equal operator (=), and thus do
    not have these sorts of issues.
    
    Debugging:
    ..........
    
    Bstrings have a simple, exposed definition and construction, and the library
    itself is open source.  So most debugging is going to be fairly straight-
    forward.  But the memory for bstrings come from the heap, which can often be
    corrupted indirectly, and it might not be obvious what has happened even from
    direct examination of the contents in a debugger or a core dump.  There are
    some tools such as Purify, Insure++ and Electric Fence which can help solve
    such problems, however another common approach is to directly instrument the
    calls to malloc, realloc, calloc, free, memcpy, memmove and/or other calls
    by overriding them with macro definitions.
    
    Although the user could hack on the Bstrlib sources directly as necessary to
    perform such an instrumentation, Bstrlib comes with a built-in mechanism for
    doing this.  By defining the macro BSTRLIB_MEMORY_DEBUG and providing an
    include file named memdbg.h this will force the core Bstrlib modules to
    attempt to include this file.  In such a file, macros could be defined which
    overrides Bstrlib's useage of the C standard library.
    
    Rather than calling malloc, realloc, free, memcpy or memmove directly, Bstrlib
    emits the macros bstr__alloc, bstr__realloc, bstr__free, bstr__memcpy and
    bstr__memmove in their place respectively.  By default these macros are simply
    assigned to be equivalent to their corresponding C standard library function
    call.  However, if they are given earlier macro definitions (via the back
    door include file) they will not be given their default definition.  In this
    way Bstrlib's interface to the standard library can be changed but without
    having to directly redefine or link standard library symbols (both of which
    are not strictly ANSI C compliant.)
    
    An example definition might include:
    
        #define bstr__alloc(sz) X_malloc ((sz), __LINE__, __FILE__)
    
    which might help contextualize heap entries in a debugging environment.
    
    The NULL parameter and sanity checking of bstrings is part of the Bstrlib
    API, and thus Bstrlib itself does not present any different modes which would
    correspond to "Debug" or "Release" modes.  Bstrlib always contains mechanisms
    which one might think of as debugging features, but retains the performance
    and small memory footprint one would normally associate with release mode
    code.
    
    Integration Microsoft's Visual Studio debugger:
    ...............................................
    
    Microsoft's Visual Studio debugger has a capability of customizable mouse
    float over data type descriptions.  This is accomplished by editting the
    AUTOEXP.DAT file to include the following:
    
        ; new for CBString
        tagbstring =slen=<slen> mlen=<mlen> <data,st>
        Bstrlib::CBStringList =count=<size()>
    
    In Visual C++ 6.0 this file is located in the directory:
    
        C:\Program Files\Microsoft Visual Studio\Common\MSDev98\Bin
    
    and in Visual Studio .NET 2003 its located here:
    
        C:\Program Files\Microsoft Visual Studio .NET 2003\Common7\Packages\Debugger
    
    This will improve the ability of debugging with Bstrlib under Visual Studio.
    
    Security
    --------
    
    Bstrlib does not come with explicit security features outside of its fairly
    comprehensive error detection, coupled with its strict semantic support.
    That is to say that certain common security problems, such as buffer overrun,
    constant overwrite, arbitrary truncation etc, are far less likely to happen
    inadvertently.  Where it does help, Bstrlib maximizes its advantage by
    providing developers a simple adoption path that lets them leave less secure
    string mechanisms behind.  The library will not leave developers wanting, so
    they will be less likely to add new code using a less secure string library
    to add functionality that might be missing from Bstrlib.
    
    That said there are a number of security ideas not addressed by Bstrlib:
    
    1. Race condition exploitation (i.e., verifying a string's contents, then
    raising the privilege level and execute it as a shell command as two
    non-atomic steps) is well beyond the scope of what Bstrlib can provide.  It
    should be noted that MFC's built-in string mutex actually does not solve this
    problem either -- it just removes immediate data corruption as a possible
    outcome of such exploit attempts (it can be argued that this is worse, since
    it will leave no trace of the exploitation).  In general race conditions have
    to be dealt with by careful design and implementation; it cannot be assisted
    by a string library.
    
    2. Any kind of access control or security attributes to prevent usage in
    dangerous interfaces such as system().  Perl includes a "trust" attribute
    which can be endowed upon strings that are intended to be passed to such
    dangerous interfaces.  However, Perl's solution reflects its own limitations
    -- notably that it is not a strongly typed language.  In the example code for
    Bstrlib, there is a module called taint.cpp.  It demonstrates how to write a
    simple wrapper class for managing "untainted" or trusted strings using the
    type system to prevent questionable mixing of ordinary untrusted strings with
    untainted ones then passing them to dangerous interfaces.  In this way the
    security correctness of the code reduces to auditing the direct usages of
    dangerous interfaces or promotions of tainted strings to untainted ones.
    
    3. Encryption of string contents is way beyond the scope of Bstrlib.
    Maintaining encrypted string contents in the futile hopes of thwarting things
    like using system-level debuggers to examine sensitive string data is likely
    to be a wasted effort (imagine a debugger that runs at a higher level than a
    virtual processor where the application runs).  For more standard encryption
    usages, since the bstring contents are simply binary blocks of data, this
    should pose no problem for usage with other standard encryption libraries.
    
    Compatibility
    -------------
    
    The Better String Library is known to compile and function correctly with the
    following compilers:
    
      - Microsoft Visual C++
      - Watcom C/C++
      - Intel's C/C++ compiler (Windows)
      - The GNU C/C++ compiler (cygwin and Linux on PPC64)
      - Borland C
      - Turbo C
    
    Setting of configuration options should be unnecessary for these compilers
    (unless exceptions are being disabled or STLport has been added to WATCOM
    C/C++).  Bstrlib has been developed with an emphasis on portability.  As such
    porting it to other compilers should be straight forward.  This package
    includes a porting guide (called porting.txt) which explains what issues may
    exist for porting Bstrlib to different compilers and environments.
    
    ANSI issues
    -----------
    
    1. The function pointer types bNgetc and bNread have prototypes which are very
    similar to, but not exactly the same as fgetc and fread respectively.
    Basically the FILE * parameter is replaced by void *.  The purpose of this
    was to allow one to create other functions with fgetc and fread like
    semantics without being tied to ANSI C's file streaming mechanism.  I.e., one
    could very easily adapt it to sockets, or simply reading a block of memory,
    or procedurally generated strings (for fractal generation, for example.)
    
    The problem is that invoking the functions (bNgetc)fgetc and (bNread)fread is
    not technically legal in ANSI C.  The reason being that the compiler is only
    able to coerce the function pointers themselves into the target type, however
    are unable to perform any cast (implicit or otherwise) on the parameters
    passed once invoked.  I.e., if internally void * and FILE * need some kind of
    mechanical coercion, the compiler will not properly perform this conversion
    and thus lead to undefined behavior.
    
    Apparently a platform from Data General called "Eclipse" and another from
    Tandem called "NonStop" have a different representation for pointers to bytes
    and pointers to words, for example, where coercion via casting is necessary.
    (Actual confirmation of the existence of such machines is hard to come by, so
    it is prudent to be skeptical about this information.)  However, this is not
    an issue for any known contemporary platforms.  One may conclude that such
    platforms are effectively apocryphal even if they do exist.
    
    To correctly work around this problem to the satisfaction of the ANSI
    limitations, one needs to create wrapper functions for fgets and/or
    fread with the prototypes of bNgetc and/or bNread respectively which performs
    no other action other than to explicitely cast the void * parameter to a
    FILE *, and simply pass the remaining parameters straight to the function
    pointer call.
    
    The wrappers themselves are trivial:
    
        size_t freadWrap (void * buff, size_t esz, size_t eqty, void * parm) {
            return fread (buff, esz, eqty, (FILE *) parm);
        }
    
        int fgetcWrap (void * parm) {
            return fgetc ((FILE *) parm);
        }
    
    These have not been supplied in bstrlib or bstraux to prevent unnecessary
    linking with file I/O functions.
    
    2. vsnprintf is not available on all compilers.  Because of this, the bformat
    and bformata functions (and format and formata methods) are not guaranteed to
    work properly.  For those compilers that don't have vsnprintf, the
    BSTRLIB_NOVSNP macro should be set before compiling bstrlib, and the format
    functions/method will be disabled.
    
    The more recent ANSI C standards have specified the required inclusion of a
    vsnprintf function.
    
    3. The bstrlib function names are not unique in the first 6 characters.  This
    is only an issue for older C compiler environments which do not store more
    than 6 characters for function names.
    
    4. The bsafe module defines macros and function names which are part of the
    C library.  This simply overrides the definition as expected on all platforms
    tested, however it is not sanctioned by the ANSI standard.  This module is
    clearly optional and should be omitted on platforms which disallow its
    undefined semantics.
    
    In practice the real issue is that some compilers in some modes of operation
    can/will inline these standard library functions on a module by module basis
    as they appear in each.  The linker will thus have no opportunity to override
    the implementation of these functions for those cases.  This can lead to
    inconsistent behaviour of the bsafe module on different platforms and
    compilers.
    
    ===============================================================================
    
    Comparison with Microsoft's CString class
    -----------------------------------------
    
    Although developed independently, CBStrings have very similar functionality to
    Microsoft's CString class.  However, the bstring library has significant
    advantages over CString:
    
    1. Bstrlib is a C-library as well as a C++ library (using the C++ wrapper).
    
        - Thus it is compatible with more programming environments and
          available to a wider population of programmers.
    
    2. The internal structure of a bstring is considered exposed.
    
        - A single contiguous block of data can be cut into read-only pieces by
          simply creating headers, without allocating additional memory to create
          reference copies of each of these sub-strings.
        - In this way, using bstrings in a totally abstracted way becomes a choice
          rather than an imposition.  Further this choice can be made differently
          at different layers of applications that use it.
    
    3. Static declaration support precludes the need for constructor
       invocation.
    
        - Allows for static declarations of constant strings that has no
          additional constructor overhead.
    
    4. Bstrlib is not attached to another library.
    
        - Bstrlib is designed to be easily plugged into any other library
          collection, without dependencies on other libraries or paradigms (such
          as "MFC".)
    
    The bstring library also comes with a few additional functions that are not
    available in the CString class:
    
        - bsetstr
        - bsplit
        - bread
        - breplace (this is different from CString::Replace())
        - Writable indexed characters (for example a[i]='x')
    
    Interestingly, although Microsoft did implement mid$(), left$() and right$()
    functional analogues (these are functions from GWBASIC) they seem to have
    forgotten that mid$() could be also used to write into the middle of a string.
    This functionality exists in Bstrlib with the bsetstr() and breplace()
    functions.
    
    Among the disadvantages of Bstrlib is that there is no special support for
    localization or wide characters.  Such things are considered beyond the scope
    of what bstrings are trying to deliver.  CString essentially supports the
    older UCS-2 version of Unicode via widechar_t as an application-wide compile
    time switch.
    
    CString's also use built-in mechanisms for ensuring thread safety under all
    situations.  While this makes writing thread safe code that much easier, this
    built-in safety feature has a price -- the inner loops of each CString method
    runs in its own critical section (grabbing and releasing a light weight mutex
    on every operation.)  The usual way to decrease the impact of a critical
    section performance penalty is to amortize more operations per critical
    section.  But since the implementation of CStrings is fixed as a one critical
    section per-operation cost, there is no way to leverage this common
    performance enhancing idea.
    
    The search facilities in Bstrlib are comparable to those in MFC's CString
    class, though it is missing locale specific collation.  But because Bstrlib
    is interoperable with C's char buffers, it will allow programmers to write
    their own string searching mechanism (such as Boyer-Moore), or be able to
    choose from a variety of available existing string searching libraries (such
    as those for regular expressions) without difficulty.
    
    Microsoft used a very non-ANSI conforming trick in its implementation to
    allow printf() to use the "%s" specifier to output a CString correctly.  This
    can be convenient, but it is inherently not portable.  CBString requires an
    explicit cast, while bstring requires the data member to be dereferenced.
    Microsoft's own documentation recommends casting, instead of relying on this
    feature.
    
    Comparison with C++'s std::string
    ---------------------------------
    
    This is the C++ language's standard STL based string class.
    
    1. There is no C implementation.
    2. The [] operator is not bounds checked.
    3. Missing a lot of useful functions like printf-like formatting.
    4. Some sub-standard std::string implementations (SGI) are necessarily unsafe
       to use with multithreading.
    5. Limited by STL's std::iostream which in turn is limited by ifstream which
       can only take input from files.  (Compare to CBStream's API which can take
       abstracted input.)
    6. Extremely uneven performance across implementations.
    
    Comparison with ISO C TR 24731 proposal
    ---------------------------------------
    
    Following the ISO C99 standard, Microsoft has proposed a group of C library
    extensions which are supposedly "safer and more secure".  This proposal is
    expected to be adopted by the ISO C standard which follows C99.
    
    The proposal reveals itself to be very similar to Microsoft's "StrSafe"
    library. The functions are basically the same as other standard C library
    string functions except that destination parameters are paired with an
    additional length parameter of type rsize_t.  rsize_t is the same as size_t,
    however, the range is checked to make sure its between 1 and RSIZE_MAX.  Like
    Bstrlib, the functions perform a "parameter check".  Unlike Bstrlib, when a
    parameter check fails, rather than simply outputing accumulatable error
    statuses, they call a user settable global error function handler, and upon
    return of control performs no (additional) detrimental action.  The proposal
    covers basic string functions as well as a few non-reenterable functions
    (asctime, ctime, and strtok).
    
    1. Still based solely on char * buffers (and therefore strlen() and strcat()
       is still O(n), and there are no faster streq() comparison functions.)
    2. No growable string semantics.
    3. Requires manual buffer length synchronization in the source code.
    4. No attempt to enhance functionality of the C library.
    5. Introduces a new error scenario (strings exceeding RSIZE_MAX length).
    
    The hope is that by exposing the buffer length requirements there will be
    fewer buffer overrun errors.  However, the error modes are really just
    transformed, rather than removed.  The real problem of buffer overflows is
    that they all happen as a result of erroneous programming.  So forcing
    programmers to manually deal with buffer limits, will make them more aware of
    the problem but doesn't remove the possibility of erroneous programming.  So
    a programmer that erroneously mixes up the rsize_t parameters is no better off
    from a programmer that introduces potential buffer overflows through other
    more typical lapses.  So at best this may reduce the rate of erroneous
    programming, rather than making any attempt at removing failure modes.
    
    The error handler can discriminate between types of failures, but does not
    take into account any callsite context.  So the problem is that the error is
    going to be manifest in a piece of code, but there is no pointer to that
    code.  It would seem that passing in the call site __FILE__, __LINE__ as
    parameters would be very useful, but the API clearly doesn't support such a
    thing (it would increase code bloat even more than the extra length
    parameter does, and would require macro tricks to implement).
    
    The Bstrlib C API takes the position that error handling needs to be done at
    the callsite, and just tries to make it as painless as possible.  Furthermore,
    error modes are removed by supporting auto-growing strings and aliasing.  For
    capturing errors in more central code fragments, Bstrlib's C++ API uses
    exception handling extensively, which is superior to the leaf-only error
    handler approach.
    
    Comparison with Managed String Library CERT proposal
    ----------------------------------------------------
    
    The main webpage for the managed string library:
    http://www.cert.org/secure-coding/managedstring.html
    
    Robert Seacord at CERT has proposed a C string library that he calls the
    "Managed String Library" for C. Like Bstrlib, it introduces a new type
    which is called a managed string. The structure of a managed string
    (string_m) is like a struct tagbstring but missing the length field.  This
    internal structure is considered opaque. The length is, like the C standard
    library, always computed on the fly by searching for a terminating NUL on
    every operation that requires it. So it suffers from every performance
    problem that the C standard library suffers from. Interoperating with C
    string APIs (like printf, fopen, or anything else that takes a string
    parameter) requires copying to additionally allocating buffers that have to
    be manually freed -- this makes this library probably slower and more
    cumbersome than any other string library in existence.
    
    The library gives a fully populated error status as the return value of every
    string function.  The hope is to be able to diagnose all problems
    specifically from the return code alone.  Comparing this to Bstrlib, which
    aways returns one consistent error message, might make it seem that Bstrlib
    would be harder to debug; but this is not true.  With Bstrlib, if an error
    occurs there is always enough information from just knowing there was an error
    and examining the parameters to deduce exactly what kind of error has
    happened.  The managed string library thus gives up nested function calls
    while achieving little benefit, while Bstrlib does not.
    
    One interesting feature that "managed strings" has is the idea of data
    sanitization via character set whitelisting.  That is to say, a globally
    definable filter that makes any attempt to put invalid characters into strings
    lead to an error and not modify the string.  The author gives the following
    example:
    
        // create valid char set
        if (retValue = strcreate_m(&str1, "abc") ) {
          fprintf(
            stderr,
            "Error %d from strcreate_m.\n",
            retValue
          );
        }
        if (retValue = setcharset(str1)) {
          fprintf(
            stderr,
            "Error %d from  setcharset().\n",
            retValue
          );
        }
        if (retValue = strcreate_m(&str1, "aabbccabc")) {
          fprintf(
            stderr,
            "Error %d from strcreate_m.\n",
            retValue
          );
        }
        // create string with invalid char set
        if (retValue = strcreate_m(&str1, "abbccdabc")) {
          fprintf(
            stderr,
            "Error %d from strcreate_m.\n",
            retValue
          );
        }
    
    Which we can compare with a more Bstrlib way of doing things:
    
        bstring bCreateWithFilter (const char * cstr, const_bstring filter) {
          bstring b = bfromcstr (cstr);
          if (BSTR_ERR != bninchr (b, filter) && NULL != b) {
            fprintf (stderr, "Filter violation.\n");
            bdestroy (b);
            b = NULL;
          }
          return b;
        }
    
        struct tagbstring charFilter = bsStatic ("abc");
        bstring str1 = bCreateWithFilter ("aabbccabc", &charFilter);
        bstring str2 = bCreateWithFilter ("aabbccdabc", &charFilter);
    
    The first thing we should notice is that with the Bstrlib approach you can
    have different filters for different strings if necessary.  Furthermore,
    selecting a charset filter in the Managed String Library is uni-contextual.
    That is to say, there can only be one such filter active for the entire
    program, which means its usage is not well defined for intermediate library
    usage (a library that uses it will interfere with user code that uses it, and
    vice versa.)  It is also likely to be poorly defined in multi-threading
    environments.
    
    There is also a question as to whether the data sanitization filter is checked
    on every operation, or just on creation operations.  Since the charset can be
    set arbitrarily at run time, it might be set *after* some managed strings have
    been created.  This would seem to imply that all functions should run this
    additional check every time if there is an attempt to enforce this.  This
    would make things tremendously slow.  On the other hand, if it is assumed that
    only creates and other operations that take char *'s as input need be checked
    because the charset was only supposed to be called once at and before any
    other managed string was created, then one can see that its easy to cover
    Bstrlib with equivalent functionality via a few wrapper calls such as the
    example given above.
    
    And finally we have to question the value of sanitation in the first place.
    For example, for httpd servers, there is generally a requirement that the
    URLs parsed have some form that avoids undesirable translation to local file
    system filenames or resources.  The problem is that the way URLs can be
    encoded, it must be completely parsed and translated to know if it is using
    certain invalid character combinations.  That is to say, merely filtering
    each character one at a time is not necessarily the right way to ensure that
    a string has safe contents.
    
    In the article that describes this proposal, it is claimed that it fairly
    closely approximates the existing C API semantics.  On this point we should
    compare this "closeness" with Bstrlib:
    
                          Bstrlib                     Managed String Library
                          -------                     ----------------------
    
    Pointer arithmetic    Segment arithmetic          N/A
    
    Use in C Std lib      ->data, or bdata{e}         getstr_m(x,*) ... free(x)
    
    String literals       bsStatic, bsStaticBlk       strcreate_m()
    
    Transparency          Complete                    None
    
    Its pretty clear that the semantic mapping from C strings to Bstrlib is fairly
    straightforward, and that in general semantic capabilities are the same or
    superior in Bstrlib.  On the other hand the Managed String Library is either
    missing semantics or changes things fairly significantly.
    
    Comparison with Annexia's c2lib library
    ---------------------------------------
    
    This library is available at:
    http://www.annexia.org/freeware/c2lib
    
    1. Still based solely on char * buffers (and therefore strlen() and strcat()
       is still O(n), and there are no faster streq() comparison functions.)
       Their suggestion that alternatives which wrap the string data type (such as
       bstring does) imposes a difficulty in interoperating with the C langauge's
       ordinary C string library is not founded.
    2. Introduction of memory (and vector?) abstractions imposes a learning
       curve, and some kind of memory usage policy that is outside of the strings
       themselves (and therefore must be maintained by the developer.)
    3. The API is massive, and filled with all sorts of trivial (pjoin) and
       controvertial (pmatch -- regular expression are not sufficiently
       standardized, and there is a very large difference in performance between
       compiled and non-compiled, REs) functions.  Bstrlib takes a decidely
       minimal approach -- none of the functionality in c2lib is difficult or
       challenging to implement on top of Bstrlib (except the regex stuff, which
       is going to be difficult, and controvertial no matter what.)
    4. Understanding why c2lib is the way it is pretty much requires a working
       knowledge of Perl.  bstrlib requires only knowledge of the C string library
       while providing just a very select few worthwhile extras.
    5. It is attached to a lot of cruft like a matrix math library (that doesn't
       include any functions for getting the determinant, eigenvectors,
       eigenvalues, the matrix inverse, test for singularity, test for
       orthogonality, a grahm schmit orthogonlization, LU decomposition ... I
       mean why bother?)
    
    Convincing a development house to use c2lib is likely quite difficult.  It
    introduces too much, while not being part of any kind of standards body.  The
    code must therefore be trusted, or maintained by those that use it.  While
    bstring offers nothing more on this front, since its so much smaller, covers
    far less in terms of scope, and will typically improve string performance,
    the barrier to usage should be much smaller.
    
    Comparison with stralloc/qmail
    ------------------------------
    
    More information about this library can be found here:
    http://www.canonical.org/~kragen/stralloc.html or here:
    http://cr.yp.to/lib/stralloc.html
    
    1. Library is very very minimal.  A little too minimal.
    2. Untargetted source parameters are not declared const.
    3. Slightly different expected emphasis (like _cats function which takes an
       ordinary C string char buffer as a parameter.)  Its clear that the
       remainder of the C string library is still required to perform more
       useful string operations.
    
    The struct declaration for their string header is essentially the same as that
    for bstring.  But its clear that this was a quickly written hack whose goals
    are clearly a subset of what Bstrlib supplies.  For anyone who is served by
    stralloc, Bstrlib is complete substitute that just adds more functionality.
    
    stralloc actually uses the interesting policy that a NULL data pointer
    indicates an empty string.  In this way, non-static empty strings can be
    declared without construction.  This advantage is minimal, since static empty
    bstrings can be declared inline without construction, and if the string needs
    to be written to it should be constructed from an empty string (or its first
    initializer) in any event.
    
    wxString class
    --------------
    
    This is the string class used in the wxWindows project.  A description of
    wxString can be found here:
    http://www.wxwindows.org/manuals/2.4.2/wx368.htm#wxstring
    
    This C++ library is similar to CBString.  However, it is littered with
    trivial functions (IsAscii, UpperCase, RemoveLast etc.)
    
    1. There is no C implementation.
    2. The memory management strategy is to allocate a bounded fixed amount of
       additional space on each resize, meaning that it does not have the
       log_2(n) property that Bstrlib has (it will thrash very easily, cause
       massive fragmentation in common heap implementations, and can easily be a
       common source of performance problems).
    3. The library uses a "copy on write" strategy, meaning that it has to deal
       with multithreading problems.
    
    Vstr
    ----
    
    This is a highly orthogonal C string library with an emphasis on
    networking/realtime programming.  It can be found here:
    http://www.and.org/vstr/
    
    1. The convoluted internal structure does not contain a '\0' char * compatible
       buffer, so interoperability with the C library a non-starter.
    2. The API and implementation is very large (owing to its orthogonality) and
       can lead to difficulty in understanding its exact functionality.
    3. An obvious dependency on gnu tools (confusing make configure step)
    4. Uses a reference counting system, meaning that it is not likely to be
       thread safe.
    
    The implementation has an extreme emphasis on performance for nontrivial
    actions (adds, inserts and deletes are all constant or roughly O(#operations)
    time) following the "zero copy" principle.  This trades off performance of
    trivial functions (character access, char buffer access/coersion, alias
    detection) which becomes significantly slower, as well as incremental
    accumulative costs for its searching/parsing functions.  Whether or not Vstr
    wins any particular performance benchmark will depend a lot on the benchmark,
    but it should handily win on some, while losing dreadfully on others.
    
    The learning curve for Vstr is very steep, and it doesn't come with any
    obvious way to build for Windows or other platforms without gnu tools.  At
    least one mechanism (the iterator) introduces a new undefined scenario
    (writing to a Vstr while iterating through it.)  Vstr has a very large
    footprint, and is very ambitious in its total functionality.  Vstr has no C++
    API.
    
    Vstr usage requires context initialization via vstr_init() which must be run
    in a thread-local context.  Given the totally reference based architecture
    this means that sharing Vstrings across threads is not well defined, or at
    least not safe from race conditions.  This API is clearly geared to the older
    standard of fork() style multitasking in UNIX, and is not safely transportable
    to modern shared memory multithreading available in Linux and Windows.  There
    is no portable external solution making the library thread safe (since it
    requires a mutex around each Vstr context -- not each string.)
    
    In the documentation for this library, a big deal is made of its self hosted
    s(n)printf-like function.  This is an issue for older compilers that don't
    include vsnprintf(), but also an issue because Vstr has a slow conversion to
    '\0' terminated char * mechanism.  That is to say, using "%s" to format data
    that originates from Vstr would be slow without some sort of native function
    to do so.  Bstrlib sidesteps the issue by relying on what snprintf-like
    functionality does exist and having a high performance conversion to a char *
    compatible string so that "%s" can be used directly.
    
    Str Library
    -----------
    
    This is a fairly extensive string library, that includes full unicode support
    and targetted at the goal of out performing MFC and STL.  The architecture,
    similarly to MFC's CStrings, is a copy on write reference counting mechanism.
    
    http://www.utilitycode.com/str/default.aspx
    
    1. Commercial.
    2. C++ only.
    
    This library, like Vstr, uses a ref counting system.  There is only so deeply
    I can analyze it, since I don't have a license for it.  However, performance
    improvements over MFC's and STL, doesn't seem like a sufficient reason to
    move your source base to it.  For example, in the future, Microsoft may
    improve the performance CString.
    
    It should be pointed out that performance testing of Bstrlib has indicated
    that its relative performance advantage versus MFC's CString and STL's
    std::string is at least as high as that for the Str library.
    
    libmib astrings
    ---------------
    
    A handful of functional extensions to the C library that add dynamic string
    functionality.
    http://www.mibsoftware.com/libmib/astring/
    
    This package basically references strings through char ** pointers and assumes
    they are pointing to the top of an allocated heap entry (or NULL, in which
    case memory will be newly allocated from the heap.)  So its still up to user
    to mix and match the older C string functions with these functions whenever
    pointer arithmetic is used (i.e., there is no leveraging of the type system
    to assert semantic differences between references and base strings as Bstrlib
    does since no new types are introduced.)  Unlike Bstrlib, exact string length
    meta data is not stored, thus requiring a strlen() call on *every* string
    writing operation.  The library is very small, covering only a handful of C's
    functions.
    
    While this is better than nothing, it is clearly slower than even the
    standard C library, less safe and less functional than Bstrlib.
    
    To explain the advantage of using libmib, their website shows an example of
    how dangerous C code:
    
        char buf[256];
        char *pszExtraPath = ";/usr/local/bin";
    
        strcpy(buf,getenv("PATH")); /* oops! could overrun! */
        strcat(buf,pszExtraPath); /* Could overrun as well! */
    
        printf("Checking...%s\n",buf); /* Some printfs overrun too! */
    
    is avoided using libmib:
    
        char *pasz = 0;      /* Must initialize to 0 */
        char *paszOut = 0;
        char *pszExtraPath = ";/usr/local/bin";
    
        if (!astrcpy(&pasz,getenv("PATH"))) /* malloc error */ exit(-1);
        if (!astrcat(&pasz,pszExtraPath)) /* malloc error */ exit(-1);
    
        /* Finally, a "limitless" printf! we can use */
        asprintf(&paszOut,"Checking...%s\n",pasz);fputs(paszOut,stdout);
    
        astrfree(&pasz); /* Can use free(pasz) also. */
        astrfree(&paszOut);
    
    However, compare this to Bstrlib:
    
        bstring b, out;
    
        bcatcstr (b = bfromcstr (getenv ("PATH")), ";/usr/local/bin");
        out = bformat ("Checking...%s\n", bdatae (b, "<Out of memory>"));
        /* if (out && b) */ fputs (bdatae (out, "<Out of memory>"), stdout);
        bdestroy (b);
        bdestroy (out);
    
    Besides being shorter, we can see that error handling can be deferred right
    to the very end.  Also, unlike the above two versions, if getenv() returns
    with NULL, the Bstrlib version will not exhibit undefined behavior.
    Initialization starts with the relevant content rather than an extra
    autoinitialization step.
    
    libclc
    ------
    
    An attempt to add to the standard C library with a number of common useful
    functions, including additional string functions.
    http://libclc.sourceforge.net/
    
    1. Uses standard char * buffer, and adopts C 99's usage of "restrict" to pass
       the responsibility to guard against aliasing to the programmer.
    2. Adds no safety or memory management whatsoever.
    3. Most of the supplied string functions are completely trivial.
    
    The goals of libclc and Bstrlib are clearly quite different.
    
    fireString
    ----------
    
    http://firestuff.org/
    
    1. Uses standard char * buffer, and adopts C 99's usage of "restrict" to pass
       the responsibility to guard against aliasing to the programmer.
    2. Mixes char * and length wrapped buffers (estr) functions, doubling the API
       size, with safety limited to only half of the functions.
    
    Firestring was originally just a wrapper of char * functionality with extra
    length parameters.  However, it has been augmented with the inclusion of the
    estr type which has similar functionality to stralloc.  But firestring does
    not nearly cover the functional scope of Bstrlib.
    
    Safe C String Library
    ---------------------
    
    A library written for the purpose of increasing safety and power to C's string
    handling capabilities.
    http://www.zork.org/safestr/safestr.html
    
    1. While the safestr_* functions are safe in of themselves, interoperating
       with char * string has dangerous unsafe modes of operation.
    2. The architecture of safestr's causes the base pointer to change.  Thus,
       its not practical/safe to store a safestr in multiple locations if any
       single instance can be manipulated.
    3. Dependent on an additional error handling library.
    4. Uses reference counting, meaning that it is either not thread safe or
       slow and not portable.
    
    I think the idea of reallocating (and hence potentially changing) the base
    pointer is a serious design flaw that is fatal to this architecture.  True
    safety is obtained by having automatic handling of all common scenarios
    without creating implicit constraints on the user.
    
    Because of its automatic temporary clean up system, it cannot use "const"
    semantics on input arguments.  Interesting anomolies such as:
    
        safestr_t s, t;
        s = safestr_replace (t = SAFESTR_TEMP ("This is a test"),
                             SAFESTR_TEMP (" "), SAFESTR_TEMP ("."));
        /* t is now undefined. */
    
    are possible.  If one defines a function which takes a safestr_t as a
    parameter, then the function would not know whether or not the safestr_t is
    defined after it passes it to a safestr library function.  The author
    recommended method for working around this problem is to examine the
    attributes of the safestr_t within the function which is to modify any of
    its parameters and play games with its reference count.  I think, therefore,
    that the whole SAFESTR_TEMP idea is also fatally broken.
    
    The library implements immutability, optional non-resizability, and a "trust"
    flag.  This trust flag is interesting, and suggests that applying any
    arbitrary sequence of safestr_* function calls on any set of trusted strings
    will result in a trusted string.  It seems to me, however, that if one wanted
    to implement a trusted string semantic, one might do so by actually creating
    a different *type* and only implement the subset of string functions that are
    deemed safe (i.e., user input would be excluded, for example.)  This, in
    essence, would allow the compiler to enforce trust propogation at compile
    time rather than run time.  Non-resizability is also interesting, however,
    it seems marginal (i.e., to want a string that cannot be resized, yet can be
    modified and yet where a fixed sized buffer is undesirable.)
    
    Libsrt
    ------
    
    This is a length based string library based on a slightly different strategy.
    The string contents are appended to the end of the header directly so strings
    only require a single allocation.  However, whenever a reallocation occurs,
    the header is replicated and the base pointer for the string is changed.
    That means references to the string are only valid so long as they are not
    resized after any such reference is cached.  The internal structure maintains
    a lot some state used to accelerate unicode manipulation.  This makes
    sustainable usage of the library essentially opaque.  This also creates a
    bottleneck for whatever extensions to the library one desires (write all
    extensions on top of the base library, put in a request to the author, or
    dedicate an expert to learn the internals of the library).  The library is
    committed to Unicode representation of its string data, and therefore cannot
    be used as a generic buffer library.
    
    SDS
    ---
    
    Sds uses a strategy very similar to Libsrt.  However, it uses some dynamic
    headers to decrease the overhead for very small strings.  This requires an
    extra switch statement for access to each string attribute.  The source code
    appears to use gcc/clang extensions, and thus it is not portable.
    
    ===============================================================================
    
    Examples
    --------
    
        Dumping a line numbered file:
    
        FILE * fp;
        int i, ret;
        struct bstrList * lines;
        struct tagbstring prefix = bsStatic ("-> ");
    
        if (NULL != (fp = fopen ("bstrlib.txt", "rb"))) {
            bstring b = bread ((bNread) fread, fp);
            fclose (fp);
            if (NULL != (lines = bsplit (b, '\n'))) {
                for (i=0; i < lines->qty; i++) {
                    binsert (lines->entry[i], 0, &prefix, '?');
                    printf ("%04d: %s\n", i, bdatae (lines->entry[i], "NULL"));
                }
                bstrListDestroy (lines);
            }
            bdestroy (b);
        }
    
    For numerous other examples, see bstraux.c, bstraux.h and the example archive.
    
    ===============================================================================
    
    License
    -------
    
    The Better String Library is available under either the BSD license (see the
    accompanying license.txt) or the Gnu Public License version 2 (see the
    accompanying gpl.txt) at the option of the user.
    
    ===============================================================================
    
    Acknowledgements
    ----------------
    
    The following individuals have made significant contributions to the design
    and testing of the Better String Library:
    
    Bjorn Augestad
    Clint Olsen
    Darryl Bleau
    Fabian Cenedese
    Graham Wideman
    Ignacio Burgueno
    International Business Machines Corporation
    Ira Mica
    John Kortink
    Manuel Woelker
    Marcel van Kervinck
    Michael Hsieh
    Mike Steinert
    Richard A. Smith
    Simon Ekstrom
    Wayne Scott
    Zed A. Shaw
    
    ===============================================================================