Buffer Object (BO)

A buffer object is a convenient container to store arbitrary data. It's also possible to use them to store text if desired, in which case using UTF-8 it recommended.

Types

There are multiple BO types. Every BO also contains a type field indicating the type of the object. Here, the type constants are referred to as dynamic or runtime types, and the C types as static types. BO types may refer to either. The relation between static and dynamic types can be found in the table below.

Pointers to dpa_u_­bo_*_t* types vs. dpa_u_any_­bo_*_t* types

The dpa_u_any_­bo_*_t* types are const opaque types. They are to be passed to functions expecting a variant or derived type of the corresponding dpa_u_­bo_*_t type. Pointers to dpa_u_­bo_*_t types must never point to a derived type, nor, in case of a variant type, to a type that variant can not contain. That way, a = *b will be a safe operation for dpa_u_­bo_*_t pointers. The BO objects contain a type field, and this ensures the type field of a base type will not be set to the type constant for a derived type.

Use cases

For different use-cases, there exist different types of buffer objects.

The dpa_u_­bo_­simple_­ro_t is useful in cases where only a simple buffer is needed, and there are no additional needs. It can also be used to force a buffers data to be copied in functions which need to store a reference to the BO.

The dpa_u_­bo_­unique_t is useful where comparing 2 BOs and/or getting their hash should be fast / in O(1), it is also very memory efficient. But there is an initial overhead for creating such a BO, which can only be done using the dpa_u_­bo_­intern function. This type may either store a dpa_u_­bo_­inline_t or a dpa_u_­bo_­unique_­hashmap_t, but those two types are not intended to be used directly, and while dpa_u_­bo_­inline_t can only store data up to a certain size, dpa_u_­bo_­unique_­hashmap_t will only store amounts of data bigger than that. dpa_u_­bo_­inline_t takes advantage of automatic storage duration & data copying, while dpa_u_­bo_­unique_­hashmap_t is immutable and reference counted. For simplicity, all these types can be passed to dpa_u_­bo_­ref and dpa_u_­bo_­put, although for dpa_u_­bo_­inline_t, this is simply a no-op.

The hashed and reference counted BO can, as their name implies, hold a reference count or hash respectively. A hashed type contains the hash in the BO object.
A reference count, on the other hand, will match the lifetime of the data a BO points to, and it is recommended to allocate the reference count together with the data whenever possible. More specifically, a dpa_u_­ref­count_­free­able_t or a type derived from it is used, which allows automatically freeing the BO when it's no longer referenced. It's also possible to execute a destructor callback at the same time if needed. The reference count types serve an additional purpose. When they contain a static reference count, that is a hint that the BOs data will never bee freed, which is useful to allow functions to know that they do not need to copy string literals and such things.

The dpa_u_any_­bo_*_t* types may point to a derived type of their corresponding type. The dpa_u_any_­bo_­ro_t* type is notable, because it may point to any type of BO. The only special case is DPA_U_­BO_­UNIQUE_­HASHMAP, because dpa_u_­bo_­unique_­hashmap_t is internally a pointer to an entry in a hash map, and the any types will always point to that entry directly, rather than to an enclosing variant type, it can't refer to the original variant BO even if the any BO pointer was derived from one, which may affect it's lifetime. However, this is usually not necessary anyway, and it avoids an unnecessary dereference step. It is also still possible to create a new, temporary, variant BO, from an any pointer containing a DPA_U_­BO_­UNIQUE_­HASHMAP.

If a pointer to an dpa_u_­bo_*_t is passed to a function, that usually means the function is used to return a BO of that type, or to modify an existing BO.

Because there are limitations in what can be done using inheritance and association, which limits what the dpa_u_any_­bo_*_t* types can be used for, the trait types exist. These are opaque types that can point to a BO with certain traits. For example, dpa_u_­bo_­with_­hash_­ro_t* can refer to any BO type which may contain a hash, which dpa_u_any_­bo_­hashed_­ro_t* can't do, because dpa_u_any_­bo_­ref­count­ed_­hashed_­ro_t* can't inherit from dpa_u_any_­bo_­ref­count­ed_­ro_t* and dpa_u_any_­bo_­hashed_­ro_t* at the same time.
The dpa_u_­bo_­gc_ro_t type was specifically added for both, reference counted and inline BOs. This is useful because a unique BO could contain an inline BO.

Readonly (ro), immutability, and non-readonly types

Technically, all BO types have a readonly (ro) version. When that type is used, the data the BO contains is const qualified and will only be accessed for reading through that BO, but if there is a pointer to the data, that pointer is not const qualified. The dpa_u_­bo_­inline_­ro_t is special. It's just the const version of the dpa_u_­bo_­inline_t type, rather than a distinct type, because the BO object itself contains it's data directly. Also, even ro types that may contain a dpa_u_­bo_­inline_t won't use dpa_u_­bo_­inline_­ro_t, because then, they would not be assignable anymore, but the const version of the ro buffer will contain the const version of dpa_u_­bo_­inline_t, which is a dpa_u_­bo_­inline_­ro_t. Also, dpa_u_­bo_­unique_­hashmap_t does not end in _ro_t, but it is a readonly type.

Some types do not have a non-readonly variant. These types are not only readonly, but immutable. This includes the dpa_u_­bo_­unique_­hashmap_t and the dpa_u_­bo_­ref­count­ed_­hashed_ro_t type. The dpa_u_­bo_­inline_­ro_t does have a non-ro variant, but it can still be considered immutable. The dpa_u_­bo_­ref­count­ed_­hashed_ro_t has no non-readonly type because changing the data of a hashed BO will lead to it having an invalid hash, and a reference count only makes sense with multiple instances of the BO, in other words, if such a BO would exist, it would only end up with a wrong hash eventually. For simplicities sake, functions of this library assume dpa_u_­bo_­refcounted_­ro_t to be immutable as well, although this doesn't necessarily have to be true. Be careful when converting a dpa_u_­bo_­ref­count­ed_t to an dpa_u_­bo_­ref­count­ed_­ro_t, if the BO may change in the future, it's recommended to convert it to a dpa_u_­bo_­simple_­ro_t or to copy it directly first instead before passing it to functions which may want to keep a reference to an immutable BO. For this, there is also the conversion macro dpa_u_­bo_­maybe_­not_­immutable, which can handle this for any BO type.

All static & runtime types

Base typesThey all have the same small size
Base type variantsCan store various base types & have the same size
Derived typesContain additional useful properties
Any types Opaque type, must be used as a pointer. They may point to the corresponding base type itself, but can also be though of pointing to the variants which can store it. It could also point to the types derived from the base type.
Trait typesOpaque type, must be used as a pointer. Points to types with specific properties
DPA_U_BO_INLINE DPA_U_BO_UNIQUE_HASHMAP DPA_U_BO_SIMPLE DPA_U_BO_HASHED DPA_U_BO_REFCOUNTED DPA_U_BO_REFCOUNTED_HASHED
dpa_u_bo_inline_t
dpa_u_bo_unique_hashmap_t
dpa_u_bo_simple_ro_t
dpa_u_bo_simple_t
dpa_u_bo_unique_t
dpa_u_bo_ro_t
dpa_u_bo_t
dpa_u_bo_hashed_ro_t
dpa_u_bo_hashed_t
dpa_u_bo_refcounted_ro_t
dpa_u_bo_refcounted_t
dpa_u_bo_refcounted_hashed_ro_t
dpa_u_any_bo_inline_t*
dpa_u_any_bo_unique_hashmap_t*
dpa_u_any_bo_simple_ro_t*
dpa_u_any_bo_simple_t*
dpa_u_any_bo_unique_t*
dpa_u_any_bo_ro_t*
dpa_u_any_bo_t*
dpa_u_any_bo_hashed_ro_t*
dpa_u_any_bo_hashed_t*
dpa_u_any_bo_refcounted_ro_t*
dpa_u_any_bo_refcounted_t*
dpa_u_any_bo_refcounted_hashed_ro_t*
dpa_u_bo_with_hash_ro_t*
dpa_u_bo_with_hash_t*
dpa_u_bo_gc_ro_t*
dpa_u_bo_with_refcount_ro_t*
dpa_u_bo_with_refcount_t*
dpa_u_bo_with_refcount_and_hash_ro_t*

Layout of Types

The ro and non-ro types have the same layout.

For accessing properties of an existing object, using the macros is always recommended.

The layout of variant types is unspecified and subject to change. Use the type conversion macros to create them. The same applies to the opaque types and the dpa_u_­bo_­unique_­hashmap_t type.

dpa_u_bo_inline_t

TypeNameDescription
unsigned : 4typeDPA_U_BO_INLINE
size : 4sizeThe size of the BO. Currently, dpa_u_­bo_­inline_t is assumed to never be bigger than 16 bytes, but it may be smaller than that.
char[DPA_U_BO_INLINE_MAX_SIZE] data An array which contains the data. One byte less than the size of dpa_u_­bo_­inline_t

dpa_u_bo_simple_t / dpa_u_bo_simple_ro_t

TypeNameDescription
unsigned : 4typeDPA_U_­BO_­SIMPLE
size_t : sizeof(size_t)-1sizeThe size of the BO.
void* / const void*dataA pointer to the data

dpa_u_bo_hashed_t / dpa_u_bo_hashed_ro_t

TypeNameDescription
dpa_u_bo_simple_t / dpa_u_bo_simple_ro_tbo_simpleSee dpa_u_­bo_­simple_t
dpa_u_refcount_freeable_t*refcountA pointer to the reference counter

dpa_u_bo_refcounted_t / dpa_u_bo_refcounted_ro_t

TypeNameDescription
dpa_u_bo_simple_t / dpa_u_bo_simple_ro_tbo_simpleSee dpa_u_­bo_­simple_t
dpa_u_hash_thashFrom this field and the data, the hash can be derived in O(1)

dpa_u_bo_refcounted_hashed_ro_t

TypeNameDescription
dpa_u_bo_refcounted_ro_tbo_refcountedSee dpa_u_­bo_­refcounted_­ro_t
dpa_u_hash_thashFrom this field and the data, the hash can be derived in O(1)

dpa_u_bo_unique_hashmap_stats_t

See also dpa_u_bo_unique_hashmap_stats.
TypeNameDescription
size_tempty_countHow many buckets of the hash map are unused
size_tcollision_countIf an entry is added to a bucket that is already occupied, that is a collision. This field indicates how often that happened.
size_ttotal_bucketsThe total amount of buckets that currently exist
size_tentry_countThe total amount of entries
doubleload_factorentry_count divided by total_buckets

Functions

All the functions listed here, are actually macros. The real functions are suffixed with _p. Macros which use generics also have a version suffixed with _g, which allows nesting generics.

Property access functions

dpa_u_bo_get_type

Returns the type of a BO.

This macro will not check the type fields for types where only 1 type is possible, it'll just return that type constant. The type returned is always an enum, but which enum depends on the BO in question. This is so that when it's used in a switch, the compiler only warns about the possible type values if they have no case. It's always safe to cast to enum dpa_u_­bo_­any_­type, and it's recommended to always use the constants in that enum.

dpa_u_bo_data

Get the data property of the BO, as a pointer. If, for the given BO type, this is always a pointer, then this will be an lvalue, and you can assign a value to it.

dpa_u_bo_set_size(bo, size)

Set the size of a BO. For some BO types, such as the immutable dpa_u_­bo_­unique_­hashmap_t the size can not be set. Static types which could refer to such a type can't have the size set directly either, even if their dynamic type is one where the size can be set. In that case, the type first needs to be converted to a compatible mutable type.

dpa_u_bo_get_size

Get the size of a BO.

dpa_u_bo_get_refcount

Returns a pointer to the reference count, it has the type dpa_u_­ref­count_­free­able_t. Takes only BOs which may have a reference count. If they don't after all, returns 0.

For incrementing / decrementing the reference count, it's recommended to just use dpa_u_bo_ref and dpa_u_bo_unref directly, instead of first getting the referemce count and then using dpa_u_­refcount_­ref and dpa_u_­refcount_­put on it.

dpa_u_bo_set_refcount

Sets the pointer to the reference count.

Conversion Macros

dpa_u_bo_maybe_not_immutable / dpa_u_­bo_­mni

If a BO has a dynamic type of DPA_U_­BO_­REFCOUNTED, it'll be changed to DPA_U_­BO_­SIMPLE. The static type may also change if the old type can not hold that dynamic type. This is useful to ensure that a function which needs to store a reference to a buffer for longer than than it's own runtime has to make a copy.

dpa_u_bo_intern

Interns a BO. Returns a dpa_u_­bo_­unique_t. This function needs to compare strings internally, and may resize & reorganize internal data structures. Reference counted BOs will not be copied, instead, their reference count is incremented.

dpa_u_t_bo_*

Convert the BO to the specified static BO type. Always returns an lvalue. The lifetime of the bo will be the same as the enclosing block scope, a compound literal is used to acheeve this. For inline BOs, the lifetime of it's data is the same as the lifetime of the BO object itself. It's runtime type will change if necessary.

dpa_u_tp_bo_*

Same as dpa_u_t_bo_* but returns a pointer. The pointer itself may not be an lvalue, but the type it points to will not be const, and will have the lifetime of the enclosing block scope. These macros can also be used to convert to the any bo types.

dpa_u_v_bo_*

Convert the BO to the specified static BO type if it is guaranteed to succeed and the lifetime of the data does not change. It's runtime type will change if necessary. It may or may not return the same object, and it may not be an lvalue.

dpa_u_p_bo_*

Cast the BO to a pointer to the specified static BO type if it is compatible. It does not take BO types were this is not possible. The pointer will point to the original BO object, except if the source BO is a variant type that is being converted to a dpa_u_­bo_­unique_­hashmap_t or an opaque pointer type, because that type is already a pointer.

dpa_u_up_bo_*

Cast the BO to a pointer to the specified static BO type. For types where if this is possible depends on the dynamic type, if it turns out not to be possible at runtime, it returns 0. The pointer will point to the original BO object, except if the source BO is a variant type that is being converted to a dpa_u_­bo_­unique_­hashmap_t or an opaque pointer type, because that type is already a pointer.

dpa_u_bo_ptr

Turns a BO into it's dpa_u_any_bo_* version. The pointer will point to the original BO object, except if the source BO is a variant type containing a DPA_U_­BO_­UNIQUE_­HASHMAP, because that type contains a dpa_u_­bo_­unique_­hashmap_t, which is already a pointer.

Other functions

dpa_u_bo_ref

Take a reference / Increments the BOs reference count.

For inline BOs, this is a no-op.

dpa_u_bo_put

Release a reference / decrement the BOs datas reference count.

When no references are left, the BOs data is freed. Some BOs may als have their own destructor function, which will then be executed.

Some BOs may have a static reference count, such a reference count will never hit 0, and the reference count will never be freed.

For inline BOs, this is a no-op.

int dpa_u_bo_compare(a,b)

Compares 2 BOs.

Comparing exclusively between BOs of any of the types dpa_u_­bo_­unique_t, dpa_u_­bo_­unique_­hashmap_t and dpa_u_­bo_­inline_t, bzw. DPA_U_­BO_­UNIQUE_­HASHMAP and DPA_U_­BO_­INLINE, is an O(1) operation. For any other type, it's O(N), and depends on the length of the data.

Returns 0 if equal, -1 or 1 otherwise. For most BOs, first, the size is compared, then the content using memcmp. But 2 DPA_U_­BO_­UNIQUE_­HASHMAP will only compare the address of the BO objects, instead of the data, because they are unique. In that case, the order may differ for the same data referred to by a different BO type, in non-transitive ways. If you use this for sorting, ensure you never have both, a DPA_U_­BO_­UNIQUE_­HASHMAP BOs and another BO of a different type, that refers to the same data, or the sorting will not work as expected.

int dpa_u_bo_compare_lexicographic(a,b)

Compares the content of 2 BOs.

This uses memcmp to copy 2 BOs, and then the size. It does not have the limitations dpa_u_bo_compare has, but is often slower.

dpa_u_bo_unique_hashmap_stats_t dpa_u_bo_unique_hashmap_stats()

This function is manly for debugging purposes. It returns statistics about the hash map storing dpa_u_­bo_­unique_­hashmap_t entries, such as how many entries there are, how many collisions, and so on. See dpa_u_bo_unique_hashmap_stats_t for details.

void dpa_u_bo_unique_verify()

This function is for debugging purposes only. Verifies that the internal hash map storing the unique BOs is in a valid state. On success, does nothing. On error, aborts. There is pretty much never a need to use this function.

Constants

enum dpa_u_­bo_­any_­type

NameValue
DPA_U_BO_INLINE1
DPA_U_BO_UNIQUE_HASHMAP2
DPA_U_BO_SIMPLE3
DPA_U_BO_HASHED4
DPA_U_BO_REFCOUNTED5
DPA_U_BO_REFCOUNTED_HASHED6

Other constants

NameValue
DPA_U_BO_INLINE_MAX_SIZE Currently assumed to be <= 15. Platform dependent. One byte less than the size of dpa_u_­bo_­inline_t
dpa_u_mask_* There is such a constant for every static BO type. It is a bitmask of all dynamic types it can contain. See the table above.

Macros

dpa_u_case_*

There is such a macro for every static type. It expands to a bunch of cases for type constants the static type can contain. For example: dpa_u_case_bo_unique: puts("dpa_u_bo_unique_t"); expands to: case DPA_U_BO_INLINE: case DPA_U_BO_UNIQUE_HASHMAP: puts("dpa_u_bo_unique_t"); This may seam very useful at first, but there is an overlap between dynamic types with many static types, so in practice, the dynamic types often need to be specified explicitly anyway.

DPA_U_BO_UNIQUE_CSTRING / DPA_U_BO_DECLARE_UNIQUE_CSTRING / DPA_U_BO_DEFINE_UNIQUE_CSTRING

This is for when you need a string constant as a dpa_u_­bo_­unique_t. It can only be used in file scope.

All these macros take first the name of a getter function, and then a C string constant. The getter will return the desired dpa_u_­bo_­unique_t. DPA_U_BO_DECLARE_UNIQUE_CSTRING declares the getter, whereas DPA_U_BO_DEFINE_UNIQUE_CSTRING defines it. The getter is an inline function. If the string fits into an inline BO, this is equivalent to constructing one & returing it. If it doesn't, it interns the string on program startup, and the getter simply copies the unique bo.

The macro DPA_U_BO_UNIQUE_CSTRING is either defined as DPA_U_BO_DECLARE_UNIQUE_CSTRING or as DPA_U_BO_DEFINE_UNIQUE_CSTRING if DPA_U_GEN_DEF was set before includeing bo.h. This way, you can put the C strings you need as unique BOs into a single file, include it where you need the declaration, and compile it with DPA_U_GEN_DEF set to get the needed declarations. That way, you don't need to keep 2 files in sync.