StringLib C: Library Adding String Type to C

0
43

Introduction

StringLib C is a library designed for C99+ (or C89 implementations supporting stdint.h) that defines ‘string’ type and several functions that work with it, allowing easy string manipulation, reading and writing, without having to worry about memory allocation (scary stuff, really).

All dynamic allocation is done automatically: a newly defined string is considered an empty string (“”), while calling the string_delete(string*) function frees the memory and empties the string once again;

The ‘string’ type is a structure with no visible members, which prevents the user from modifying it directly, but it still has a well defined size.

C11+ additional features (not required):

The library checks for _Generic() keyword support and (if present) uses it for function macro allocation: this feature allows using different parameters (like pointer to string instead of a string_literal/pointer_to_char in many functions);

The option is automatically disabled if not supported, but you can also disable it with a macro.

How string type works

WARNING:  this paragraph will deal with structure obfuscation, which is done to prevent (as much as possible) inexperienced users to edit members that are not intended to be edited separately; if you hate obfuscation and/or obfuscation is against your religion, please close your browser and clear your history before it’s too late.

The string type inside stringlib.h is not defined as a pointer to a structure but as an actual structure with no members (hidden members, russian hackers won’t mess with my code this time [please ignore if you are russian]); this is done through unnamed bit fields that use the sizeof keyword.

typedef struct
#ifndef _STRINGLIB_OVERLOADING_WAIT_
{
 uintptr_t:8*sizeof(uintptr_t);
 size_t:8*sizeof(size_t);
 size_t:8*sizeof(size_t);
 uintptr_t:8*sizeof(uintptr_t);
 uintptr_t:8*sizeof(uintptr_t);
}_string;

#if !defined(string) || defined(_STRINGLIB_FORCE_DEFINE_)
#undef string
#define string _string
#endif // string

While the hypothetical accessible structure should be:

struct _string_accessible
{
 
 char *getString;
 
 size_t getSize;
 
 size_t getSizeChars;
 
 void *stringAllocation;
 void *stringSignature;
};

This allows the type to have the given size of a pointer to char (actual string), two size_t indicating the string size and two pointers to void (used for dynamic allocation inside functions), despite not having visible members, which makes the string type uneditable both when declared and called without passing it through the stringlib.h functions or reading its memory by casting to pointer, thus preventing possible memory allocation faults due to wrong editing by inexperienced users.

The choice of having the char pointer as first member allows the string to be printed by simply passing it as a parameter of printf, this action is however discouraged, and i advise passing string_getString(string) or using the dedicated string_print(string, ...) function instead (second eventual parameter determines new line)

The stringlib.c file declares five macros for reading and writing each one of the type members.

#define _STRINGLIB_ACCESS_STRING_(STRING)\
 (*(*((char***)((_string*[]){(&(STRING))}))))
#define _STRINGLIB_ACCESS_SIZE_(STRING)\
 (*((size_t*)&(((uint8_t*)(&(STRING)))[sizeof(uintptr_t)])))
#define _STRINGLIB_ACCESS_SIZECHARS_(STRING)\
 (*((size_t*)&(((uint8_t*)(&(STRING)))[sizeof(uintptr_t)+sizeof(size_t)])))
#define _STRINGLIB_ACCESS_ALLOCATION_(STRING)\
 (*((void**)&(((uint8_t*)(&(STRING)))[sizeof(uintptr_t)+2*sizeof(size_t)])))
#define _STRINGLIB_ACCESS_SIGNATURE_(STRING)\
 (*((void**)&(((uint8_t*)(&(STRING)))[2*sizeof(uintptr_t)+2*sizeof(size_t)])))

In the following headings the string type will be refferred as the hypothetical accessible structure for convenience and major readability.

How string allocation works

String allocation is checked by every function in the library via string_isAllocated(string) function (unnecessary to the user); this behaviour makes it possible to use all the functions with a newly declared string without the risk of generating errors, as said string would be treated as an empty “” string.

int (string_isAllocated)(_string string_a)
{
 return (string_a.stringAllocation == _STRINGLIB_ALLOCATION_TRUE_ &&\
 string_a.stringSignature == _STRINGLIB_ALLOCATION_TRUE_ &&\
 (string_a.getSize == string_a.getSizeChars+1));
}

The string_isAllocated(string) function return is then used inside (and in some cases before) string_init(string*) function (again, not necessary to the user), which sets the string to an empty string with the size of 1 byte.

size_t (string_init)(_string *string_a)
{
 int stringMallocFail = 0;
 char *tempStrMalloc = NULL;

 
 if (string_isAllocated(*string_a))
 {
 free(string_a->getString);
 
 string_a->stringAllocation = _STRINGLIB_ALLOCATION_FALSE_;
 string_a->stringSignature = _STRINGLIB_ALLOCATION_FALSE_;
 }
 string_a->getString = NULL;

 
 tempStrMalloc = (char*) malloc(1 * sizeof(char));
 while (tempStrMalloc == NULL)
 {
 tempStrMalloc = (char*) malloc(1 * sizeof(char));
 
 if (++stringMallocFail == _STRINGLIB_MAX_ALLOC_FAILS_)
 {free(tempStrMalloc); printf("_string memory initialization failed\n"); return 0;};
 }
 string_a->getString = tempStrMalloc;
 string_a->getString[0] = '\0';

 string_a->getSize = 1;
 string_a->getSizeChars = 0;

 
 string_a->stringAllocation = _STRINGLIB_ALLOCATION_TRUE_;
 string_a->stringSignature = _STRINGLIB_ALLOCATION_TRUE_;
 return 1;
}

A string must have both stringAllocation and stringSignature members set to _STRINGLIB_ALLOCATION_TRUE_ and the getSize member to be higher of getSizeChars by 1 to be allocated, which makes it nearly impossible for a string to be considered as initialized when first declared (world will most certainly end before); in any case calling functions that allocate a new size to strings (which also deallocate the previous size if necessary) like string_set(string*, char*) or string_scan(string*) or just string_init(string*) BEFORE functions that work over previously allocated size should avoid even this remote possibility.

void (string_delete)(_string *string_a)
{
 
 if (string_isAllocated(*string_a))
 {
 free(string_a->getString);
 
 string_a->stringAllocation = _STRINGLIB_ALLOCATION_FALSE_;
 string_a->stringSignature = _STRINGLIB_ALLOCATION_FALSE_;
 }
 string_a->getString = NULL;
}

Deallocation of a string is done by calling the function string_delete(string*), which frees the char pointer and sets allocation and signature to _STRINGLIB_ALLOCATION_FALSE_; any other function can still be used over a previously deleted string.

Text and binary file reading/writing

The library also relies on specific dedicated functions for file reading and writing

size_t (string_write)(_string string_a, FILE *file_a, ...)
{
 size_t initPos = 0;
 size_t pos = 0;
 char c_return = '\r';
 va_list valist;
 if (string_isAllocated(string_a))
 {
 while (*(string_a.getString+pos)!= '\0')
 {
 
 if (*(string_a.getString+pos)== '\n')
 {
 *(string_a.getString+pos)='\0';
 fputs(string_a.getString+initPos, file_a);
 fwrite(&c_return, sizeof(char), 1, file_a);
 fputc('\n', file_a);
 *(string_a.getString+pos)='\n';
 initPos = pos+1;
 }
 ++pos;
 }
 
 fputs(string_a.getString+initPos, file_a);
 }
 
 va_start(valist, file_a);
 if (va_arg(valist, int)) fputc('\n', file_a);
 va_end(valist);
 return string_a.getSize;
}

The text file writing function does not simply write the string as-is, but also writes an addidional carriage return character (don’t worry, it’s only visible with an hexadecimal editor) before the new line character for every new line within the string, allowing string_read and string_readAppend functions to determine whether the string continues after a new line.

size_t (string_writeBin)(_string string_a, FILE *file_a)
{
 if (!string_isAllocated(string_a))
 {string_a.getSize = 0; fwrite(&(string_a.getSize), sizeof(size_t), 1, file_a); return 0;}
 
 fwrite(&(string_a.getSize), sizeof(size_t), 1, file_a);
 
 fwrite(string_a.getString, sizeof(char), string_a.getSize, file_a);
 return string_a.getSize;
}

The simpler binary writing function writes the string size followed by the string characters or simply writes 0 for unallocated strings; string_readBin on the other hand reads the first number, then allocates the string.

Function overloading

This feature works only on standard C11 implementations or compiler versions supporting the _Generic() keyword (GNU 4.9+, Clang 3.0+, xlC 12.01[V2R1]+), can be checked with _STRINGLIB_OVERLOADING_ macro.

The _Generic() keyword allows type checking in C and can be used inside a macro function for overloading purposes (just like C++, but NO, not really), as in the example:

#define string_set(A, B)\
 _Generic(B,\
 char*: (string_set)((_string*)A, (char*)B),\
 _string*: (string_isAllocated)(*((_string*)B))?(string_set)((_string*)A, ((string_getString)(*((_string*)((void*)B+0))))+0):(string_set)((_string*)A, ""),\
 default: (string_set)((_string*)A, ""))

In this example, generic selection allows a pointer to string to be passed in the function instead of a pointer to char, thus giving the user more flexibility.

Simple overloading

StringLib’s simple overloading consists in macro definition of functions in which only the number of passed parameters (or whether last parameter is passed) is checked, without checking the parameter types; this is done as a replacement for the oveloading method described above if the _Generic() keyword is not supported, and in some functions that only require parameters number checking.

Here are two examples of simple overloading:


#define string_appendPos(A, B, C...) ((string_appendPos)(A, B, (size_t)C+0))

#define string_print(A, B...) ((string_print)(A, (sizeof((int[]){B}))?B+0:1))

While the former simply passes the argument C or argument 0 if empty, the latter creates a new array containing the extra parameters and checks its size.

Both generic selection overloading and simple overloading can be avoided by placing brackets on the function name when calling it.

Implementation example

int main()
{
 string string_a;
 string string_b;
 string_set(&string_a, "hello world\nnew line test");
 string_newline(&string_a, "nothing to see here");
 
 string_print(string_a);
 printf("SIZE: %d\n", string_getSize(string_a));
 
 FILE *foo = fopen("text.txt", "w");
 string_write(string_a, foo);
 fclose(foo);
 
 foo = fopen("text.txt", "r");
 string_read(&string_b, foo);
 string_print(string_b);
 printf("SIZE: %d\n", string_getSize(string_b));
 string_delete(&string_a);
 string_delete(&string_b);
 return 0;
}

Additional info

StringLib C is currently in BETA phase, any help, suggestion, compliment or straight up insult is well accepted.

If you try the library please give your feedback (here or in the sourceforge page… or both).

Full documentation

Basic Functions

const char *const string_getString(_string string_a);

converts string to char pointer.

size_t string_getSize(_string string_a); 

returns string size; Checking functions:

int string_contains(_string string_a, int charnum, ...); 

checks if string contains string literal or one of the characters (check the header file to see the proper use of charnum based on your support of _Generic() keyword)

int string_equals(_string string_a, const void *string_v);

checks if string is the same as the string literal

Input/output functions

size_t string_set(_string *string_a, const void *string_b); 

sets string to string literal

size_t string_scan(_string *string_a); 

sets string to user input

size_t string_append(_string *string_a, const void *string_v);
size_t string_appendPos(_string *string_a, const void *string_v, ...); 

appends string literal to string

size_t string_scanAppend(_string *string_a);
size_t string_scanAppendPos(_string *string_a, ...); 

appends user input to string

size_t string_newline(_string *string_a, ...); 

creates new line, eventually appends string literal (can pass stdin to append user input)

size_t string_cut(_string *string_a, size_t pos, ...); 

cuts string from position ‘pos’ to eventual end position

size_t string_override(_string *string_a, const void* string_v, ...); 

overrides string literal over the string starting from eventual position (if specified)

void string_swap(_string *string_a, _string *string_b); 

swaps two strings

void string_delete(_string *string_a); 

deletes a string, deallocating memory

void string_print(_string string_a, ...); 

prints a string to output console

File Input/output functions

size_t string_write(_string string_a, FILE *file_a, ...);

write to text file

size_t string_writeBin(_string string_a, FILE *file_a);

write to binary file

size_t string_read(_string *string_a, FILE *file_a, ...); 
size_t string_readAppend(_string *string_a, FILE *file_a, ...);

read from text file

size_t string_readBin(_string *string_a, FILE *file_a);

read from binary file

 

Unnecessary functions (used by other functions but not necessary for user)

size_t string_init(_string *string_a);

initializes string

int string_isAllocated(_string string_a);

checks if the string is allocated

History

[21/03/2017] First beta upload

[30/03/2017] Major changes to structure access methods

[09/04/2017] Changed string access method, optimized algorithm complexity

[19/04/2017] Added simple overloading (prevents bugs in linux)

[09/05/2017] Split string_contains function; string_contains now returns position+1

[13/05/2017] string_contains can now check from any starting position

LEAVE A REPLY