Vuo  1.2.6
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Properties Friends Macros Groups Pages
Macros | Functions | Variables
VuoText.c File Reference

Description

VuoText implementation.

Macros

#define UTF8_ACCEPT   0
 VuoText_isValidUtf8 is based on code by Bjoern Hoehrmann.
 
#define UTF8_REJECT   1
 the byte is not allowed to occur at its position
 

Functions

VuoText VuoText_makeFromJson (json_object *js)
 Decodes the JSON object js, expected to contain a UTF-8 string, to create a new value.
 
json_objectVuoText_getJson (const VuoText value)
 Encodes value as a JSON object.
 
VuoText VuoText_truncateWithEllipsis (const VuoText subject, int maxLength, VuoTextTruncation where)
 If subject is less than or equal to maxLength Unicode characters long, returns subject.
 
char * VuoText_getSummary (const VuoText value)
 Creates a new UTF-8 C string from value, or, if it's more than 50 Unicode characters long, creates an aposiopesis.
 
VuoText VuoText_make (const char *unquotedString)
 Creates a VuoText value from an unquoted string (unlike VuoText_makeFromString(), which expects a quoted string).
 
VuoText VuoText_makeWithMaxLength (const void *data, const size_t maxLength)
 Creates a VuoText value from an untrusted source (one that might not contain a NULL terminator within its memory page).
 
VuoText VuoText_makeFromCFString (const void *cfs)
 Creates a VuoText value from a CFStringRef.
 
static bool VuoText_isValidUtf8 (const unsigned char *data, unsigned long size)
 Returns true if data is valid UTF-8 text.
 
VuoText VuoText_makeFromData (const unsigned char *data, const unsigned long size)
 Attempts to interpret data as UTF-8 text.
 
VuoText VuoText_makeFromUtf32 (const uint32_t *data, size_t size)
 Create a new VuoText string from an array of UTF-32 values.
 
VuoText VuoText_makeFromMacRoman (const char *string)
 Creates a new VuoText from a MacRoman-encoded string.
 
size_t VuoText_length (const VuoText text)
 Returns the number of Unicode characters in the text.
 
size_t VuoText_byteCount (const VuoText text)
 Returns the number of bytes in the text, not including the null terminator.
 
bool VuoText_isEmpty (const VuoText text)
 Returns true if text is empty (is NULL or is non-NULL with zero length).
 
bool VuoText_areEqual (const VuoText text1, const VuoText text2)
 Returns true if the two texts represent the same Unicode string (even if they use different UTF-8 encodings or Unicode character decompositions).
 
static bool isLessThan (const VuoText text1, const VuoText text2, CFStringCompareFlags flags)
 Helper for VuoText_isLessThan*().
 
bool VuoText_isLessThan (const VuoText text1, const VuoText text2)
 Returns true if text1 is ordered before text2 in a case-sensitive lexicographic ordering (which treats different UTF-8 encodings and Unicode character decompositions as equivalent).
 
bool VuoText_isLessThanCaseInsensitive (const VuoText text1, const VuoText text2)
 Returns true if text1 is ordered before text2 in a case-insensitive lexicographic ordering (which treats different UTF-8 encodings and Unicode character decompositions as equivalent).
 
bool VuoText_isLessThanNumeric (const VuoText text1, const VuoText text2)
 Returns true if the number in text1 is less than the number in text2.
 
bool VuoText_compare (VuoText text1, VuoTextComparison comparison, VuoText text2)
 Returns true if text1 matches text2 based on comparison.
 
size_t VuoText_findFirstOccurrence (const VuoText string, const VuoText substring, const size_t startIndex)
 Returns the index (starting at 1) of the first instance of substring in string at index >= startIndex.
 
size_t VuoText_findLastOccurrence (const VuoText string, const VuoText substring)
 Returns the index (starting at 1) of the last instance of substring in string.
 
VuoList_VuoInteger VuoText_findOccurrences (const VuoText string, const VuoText substring)
 Returns a list containing all occurrences of substring in string.
 
VuoText VuoText_substring (const VuoText string, int startIndex, int length)
 Returns the substring of string starting at index startIndex and spanning length Unicode characters.
 
VuoText VuoText_append (VuoText *texts, size_t textsCount)
 Returns a string consisting of the elements in the texts array concatenated together.
 
VuoTextVuoText_split (VuoText text, VuoText separator, bool includeEmptyParts, size_t *partsCount)
 Splits text into parts (basically the inverse of VuoText_append()).
 
VuoText VuoText_replace (VuoText subject, VuoText stringToFind, VuoText replacement)
 Returns a new string in which each occurrence of stringToFind in subject has been replaced with replacement.
 
VuoText VuoText_insert (const VuoText string, int startIndex, const VuoText newText)
 Returns a new string with newText inserted at the startIndex.
 
VuoText VuoText_removeAt (const VuoText string, int startIndex, int length)
 Returns a new string where characters from startIndex to startIndex + length are removed.
 
char * VuoText_format (const char *format,...)
 Returns a new string formatted using the printf-style format string.
 
VuoText VuoText_trim (const VuoText text)
 Returns a new string consisting of text without the whitespace at the beginning and end.
 
static bool VuoText_isASCII7 (VuoText text)
 Returns true if all byte values in text are between 0 and 127.
 
VuoText VuoText_changeCase (const VuoText text, VuoTextCase textCase)
 Returns a new string with the text characters cased in the textCase style.
 
uint32_t * VuoText_getUtf32Values (const VuoText text, size_t *length)
 Returns an array of unicode 32 bit decimal values for each character in a string.
 

Variables

static const uint8_t utf8d []
 The first part maps bytes to character classes, the second part encodes a deterministic finite automaton using these character classes as transitions.
 

Macro Definition Documentation

#define UTF8_ACCEPT   0

VuoText_isValidUtf8 is based on code by Bjoern Hoehrmann.

Copyright (c) 2008-2009 Bjoern Hoehrmann bjoer.nosp@m.n@ho.nosp@m.ehrma.nosp@m.nn.d.nosp@m.e See http://bjoern.hoehrmann.de/utf-8/decoder/dfa/ for details.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.enough bytes have been read for a character

#define UTF8_REJECT   1

the byte is not allowed to occur at its position

Function Documentation

static bool isLessThan ( const VuoText  text1,
const VuoText  text2,
CFStringCompareFlags  flags 
)
static

Helper for VuoText_isLessThan*().

static bool VuoText_isASCII7 ( VuoText  text)
static

Returns true if all byte values in text are between 0 and 127.

static bool VuoText_isValidUtf8 ( const unsigned char *  data,
unsigned long  size 
)
static

Returns true if data is valid UTF-8 text.

Variable Documentation

const uint8_t utf8d[]
static
Initial value:
= {
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
8,8,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
0xa,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x4,0x3,0x3,
0xb,0x6,0x6,0x6,0x5,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,
0x0,0x1,0x2,0x3,0x5,0x8,0x7,0x1,0x1,0x1,0x4,0x6,0x1,0x1,0x1,0x1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,0,1,0,1,1,1,1,1,1,
1,2,1,1,1,1,1,2,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1,
1,2,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1,3,1,1,1,1,1,1,
1,3,1,1,1,1,1,3,1,3,1,1,1,1,1,1,1,3,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
}

The first part maps bytes to character classes, the second part encodes a deterministic finite automaton using these character classes as transitions.