unicode_u_ucs4_native (3) Linux Manual Page
unicode_u_ucs4_native, unicode_u_ucs2_native, unicode_convert_init, unicode_convert, unicode_convert_deinit, unicode_convert_tocbuf_init, unicode_convert_tou_init, unicode_convert_fromu_init, unicode_convert_uc, unicode_convert_tocbuf_toutf8_init, unicode_convert_tocbuf_fromutf8_init, unicode_convert_toutf8, unicode_convert_fromutf8, unicode_convert_tobuf, unicode_convert_tou_tobuf, unicode_convert_fromu_tobuf – unicode character set conversion
Synopsis
#include <courier-unicode.h>
extern const char unicode_u_ucs4_native[];
extern const char unicode_u_ucs2_native[];
unicode_convert_handle_t unicode_convert_init(const char *src_chset, const char *dst_chset, void *cb_arg);
int unicode_convert(unicode_convert_handle_t handle, const char *text, size_t cnt);
int unicode_convert_deinit(unicode_convert_handle_t handle, int *errptr);
unicode_convert_handle_t unicode_convert_tocbuf_init(const char *src_chset, const char *dst_chset, char **cbufptr_ret, size_t *cbufsize_ret, int nullterminate);
unicode_convert_handle_t unicode_convert_tocbuf_toutf8_init(const char *src_chset, char **cbufptr_ret, size_t *cbufsize_ret, int nullterminate);
unicode_convert_handle_t unicode_convert_tocbuf_fromutf8_init(const char *dst_chset, char **cbufptr_ret, size_t *cbufsize_ret, int nullterminate);
unicode_convert_handle_t unicode_convert_tou_init(const char *src_chset, unicode_char **ucptr_ret, size_t *ucsize_ret, int nullterminate);
unicode_convert_handle_t unicode_convert_fromu_init(const char *dst_chset, char **cbufptr_ret, size_t *cbufsize_ret, int nullterminate);
int unicode_convert_uc(unicode_convert_handle_t handle, const unicode_char *text, size_t cnt);
char *unicode_convert_toutf8(const char *text, const char *charset, int *error);
char *unicode_convert_fromutf8(const char *text, const char *charset, int *error);
char *unicode_convert_tobuf(const char *text, const char *charset, const char *dstcharset, int *error);
int unicode_convert_toubuf(const char *text, size_t text_l, const char *charset, unicode_char **uc, size_t *ucsize, int *error);
int unicode_convert_fromu_tobuf(const unicode_char *utext, size_t utext_l, const char *charset, char **c, size_t *csize, int *error);
Description
unicode_u_ucs4_native[] contains the string “UCS-4BE”
Collecting converted text into a buffer
Call unicode_convert_tocbuf_init() instead of unicode_convert_init(), then call unicode_convert() and unicode_convert_deinit() normally. The parameters to unicode_convert_init() specify the source and the destination character sets. unicode_convert_tocbuf_toutf8_init() is just an alias that specifies UTF-8 as the destination character set. unicode_convert_tocbuf_fromutf8_init() is just an alias that specifies UTF-8 as the source character st. These functions supply an output function that collects the converted text into a malloc()ed buffer. If unicode_convert_deinit() returns 0, *cbufptr_ret gets initialized to a malloc()ed buffer, and the number of converted characters, the size of the malloc()ed buffer, get placed into *cbufsize_ret.
- Note
If the converted string is an empty string, *cbufsize_ret gets set to 0, but *cbufptr_ret still gets initialized (to a dummy malloced buffer). A non-zero nullterminate places a trailing \0 character after the converted string (this is included in *cbufsize_ret).Converting between character sets and unicode
unicode_convert_tou_init() converts character text into a unicode_char buffer. It works just like unicode_convert_tocbuf_init(), except that only the source character set gets specified and the output buffer is a unicode_char buffer. nullterminate terminates the converted unicode characters with a U+0000. unicode_convert_fromu_init() converts unicode_chars to the output character set, and also works like unicode_convert_tocbuf_init(). Additionally, in this case, unicode_convert_uc() works just like unicode_convert() except that the input sequence is a unicode_char sequence, and the count parameter is th enumber of unicode characters.
One-shot conversions
unicode_convert_toutf8() converts the specified text in the specified text into a UTF-8 string, returning a malloced buffer. If error is not NULL, even if unicode_convert_toutf8() returns a non NULL value *error gets set to a non-zero value if a character conversion error has occured, and some characters could not be converted. unicode_convert_fromutf8() does a similar conversion from UTF-8 text to the specified character set. unicode_convert_tobuf() does a similar conversion between two different character sets. unicode_convert_tou_tobuf() calls unicode_convert_tou_init(), feeds the character string through unicode_convert(), then calls unicode_convert_deinit(). If this function returns 0, *uc and *ucsize are set to a malloced buffer+size holding the unicode char array. unicode_convert_fromu_tobuf() calls unicode_convert_fromu_init(), feeds the unicode array through unicode_convert_uc(), then calls unicode_convert_deinit(). If this function returns 0, *c and *csize are set to a malloced buffer+size holding the char array.
See Also
courier-unicode(7), unicode_convert_tocase(3), unicode_default_chset(3).
Author
Sam Varshavchik
- Author
Notes
