Logo by DaveyD (anonymous IP: 3.239.76.211,2242) | ||||||||||||||
| ||||||||||||||
Audio (343) Datatype (51) Demo (203) Development (600) Document (22) Driver (97) Emulation (148) Game (1005) Graphics (500) Library (116) Network (233) Office (66) Utility (930) Video (69) Total files: 4383 Full index file Recent index file
Amigans.net OpenAmiga Aminet IntuitionBase
Support the site
|
Foreword: --------- With the ongoing progress in development of MorphOS and AmigaOS4, also in order to treat UTF-8 no longer as stepchild, I do hope that they will render this library useless (no, I'm not kidding). That means that you should first check whether there is support in your OS for a certain task and only in case not, you should fallback on functions provided by Uni library. Unfortunately, I am missing information about how far Unicode support has been established in MorphOS 2.0 and for the upcoming OS4.1. Introduction: ------------- Uni library is a support library for Unicode code points in range from 0 to 1'114'109 - thus not limited to the Basic Multilingual Plane (range 0 to 65'535). You may determine code point attributes (UPPERCASE_LETTER, LOWERCASE_LETTER, TITLECASE_LETTER etc.) as well as you are able to change these attributes for a code point (mapping the code point to its counterpart). Because I haven't found a shared library with support functions that can cover UTF-8 strings, I've built them into Uni Library as well, like for example: UTF8StrCmp(). Furthermore, transcoding of strings from one format to an other is also implemented, like through: UTF16ToUTF8(). Thus, it's a shared library for three tasks: Determining code point attributes / mapping code points. Handling of UTF-8 multibyte sequences. Transcoding strings. The enclosed documentation was drawn up in HTML - and I spent a lot of time in order to clarify some misleading terms, which are frequently used by people, who do not fully understand for what Unicode and its related terms stand for. Okay, I'm not an expert myself, however, please read the documentation I provided before you study the API of this library; it will be your benefit. Changes: ------- This new version of Uni library was upgraded in order to adopt the Unicode Standard, Version 5.1.0 character encoding scheme as published by the Unicode Consortium and so far as my limited implementation can support it. In addition, this new version fixes a bug which surfaced in case a UTF-32/UTF-16 string was to be transcoded to UTF-8. The UTF-8 string buffer had to be at least four bytes bigger than required (ouch...). UniCodeChart() supports 32 new code charts and with that it supports 201 code charts in total. Notes on transcoding singlebyte character encoding schemes: ----------------------------------------------------------- I'll release an additional archive (UniLibSupp - already used by a 3rd party) that shall make it easier for you to transcode strings by utilizing IANA-IDs, which are also used by the operating system's Locale library up from version 50 (MorphOS, AmigaOS4). Functions: ---------- The API provides these functions: Code Points Attribute Information UniIsAlpha() UniIsAttr() UniIsCon() UniIsDigit() UniIsLower() UniIsNSM() UniIsPrint() UniIsPunct() UniIsSpace() UniIsTitle() UniIsUpper() UniToLower() UniToTitle() UniToUpper() UniCodeChart() UTF-8 String Information UTF8IsLegal() UTF8LegalStart() UTF8NextChar() UTF8PrevChar() UTF8CharAtIndex() UTF8StrInfo() UTF8StrLen() UTF8StrOfSize() UTF8StrVisibleLen() UTF-8 String Comparison / Modifiers UTF8StrCat() UTF8StrCmp() UTF8StrCmpI() UTF8StrCpy() UTF8StrFind() UTF8StrMatch() UTF8StrNCat() UTF8StrNCmp() UTF8StrNCmpI() UTF8StrNCpy() UTF8StrPaste() UTF8StrReplace() UTF8StrTerminate() UTF8StrToken() UTF8StrToLower() UTF8StrToTitle() UTF8StrToUpper() Miscellaneous (Wide Char) String Functions UTF16StrLen() UTF32StrLen() UTF16CharAsUTF8Len() UTF32CharAsUTF8Len() Transcodings LatinToUTF8() UTF8ToLatin() UTF16ToUTF8() UTF32ToUTF8() UTF8ToUTF16() UTF8ToUTF16Char() UniResultIsSurrogate() UTF8ToUTF32() UTF8ToUTF32Char() Encodings UniCheckEncoding() UniBomHasSize() UniSwitchEncoding() |
Copyright © 2004-2024 by Björn Hagström All Rights Reserved |