summaryrefslogtreecommitdiff
path: root/node_modules/wcwidth/docs/index.md
diff options
context:
space:
mode:
authorMinteck <contact@minteck.org>2021-12-21 16:52:28 +0100
committerMinteck <contact@minteck.org>2021-12-21 16:52:28 +0100
commit46e43f4bde4a35785b4997b81e86cd19f046b69b (patch)
treec53c2f826f777f9d6b2d249dab556feb72a6c3a6 /node_modules/wcwidth/docs/index.md
downloadlangdetect-46e43f4bde4a35785b4997b81e86cd19f046b69b.tar.gz
langdetect-46e43f4bde4a35785b4997b81e86cd19f046b69b.tar.bz2
langdetect-46e43f4bde4a35785b4997b81e86cd19f046b69b.zip
Commit
Diffstat (limited to 'node_modules/wcwidth/docs/index.md')
-rw-r--r--node_modules/wcwidth/docs/index.md65
1 files changed, 65 insertions, 0 deletions
diff --git a/node_modules/wcwidth/docs/index.md b/node_modules/wcwidth/docs/index.md
new file mode 100644
index 0000000..5c5126d
--- /dev/null
+++ b/node_modules/wcwidth/docs/index.md
@@ -0,0 +1,65 @@
+### Javascript porting of Markus Kuhn's wcwidth() implementation
+
+The following explanation comes from the original C implementation:
+
+This is an implementation of wcwidth() and wcswidth() (defined in
+IEEE Std 1002.1-2001) for Unicode.
+
+http://www.opengroup.org/onlinepubs/007904975/functions/wcwidth.html
+http://www.opengroup.org/onlinepubs/007904975/functions/wcswidth.html
+
+In fixed-width output devices, Latin characters all occupy a single
+"cell" position of equal width, whereas ideographic CJK characters
+occupy two such cells. Interoperability between terminal-line
+applications and (teletype-style) character terminals using the
+UTF-8 encoding requires agreement on which character should advance
+the cursor by how many cell positions. No established formal
+standards exist at present on which Unicode character shall occupy
+how many cell positions on character terminals. These routines are
+a first attempt of defining such behavior based on simple rules
+applied to data provided by the Unicode Consortium.
+
+For some graphical characters, the Unicode standard explicitly
+defines a character-cell width via the definition of the East Asian
+FullWidth (F), Wide (W), Half-width (H), and Narrow (Na) classes.
+In all these cases, there is no ambiguity about which width a
+terminal shall use. For characters in the East Asian Ambiguous (A)
+class, the width choice depends purely on a preference of backward
+compatibility with either historic CJK or Western practice.
+Choosing single-width for these characters is easy to justify as
+the appropriate long-term solution, as the CJK practice of
+displaying these characters as double-width comes from historic
+implementation simplicity (8-bit encoded characters were displayed
+single-width and 16-bit ones double-width, even for Greek,
+Cyrillic, etc.) and not any typographic considerations.
+
+Much less clear is the choice of width for the Not East Asian
+(Neutral) class. Existing practice does not dictate a width for any
+of these characters. It would nevertheless make sense
+typographically to allocate two character cells to characters such
+as for instance EM SPACE or VOLUME INTEGRAL, which cannot be
+represented adequately with a single-width glyph. The following
+routines at present merely assign a single-cell width to all
+neutral characters, in the interest of simplicity. This is not
+entirely satisfactory and should be reconsidered before
+establishing a formal standard in this area. At the moment, the
+decision which Not East Asian (Neutral) characters should be
+represented by double-width glyphs cannot yet be answered by
+applying a simple rule from the Unicode database content. Setting
+up a proper standard for the behavior of UTF-8 character terminals
+will require a careful analysis not only of each Unicode character,
+but also of each presentation form, something the author of these
+routines has avoided to do so far.
+
+http://www.unicode.org/unicode/reports/tr11/
+
+Markus Kuhn -- 2007-05-26 (Unicode 5.0)
+
+Permission to use, copy, modify, and distribute this software
+for any purpose and without fee is hereby granted. The author
+disclaims all warranties with regard to this software.
+
+Latest version: http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
+
+
+