Get width of a Unicode string in fixed-width display cells, accounting for combining characters, emoji, flags, Hangul, East Asian Width, default ignorable characters, and a few more edge cases.
npm install @cto.af/string-width
Full documentation is available.
import {StringWidth, AMBIGUOUS, POTENTIAL_EMOJI} from '../lib/index.js'
const sw = new StringWidth()
sw.width('foo') // 3
sw.width('\u{1F4A9}') // 2: Emoji take two cells
sw.width('#\ufe0f\u20e3') // 2: More complicated emoji
sw.break('foobar', 3) // [
// {string: 'foo', cells: 3},
// {string: 'bar', cells: 3, last: true}
// ]
const custom = new StringWidth({
locale: 'ko-KR',
isCJK: true,
extraWidths: new Map([
// This example is not actually useful, but demonstrates how to customize
// 'K' how has ambiguous East Asian Width
[0x4b, AMBIGUOUS],
// 'O' might now start an Emoji sequence
[0x4f, POTENTIAL_EMOJI],
// 'R' now has a width of 3 cells
[0x52, 3]
])
})
widths.js
at
build time. POTENTIAL_EMOJI(14) is a sentinel for "possible emoji".
AMBIGUOUS(15) is a sentinel for "ambiguous East Asian Width", and all the
rest of the values are the width for that code point. The default result
from the Trie is 1.includeANSI
option
is enabled.Some code points have ambiguous length, which depends upon whether we are
counting in a CJK context or not. By default, StringWidth will look at the
locale that is given (or derived from the environment), and use the default
script of that locale to decide if this is a Chinese, Japanese, or Korean
context. The script identifiers 'Hans'
, 'Hant'
, 'Jpan'
, and 'Kore'
signal CJK context. If desired, this detection can be overridden by passing
in the isCJK
field in the constructor options.
The break(string, N)
method slices a string into chunks, each of which is at
most N cells. This was so entangled with the width logic that it made sense
to be in this library. It is useful for strings that are longer than N that
need to have a hyphen inserted between each of the segments, ensuring that the
hyphen doesn't go in the middle of a grapheme cluster.
On a new Unicode version being released, delete the tools/*.txt
files, then
do npm run build
to re-generate the Trie.