Get last character of string in current modern Javascript, allowing for Astral characters such as Emoji that use surrogate pairs (two code units)

296 views Asked by At

Unicode characters (code points) not in the Basic Multilingual Plane (BMP) may consist of two chars (code units), called a surrogate pair.

'ab' is two code units and two code points. (So two chars and two characters.)

'a' is three code units and two code points. (So three chars and two characters.)

My code does not need to work with old versions of JavaScript. ES6 or whatever is most modern.

How can I access the last character, irrespective of whether its an Astral character or not?

Splitting the string into "all but last character" and "final character" is also fine.

2

There are 2 answers

1
Andreas On BEST ANSWER

Spreading will dissect a string into its code points

[...'a'].pop()
4
hippietrail On

I knew from answers on other SO questions that both Array.from() and regular expressions with the /u flag would both correctly handle non-BMP Unicode characters, but I didn't think either was likely to be the best answer.

Maybe I was wrong, so here are two solutions:

Array.from()

let c = Array.from('a')[1];
console.log(c);

u flag

let c ='a'.match(/.$/u)[0];
console.log(c);

This second approach can be extended to answer the second part of my question too:

let [,l,r] = 'abcd'.match(/(.*)(.)/u);
console.log(l);
console.log(r);

(No anchor needed as the .* will be greedy.)