How to convert one emoji character to Unicode codepoint number in JavaScript?

Active3 hr before
Viewed126 times

6 Answers


Meta Stack Overflow , Stack Overflow help chat ,Stack Overflow en español,Stack Overflow на русском

Added script to convert this on browser side

function emojiUnicode(emoji) {
   var comp;
   if (emoji.length === 1) {
      comp = emoji.charCodeAt(0);
   comp = (
      (emoji.charCodeAt(0) - 0xD800) * 0x400 +
      (emoji.charCodeAt(1) - 0xDC00) + 0x10000
   if (comp < 0) {
      comp = emoji.charCodeAt(0);
   return comp.toString("16");
# result "1f600"
load more v

Code point — A numerical representation of a specific Unicode character.,Parsing emoji in Javascript is… not easy.,Character Code— Another name for a code point.,Hexadecimal — A way to represent code points in base 16.

Here’s an example of using these methods, courtesy of

); // prints 55357, WRONG!

); // prints 128568, correct
load more v

A string created by using the specified sequence of code points., A RangeError is thrown if an invalid Unicode code point is given (e.g. "RangeError: NaN is not a valid code point"). , The static String.fromCodePoint() method returns a string created by using the specified sequence of code points. ,A sequence of code points.

String.fromCodePoint(num1, num2)
String.fromCodePoint(num1, num2, ..., numN)
load more v

java - How to convert a Unicode code point to its character representation? , unicode - How to delete emoticon code with JavaScript? , python - How to convert an int representing a UTF-8 character to a Unicode code point? ,how to convert this ? in to this 1f600 in javascript

how to convert this ? in to this 1f600 in javascript


The Intl extension provides a function to return the codepoint for a character. As it returns an integer, you just need to convert it to a hex string.,How to convert this ? into this U+1F603 with php?,I am trying to convert emoji to unicode with php , more info:

How to convert this ? into this U+1F603 with php?

function convert_emoji($var) {

load more v

The Unicode Character Database, a text document listing the names, code points and properties of all Unicode characters,UTF-8, the dominant encoding on the World Wide Web (used in over 95% of websites as of 2020[update], and up to 100% for some languages)[2] and on most Unix-like operating systems, uses one byte[note 1] (8 bits) for the first 128 code points, and up to 4 bytes for other characters.[3] The first 128 Unicode code points represent the ASCII characters, which means that any ASCII text is also a UTF-8 text. ,There is also a Medieval Unicode Font Initiative focused on special Latin medieval characters. Part of these proposals have been already included into Unicode. ,Although syntax rules may affect the order in which characters are allowed to appear, XML (including XHTML) documents, by definition,[75] comprise characters from most of the Unicode code points, with the exception of:

The UCS-2 and UTF-16 encodings specify the Unicode Byte Order Mark (BOM) for use at the beginnings of text files, which may be used for byte ordering detection (or byte endianness detection). The BOM, code point U+FEFF, has the important property of unambiguity on byte reorder, regardless of the Unicode encoding used; U+FFFE (the result of byte-swapping U+FEFF) does not equate to a legal character, and U+FEFF in places other than the beginning of text conveys the zero-width non-break space (a character with no appearance and no effect other than preventing the formation of ligatures).

The same character converted to UTF-8 becomes the byte sequence EF BB BF. The Unicode Standard allows that the BOM "can serve as signature for UTF-8 encoded text where the character set is unmarked".[69] Some software developers have adopted it for other encodings, including UTF-8, in an attempt to distinguish UTF-8 from local 8-bit code pages. However RFC 3629, the UTF-8 standard, recommends that byte order marks be forbidden in protocols using UTF-8, but discusses the cases where this may not be possible. In addition, the large restriction on possible patterns in UTF-8 (for instance there cannot be any lone bytes with the high bit set) means that it should be possible to distinguish UTF-8 from other character encodings without relying on the BOM.

load more v

Other "number-undefined" queries related to "How to convert one emoji character to Unicode codepoint number in JavaScript?"