Javascript, convert unicode string to Javascript escape?

Asked
Active3 hr before
Viewed126 times

8 Answers

convertjavascript
90%

How would I go about converting this to its Javascript escape form? ,Thanks for contributing an answer to Stack Overflow!,I have a variable that contains a string consisting of Japanese characters, for instance;, Stack Overflow Public questions & answers

"み".charCodeAt(0).toString(16);

This will give you the unicode (in Hex). You can run it through a loop:

String.prototype.toUnicode = function() {
   var result = "";
   for (var i = 0; i < this.length; i++) {
      // Assumption: all characters are < 0xffff
      result += "\\u" + ("000" + this[i].charCodeAt(0).toString(16)).substr(-4);
   }
   return result;
};

"みどりいろ".toUnicode(); //"\u307f\u3069\u308a\u3044\u308d"
"Mi Do Ri I Ro".toUnicode(); //"\u004d\u0069\u0020\u0044\u006f\u0020\u0052\u0069\u0020\u0049\u0020\u0052\u006f"
"Green".toUniCode(); //"\u0047\u0072\u0065\u0065\u006e"
load more v
88%

Using Unicode in a string,Unicode encoding of source files,Get the proper length of a string,ES6 Unicode code point escapes

If the file is fetched using HTTP (or HTTPS), the Content-Type header can specify the encoding:

Content - Type: application / javascript;
charset = utf - 8
<script src="./app.js" charset="utf-8">
...

<head>
   <meta charset="utf-8" />
</head>
...
const s1 = '\u00E9' //é
const s2 = '\u0065\u0301' //é
s1.length //1
s2.length //2
const s3 = 'e\u0301' //é
s3.length === 2 //true
s2 === s3 //true
s1 !== s3 //true
const s1 = '\u00E9' //é
const s3 = 'e\u0301' //é
s1 !== s3
s1.normalize() === s3.normalize() //true
const s4 = '🐶'
'👩‍❤️‍👩'.length
;
[...'🐶'].length //1
require('punycode').ucs2.decode('🐶').length //1
require('punycode').ucs2.decode('👩‍❤️‍👩').length //6
[...'👩‍❤️‍👩'].length //6
'\u{XXXXX}'
'\x61' // a
'\x2A' // *
load more v
72%

Escape sequences in a JavaScript string are used to express code units based on code point numbers. JavaScript has 3 escape types, one which was introduced in ECMAScript 2015.,Probably the most important concept about Unicode in JavaScript is to treat strings as sequences of code units, as they really are.,3. Unicode in JavaScript 3.1 Escape sequences 3.2 String comparison 3.3 String length 3.4 Character positioning 3.5 Regular expression match ,Strings in JavaScript are sequences of code units. Reasonable you could expect that string comparison involves the evaluation of code units for a match.

An astral code point requires two code units: a surrogate pair. As you saw in the previous example, to encode U+1F600 (😀) in UTF-16 a surrogate pair is used: 0xD83D 0xDE00.

javascriptconsole.log('\uD83D\uDE00'); // => '😀'
load more v
65%

More "Try it Yourself" examples below.,The String.fromCharCode() method converts Unicode values to characters.,Convert a set of Unicode values into characters:,Convert a Unicode number into a character:

Definition and Usage

The String.fromCharCode() method converts Unicode values to characters.

String.fromCharCode()
load more v
75%

Note: This function was used mostly for URL queries (the part of a URL following ?)—not for escaping ordinary String literals, which use the format "\xHH". (HH are two hexadecimal digits, and the form \xHH\xHH is used for higher-plane Unicode characters.) ,A new string in which certain characters have been escaped., Escaped characters in String literals can be expanded by replacing the \x with %, then using the decodeURIComponent() function. ,SyntaxError: unterminated string literal

escape(str)
load more v
40%

Demo: http://mothereff.in/js-escapes,Octal escapes can only be used for charCodes smaller than 256, and the test results show that they’re only shorter than Unicode/hex escapes for charCodes < 64:,Replace only unicode characters,We could save some bytes in the output by using escapes of the form \xab instead of \u1234. The code is more compact, too:

function unicodeEscape(str) {
   return str.replace(/[\s\S]/g, function(escape) {
      return '\\u' + ('0000' + escape.charCodeAt().toString(16)).slice(-4);
   });
}
load more v
22%

ECMAScript 6 introduces a new kind of escape sequence in strings, namely Unicode code point escapes. Additionally, it will define String.fromCodePoint and String#codePointAt, both of which accept code points rather than UCS-2/UTF-16-like code units.,It’s a bit confusing that the spec refers to this kind of escape sequence as “hexadecimal”, since Unicode escapes use hex as well.,Having recently written about character references in HTML and escape sequences in CSS, I figured it would be interesting to look into JavaScript character escapes as well.,Additionally, they produce syntax errors in strict mode:

There’s only one exception to this rule:

'abc\def' == 'abcdef'; // true
load more v
60%

Hex escape (exactly two hexadecimal digits): \xHH > '\x7A' === 'z' true ,A 4-digit Unicode escape \uHHHH becomes a single code point.,A 4-digit Unicode escape \uHHHH contributes a UTF-16 code unit.,Unicode escape (exactly four hexadecimal digits): \uHHHH > '\u007A' === 'z' true

  > '\x7A' === 'z'
  true
load more v

Other "convert-javascript" queries related to "Javascript, convert unicode string to Javascript escape?"