JavaScript: String

  • Can be defined using single or double quotes
  • Concatenation risky (unexpected type conversions leading to what you think should be a string being a number). Can use template literals instead – always a string.
  • Escape character: backslash → \
  • Can insert unicode character using → \u{0030} where 0030 is in hex → character 30 (Hex) / 48(Dec) → character zero

A sequence of Unicode Code Points (broadly equivalent to old-world ASCII character)

let str = “Testing”;
str.codePointAt(0); // → 84 (T)

for(const next of [...str]){   // code point of each character
	console.log(next.codePointAt(0));
}

let strCodePoints = [0x48, 0x49, 0x4A];
let newStr = String.fromCodePoint(...strCodePoints);

String object methods (see String)

  • indexOf
  • lastIndexOf
  • startsWith
  • endsWith
  • includes
  • substring
  • slice
  • split
  • trim
  • padStart
  • padEnd
  • toUpperCase
  • toLowerCase
  • localeCompare
  • repeat

Encoding strings for URI & URI components

encodeURI(...)
encodeURIComponent('test?')

Template Literals

Use backtick (`) to format a string using variables – similar to C ‘printf’ function. Can be used instead of string concatenation (String1 + String2) to avoid issue with unexpected type conversions.

let myGreeting = `Hello there ${myNameVar}`

Tagged Template Literals

It is possible to mix functions and template literals. For example

const func1 = (textSection, ...values) => {
	let result = textSection[0];
	for(let i = 0; i < values.length ; i++){
		result += `*${values[i]}*${textSection[i+1]}`;
	}
	return result;
}

let  replaceText = “marvellous”;
const myText = func1`This is ${replaceText}. And here is ${replaceText} again.`;

the replacements text is split into “textSection” and the replacement values are swept into an array values by the “spread operator”. This would give the output:

This is *marvellous*. And here is *marvellous* again.

There is a special tagged template literal used to ignore backslashes:

const out = String.raw`c:\users\noddy`

This would not escape the ‘u’ or ‘n’. Note that it does not ignore backticks, ${, ` and {.

Regular Expressions

Special characters

  • . matches any one character
  • * previous item can be present 0, 1 or more times
  • + previous item can be present 1 or more times (must be present at least once)
  • ? previous item is optional (present 0 or 1 times)
  • | lists alternatives. a|b|c would match a or b or c.
  • () groups items together. “.(e|a).” would match mat and met but not meat.
  • [] encapsulates character classes, such as [0-9], [kK], [A-Za-z]
  • ^ can be the compliment (i.e. [^0-9] includes all characters except numeric zero to nine) it also indicates the start of a text.
  • $ indicates the end of the text

Predefined characters

  • \d digits
  • \s white space
  • \w word characters
  • \D non-digits
  • \S non white space
  • \W non word characters
  • \u{pppp} Unicode code point pppp

A regulator expression literal is enclosed in slash characters rather than quotes

let myRegEx = /[0-9]+Q/ 

is a regular expression for any number of digits terminated by the letter Q.

myRegEx.test(“AAA12345Q”); //  → true

If the regular expression was

let myRegEx = /^[0-9]+Q$/
myRegEx.test(“AAA12345Q”); //  → false because there is an A after the start (“^”).

/[0-9]+Q/.exec(“AAA12345Q”); //  → an array with one entry, 12345Q, which matches the regular expression