RegExp 和 String 的方法

在本文中，我们将深入探讨各种与正则表达式一起使用的方法。

str.match(regexp)

方法 str.match(regexp) 在字符串 str 中查找 regexp 的匹配项。

它有 3 种模式

如果 regexp 没有标志 g，则它将第一个匹配项作为包含捕获组的数组返回，以及属性 index（匹配项的位置）、input（输入字符串，等于 str）。

let str = "I love JavaScript";

let result = str.match(/Java(Script)/);

alert( result[0] );     // JavaScript (full match)
alert( result[1] );     // Script (first capturing group)
alert( result.length ); // 2

// Additional information:
alert( result.index );  // 7 (match position)
alert( result.input );  // I love JavaScript (source string)

如果 regexp 有标志 g，则它将所有匹配项作为字符串的数组返回，不包含捕获组和其他详细信息。

let str = "I love JavaScript";

let result = str.match(/Java(Script)/g);

alert( result[0] ); // JavaScript
alert( result.length ); // 1

如果没有匹配项，无论是否有标志 g，都会返回 null。

这是一个重要的细微差别。如果没有匹配项，我们不会得到一个空数组，而是 null。很容易忘记这一点，例如：
```
let str = "I love JavaScript";

let result = str.match(/HTML/);

alert(result); // null
alert(result.length); // Error: Cannot read property 'length' of null
```
如果我们希望结果是一个数组，我们可以这样写
```
let result = str.match(regexp) || [];
```

str.matchAll(regexp)

这是语言中最近添加的功能。旧浏览器可能需要 polyfill。

方法 str.matchAll(regexp) 是 str.match 的“更新、改进”版本。

它主要用于搜索所有匹配项及其所有分组。

与 match 有 3 个区别

它返回一个包含匹配项的可迭代对象，而不是数组。我们可以使用 Array.from 从中创建一个常规数组。
每个匹配项都作为包含捕获组的数组返回（与不带 g 标志的 str.match 格式相同）。
如果没有结果，它将返回一个空的可迭代对象，而不是 null。

使用示例

let str = '<h1>Hello, world!</h1>';
let regexp = /<(.*?)>/g;

let matchAll = str.matchAll(regexp);

alert(matchAll); // [object RegExp String Iterator], not array, but an iterable

matchAll = Array.from(matchAll); // array now

let firstMatch = matchAll[0];
alert( firstMatch[0] );  // <h1>
alert( firstMatch[1] );  // h1
alert( firstMatch.index );  // 0
alert( firstMatch.input );  // <h1>Hello, world!</h1>

如果我们使用 for..of 循环遍历 matchAll 匹配项，那么我们就不再需要 Array.from 了。

str.split(regexp|substr, limit)

使用正则表达式（或子字符串）作为分隔符分割字符串。

我们可以像这样使用字符串的 split

alert('12-34-56'.split('-')) // array of ['12', '34', '56']

但我们也可以用相同的方式通过正则表达式进行分割

alert('12, 34, 56'.split(/,\s*/)) // array of ['12', '34', '56']

str.search(regexp)

方法 str.search(regexp) 返回第一个匹配项的位置，如果未找到则返回 -1

let str = "A drop of ink may make a million think";

alert( str.search( /ink/i ) ); // 10 (first match position)

重要的限制：search 只能找到第一个匹配项。

如果我们需要其他匹配项的位置，应该使用其他方法，例如使用 str.matchAll(regexp) 找到所有匹配项。

str.replace(str|regexp, str|func)

这是一个用于搜索和替换的通用方法，是最有用的方法之一。搜索和替换的瑞士军刀。

我们可以不使用正则表达式，来搜索和替换子字符串

// replace a dash by a colon
alert('12-34-56'.replace("-", ":")) // 12:34-56

不过有一个陷阱。

当 replace 的第一个参数是字符串时，它只替换第一个匹配项。

您可以在上面的示例中看到：只有第一个 "-" 被替换为 ":"。

要查找所有连字符，我们需要使用正则表达式 /-/g，而不是字符串 "-"，并使用必不可少的 g 标志

// replace all dashes by a colon
alert( '12-34-56'.replace( /-/g, ":" ) )  // 12:34:56

第二个参数是替换字符串。我们可以在其中使用特殊字符

符号	替换字符串中的操作
`$&`	插入整个匹配项
$`	插入匹配项之前的字符串部分
`$'`	插入匹配项之后的字符串部分
`$n`	如果 `n` 是一个 1-2 位数字，则插入第 n 个捕获组的内容，有关详细信息，请参阅捕获组
`$<name>`	插入具有给定 `name` 的括号的内容，有关详细信息，请参阅捕获组
`$$`	插入字符 `$`

例如

let str = "John Smith";

// swap first and last name
alert(str.replace(/(john) (smith)/i, '$2, $1')) // Smith, John

对于需要“智能”替换的情况，第二个参数可以是一个函数。

它将针对每个匹配项调用，返回值将被插入作为替换。

该函数使用参数 func(match, p1, p2, ..., pn, offset, input, groups) 调用

match – 匹配项，
p1, p2, ..., pn – 捕获组的内容（如果有），
offset – 匹配项的位置，
input – 源字符串，
groups – 带有命名组的对象。

如果正则表达式中没有括号，则只有 3 个参数：func(str, offset, input)。

例如，让我们将所有匹配项都大写

let str = "html and css";

let result = str.replace(/html|css/gi, str => str.toUpperCase());

alert(result); // HTML and CSS

用其在字符串中的位置替换每个匹配项

alert("Ho-Ho-ho".replace(/ho/gi, (match, offset) => offset)); // 0-3-6

在下面的示例中，有两个括号，因此替换函数使用 5 个参数调用：第一个是完整匹配，然后是 2 个括号，之后（在示例中未使用）是匹配位置和源字符串

let str = "John Smith";

let result = str.replace(/(\w+) (\w+)/, (match, name, surname) => `${surname}, ${name}`);

alert(result); // Smith, John

如果有许多组，使用剩余参数来访问它们很方便

let str = "John Smith";

let result = str.replace(/(\w+) (\w+)/, (...match) => `${match[2]}, ${match[1]}`);

alert(result); // Smith, John

或者，如果我们使用命名组，那么带有它们的 groups 对象始终是最后一个，因此我们可以像这样获取它

let str = "John Smith";

let result = str.replace(/(?<name>\w+) (?<surname>\w+)/, (...match) => {
  let groups = match.pop();

  return `${groups.surname}, ${groups.name}`;
});

alert(result); // Smith, John

使用函数为我们提供了最终的替换能力，因为它获取了有关匹配项的所有信息，可以访问外部变量并可以执行所有操作。

str.replaceAll(str|regexp, str|func)

此方法本质上与 str.replace 相同，但有两个主要区别

如果第一个参数是字符串，它将替换字符串的所有出现，而 replace 只替换第一次出现。
如果第一个参数是没有 g 标志的正则表达式，则会发生错误。使用 g 标志，它的作用与 replace 相同。

replaceAll 的主要用例是替换字符串的所有出现。

像这样

// replace all dashes by a colon
alert('12-34-56'.replaceAll("-", ":")) // 12:34:56

regexp.exec(str)

regexp.exec(str) 方法返回字符串 str 中 regexp 的匹配项。与以前的方法不同，它是在正则表达式上调用，而不是在字符串上调用。

它的行为取决于正则表达式是否具有标志 g。

如果没有 g，则 regexp.exec(str) 返回第一个匹配项，与 str.match(regexp) 完全相同。此行为不会带来任何新内容。

但如果有标志 g，则

调用 regexp.exec(str) 返回第一个匹配项，并在属性 regexp.lastIndex 中保存紧随其后的位置。
下一个此类调用从位置 regexp.lastIndex 开始搜索，返回下一个匹配项，并在 regexp.lastIndex 中保存其后的位置。
…等等。
如果没有匹配项，regexp.exec 返回 null 并将 regexp.lastIndex 重置为 0。

因此，重复调用会使用属性 regexp.lastIndex 来跟踪当前搜索位置，从而依次返回所有匹配项。

过去，在 JavaScript 中添加 str.matchAll 方法之前，循环中使用 regexp.exec 的调用来获取所有带有组的匹配项。

let str = 'More about JavaScript at https://javascript.js.cn';
let regexp = /javascript/ig;

let result;

while (result = regexp.exec(str)) {
  alert( `Found ${result[0]} at position ${result.index}` );
  // Found JavaScript at position 11, then
  // Found javascript at position 33
}

现在也可以使用这种方法，尽管对于较新的浏览器来说，str.matchAll 通常更方便。

我们可以通过手动设置 lastIndex 来使用 regexp.exec 从给定位置进行搜索。

例如

let str = 'Hello, world!';

let regexp = /\w+/g; // without flag "g", lastIndex property is ignored
regexp.lastIndex = 5; // search from 5th position (from the comma)

alert( regexp.exec(str) ); // world

如果正则表达式具有标志 y，则搜索将精确地在位置 regexp.lastIndex 进行，而不是更远的位置。

让我们在上面的示例中将标志 g 替换为 y。由于位置 5 处没有单词，因此不会有任何匹配项。

let str = 'Hello, world!';

let regexp = /\w+/y;
regexp.lastIndex = 5; // search exactly at position 5

alert( regexp.exec(str) ); // null

这对于我们需要在字符串中使用正则表达式在精确位置“读取”某些内容，而不是更远的位置的情况非常方便。

regexp.test(str)

方法 regexp.test(str) 会查找匹配项，并返回是否存在匹配项的 true/false 值。

例如

let str = "I love JavaScript";

// these two tests do the same
alert( /love/i.test(str) ); // true
alert( str.search(/love/i) != -1 ); // true

带有否定答案的示例

let str = "Bla-bla-bla";

alert( /love/i.test(str) ); // false
alert( str.search(/love/i) != -1 ); // false

如果正则表达式具有标志 g，则 regexp.test 会从 regexp.lastIndex 属性开始查找并更新此属性，就像 regexp.exec 一样。

因此，我们可以使用它从给定位置进行搜索。

let regexp = /love/gi;

let str = "I love JavaScript";

// start the search from position 10:
regexp.lastIndex = 10;
alert( regexp.test(str) ); // false (no match)

如果我们将相同的全局正则表达式应用于不同的输入，可能会导致错误的结果，因为 regexp.test 调用会推进 regexp.lastIndex 属性，因此在另一个字符串中的搜索可能会从非零位置开始。

例如，这里我们在同一文本上调用 regexp.test 两次，第二次失败。

let regexp = /javascript/g;  // (regexp just created: regexp.lastIndex=0)

alert( regexp.test("javascript") ); // true (regexp.lastIndex=10 now)
alert( regexp.test("javascript") ); // false

这正是因为在第二次测试中 regexp.lastIndex 非零。

为了解决这个问题，我们可以在每次搜索之前将 regexp.lastIndex 设置为 0。或者，我们可以使用字符串方法 str.match/search/... 而不是调用正则表达式上的方法，因为它们不使用 lastIndex。