如何在单元格和循环中使用Microsoft Excel中的正则表达式（Regex）

本文翻译自：How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops

How can I use regular expressions in Excel and take advantage of Excel's powerful grid-like setup for data manipulation?如何在Excel中使用正则表达式并利用Excel强大的类似于网格的设置进行数据处理？

In-cell function to return a matched pattern or replaced value in a string.单元内函数可返回匹配的模式或字符串中的替换值。Sub to loop through a column of data and extract matches to adjacent cells.Sub循环遍历一列数据并提取与相邻单元格的匹配项。What setup is necessary?需要什么设置？What are Excel's special characters for Regular expressions?Excel正则表达式的特殊字符是什么？

I understand Regex is not ideal for many situations ( To use or not to use regular expressions? ) since excel can useLeft,Mid,Right,Instrtype commands for similar manipulations.我知道Regex在许多情况下都不理想（要使用还是不使用正则表达式？），因为excel可以使用Left，Mid，Right，Instr类型的命令进行类似的操作。

#1楼

参考：/question/1WaQ6/如何在单元格和循环中使用Microsoft-Excel中的正则表达式-Regex

#2楼

Regular expressions are used for Pattern Matching.正则表达式用于模式匹配。

To use in Excel follow these steps :要在Excel中使用，请按照以下步骤操作：

Step 1: Add VBA reference to "Microsoft VBScript Regular Expressions 5.5"步骤1：将VBA引用添加到“ Microsoft VBScript正则表达式5.5”

Select "Developer" tab ( I don't have this tab what do I do? )选择“开发人员”标签（我没有此标签，我该怎么办？）Select "Visual Basic" icon from 'Code' ribbon section从“代码”功能区部分选择“ Visual Basic”图标In "Microsoft Visual Basic for Applications" window select "Tools" from the top menu.在“ Microsoft Visual Basic应用程序”窗口中，从顶部菜单中选择“工具”。Select "References"选择“参考”Check the box next to "Microsoft VBScript Regular Expressions 5.5" to include in your workbook.选中要包含在您的工作簿中的“ Microsoft VBScript正则表达式5.5”旁边的框。Click "OK"点击“确定”

Step 2: Define your pattern第2步：定义模式

Basic definitions:基本定义：

-Range.-范围。

Egazmatches an lower case letters from a to z例如az匹配从a到z的小写字母Eg0-5matches any number from 0 to 5例如0-5匹配0到5之间的任何数字

[]Match exactly one of the objects inside these brackets.[]完全匹配这些括号内的对象之一。

Eg[a]matches the letter a例如[a]匹配字母aEg[abc]matches a single letter which can be a, b or c例如[abc]匹配一个字母，可以是a，b或cEg[az]matches any single lower case letter of the alphabet.例如[az]匹配字母表中的任何单个小写字母。

()Groups different matches for return purposes.()不同的匹配分组以便返回。See examples below.请参阅下面的示例。

{}Multiplier for repeated copies of pattern defined before it.{}乘数，用于重复定义在其之前的模式。

Eg[a]{2}matches two consecutive lower case letter a:aa例如[a]{2}匹配两个连续的小写字母a：aaEg[a]{1,3}matches at least one and up to three lower case lettera,aa,aaa例如[a]{1,3}匹配至少一个且最多三个小写字母a，aa，aaa

+Match at least one, or more, of the pattern defined before it.+匹配至少一个或多个之前定义的模式。

Ega+will match consecutive a'sa,aa,aaa, and so on例如a+将匹配连续的a，aa，aaa，依此类推

?Match zero or one of the pattern defined before it.匹配零或之前定义的模式之一。

Eg Pattern may or may not be present but can only be matched one time.例如，模式可能会或可能不会出现，但只能匹配一次。Eg[az]?例如[az]?matches empty string or any single lower case letter.匹配空字符串或任何单个小写字母。

*Match zero or more of the pattern defined before it.*匹配零个或多个之前定义的模式。- Eg Wildcard for pattern that may or may not be present.-例如通配符，表示可能存在或可能不存在的模式。- Eg[az]*matches empty string or string of lower case letters.-例如[az]*匹配空字符串或小写字母字符串。

.Matches any character except newline\\n匹配除换行符\\n以外的任何字符

Ega.例如a.Matches a two character string starting with a and ending with anything except\\n匹配两个字符串，以a开头，以\\n以外的任何结尾

|OR operatorOR运算符

^NOT operator^NOT运算符

Eg[^0-9]character can not contain a number例如[^0-9]字符不能包含数字Eg[^aA]character can not be lower caseaor upper caseA例如[^aA]字符不能为小写a或大写A

\\Escapes special character that follows (overrides above behavior)\\转义后跟的特殊字符（覆盖行为之上）

Eg\\.例如\\.,\\\\,\\(,\\?,\\$,\\^，\\\\，\\(，\\?，\\$，\\^

Anchoring Patterns:锚定模式：

^Match must occur at start of string^匹配必须在字符串开头

Eg^aFirst character must be lower case lettera例如^a第一个字符必须为小写字母aEg^[0-9]First character must be a number.例如^[0-9]第一个字符必须是数字。

$Match must occur at end of string$匹配必须出现在字符串的末尾

Ega$Last character must be lower case lettera例如a$最后一个字符必须是小写字母a

Precedence table:优先级表：

Order NameRepresentation1Parentheses ( )2Multipliers ? + * {m,n} {m, n}?3Sequence & Anchors abc ^ $4Alternation |

Predefined Character Abbreviations:预定义的字符缩写：

abr same as meaning\d[0-9] Any single digit\D[^0-9] Any single character that's not a digit\w[a-zA-Z0-9_] Any word character\W[^a-zA-Z0-9_] Any non-word character\s[ \r\t\n\f] Any space character\S[^ \r\t\n\f] Any non-space character\n[\n]New line

Example 1:Run as macro示例1：作为宏运行

The following example macro looks at the value in cellA1to see if the first 1 or 2 characters are digits.下面的示例宏查看单元格A1中的值，以查看前1个或2个字符是否为数字。If so, they are removed and the rest of the string is displayed.如果是这样，它们将被删除并显示字符串的其余部分。If not, then a box appears telling you that no match is found.如果没有，则会出现一个框，告诉您找不到匹配项。CellA1values of12abcwill returnabc, value of1abcwill returnabc, value ofabc123will return "Not Matched" because the digits were not at the start of the string.单元格A1值12abc将返回abc，值1abc将返回abc，值abc123将返回“不匹配”，因为这些数字不在字符串的开头。

Private Sub simpleRegex()Dim strPattern As String: strPattern = "^[0-9]{1,2}"Dim strReplace As String: strReplace = ""Dim regEx As New RegExpDim strInput As StringDim Myrange As RangeSet Myrange = ActiveSheet.Range("A1")If strPattern <> "" ThenstrInput = Myrange.ValueWith regEx.Global = True.MultiLine = True.IgnoreCase = False.Pattern = strPatternEnd WithIf regEx.Test(strInput) ThenMsgBox (regEx.Replace(strInput, strReplace))ElseMsgBox ("Not matched")End IfEnd IfEnd Sub

Example 2:Run as an in-cell function示例2：作为单元内函数运行

This example is the same as example 1 but is setup to run as an in-cell function.该示例与示例1相同，但设置为作为单元内功能运行。To use, change the code to this:要使用，请将代码更改为此：

Function simpleCellRegex(Myrange As Range) As StringDim regEx As New RegExpDim strPattern As StringDim strInput As StringDim strReplace As StringDim strOutput As StringstrPattern = "^[0-9]{1,3}"If strPattern <> "" ThenstrInput = Myrange.ValuestrReplace = ""With regEx.Global = True.MultiLine = True.IgnoreCase = False.Pattern = strPatternEnd WithIf regEx.test(strInput) ThensimpleCellRegex = regEx.Replace(strInput, strReplace)ElsesimpleCellRegex = "Not matched"End IfEnd IfEnd Function

Place your strings ("12abc") in cellA1.将您的字符串（“ 12abc”）放在单元格A1。Enter this formula=simpleCellRegex(A1)in cellB1and the result will be "abc".在单元格B1输入此公式=simpleCellRegex(A1)，结果将为“ abc”。

Example 3:Loop Through Range示例3：循环范围

This example is the same as example 1 but loops through a range of cells.此示例与示例1相同，但循环通过一系列单元。

Private Sub simpleRegex()Dim strPattern As String: strPattern = "^[0-9]{1,2}"Dim strReplace As String: strReplace = ""Dim regEx As New RegExpDim strInput As StringDim Myrange As RangeSet Myrange = ActiveSheet.Range("A1:A5")For Each cell In MyrangeIf strPattern <> "" ThenstrInput = cell.ValueWith regEx.Global = True.MultiLine = True.IgnoreCase = False.Pattern = strPatternEnd WithIf regEx.Test(strInput) ThenMsgBox (regEx.Replace(strInput, strReplace))ElseMsgBox ("Not matched")End IfEnd IfNextEnd Sub

Example 4: Splitting apart different patterns示例4：拆分不同的模式

This example loops through a range (A1,A2&A3) and looks for a string starting with three digits followed by a single alpha character and then 4 numeric digits.本示例循环遍历一个范围（A1，A2和A3），并查找一个字符串，该字符串以三个数字开头，后跟一个字母字符，然后是4个数字。The output splits apart the pattern matches into adjacent cells by using the().输出使用()将模式匹配拆分为相邻的单元格。$1represents the first pattern matched within the first set of().$1表示在()的第一组中匹配的第一个模式。

Private Sub splitUpRegexPattern()Dim regEx As New RegExpDim strPattern As StringDim strInput As StringDim Myrange As RangeSet Myrange = ActiveSheet.Range("A1:A3")For Each C In MyrangestrPattern = "(^[0-9]{3})([a-zA-Z])([0-9]{4})"If strPattern <> "" ThenstrInput = C.ValueWith regEx.Global = True.MultiLine = True.IgnoreCase = False.Pattern = strPatternEnd WithIf regEx.test(strInput) ThenC.Offset(0, 1) = regEx.Replace(strInput, "$1")C.Offset(0, 2) = regEx.Replace(strInput, "$2")C.Offset(0, 3) = regEx.Replace(strInput, "$3")ElseC.Offset(0, 1) = "(Not matched)"End IfEnd IfNextEnd Sub

Results:结果：

Additional Pattern Examples附加图案示例

String Regex Pattern Explanationa1aaa [a-zA-Z][0-9][a-zA-Z]{3} Single alpha, single digit, three alpha charactersa1aaa [a-zA-Z]?[0-9][a-zA-Z]{3}May or may not have preceeding alpha charactera1aaa [a-zA-Z][0-9][a-zA-Z]{0,3}Single alpha, single digit, 0 to 3 alpha charactersa1aaa [a-zA-Z][0-9][a-zA-Z]* Single alpha, single digit, followed by any number of alpha characters</i8> \<\/[a-zA-Z][0-9]\> Exact non-word character except any single alpha followed by any single digit

#3楼

To make use of regular expressions directly in Excel formulas the following UDF (user defined function) can be of help.要直接在Excel公式中使用正则表达式，以下UDF（用户定义函数）可能会有所帮助。It more or less directly exposes regular expression functionality as an excel function.它或多或少直接将正则表达式功能公开为excel函数。

How it works怎么运行的

It takes 2-3 parameters.它需要2-3个参数。

A text to use the regular expression on.使用正则表达式的文本。A regular expression.正则表达式。A format string specifying how the result should look.一个格式字符串，指定结果的外观。It can contain$0,$1,$2, and so on.它可以包含$0，$1，$2等。$0is the entire match,$1and up correspond to the respective match groups in the regular expression.$0是整个匹配项，$1和up对应于正则表达式中的各个匹配项组。Defaults to$0.默认为$0。

Some examples一些例子

Extracting an email address:提取电子邮件地址：

=regex("Peter Gordon: some@, 47", "\w+@\w+\.\w+")=regex("Peter Gordon: some@, 47", "\w+@\w+\.\w+", "$0")

Results in:some@结果：some@

Extracting several substrings:提取几个子字符串：

=regex("Peter Gordon: some@, 47", "^(.+): (.+), (\d+)$", "E-Mail: $2, Name: $1")

Results in:E-Mail: some@, Name: Peter Gordon结果在：E-Mail: some@, Name: Peter Gordon

To take apart a combined string in a single cell into its components in multiple cells:要将单个单元格中的组合字符串分解成多个单元格中的组件，请执行以下操作：

=regex("Peter Gordon: some@, 47", "^(.+): (.+), (\d+)$", "$" & 1)=regex("Peter Gordon: some@, 47", "^(.+): (.+), (\d+)$", "$" & 2)

Results in:Peter Gordonsome@...结果：Peter Gordonsome@...

How to use如何使用

To use this UDF do the following (roughly based on this Microsoft page . They have some good additional info there!):要使用此UDF，请执行以下操作（大致基于此Microsoft页面。它们在此处有一些不错的附加信息！）：

In Excel in a Macro enabled file ('.xlsm') pushALT+F11to open theMicrosoft Visual Basic for ApplicationsEditor.在Excel中的启用宏的文件（'.xlsm'）中，按ALT+F11打开Microsoft Visual Basic for Applications编辑器。Add VBA reference to the Regular Expressions library (shamelessly copied from Portland Runners++ answer ):将VBA参考添加到正则表达式库中（从Portland Runners ++ answer中无耻地复制）：Click onTools->References(please excuse the german screenshot)单击工具->参考（请原谅德语截图）FindMicrosoft VBScript Regular Expressions 5.5in the list and tick the checkbox next to it.在列表中找到Microsoft VBScript正则表达式5.5，然后选中它旁边的复选框。ClickOK.单击确定。

Click onInsert Module.单击插入模块。If you give your module a different name make sure the Module doesnothave the same name as the UDF below (eg naming the ModuleRegexand the functionregexcauses#NAME!errors).如果你给你的模块不同的名称确保模块不具有相同的名称，下面的UDF（如命名模块Regex和函数regex导致＃NAME！错误）。

In the big text window in the middle insert the following:在中间的大文本窗口中，插入以下内容：

Function regex(strInput As String, matchPattern As String, Optional ByVal outputPattern As String = "$0") As Variant Dim inputRegexObj As New VBScript_RegExp_55.RegExp, outputRegexObj As New VBScript_RegExp_55.RegExp, outReplaceRegexObj As New VBScript_RegExp_55.RegExp Dim inputMatches As Object, replaceMatches As Object, replaceMatch As Object Dim replaceNumber As Integer With inputRegexObj .Global = True .MultiLine = True .IgnoreCase = False .Pattern = matchPattern End With With outputRegexObj .Global = True .MultiLine = True .IgnoreCase = False .Pattern = "\\$(\\d+)" End With With outReplaceRegexObj .Global = True .MultiLine = True .IgnoreCase = False End With Set inputMatches = inputRegexObj.Execute(strInput) If inputMatches.Count = 0 Then regex = False Else Set replaceMatches = outputRegexObj.Execute(outputPattern) For Each replaceMatch In replaceMatches replaceNumber = replaceMatch.SubMatches(0) outReplaceRegexObj.Pattern = "\\$" & replaceNumber If replaceNumber = 0 Then outputPattern = outReplaceRegexObj.Replace(outputPattern, inputMatches(0).Value) Else If replaceNumber > inputMatches(0).SubMatches.Count Then 'regex = "A to high $ tag found. Largest allowed is $" & inputMatches(0).SubMatches.Count & "." regex = CVErr(xlErrValue) Exit Function Else outputPattern = outReplaceRegexObj.Replace(outputPattern, inputMatches(0).SubMatches(replaceNumber - 1)) End If End If Next regex = outputPattern End If End Function

Save and close theMicrosoft Visual Basic for ApplicationsEditor window.保存并关闭“Microsoft Visual Basic for Applications编辑器”窗口。

#4楼

Here is my attempt:这是我的尝试：

Function RegParse(ByVal pattern As String, ByVal html As String)Dim regex As RegExpSet regex = New RegExpWith regex.IgnoreCase = True 'ignoring cases while regex engine performs the search..pattern = pattern 'declaring regex pattern..Global = False'restricting regex to find only first match.If .Test(html) Then 'Testing if the pattern matches or notmStr = .Execute(html)(0) '.Execute(html)(0) will provide the String which matches with RegexRegParse = .Replace(mStr, "$1") '.Replace function will replace the String with whatever is in the first set of braces - $1.ElseRegParse = "#N/A"End IfEnd WithEnd Function

#5楼

I needed to use this as a cell function (likeSUMorVLOOKUP) and found that it was easy to:我需要将其用作单元函数（例如SUM或VLOOKUP），发现很容易：

Make sure you are in a Macro Enabled Excel File (save as xlsm).确保您在启用宏的Excel文件中（另存为xlsm）。Open developer toolsAlt+F11打开开发人员工具Alt+F11AddMicrosoft VBScript Regular Expressions 5.5as in other answers在其他答案中添加Microsoft VBScript正则表达式5.5

Create the following function either in workbook or in its own module:在工作簿中或在其自己的模块中创建以下函数：

Function REGPLACE(myRange As Range, matchPattern As String, outputPattern As String) As Variant Dim regex As New VBScript_RegExp_55.RegExp Dim strInput As String strInput = myRange.Value With regex .Global = True .MultiLine = True .IgnoreCase = False .Pattern = matchPattern End With REGPLACE = regex.Replace(strInput, outputPattern) End Function

Then you can use in cell with=REGPLACE(B1, "(\\w) (\\d+)", "$1$2")(ex: "A 243" to "A243")然后，您可以在带有=REGPLACE(B1, "(\\w) (\\d+)", "$1$2")（例如：“ A 243”到“ A243”）的单元格中使用

#6楼

Expanding on patszim 's answer for those in a rush.急于扩展patszim的答案。

Open Excel workbook.打开Excel工作簿。Alt+F11to open VBA/Macros window.Alt+F11打开VBA /宏窗口。Add reference to regex underToolsthenReferences以正则表达式下添加工具，然后引用参考

and selectingMicrosoft VBScript Regular Expression 5.5并选择Microsoft VBScript正则表达式5.5

Insert a new module (code needs to reside in the module otherwise it doesn't work).插入一个新模块（代码需要驻留在模块中，否则它将不起作用）。

In the newly inserted module,在新插入的模块中，

add the following code:添加以下代码：

Function RegxFunc(strInput As String, regexPattern As String) As String Dim regEx As New RegExp With regEx .Global = True .MultiLine = True .IgnoreCase = False .pattern = regexPattern End With If regEx.Test(strInput) Then Set matches = regEx.Execute(strInput) RegxFunc = matches(0).Value Else RegxFunc = "not matched" End If End Function

The regex pattern is placed in one of the cells andabsolute referencingis used on it.将正则表达式模式放在其中一个单元格中，并在其上使用绝对引用。Function will be tied to workbook that its created in.功能将与其在其中创建的工作簿联系在一起。

If there's a need for it to be used in different workbooks, store the function inPersonal.XLSB如果需要在不同的工作簿中使用它，请将函数存储在Personal.XLSB中