VB and VBA Users Source Code: Using Regular Expressions to match strings
[
Home
|
Contents
|
Search
|
Reply
| Previous | Next ]
VB/VBA Source Code
Using Regular Expressions to match strings
By:
Andrew Baker
Email (spam proof):
Email the originator of this post
Date:
Friday, April 20, 2001
Hits:
2139
Category:
Visual Basic General
Article:
A regular expression is a series of characters that define a pattern. The pattern is then compared to a target string to see whether there are any matches of the pattern in the target string. Many times, the characters in a pattern will simply match themselves in the target string, such as looking for all occurrences of the pattern "the" in "the quick brown fox chased the lazy dog." You can also use special characters, called metacharacters, to indicate character positioning, grouping, and repetition. Most of you are probably familiar with the use of the character "*" (asterisk) as a wildcard for matching any character when doing a directory search in DOS. The "*" is an example of a regular expression metacharacter. You can also search for pattern sequences using regular expressions. For example, the regular expression "[a-c]" will match any "a", "b", or "c" in the target string. The regular expression engine in VBScript includes several special metacharacters and sequences to allow you to do more complex pattern matching, including the following: The character "^" stands for the beginning of a string , so "^i" will match "is" but not "mi". "$" indicates a match at the end of a string, so "$i" will match "mi" but not "is". The character "*" matches the preceding character zero or more times, so the regular expression "fo*" matches both "fo" and "foo". "+" matches the preceding character one or more times, so "fo+" matches "foo" but not "fo". The question mark ("?") matches the preceding character zero or just one time, meaning "?ve?" matches the "ve" in "never". The period (".") matches any single character except the newline character, so "a.b" matches "aab" and "a3b", but not "ab". The bar "|" is used for alternative matching, as in "a|b", which will match either "a" or "b". The expression "{n}" matches against the target string exactly n times. For example, "e{2}" will match "feed" but not "fed". The expression "{n,m}" will match against the target string at least n times but not more than m times. "o{1,3}" will match all the o's in "food" or "sod", but will only match the first three o's in "soooooie". Brackets are used to express character and digit sets and ranges. For example, the expression "[abcd]" will match any of the enclosed characters in the target string. The whole lower-case alphabet can be expressed using "[a-z]". To include both upper- and lower-case letters in the regular expression, write the expression like this: "[a-zA-Z]". To search for digits in a string, use "[0-9]". Negative character and digit sets and ranges can also be expressed. "[^abc]" will match any characters not enclosed in brackets. You can also write an expression for a negative range, such as "[^m-q]". 'Note you will need the file "VBSCRIPT.DLL" to be installed and registered 'before this code will work Sub Test() Dim oRegExp As Object, sEmailName As String Set oRegExp = CreateObject("VBScript.RegExp") 'Check for a word followed by an "@" symbol, then a word 'followed, then a "." and finally by another word 'eg. myname@myserver.type oRegExp.Pattern = "\w+\@+\w+\.+\w" sEmailName = InputBox("E-mail Address", , "myname@myserver.com") 'Test the e-mail address If oRegExp.Test(sEmailName) Then MsgBox "Your e-mail address is correct.", vbInformation Else MsgBox "Invalid e-mail address.", vbExclamation End If Set oRegExp = Nothing End Sub
Terms and Conditions
Support this site
Download a trial version of the Excel Workbook Rebuilder