$reg = "^This"; // will match the word "This" at the beginning of a string $Reg = "This$"; // will match the word "This" at the end of a string $REG = "^This$"; // will match the string "This"Here is the list of special characters you can use in PHP regular expressions:
| matching matacharacters | ||
| Character | Matches | Example |
|---|---|---|
| . | Any character except new line | "..." matches "abC", "12f", "1+ ", or any three characters |
| [...] | Character set | "[AN]BC" matches "ABC" and "NBC" but not "BBC" |
| [^...] | Negated character set | "[^AN]BC" matches "BBC" and "CBC" but not "ABC" or "NBC" |
| matching character classes | ||
| Charater class | Matches | Example |
| [[:alpha:]] | any letter | "[[:alpha:]]+" matches "PHP", "JavaScript", and "text" but not "123" |
| [[:digit:]] | any digit | "[[:digit:]]" matches "12", "23", and "1text" but not "abc" |
| [[:alnum:]] | any letter or digit | "[[:alnum:]]" matches "PHP", "12", and "text21" but not "\n\t\n" |
| [[:space:]] | any whitespace | "[[:space:]]" matches "PHP is a script language", "JavaScript\tPHP", and "text\nanother text" but not "123" |
| [[:upper:]] | any uppercase letter | "[[:upper:]]" matches "PHP" and "JavaScript" but not "text" |
| [[:lower:]] | any lowercase letter | "[[:lower:]]" matches "JavaScript" and "text" but not "PHP" |
| [[:punct:]] | any punctuation mark | "[[:punct:]]" matches "PHP,Javascript", "JavaScript!", and "text:" but not "123" |
| [[:xdigit:]] | any hexadecimal digit | "[[:xdigit:]]" matches "A", "10Ae", and "123" but not "tst" |
| Counting metacharacters | ||
| Character | Matches last character | Example |
| * | Zero or more times | "Ja*vaScript" matches "JvaScript", "JavaScript", and "JaaaavaScript" but not "JuvaScript" |
| ? | Zero or one time | "Ja?vaScript" matches "JvaScript" or "JavaScript" but not "JaavaScript" |
| + | One or more times | "Ja+vaScript" matches "JavaScript" or "JaaaavaScript" but not "JvaScript" |
| {n} | Exactly n times | "Ja{2}vaScript" matches "JaavaScript" but not "JvaScript" or "JaaaavaScript" |
| {n,} | n or more times | "Ja{2,}vaScript" matches "JaavaScript" or "JaaaavaScript" but not "JvaScript" |
| {n, m} | At least n at most m times | "Ja{2,3}vaScript" matches "JaavaScript" or "JaaavaScript" but not "JvaScript" or "JaaaaavaScript" |
| positional metacharacters | ||
| Character | Matches located | Example |
| ^ | At the beginning of the string | "^Fread" matches "Fred is OK" but not "I'm with Fred" or "Is Fred here?" |
| $ | At the end of the string | "Fread$" matches "I'm with Fred" but not "Fred is OK" or "Is Fred here?" |
$pattern = "^[a-zA-Z0-9_]+@[a-zA-Z0-9_]+(\\.[a-zA-Z0-9_]+)+";
if( ereg($pattern, $email) ){
echo "E-mail address is valid";
}
else{
echo "Invalid e-mail address";
}
These two functions can accept a third argument. This optional argument is an array passed by
reference. The very first element of the array is the found match and all the other element
are the sets of symbols inside parenthesis in the pattern.
$pattern = "^([a-zA-Z0-9_])+@([a-zA-Z0-9_]+)\\.([a-zA-Z0-9_]+)$";
if( ereg($pattern, $email, $arr) ){
echo "E-mail address is valid";
for($i=0;$arr[$i];$i++)
echo "\$arr[$i] = $arr[$i]";
}
else{
echo "Invalid e-mail address";
}
eregi() behaves identically to ereg(), except it ignores case distinctions
when matching letters.
$card = "1234-2345-5677-5675";
$pattern = "[0-9]{4}";
$Card = ereg_replace($pattern, "****", $card);
echo "$card"; // stays the same
echo "$Card"; // now equals ****-****-****-****
As you can guess, eregi_replace() behaves like eregi_replace(), but ignores case distinction.
The complete documentation can be found on the PHP site. Please also use this page to play with different regular expressions and PHP functions.
int preg_match(pattern, string [, matcharray]); int preg_match_all(pattern, string, matcharray[, flag]);As a pattern these functions take Perl-style regular expressions; that is, expressions in the form /^start+/. The only difference is that in PHP these expressions should be put inside double quotes:
$pregexep = "/^(\d{3}) \d{3}-\d{4}$/";
Yes, we can use the same symbols:
| Meta-character | Description |
|---|---|
| \d | any decimal digit |
| \D | any character that is not a decimal digit |
| \s | any whitespace character |
| \S | any character that is not a whitespace character |
| \w | any "word" character |
| \W | any "non-word" character |
| \b | word boundary |
| \B | not a word boundary |
If the third (optional) argument is provided it will be filled with the result of the search. The first element of this array will contain the substring of the subject string that matches the pattern, the second argument will contain the substring of the match that matches the first sub-pattern in the pattern (a pattern inside the first parenthesis), and so on. Please notice that preg_match() stops searching after the first match is found. To find all matches in a string we need to use function preg_match_all().
This functions takes the same arguments function preg_match() does, but the result array is a two-dimensional array. This array contains all substrings of the subject string that match the pattern as well as all sub-pattern matches. How exactly these elements are stored in the array depends on the value of the last (optional) parameter flag. We would prefer use value PREG_SET_ORDER for this argument. In this case each element of the result array is exactly an array returned by function preg_match(). The following example illustrates how to extract some opening and closing tags from an HTML file and also text between them:
string preg_replace(pattern, replacement, string);The first argument of the function is the regular expression pattern to be replaced. The second argument is the string to replace pattern with, and the last argument is the string where we need to make the replacement. If the replacement took place preg_replace() returns the new string, otherwise it returns the original string.
replacement argument may contain references of the form \\n or (since PHP 4.0.4) $n, with the latter form being the preferred one. Every such reference will be replaced by the text captured by the n'th parenthesized pattern. n can be from 0 to 99, and \\0 or $0 refers to the text matched by the whole pattern. Opening parentheses are counted from left to right (starting from 1) to obtain the number of the capturing subpattern.
array preg_split(pattern, string);This function takes two arguments
Note: some of the regular expression functions have more options than described in this lecture. Please consult PHP documentation about the complete description. The following Perl-style regular expression play ground to use your own regular expressions.
| Function | Returns | Description |
|---|---|---|
| ltrim(string) | String | Strips whitespaces from the beginning of the specified string |
| rtrim(string) | String | Removes trailing whitespaces |
| chop(string) | String | Alias or rtrim(). Removes trailing white spaces from the specified string |
| ord(string) | Integer | Returns the ASCII code of the first character of the specifid string |
| chr(ascii) | String | Returns the character represented by the specified ASCII code |
| strchr(haystack, needle) | String | Finds the first occurrence of the needle in the haystack |
| strlen(string) | Integer | Returns the length of the specified string |
| String | Returns length characters of the string from the position specified by start | |
| strpos(string, substr) | Integer | Returns the numeric position of the first occurrence of substr in string |
| strrpos(string, substr) | Integer | Returns the numeric position of the last occurrence of substr in string |
| stripcslashes(string) | String | Returns a string with backslashes stripped off. Recognizes C-like \n, \r ..., octal and hexadecimal representation. |
| strtolower(string) | String | Returns string with all alphabetic characters converted to lowercase. |
| strtoupper(string) | String | Returns string with all alphabetic characters converted to uppercase. |
| nl2br (string) | String | Returns string with '<br />' inserted before all newlines. |
| crypt(string[, salt]) | String | Encrypts the specified string using the two-character salt |