Tuesday, May 20, 2014

We will look at a few more examples of this regular expression search.We looked at how match called matchhere and matchhere in turn was recursive or called matchstar. We will now look at say grep functionality.
grep is implemented this way. We read strings from a file a buffersize at a time, null terminate the buffer.We run match over this buffer, increment the match count, print it and continue.
In other words, grep is a continuous search of the match over the entire file a buffer at a time.
Now if we want the matchstar to implement the leftmost longest search for
Suppose we were to implement the ^ and the $ sign implementation as well.
In this case, we will have to watch out for the following cases:
^ - match at the beginning
$ - match at the end
if both ^ and $ then match the string with the literal such that the beginning and the end is the same.
if both ^ and $ are specified and there is no literal, then match even empty strings.
literals before ^ and after $ are not permitted.
Let us now look at a way to use regex for validating a phone number. Note that phone numbers can come in different forms some with brackets, others with or without hyphens and some missing the country code etc.
Given a ten digit US phone number, where the 3 digit area code can be optional, the regex would look something like this:
^(\(\d{3}\))|^\d{3}[-]?)?\d{3}[-]?\d{4}$
Here the first character in the regex or the caret after the vertical bar denotes the start of a sentence with the phone number
The first ( denotes a group but the \( denotes the literal in an optional area code
\d matches a digit which will have the number of occurrences as specified in the { } braces
The | bar indicates that either the area code with paranthesis or without may be present.
Then there are characters to close the ones specified above
? makes the group optional while $ matches the end of a line.

No comments:

Post a Comment