grep is implemented this way. We read strings from a file a buffersize at a time, null terminate the buffer.We run match over this buffer, increment the match count, print it and continue.
In other words, grep is a continuous search of the match over the entire file a buffer at a time.
Now if we want the matchstar to implement the leftmost longest search for
Suppose we were to implement the ^ and the $ sign implementation as well.
In this case, we will have to watch out for the following cases:
^ - match at the beginning
$ - match at the end
if both ^ and $ then match the string with the literal such that the beginning and the end is the same.
if both ^ and $ are specified and there is no literal, then match even empty strings.
literals before ^ and after $ are not permitted.
Let us now look at a way to use regex for validating a phone number. Note that phone numbers can come in different forms some with brackets, others with or without hyphens and some missing the country code etc.
Given a ten digit US phone number, where the 3 digit area code can be optional, the regex would look something like this:
^(\(\d{3}\))|^\d{3}[-]?)?\d{3}[-]?\d{4}$
Here the first character in the regex or the caret after the vertical bar denotes the start of a sentence with the phone number
The first ( denotes a group but the \( denotes the literal in an optional area code
\d matches a digit which will have the number of occurrences as specified in the { } braces
The | bar indicates that either the area code with paranthesis or without may be present.
Then there are characters to close the ones specified above
? makes the group optional while $ matches the end of a line.