re.match
is anchored at the beginning of the string. That has nothing to do with newlines, so it is not the same as using ^
in the pattern.
As the re.match documentation says:
If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding
MatchObject
instance. ReturnNone
if the string does not match the pattern; note that this is different from a zero-length match.Note: If you want to locate a match anywhere in string, use
search()
instead.
re.search
searches the entire string, as the documentation says:
Scan through string looking for a location where the regular expression pattern produces a match, and return a corresponding
MatchObject
instance. ReturnNone
if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.
So if you need to match at the beginning of the string, or to match the entire string use match
. It is faster. Otherwise use search
.
The documentation has a specific section for match
vs. search
that also covers multiline strings:
Python offers two different primitive operations based on regular expressions:
match
checks for a match only at the beginning of the string, whilesearch
checks for a match anywhere in the string (this is what Perl does by default).Note that
match
may differ fromsearch
even when using a regular expression beginning with'^'
:'^'
matches only at the start of the string, or inMULTILINE
mode also immediately following a newline. The “match
” operation succeeds only if the pattern matches at the start of the string regardless of mode, or at the starting position given by the optionalpos
argument regardless of whether a newline precedes it.
Now, enough talk. Time to see some example code:
# example code:
string_with_newlines = """something
someotherthing"""
import re
print re.match('some', string_with_newlines) # matches
print re.match('someother',
string_with_newlines) # won't match
print re.match('^someother', string_with_newlines,
re.MULTILINE) # also won't match
print re.search('someother',
string_with_newlines) # finds something
print re.search('^someother', string_with_newlines,
re.MULTILINE) # also finds something
m = re.compile('thing$', re.MULTILINE)
print m.match(string_with_newlines) # no match
print m.match(string_with_newlines, pos=4) # matches
print m.search(string_with_newlines,
re.MULTILINE) # also matches
Python | re.search() vs re.match()
Prerequisite: Regex in Python
Use of re.search() and re.match() –
re.search() and re.match()
both are functions of re module in python. These functions are very
efficient and fast for searching in strings. The function searches for
some substring in a string and returns a match object if found, else it
returns none.
re.search() vs re.match() –
There is a difference between the use of both functions. Both return the first match of a substring found in the string,
but re.match()
searches only from the beginning of the string and return match object
if found. But if a match of substring is found somewhere in the middle
of the string, it returns none.
While re.search()
searches for the whole string even if the string contains multi-lines
and tries to find a match of the substring in all the lines of string.
# import re module import re Substring ='string' String1 ='''We are learning regex with geeksforgeeks regex is very useful for string matching. It is fast too.''' String2 ='''string We are learning regex with geeksforgeeks regex is very useful for string matching. It is fast too.''' # Use of re.search() Method print(re.search(Substring, String1, re.IGNORECASE)) # Use of re.match() Method print(re.match(Substring, String1, re.IGNORECASE)) # Use of re.search() Method print(re.search(Substring, String2, re.IGNORECASE)) # Use of re.match() Method print(re.match(Substring, String2, re.IGNORECASE))
Conclusion :
- re.search() is returning match object and implies that first match found at index 69.
-
re.match() is returning none because match exists
in the second line of the string and re.match() only works if the match
is found at the beginning of the string.
-
re.IGNORECASE is used to ignore the case sensitivity in the strings.
- Both re.search() and re.match() returns only the first occurrence of a substring in the string and ignore others.