javaregex

Difference between matches() and find() in Java Regex


I am trying to understand the difference between matches() and find().

According to the Javadoc, (from what I understand), matches() will search the entire string even if it finds what it is looking for, and find() will stop when it finds what it is looking for.

If that assumption is correct, I cannot see whenever you would want to use matches() instead of find(), unless you want to count the number of matches it finds.

In my opinion the String class should then have find() instead of matches() as an inbuilt method.

So to summarize:

  1. Is my assumption correct?
  2. When is it useful to use matches() instead of find()?

Solution

  • matches tries to match the expression against the entire string and implicitly add a ^ at the start and $ at the end of your pattern, meaning it will not look for a substring. Hence the output of this code:

    public static void main(String[] args) throws ParseException {
        Pattern p = Pattern.compile("\\d\\d\\d");
        Matcher m = p.matcher("a123b");
        System.out.println(m.find());
        System.out.println(m.matches());
        
        p = Pattern.compile("^\\d\\d\\d$");
        m = p.matcher("123");
        System.out.println(m.find());
        System.out.println(m.matches());
    }
    
    /* output:
    true
    false
    true
    true
    */
    

    123 is a substring of a123b so the find() method outputs true. matches() only 'sees' a123b which is not the same as 123 and thus outputs false.

    Also worth highlighting the difference between matches and find as pointed out in the official docs:

    A matcher is created from a pattern by invoking the pattern's matcher method. Once created, a matcher can be used to perform three different kinds of match operations:

    • The matches method attempts to match the entire input sequence against the pattern.
    • The find method scans the input sequence looking for the next subsequence that matches the pattern.
    • The lookingAt method attempts to match the input sequence, starting at the beginning, against the pattern.