iosemailnsdatadetector

How to detect email addresses within arbitrary strings


I'm using the following code to detect an email in the string. It works fine except dealing with email having pure number prefix, such as "536264846@gmail.com". Is it possible to overcome this bug of apple? Any help will be appreciated!

NSString *string = @"536264846@gmail.com";
NSError *error = NULL;
NSDataDetector *detector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeLink error:&error];
NSArray *matches = [detector matchesInString:string
                                     options:0
                                       range:NSMakeRange(0, [string length])];    
for (NSTextCheckingResult *match in matches) {
    if ([match.URL.scheme isEqualToString:@"mailto"]) {
        NSString *email = [match.URL.absoluteString substringFromIndex:match.URL.scheme.length + 1];
        NSLog(@"email :%@",email);

    }else{
        NSLog(@"[match URL] :%@",[match URL]);
    }

}

Edit: log result is: [match URL] :http://gmail.com


Solution

  • What I did in the past:

    What I did was put the regular expression in a text file, so I didn't need to escape it:

    ^(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){255,})(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){65,}@)(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))\x22))(?:.(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))\x22)))@(?:(?:(?!.[^.]{64,})(?:(?:(?:xn--)?[a-z0-9]+(?:-[a-z0-9]+).){1,126}){1,}(?:(?:[a-z][a-z0-9])|(?:(?:xn--)[a-z0-9]+))(?:-[a-z0-9]+))|(?:[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.[a-f0-9][:]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))]))$

    Defined an ivar:

    NSRegularExpression *reg
    

    Created the regular expression:

    NSString *fullPath = [[NSBundle mainBundle] pathForResource:@"EMailRegExp" ofType:@"txt"];
    NSString *pattern = [NSString stringWithContentsOfFile:fullPath encoding:NSUTF8StringEncoding error:NULL];
    NSError *error = nil;
    reg = [NSRegularExpression regularExpressionWithPattern:pattern options:NSRegularExpressionCaseInsensitive error:&error];
    assert(reg && !error);
    

    Then wrote a method to do the comparison:

    - (BOOL)isValidEmail:(NSString *)string
    {
        NSTextCheckingResult *match = [reg firstMatchInString:string options:0 range:NSMakeRange(0, [string length])];
        return match ? YES : NO;
    }
    

    EDIT: I've turned the above into a project on github

    EDIT2: for an alterate, less rigorous but faster, see the comment section of this question