regexrust

Capture multiple instances in same line of complex expression


I'm trying to craft a regular expression in Rust for a log file containing strings like the following:

[2025-01-01T08:17:29.791951550Z INFO] Values: [(311154184, Some(389971313710868251)), (311154187, Some(389967898428572732)), (311154182, Some(389971313710868251)), (311154174, Some(389971313710868251)), (311154178, Some(389971313710868251)), (311154197, Some(389811146843151022)), (311154171, Some(389971313710868251)), (311154167, Some(389971313710868251)), (311154185, Some(389967898428572732)), (311154168, Some(389971313710868251)), (311154191, Some(389967898428572732)), (311154196, Some(389811259875653181)), (311154199, Some(0)), (311154192, Some(389967898428572732)), (311154172, Some(389971313710868251)), (311154181, Some(389971313710868251)), (311154177, Some(389971313710868251)), (311154176, Some(389971313710868251)), (311154179, Some(389971313710868251)), (311154183, Some(389971313710868251)), (311154186, Some(389967898428572732)), (311154175, Some(389971313710868251)), (311154173, Some(389971313710868251)), (311154180, Some(389971313710868251)), (311154190, Some(389967898428572732)), (311154170, Some(389971313710868251)), (311154165, None), (311154189, Some(389967898428572732)), (311154169, Some(389971313710868251)), (311154166, None), (311154195, Some(389967898428572732)), (311154198, Some(287087126717485928)), (311154193, Some(389967898428572732)), (311154188, Some(389967898428572732)), (311154194, Some(389967898428572732))]

The "Values" in this case are each tuples of a u64 and an Option(u64), so some appear as: (311154170, Some(389971313710868251)) and some appear as: (311154165, None).

I'd like to capture all of them and end up with a Vec of tuples just like the above but I can't quite nail the regex. I've tried a few iterations, including this:

((\((\d+), (Some\((\d+)\)\)|None\)), )+)

but it seems to be missing items or capturing everything and select tuples only but not others. I welcome any assistance.


Solution

  • Here is the solution with regex. This will produce a vector of (u64, Option<u64>) type.

    use regex::Regex;
    
    fn main() {
        let string = "<YOUR_STRING_HERE>";
    
        let re = Regex::new(r"\((\d+), (Some\(\d+\)|None)\)").unwrap();
        let result: Vec<(u64, Option<u64>)> = re
            .captures_iter(&string)
            .map(|c| {
                let (_, [first, second]): (&str, [&str; 2]) = c.extract();
                (
                    first.parse::<u64>().unwrap(),
                    if second == "None" {
                        None
                    } else {
                        second[5..(second.len() - 1)].parse::<u64>().ok()
                    },
                )
            })
            .collect();
        println!("{:?}", result);
    }