rstringsplitgrep

Regex pattern to include string while excuding other patterns


I have a long table including such texts in each line "

Each level was separated by "|".

I want to split them based on different levels in R, such as

Subset 1

Subset 2

Subset 3

Subset 4

Thanks in advance


Solution

  • Sounds like you want to count the number of "|" characters and split accordingly.

    split(your_vector, stringr::str_count(your_vector, pattern = fixed("|")))
    # $`0`
    # [1] "x__lorem_01" "x__lorem.05"
    # 
    # $`1`
    # [1] "x__lorem|y__ipsum"       "x__lorem|y__ipsum004_01"
    # 
    # $`2`
    # [1] "x__lorem.05|y__ipsum|z__dolor02_sit"
    # 
    # $`3`
    # [1] "x__lorem.05|y__ipsum|z__dolor02_sit|t__consectetur.adipiscing02"
    

    Using this sample data

    your_vector = c(
        "x__lorem_01",
        "x__lorem|y__ipsum",
        "x__lorem.05",
        "x__lorem.05|y__ipsum|z__dolor02_sit",
        "x__lorem.05|y__ipsum|z__dolor02_sit|t__consectetur.adipiscing02",
        "x__lorem|y__ipsum004_01" 
    )