pythonpython-3.xregexpep

How to use the Python packaging library with a custom regex?


I'm trying to build and tag artifacts, the environment name gets appended at the end of the release, e.g.: 1.0.0-stg or 1.0.0-sndbx, none of them are PEP-440 compliance, raising the following error message:

raise InvalidVersion(f"Invalid version: '{version}'")
packaging.version.InvalidVersion: Invalid version: '1.0.0-stg'

Using the packaging library I know I can access the regex by doing:

from packaging import version
version.VERSION_PATTERN

However, my question is how can I customize the regex rule also to support other environments?


Solution

  • The original value of the version.VERSION_PATTERN has used regex groups to separate the types of the version. And if you see the pattern, it defines three types of releases: pre-release, post-release, and dev release.

    v?
    (?:
        (?:(?P<epoch>[0-9]+)!)?                           # epoch
        (?P<release>[0-9]+(?:\.[0-9]+)*)                  # release segment
        (?P<pre>                                          # pre-release
            [-_\.]?
            (?P<pre_l>(a|b|c|rc|alpha|beta|pre|preview))
            [-_\.]?
            (?P<pre_n>[0-9]+)?
        )?
        (?P<post>                                         # post release
            (?:-(?P<post_n1>[0-9]+))
            |
            (?:
                [-_\.]?
                (?P<post_l>post|rev|r)
                [-_\.]?
                (?P<post_n2>[0-9]+)?
            )
        )?
        (?P<dev>                                          # dev release
            [-_\.]?
            (?P<dev_l>dev)
            [-_\.]?
            (?P<dev_n>[0-9]+)?
        )?
    )
    (?:\+(?P<local>[a-z0-9]+(?:[-_\.][a-z0-9]+)*))?       # local version
    

    So you should add the stg and sndbx environment names next to the particular group name. In case those values do not present in the list, the version will always be counted as a local version and not a release version.

    If those are pre-releases, you should add them to the pre_l group.

    (?P<pre_l>(a|b|c|rc|alpha|beta|pre|preview|stg|sndbx))
    

    In case those are post-release environments, the post_l should be changed, becoming:

    (?P<post_l>post|rev|r|stg|sndbx)
    

    And if those are dev release environments, the dev_l group should be updated in the same way.

    (?P<dev_l>dev|stg|sndbx)
    

    Once you have decided on the type of release and set values to the particular groups, you can override the version.VERSION_PATTERN and see the result.

    from packaging import version
    
    version.VERSION_PATTERN = r"""
        v?
        (?:
            (?:(?P<epoch>[0-9]+)!)?                           # epoch
            (?P<release>[0-9]+(?:\.[0-9]+)*)                  # release segment
            (?P<pre>                                          # pre-release
                [-_\.]?
                (?P<pre_l>(a|b|c|rc|alpha|beta|pre|preview))
                [-_\.]?
                (?P<pre_n>[0-9]+)?
            )?
            (?P<post>                                         # post release
                (?:-(?P<post_n1>[0-9]+))
                |
                (?:
                    [-_\.]?
                    (?P<post_l>post|rev|r)
                    [-_\.]?
                    (?P<post_n2>[0-9]+)?
                )
            )?
            (?P<dev>                                          # dev release
                [-_\.]?
                (?P<dev_l>dev|stg|sndbx)
                [-_\.]?
                (?P<dev_n>[0-9]+)?
            )?
        )
        (?:\+(?P<local>[a-z0-9]+(?:[-_\.][a-z0-9]+)*))?       # local version
    """
    

    NOTE: It is not required to add both identifiers of environments into one group. So one of them can be added to another group.