different between python and ruby when parsing URL path, which is valid?

I have a URL string as:

url = "https://foo.bar.com/path/to/aaa.bbb/ccc.ddd;dc_trk_aid=486652617;tfua=;gdpr=;gdpr_consent=?&339286293"

when using Python

from urllib.parse import urlparse

url_obj = urlparse(url)
url_obj.path  # `path/to/aaa.bbb/ccc.ddd`

when using ruby

url_obj = URI.parse(url)

url_obj.path # `path/to/aaa.bbb/ccc.ddd;dc_trk_aid=486652617;tfua=;gdpr=;gdpr_consent=`

I guess python is consider ; is not part of the url path, which one is 'correct'?

Solution

Python's urllib is wrong. RFC 3986 Uniform Resource Identifier (URI): Generic Syntax, Section 3.3 Path explicitly gives this exact syntax as an example for a valid path [bold emphasis mine]:

Aside from dot-segments in hierarchical paths, a path segment is considered opaque by the generic syntax. URI producing applications often use the reserved characters allowed in a segment to delimit scheme-specific or dereference-handler-specific subcomponents. For example, the semicolon (";") and equals ("=") reserved characters are often used to delimit parameters and parameter values applicable to that segment. The comma (",") reserved character is often used for similar purposes. For example, one URI producer might use a segment such as "name;v=1.1" to indicate a reference to version 1.1 of "name", whereas another might use a segment such as "name,1.1" to indicate the same. Parameter types may be defined by scheme-specific semantics, but in most cases the syntax of a parameter is specific to the implementation of the URI's dereferencing algorithm.

The correct interpretation of the example URI you posted is the following:

scheme = https
authority = foo.bar.com
- userinfo = empty
- host = foo.bar.com
- port = empty, derived from the scheme to be 443
path = /path/to/aaa.bbb/ccc.ddd;dc_trk_aid=486652617;tfua=;gdpr=;gdpr_consent=, consisting of the following four path segments:
1. path
2. to
3. aaa.bbb
4. ccc.ddd;dc_trk_aid=486652617;tfua=;gdpr=;gdpr_consent=
query = &339286293
fragment = empty