HEX

File: //usr/local/lib/python3.10/dist-packages/httpx/__pycache__/_urlparse.cpython-310.pyc
o

���grH�
@s�dZddlmZddlZddlZddlZddlZddlmZdZ	dZ
dZe�d	�Z
d
�dd�ed
d�D��Zd
�dd�ed
d�D��Zd
�dd�ed
d�D��Zd
�dd�ed
d�D��Zd
�dd�ed
d�D��Zd
�dd�ed
d�D��Ze�djdddddd��Ze�djdddd��Ze�d�e�d�e�d�e�d�e�d�e�d�e�d�e�d�d �Ze�d!�Ze�d"�ZGd#d$�d$ej�ZdFdGd*d+�ZdHd-d.�ZdId3d4�Z dJd:d;�Z!dKd<d=�Z"dLd?d@�Z#dMdBdC�Z$dMdDdE�Z%dS)Na�
An implementation of `urlparse` that provides URL validation and normalization
as described by RFC3986.

We rely on this implementation rather than the one in Python's stdlib, because:

* It provides more complete URL validation.
* It properly differentiates between an empty querystring and an absent querystring,
  to distinguish URLs with a trailing '?'.
* It handles scheme, hostname, port, and path normalization.
* It supports IDNA hostnames, normalizing them to their encoded form.
* The API supports passing individual components, as well as the complete URL string.

Previously we relied on the excellent `rfc3986` package to handle URL parsing and
validation, but this module provides a simpler alternative, with less indirection
required.
�)�annotationsN�)�
InvalidURLizBABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~z!$&'()*+,;=z%[A-Fa-f0-9]{2}�cC�g|]
}|dvrt|��qS))� �"�<�>�`��chr��.0�i�r�:/usr/local/lib/python3.10/dist-packages/httpx/_urlparse.py�
<listcomp>,�rr�cCr))rr�#r	r
rrrrrr2rcCr))	rrrr	r
�?r�{�}rrrrrr8s
�cCr�)rrrr	r
rrrr�/�:�;�=�@�[�\�]�^�|rrrrrrC���cCrrrrrrrrMr%cCr))rrrr	r
rrrrrrrrr r!r"r#r$rrrrrrZr%z�(?:(?P<scheme>{scheme}):)?(?://(?P<authority>{authority}))?(?P<path>{path})(?:\?(?P<query>{query}))?(?:#(?P<fragment>{fragment}))?z([a-zA-Z][a-zA-Z0-9+.-]*)?z[^/?#]*z[^?#]*z[^#]*z.*��scheme�	authority�path�query�fragmentzA(?:(?P<userinfo>{userinfo})@)?(?P<host>{host}):?(?P<port>{port})?z(\[.*\]|[^:@]*))�userinfo�host�portz[^@]*z(\[.*\]|[^:]*))r'r(r)r*r+r,r-r.z ^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$z^\[.*\]$c@sveZdZUded<ded<ded<ded<ded<ded	<ded
<eddd
��Zeddd��Zddd�Zddd�ZdS)�ParseResult�strr'r,r-�
int | Noner.r)�
str | Noner*r+�returncCsVd�|jr|j�d�ndd|jvrd|j�d�n|j|jdur'd|j��g�Sdg�S)Nr�@�:�[�])�joinr,r-r.��selfrrrr(�s����zParseResult.authoritycCsBd�d|jvrd|j�d�n|j|jdurd|j��g�Sdg�S)Nrr5r6r7)r8r-r.r9rrr�netloc�s����zParseResult.netloc�kwargscKs:|s|S|j|j|j|j|jd�}|�|�tdi|��S)Nr&r�r)r'r(r)r*r+�update�urlparse)r:r<�defaultsrrr�	copy_with�s�
zParseResult.copy_withcCsl|j}d�|jr|j�d�nd|rd|��nd|j|jdur$d|j��nd|jdur2d|j��g�Sdg�S)Nrr5�//�?�#)r(r8r'r)r*r+)r:r(rrr�__str__�s����zParseResult.__str__N)r3r0)r<r2r3r/)	�__name__�
__module__�__qualname__�__annotations__�propertyr(r;rArErrrrr/�s
	
r/�urlr0r<r2r3c"Ks�t|�tkr
td��tdd�|D��r.tdd�|D��}|�|�}d|�d|�d�}t|��d|vrC|d}t|t�r?t|�n||d<d	|vrZ|�	d	�pMd
}|�
d�\|d<}|d<d
|vsbd|vr�t|�	d
d
�pjd
td�}t|�	dd
�pvd
t
d�}	|	r�|�d|	��n||d<d|vr�|�	d�p�d
}
|
�
d�\|d<}|d<|s�d|d<d|vr�|�d�p�d
}d|vr�|�d�r�|�d�s�d|�d�|d<|��D]O\}
}|du�rt|�tkr�td|
�d���tdd�|D���rtdd�|D��}|�|�}d|
�d|�d|�d�}t|��t|
�|��std|
�d���q�t�|�}|du�s%J�|��}|�d|d��p3d
}|�d |d ��p>d
}|�d|d��pId
}|�d|d�}|�d!|d!�}t�|�}|du�sfJ�|��}|�d|d��ptd
}|�d|d��pd
}|�d|d�}|��}t|td�}t|�}t||�}|d
k}|d
k�p�|d
k�p�|du}t|||d"�|�s�|�r�t|�}t|td�}|du�r�dnt|td�} |du�r�dnt|t d�}!t!|||||| |!�S)#NzURL too longcs�"�|]}|��o|��VqdS�N��isascii�isprintable�r�charrrr�	<genexpr>��� zurlparse.<locals>.<genexpr>cs�$�|]
}|��r|��s|VqdSrMrNrQrrrrS�s�"z.Invalid non-printable ASCII character in URL, z
 at position �.r.r;rr5r-�username�password��safer,�raw_pathrCr)r*r6r7zURL component 'z
' too longcsrLrMrNrQrrrrSrTcsrUrMrNrQrrrrSs���
�z-Invalid non-printable ASCII character in URL z component, zInvalid URL component '�'r'r(r+)�
has_scheme�
has_authority)"�len�MAX_URL_LENGTHr�any�next�find�
isinstance�intr0�pop�	partition�quote�
USERNAME_SAFE�
PASSWORD_SAFE�get�
startswith�endswith�items�COMPONENT_REGEX�	fullmatch�	URL_REGEX�match�	groupdict�AUTHORITY_REGEX�lower�
USERINFO_SAFE�encode_host�normalize_port�
validate_path�normalize_path�	PATH_SAFE�
QUERY_SAFE�	FRAG_SAFEr/)"rKr<rR�idx�errorr.r;�_rWrXr[�	seperatorr-�key�value�	url_match�url_dictr'r(r)r*�frag�authority_match�authority_dictr,�
parsed_scheme�parsed_userinfo�parsed_host�parsed_portr]r^�parsed_path�parsed_query�parsed_fragrrrr?�s�
�
�
����


��r?r-cCs�|sdSt�|�r!zt�|�W|Stjy td|����wt�|�rGzt�|dd��Wntjy@td|����w|dd�S|��rWd}t	|�
�t|d�Szt�
|�
���d�WStjyqtd	|����w)
NrzInvalid IPv4 address: r���zInvalid IPv6 address: z"`{}%|\rY�asciizInvalid IDNA hostname: )�IPv4_STYLE_HOSTNAMErr�	ipaddress�IPv4Address�AddressValueErrorr�IPv6_STYLE_HOSTNAME�IPv6AddressrOrhru�
SUB_DELIMS�idna�encode�decode�	IDNAError)r-�WHATWG_SAFErrrrw\s0
��
	��rwr.�str | int | Noner'r1cCsd|dus|dkr
dSzt|�}Wntytd|����wdddddd��|�}||kr0dS|S)NrzInvalid port: ��Pi�)�ftp�http�https�ws�wss)re�
ValueErrorrrk)r.r'�port_as_int�default_portrrrrx�s
��rxr)r]�boolr^�NonecCsR|r
|r
|�d�s
td��|s#|s%|�d�rtd��|�d�r'td��dSdSdS)z�
    Path validation rules that depend on if the URL contains
    a scheme or authority component.

    See https://datatracker.ietf.org/doc/html/rfc3986.html#section-3.3
    �/z7For absolute URLs, path must be empty or begin with '/'rBz3Relative URLs cannot have a path starting with '//'r5z2Relative URLs cannot have a path starting with ':'N)rlr)r)r]r^rrrry�s

�rycCsvd|vr|S|�d�}d|vrd|vr|Sg}|D]}|dkr q|dkr0|r/|dgkr/|��q|�|�qd�|�S)z�
    Drop "." and ".." segments from a URL path.

    For example:

        normalize_path("/path/./to/somewhere/..") == "/path/to"
    rVr�z..r)�splitrf�appendr8)r)�
components�output�	componentrrrrz�s	
�
rz�stringcCsd�dd�|�d�D��S)NrcSsg|]}d|d���qS)�%�02Xr)r�byterrrr�szPERCENT.<locals>.<listcomp>zutf-8)r8r�)r�rrr�PERCENT�sr�rZcs.t|�|���s|Sd��fdd�|D��S)z1
    Use percent-encoding to quote a string.
    rcs g|]}|�vr
|nt|��qSr)r�rQ��NON_ESCAPED_CHARSrrr�s z#percent_encoded.<locals>.<listcomp>)�UNRESERVED_CHARACTERS�rstripr8)r�rZrr�r�percent_encoded�s
�r�c
Cs�g}d}t�t|�D]*}|��|��}}|�d�}||kr-|||�}|�t||d��|�|�|}q
|t|�krJ||d�}	|�t|	|d��d�	|�S)a�
    Use percent-encoding to quote a string, omitting existing '%xx' escape sequences.

    See: https://www.rfc-editor.org/rfc/rfc3986#section-2.1

    * `string`: The string to be percent-escaped.
    * `safe`: A string containing characters that may be treated as safe, and do not
        need to be escaped. Unreserved characters are always treated as safe.
        See: https://www.rfc-editor.org/rfc/rfc3986#section-2.3
    rrYNr)
�re�finditer�PERCENT_ENCODED_REGEX�start�end�groupr�r�r_r8)
r�rZ�parts�current_positionrr�start_position�end_position�matched_text�leading_text�
trailing_textrrrrh�s


rhr=)rKr0r<r2r3r/)r-r0r3r0)r.r�r'r0r3r1)r)r0r]r�r^r�r3r�)r)r0r3r0)r�r0r3r0)r�r0rZr0r3r0)&�__doc__�
__future__rr�r��typingr��_exceptionsrr`r�r��compiler�r8�ranger}r|r{rirjrv�formatrqrtror�r��
NamedTupler/r?rwrxryrzr�r�rhrrrr�<module>s��
������
��
�������

7

/