HEX
Server: Apache/2.4.52 (Ubuntu)
System: Linux spn-python 5.15.0-89-generic #99-Ubuntu SMP Mon Oct 30 20:42:41 UTC 2023 x86_64
User: arjun (1000)
PHP: 8.1.2-1ubuntu2.20
Disabled: NONE
Upload Files
File: //usr/local/lib/python3.10/dist-packages/charset_normalizer/__pycache__/models.cpython-310.pyc
o

;��gj0�@s�ddlmZddlmZddlmZddlmZddlm	Z	ddl
mZmZm
Z
mZddlmZmZdd	lmZmZmZGd
d�d�ZGdd
�d
�ZeeefZe
eZGdd�d�ZdS)�)�annotations)�aliases)�sha256)�dumps)�sub)�Any�Iterator�List�Tuple�)�RE_POSSIBLE_ENCODING_INDICATION�TOO_BIG_SEQUENCE)�	iana_name�is_multi_byte_encoding�
unicode_rangec@sHeZdZ		dGdHdd�ZdIdd�ZdIdd�ZedJdd��ZdKdd�ZdKdd�Z	dLdd �Z
edKd!d"��ZedMd$d%��ZedNd&d'��Z
edNd(d)��ZedMd*d+��ZedKd,d-��ZedJd.d/��ZedJd0d1��ZedJd2d3��ZedJd4d5��ZedOd6d7��ZedPd9d:��ZedNd;d<��ZedMd=d>��ZedMd?d@��ZdQdRdCdD�ZedKdEdF��ZdS)S�CharsetMatchN�payload�bytes�guessed_encoding�str�mean_mess_ratio�float�has_sig_or_bom�bool�	languages�CoherenceMatches�decoded_payload�
str | None�preemptive_declarationcCsL||_||_||_||_||_d|_g|_d|_d|_d|_	||_
||_dS)N�)�_payload�	_encoding�_mean_mess_ratio�
_languages�_has_sig_or_bom�_unicode_ranges�_leaves�_mean_coherence_ratio�_output_payload�_output_encoding�_string�_preemptive_declaration)�selfrrrrrrr�r-�D/usr/local/lib/python3.10/dist-packages/charset_normalizer/models.py�__init__s

zCharsetMatch.__init__�other�object�returncCs>t|t�st|t�rt|�|jkSdS|j|jko|j|jkS)NF)�
isinstancerrr�encoding�fingerprint�r,r0r-r-r.�__eq__*s


zCharsetMatch.__eq__cCs�t|t�st�t|j|j�}t|j|j�}|dkr%|dkr%|j|jkS|dkr@|dkr@t|j�tkr:|j|jkS|j	|j	kS|j|jkS)zQ
        Implemented to make sorted available upon CharsetMatches items.
        g{�G�z�?g{�G�z�?)
r3r�
ValueError�abs�chaos�	coherence�lenr r
�multi_byte_usage)r,r0�chaos_difference�coherence_differencer-r-r.�__lt__1s
zCharsetMatch.__lt__cCsdtt|��t|j�S)Ng�?)r<r�raw�r,r-r-r.r=GszCharsetMatch.multi_byte_usagecCs"|jdurt|j|jd�|_|jS)N�strict)r*rr r!rBr-r-r.�__str__Ks
zCharsetMatch.__str__cCsd|j�d|j�d�S)Nz<CharsetMatch 'z' bytes(z)>)r4r5rBr-r-r.�__repr__QszCharsetMatch.__repr__�NonecCs8t|t�r	||krtd�|j���d|_|j�|�dS)Nz;Unable to add instance <{}> as a submatch of a CharsetMatch)r3rr8�format�	__class__r*r&�appendr6r-r-r.�add_submatchTs��zCharsetMatch.add_submatchcC�|jS�N)r!rBr-r-r.r4_�zCharsetMatch.encoding�	list[str]cCsDg}t��D]\}}|j|kr|�|�q|j|kr|�|�q|S)z�
        Encoding name are known by many name, using this could help when searching for IBM855 when it's listed as CP855.
        )r�itemsr4rI)r,�
also_known_as�u�pr-r-r.�encoding_aliasescs


�zCharsetMatch.encoding_aliasescCrKrL�r$rBr-r-r.�bomprMzCharsetMatch.bomcCrKrLrTrBr-r-r.�byte_order_marktrMzCharsetMatch.byte_order_markcCsdd�|jD�S)z�
        Return the complete list of possible languages found in decoded sequence.
        Usually not really useful. Returned list may be empty even if 'language' property return something != 'Unknown'.
        cSsg|]}|d�qS)rr-)�.0�er-r-r.�
<listcomp>~�z*CharsetMatch.languages.<locals>.<listcomp>�r#rBr-r-r.rxszCharsetMatch.languagescCsp|js1d|jvr
dSddlm}m}t|j�r||j�n||j�}t|�dks+d|vr-dS|dS|jddS)z�
        Most probable language found in decoded sequence. If none were detected or inferred, the property will return
        "Unknown".
        �ascii�Englishr)�encoding_languages�mb_encoding_languageszLatin Based�Unknown)r#�could_be_from_charset�charset_normalizer.cdr^r_rr4r<)r,r^r_rr-r-r.�language�s
��zCharsetMatch.languagecCrKrL)r"rBr-r-r.r:�rMzCharsetMatch.chaoscCs|jsdS|jddS)Nrrrr[rBr-r-r.r;�szCharsetMatch.coherencecC�t|jddd�S�N�d�)�ndigits)�roundr:rBr-r-r.�
percent_chaos��zCharsetMatch.percent_chaoscCrdre)rir;rBr-r-r.�percent_coherence�rkzCharsetMatch.percent_coherencecCrK)z+
        Original untouched bytes.
        )r rBr-r-r.rA�szCharsetMatch.raw�list[CharsetMatch]cCrKrL)r&rBr-r-r.�submatch�rMzCharsetMatch.submatchcC�t|j�dkS�Nr)r<r&rBr-r-r.�has_submatch�szCharsetMatch.has_submatchcCs@|jdur|jSdd�t|�D�}ttdd�|D���|_|jS)NcSsg|]}t|��qSr-)r)rW�charr-r-r.rY�rZz*CharsetMatch.alphabets.<locals>.<listcomp>cSsh|]}|r|�qSr-r-)rW�rr-r-r.�	<setcomp>�rZz)CharsetMatch.alphabets.<locals>.<setcomp>)r%r�sorted�list)r,�detected_rangesr-r-r.�	alphabets�s

zCharsetMatch.alphabetscCs|jgdd�|jD�S)z�
        The complete list of encoding that output the exact SAME str result and therefore could be the originating
        encoding.
        This list does include the encoding available in property 'encoding'.
        cSsg|]}|j�qSr-)r4)rW�mr-r-r.rY�sz6CharsetMatch.could_be_from_charset.<locals>.<listcomp>)r!r&rBr-r-r.ra�sz"CharsetMatch.could_be_from_charset�utf_8r4cs~�jdus
�j|kr<|�_t��}�jdur5�j��dvr5tt�fdd�|dd�dd�}||dd�}|�|d��_�jS)	z�
        Method to get re-encoded bytes payload using given target encoding. Default to UTF-8.
        Any errors will be simply ignored by the encoder NOT replaced.
        N)zutf-8�utf8rzcs<|j|��d|��d��|��dt�j��dd��S)Nrr�_�-)�string�span�replace�groupsrr))ryrBr-r.�<lambda>�s
�z%CharsetMatch.output.<locals>.<lambda>i r)�countr�)r)rr+�lowerrr�encoder()r,r4�decoded_string�patched_headerr-rBr.�output�s 
�

�
zCharsetMatch.outputcCst|�����S)zw
        Retrieve the unique SHA256 computed using the transformed (re-encoded) payload. Not the original one.
        )rr��	hexdigestrBr-r-r.r5�szCharsetMatch.fingerprint)NN)rrrrrrrrrrrrrr)r0r1r2r)r2r�r2r)r0rr2rF)r2rN�r2r)r2r)r2rm)rz)r4rr2r)�__name__�
__module__�__qualname__r/r7r@�propertyr=rDrErJr4rSrUrVrrcr:r;rjrlrArnrqrxrar�r5r-r-r-r.r
sX�




	rc@sbeZdZdZdddd�Zd d	d
�Zd!dd�Zd"dd�Zd#dd�Zd$dd�Z	d%dd�Z
d%dd�ZdS)&�CharsetMatchesz�
    Container with every CharsetMatch items ordered by default from most probable to the less one.
    Act like a list(iterable) but does not implements all related methods.
    N�results�list[CharsetMatch] | NonecCs|r	t|�|_dSg|_dSrL)ru�_results)r,r�r-r-r.r/�szCharsetMatches.__init__r2�Iterator[CharsetMatch]ccs�|jEdHdSrL�r�rBr-r-r.�__iter__�s�zCharsetMatches.__iter__�item�	int | strrcCsJt|t�r
|j|St|t�r#t|d�}|jD]}||jvr"|Sqt�)z�
        Retrieve a single item either by its position or encoding name (alias may be used here).
        Raise KeyError upon invalid index or encoding not present in results.
        F)r3�intr�rrra�KeyError)r,r��resultr-r-r.�__getitem__s





�zCharsetMatches.__getitem__r�cCs
t|j�SrL�r<r�rBr-r-r.�__len__s
zCharsetMatches.__len__rcCrorpr�rBr-r-r.�__bool__szCharsetMatches.__bool__rFcCs|t|t�std�t|j����t|j�tkr0|j	D]}|j
|j
kr/|j|jkr/|�|�dSq|j	�
|�t|j	�|_	dS)z~
        Insert a single match. Will be inserted accordingly to preserve sort.
        Can be inserted as a submatch.
        z-Cannot append instance '{}' to CharsetMatchesN)r3rr8rGrrHr<rAr
r�r5r:rJrIru)r,r��matchr-r-r.rIs
��

�zCharsetMatches.append�CharsetMatch | NonecCs|jsdS|jdS)zQ
        Simply return the first match. Strict equivalent to matches[0].
        Nrr�rBr-r-r.�best)s
zCharsetMatches.bestcCs|��S)zP
        Redundant method, call the method best(). Kept for BC reasons.
        )r�rBr-r-r.�first1szCharsetMatches.firstrL)r�r�)r2r�)r�r�r2r)r2r�r�)r�rr2rF)r2r�)r�r�r��__doc__r/r�r�r�r�rIr�r�r-r-r-r.r��s





r�c@s.eZdZddd�Zeddd��Zddd�ZdS)�CliDetectionResult�pathrr4rrSrN�alternative_encodingsrcrxrrr:rr;�unicode_path�is_preferredcCsF||_|
|_||_||_||_||_||_||_||_|	|_	||_
dSrL)r�r�r4rSr�rcrxrr:r;r�)r,r�r4rSr�rcrxrr:r;r�r�r-r-r.r/=s
zCliDetectionResult.__init__r2�dict[str, Any]cCs2|j|j|j|j|j|j|j|j|j|j	|j
d�S)N�r�r4rSr�rcrxrr:r;r�r�r�rBr-r-r.�__dict__Ws�zCliDetectionResult.__dict__cCst|jddd�S)NT�)�ensure_ascii�indent)rr�rBr-r-r.�to_jsongszCliDetectionResult.to_jsonN)r�rr4rrSrNr�rNrcrrxrNrrr:rr;rr�rr�r)r2r�r�)r�r�r�r/r�r�r�r-r-r-r.r�<s

r�N)�
__future__r�encodings.aliasesr�hashlibr�jsonr�rer�typingrrr	r
�constantrr
�utilsrrrrr�rr�CoherenceMatchrr�r-r-r-r.�<module>siC