HEX
Server: Apache/2.4.52 (Ubuntu)
System: Linux spn-python 5.15.0-89-generic #99-Ubuntu SMP Mon Oct 30 20:42:41 UTC 2023 x86_64
User: arjun (1000)
PHP: 8.1.2-1ubuntu2.20
Disabled: NONE
Upload Files
File: //usr/local/lib/python3.10/dist-packages/charset_normalizer/__pycache__/api.cpython-310.pyc
o

;��gYX�@s:ddlmZddlZddlmZddlmZddlmZm	Z	m
Z
mZddlm
Z
mZmZmZddlmZdd	lmZmZdd
lmZmZmZmZmZmZmZe�d�Ze� �Z!e!�"e�#d��	
								d2d3d$d%�Z$	
								d2d4d(d)�Z%	
								d2d5d,d-�Z&	
								d6d7d0d1�Z'dS)8�)�annotationsN)�PathLike)�BinaryIO�)�coherence_ratio�encoding_languages�mb_encoding_languages�merge_coherence_ratios)�IANA_SUPPORTED�TOO_BIG_SEQUENCE�TOO_SMALL_SEQUENCE�TRACE)�
mess_ratio)�CharsetMatch�CharsetMatches)�any_specified_encoding�cut_sequence_chunks�	iana_name�identify_sig_or_bom�
is_cp_similar�is_multi_byte_encoding�should_strip_sig_or_bom�charset_normalizerz)%(asctime)s | %(levelname)s | %(message)s��皙�����?TF皙�����?�	sequences�bytes | bytearray�steps�int�
chunk_size�	threshold�float�cp_isolation�list[str] | None�cp_exclusion�preemptive_behaviour�bool�explain�language_threshold�enable_fallback�returnrc
2Cs�	t|ttf�std�t|����|rtj}
t�t	�t�
t�t|�}|dkrGt�
d�|r;t�t	�t�
|
p9tj�tt|dddgd�g�S|dur]t�td	d
�|��dd�|D�}ng}|durut�td
d
�|��dd�|D�}ng}|||kr�t�td|||�d}|}|dkr�|||kr�t||�}t|�tk}t|�tk}
|r�t�td�|��n|
r�t�td�|��g}|r�t|�nd}|dur�|�|�t�td|�t�}g}g}d}d}d}t�}t�}t|�\}}|du�r|�|�t�tdt|�|�|�d�d|v�r|�d�|tD�]5}|�r$||v�r$�q|�r.||v�r.�q||v�r5�q|�|�d}||k}|�oFt|�}|dv�rX|�sXt�td|��q|dv�ri|�sit�td|��qzt|�}Wnt t!f�y�t�td|�Y�qwz9|
�r�|du�r�t"|du�r�|dtd��n	|t|�td��|d�nt"|du�r�|n|t|�d�|d�}Wn+t#t$f�y�}zt|t$��s�t�td|t"|��|�|�WYd}~�qd}~wwd} |D]
}!t%||!��r�d} n�q�| �rt�td||!��qt&|�sdnt|�|t||��}"|�o&|du�o&t|�|k}#|#�r1t�td |�tt|"�d!�}$t'|$d"�}$d}%d}&g}'g}(zLt(|||"||||||�	D]=})|'�|)�|(�t)|)||du�ordt|�k�opd"kn��|(d#|k�r�|%d7}%|%|$k�s�|�r�|du�r�n�qSWn!t#�y�}zt�td$|t"|��|$}%d}&WYd}~nd}~ww|&�s�|
�r�|�s�z|td%�d�j*|d&d'�Wn#t#�y�}zt�td(|t"|��|�|�WYd}~�qd}~ww|(�r�t+|(�t|(�nd}*|*|k�s|%|$k�rH|�|�t�td)||%t,|*d*d+d,��|	�rF|dd|fv�rF|&�sFt|||dg||d-�}+||k�r<|+}n
|dk�rD|+}n|+}�qt�td.|t,|*d*d+d,��|�s^t-|�},nt.|�},|,�rqt�td/�|t"|,���g}-|dk�r�|'D]})t/|)||,�r�d0�|,�nd�}.|-�|.��qzt0|-�}/|/�r�t�td1�|/|��t|||*||/|
du�s�||ddfv�r�|nd|d-�}0|�|0�||ddfv�r�|*d2k�r�|*dk�r�t�
d3|0j1�|�r�t�t	�t�
|
�t|0g�S|�|0�t|��r+|du�s||v�r+d|v�r+d|v�r+|�2�}1t�
d3|1j1�|�r$t�t	�t�
|
�t|1g�S||k�rLt�
d4|�|�rCt�t	�t�
|
�t||g�S�qt|�dk�r�|�s^|�s^|�rdt�td5�|�rtt�
d6|j1�|�|�n2|�r||du�s�|�r�|�r�|j3|j3k�s�|du�r�t�
d7�|�|�n
|�r�t�
d8�|�|�|�r�t�
d9|�2�j1t|�d�nt�
d:�|�r�t�t	�t�
|
�|S);af
    Given a raw bytes sequence, return the best possibles charset usable to render str objects.
    If there is no results, it is a strong indicator that the source is binary/not text.
    By default, the process will extract 5 blocks of 512o each to assess the mess and coherence of a given sequence.
    And will give up a particular code page after 20% of measured mess. Those criteria are customizable at will.

    The preemptive behavior DOES NOT replace the traditional detection workflow, it prioritize a particular code page
    but never take it for granted. Can improve the performance.

    You may want to focus your attention to some code page or/and not others, use cp_isolation and cp_exclusion for that
    purpose.

    This function will strip the SIG in the payload/sequence every time except on UTF-16, UTF-32.
    By default the library does not setup any handler other than the NullHandler, if you choose to set the 'explain'
    toggle to True it will alter the logger configuration to add a StreamHandler that is suitable for debugging.
    Custom logging format and handler can be set manually.
    z3Expected object of type bytes or bytearray, got: {}rz<Encoding detection on empty bytes, assuming utf_8 intention.�utf_8gF�Nz`cp_isolation is set. use this flag for debugging purpose. limited list of encoding allowed : %s.z, cS�g|]}t|d��qS�F�r��.0�cp�r5�A/usr/local/lib/python3.10/dist-packages/charset_normalizer/api.py�
<listcomp>[�zfrom_bytes.<locals>.<listcomp>zacp_exclusion is set. use this flag for debugging purpose. limited list of encoding excluded : %s.cSr/r0r1r2r5r5r6r7fr8z^override steps (%i) and chunk_size (%i) as content does not fit (%i byte(s) given) parameters.rz>Trying to detect encoding from a tiny portion of ({}) byte(s).zIUsing lazy str decoding because the payload is quite large, ({}) byte(s).z@Detected declarative mark in sequence. Priority +1 given for %s.zIDetected a SIG or BOM mark on first %i byte(s). Priority +1 given for %s.�ascii>�utf_16�utf_32z\Encoding %s won't be tested as-is because it require a BOM. Will try some sub-encoder LE/BE.>�utf_7zREncoding %s won't be tested as-is because detection is unreliable without BOM/SIG.z2Encoding %s does not provide an IncrementalDecoderg��A)�encodingz9Code page %s does not fit given bytes sequence at ALL. %sTzW%s is deemed too similar to code page %s and was consider unsuited already. Continuing!zpCode page %s is a multi byte encoding table and it appear that at least one character was encoded using n-bytes.�����zaLazyStr Loading: After MD chunk decode, code page %s does not fit given bytes sequence at ALL. %sgj�@�strict)�errorsz^LazyStr Loading: After final lookup, code page %s does not fit given bytes sequence at ALL. %szc%s was excluded because of initial chaos probing. Gave up %i time(s). Computed mean chaos is %f %%.�d�)�ndigits)�preemptive_declarationz=%s passed initial chaos probing. Mean measured chaos is %f %%z&{} should target any language(s) of {}�,z We detected language {} using {}rz.Encoding detection: %s is most likely the one.zoEncoding detection: %s is most likely the one as we detected a BOM or SIG within the beginning of the sequence.zONothing got out of the detection process. Using ASCII/UTF-8/Specified fallback.z7Encoding detection: %s will be used as a fallback matchz:Encoding detection: utf_8 will be used as a fallback matchz:Encoding detection: ascii will be used as a fallback matchz]Encoding detection: Found %s as plausible (best-candidate) for content. With %i alternatives.z=Encoding detection: Unable to determine any suitable charset.)4�
isinstance�	bytearray�bytes�	TypeError�format�type�logger�level�
addHandler�explain_handler�setLevelr
�len�debug�
removeHandler�logging�WARNINGrr�log�joinr rrr�append�setrr
�addrr�ModuleNotFoundError�ImportError�str�UnicodeDecodeError�LookupErrorr�range�maxrr�decode�sum�roundrrrr	r=�best�fingerprint)2rrr!r"r$r&r'r)r*r+�previous_logger_level�length�is_too_small_sequence�is_too_large_sequence�prioritized_encodings�specified_encoding�tested�tested_but_hard_failure�tested_but_soft_failure�fallback_ascii�fallback_u8�fallback_specified�results�early_stop_results�sig_encoding�sig_payload�
encoding_iana�decoded_payload�bom_or_sig_available�strip_sig_or_bom�is_multi_byte_decoder�e�similar_soft_failure_test�encoding_soft_failed�r_�multi_byte_bonus�max_chunk_gave_up�early_stop_count�lazy_str_hard_failure�	md_chunks�	md_ratios�chunk�mean_mess_ratio�fallback_entry�target_languages�	cd_ratios�chunk_languages�cd_ratios_merged�
current_match�probable_resultr5r5r6�
from_bytes!s ��



��������
�

�




�����
��	
����
����
��
��
�
&��
�
���������
��

�
���
	
�
��
���
��


�


�

�


�

����
��




�


r��fprc

Cst|��|||||||||	�
S)z�
    Same thing than the function from_bytes but using a file pointer that is already ready.
    Will not close the file pointer.
    )r��read)
r�rr!r"r$r&r'r)r*r+r5r5r6�from_fp s�r��path�str | bytes | PathLikec
CsHt|d��}
t|
|||||||||	�
Wd�S1swYdS)z�
    Same thing than the function from_bytes but with one extra step. Opening and reading given file path in binary mode.
    Can raise IOError.
    �rbN)�openr�)r�rr!r"r$r&r'r)r*r+r�r5r5r6�	from_path>s�$�r��fp_or_path_or_payload�!PathLike | str | BinaryIO | bytesc
Cs�t|ttf�rt||||||||||	d�
}
|
St|ttf�r0t||||||||||	d�
}
|
St||||||||||	d�
}
|
S)a)
    Detect if the given input (file, bytes, or path) points to a binary file. aka. not a string.
    Based on the same main heuristic algorithms and default kwargs at the sole exception that fallbacks match
    are disabled to be stricter around ASCII-compatible but unlikely to be a string.
    )	rr!r"r$r&r'r)r*r+)rHr_rr�rJrIr�r�)r�rr!r"r$r&r'r)r*r+�guessesr5r5r6�	is_binary]s\�-������
r�)	rrrNNTFrT)rrrr r!r r"r#r$r%r&r%r'r(r)r(r*r#r+r(r,r)r�rrr r!r r"r#r$r%r&r%r'r(r)r(r*r#r+r(r,r)r�r�rr r!r r"r#r$r%r&r%r'r(r)r(r*r#r+r(r,r)	rrrNNTFrF)r�r�rr r!r r"r#r$r%r&r%r'r(r)r(r*r#r+r(r,r()(�
__future__rrV�osr�typingr�cdrrrr	�constantr
rrr
�mdr�modelsrr�utilsrrrrrrr�	getLoggerrN�
StreamHandlerrQ�setFormatter�	Formatterr�r�r�r�r5r5r5r6�<module>st$

��� �!�