HEX
Server: Apache
System: Linux sh-pro142.hostgator.com.br 5.14.0-162.23.1.9991722448259.nf.el9.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Jul 31 18:11:45 UTC 2024 x86_64
User: okform09 (1324)
PHP: 8.3.30
Disabled: NONE
Upload Files
File: //usr/libexec/oracle-cloud-agent/plugins/osms/charset_normalizer/api.pyc
a

ٓ�fiR�
@s�ddlZddlmZddlmZmZmZmZmZddl	m
Z
mZmZm
Z
ddlmZmZmZmZddlmZddlmZmZdd	lmZmZmZmZmZmZmZe� d
�Z!e�"�Z#e#�$e�%d��dee&e'fe(e(e)eee*eee*e+e+e)e+ed�dd�Z,dee(e(e)eee*eee*e+e+e)e+ed�dd�Z-d ee*e&efe(e(e)eee*eee*e+e+e)e+ed�dd�Z.d!eee*ee&fe(e(e)eee*eee*e+e+e)e+e+d�dd�Z/dS)"�N)�PathLike)�BinaryIO�List�Optional�Set�Union�)�coherence_ratio�encoding_languages�mb_encoding_languages�merge_coherence_ratios)�IANA_SUPPORTED�TOO_BIG_SEQUENCE�TOO_SMALL_SEQUENCE�TRACE)�
mess_ratio)�CharsetMatch�CharsetMatches)�any_specified_encoding�cut_sequence_chunks�	iana_name�identify_sig_or_bom�
is_cp_similar�is_multi_byte_encoding�should_strip_sig_or_bom�charset_normalizerz)%(asctime)s | %(levelname)s | %(message)s��皙�����?TF皙�����?)�	sequences�steps�
chunk_size�	threshold�cp_isolation�cp_exclusion�preemptive_behaviour�explain�language_threshold�enable_fallback�returnc
/Cs	t|ttf�s td�t|����|r>tj}
t�t	�t�
t�t|�}|dkr�t�
d�|rvt�t	�t�
|
prtj�tt|dddgd�g�S|dur�t�td	d
�|��dd�|D�}ng}|dur�t�td
d
�|��dd�|D�}ng}|||k�rt�td|||�d}|}|dk�r:|||k�r:t||�}t|�tk}t|�tk}
|�rlt�td�|��n|
�r�t�td�|��g}|�r�t|�nd}|du�r�|�|�t�td|�t�}g}g}d}d}d}t�}t|�\}}|du�r|�|�t�tdt|�|�|�d�d|v�r.|�d�|tD�]�}|�rP||v�rP�q6|�rd||v�rd�q6||v�rr�q6|�|�d}||k}|�o�t|�}|dv�r�|�s�t�td|��q6|dv�r�|�s�t�td|��q6zt|�}Wn,t t!f�yt�td|�Y�q6Yn0zr|
�r^|du�r^t"|du�rB|dtd��n|t|�td��|d�n&t"|du�rn|n|t|�d�|d�}Wnbt#t$f�y�}zDt|t$��s�t�td|t"|��|�|�WYd}~�q6WYd}~n
d}~00d}|D]} t%|| ��r�d}�q�q�|�r*t�td|| ��q6t&|�s6dnt|�|t||��}!|�of|du�oft|�|k}"|"�r|t�td |�tt|!�d!�}#t'|#d"�}#d}$d}%g}&g}'z�t(|||!||||||�	D]|}(|&�|(�|'�t)|(||du�o�dt|�k�o�d"kn��|'d#|k�r|$d7}$|$|#k�s4|�r�|du�r��q>�q�WnBt#�y�}z(t�td$|t"|��|#}$d}%WYd}~n
d}~00|%�s|
�r|�sz|td%�d�j*|d&d'�WnRt#�y}z8t�td(|t"|��|�|�WYd}~�q6WYd}~n
d}~00|'�rt+|'�t|'�nd})|)|k�s6|$|#k�r�|�|�t�td)||$t,|)d*d+d,��|	�r6|dd|fv�r6|%�s6t|||dg|�}*||k�r�|*}n|dk�r�|*}n|*}�q6t�td-|t,|)d*d+d,��|�s�t-|�}+nt.|�}+|+�rt�td.�|t"|+���g},|dk�rF|&D],}(t/|(||+�r2d/�|+�nd�}-|,�|-��qt0|,�}.|.�rht�td0�|.|��|�t|||)||.|��||ddfv�r�|)d1k�r�t�
d2|�|�r�t�t	�t�
|
�t||g�S||k�r6t�
d3|�|�rt�t	�t�
|
�t||g�S�q6t|�dk�r�|�s8|�s8|�rDt�td4�|�rdt�
d5|j1�|�|�nd|�rt|du�s�|�r�|�r�|j2|j2k�s�|du�r�t�
d6�|�|�n|�r�t�
d7�|�|�|�r�t�
d8|�3�j1t|�d�n
t�
d9�|�	rt�t	�t�
|
�|S):af
    Given a raw bytes sequence, return the best possibles charset usable to render str objects.
    If there is no results, it is a strong indicator that the source is binary/not text.
    By default, the process will extract 5 blocks of 512o each to assess the mess and coherence of a given sequence.
    And will give up a particular code page after 20% of measured mess. Those criteria are customizable at will.

    The preemptive behavior DOES NOT replace the traditional detection workflow, it prioritize a particular code page
    but never take it for granted. Can improve the performance.

    You may want to focus your attention to some code page or/and not others, use cp_isolation and cp_exclusion for that
    purpose.

    This function will strip the SIG in the payload/sequence every time except on UTF-16, UTF-32.
    By default the library does not setup any handler other than the NullHandler, if you choose to set the 'explain'
    toggle to True it will alter the logger configuration to add a StreamHandler that is suitable for debugging.
    Custom logging format and handler can be set manually.
    z4Expected object of type bytes or bytearray, got: {0}rz<Encoding detection on empty bytes, assuming utf_8 intention.�utf_8gF�Nz`cp_isolation is set. use this flag for debugging purpose. limited list of encoding allowed : %s.z, cSsg|]}t|d��qS�F�r��.0�cp�r2�z/sparta/input/_build_configuration/image_build+validate/lib/bmcenv/lib64/python3.9/site-packages/charset_normalizer/api.py�
<listcomp>[�zfrom_bytes.<locals>.<listcomp>zacp_exclusion is set. use this flag for debugging purpose. limited list of encoding excluded : %s.cSsg|]}t|d��qSr-r.r/r2r2r3r4fr5z^override steps (%i) and chunk_size (%i) as content does not fit (%i byte(s) given) parameters.rz>Trying to detect encoding from a tiny portion of ({}) byte(s).zIUsing lazy str decoding because the payload is quite large, ({}) byte(s).z@Detected declarative mark in sequence. Priority +1 given for %s.zIDetected a SIG or BOM mark on first %i byte(s). Priority +1 given for %s.�ascii>�utf_32�utf_16z\Encoding %s won't be tested as-is because it require a BOM. Will try some sub-encoder LE/BE.>�utf_7zREncoding %s won't be tested as-is because detection is unreliable without BOM/SIG.z2Encoding %s does not provide an IncrementalDecoderg��A)�encodingz9Code page %s does not fit given bytes sequence at ALL. %sTzW%s is deemed too similar to code page %s and was consider unsuited already. Continuing!zpCode page %s is a multi byte encoding table and it appear that at least one character was encoded using n-bytes.�����zaLazyStr Loading: After MD chunk decode, code page %s does not fit given bytes sequence at ALL. %sgj�@�strict)�errorsz^LazyStr Loading: After final lookup, code page %s does not fit given bytes sequence at ALL. %szc%s was excluded because of initial chaos probing. Gave up %i time(s). Computed mean chaos is %f %%.�d�)�ndigitsz=%s passed initial chaos probing. Mean measured chaos is %f %%z&{} should target any language(s) of {}�,z We detected language {} using {}rz.Encoding detection: %s is most likely the one.zoEncoding detection: %s is most likely the one as we detected a BOM or SIG within the beginning of the sequence.zONothing got out of the detection process. Using ASCII/UTF-8/Specified fallback.z7Encoding detection: %s will be used as a fallback matchz:Encoding detection: utf_8 will be used as a fallback matchz:Encoding detection: ascii will be used as a fallback matchz]Encoding detection: Found %s as plausible (best-candidate) for content. With %i alternatives.z=Encoding detection: Unable to determine any suitable charset.)4�
isinstance�	bytearray�bytes�	TypeError�format�type�logger�level�
addHandler�explain_handler�setLevelr�len�debug�
removeHandler�logging�WARNINGrr�log�join�intrrr�append�setrr
�addrr�ModuleNotFoundError�ImportError�str�UnicodeDecodeError�LookupErrorr�range�maxrr�decode�sum�roundr
rr	rr:�fingerprint�best)/r r!r"r#r$r%r&r'r(r)�previous_logger_level�length�is_too_small_sequence�is_too_large_sequence�prioritized_encodings�specified_encoding�tested�tested_but_hard_failure�tested_but_soft_failure�fallback_ascii�fallback_u8�fallback_specified�results�sig_encoding�sig_payload�
encoding_iana�decoded_payload�bom_or_sig_available�strip_sig_or_bom�is_multi_byte_decoder�e�similar_soft_failure_test�encoding_soft_failed�r_�multi_byte_bonus�max_chunk_gave_up�early_stop_count�lazy_str_hard_failure�	md_chunks�	md_ratios�chunk�mean_mess_ratio�fallback_entry�target_languages�	cd_ratios�chunk_languages�cd_ratios_mergedr2r2r3�
from_bytes!s���



��������

�

�




��������
�
$
�
��
��
�
&��
��������
$
�
����

�
��
��������


�

������
��	



�


r�)�fpr!r"r#r$r%r&r'r(r)r*c

Cst|��|||||||||	�
S)z�
    Same thing than the function from_bytes but using a file pointer that is already ready.
    Will not close the file pointer.
    )r��read)
r�r!r"r#r$r%r&r'r(r)r2r2r3�from_fp�s�r�)�pathr!r"r#r$r%r&r'r(r)r*c
CsHt|d��*}
t|
|||||||||	�
Wd�S1s:0YdS)z�
    Same thing than the function from_bytes but with one extra step. Opening and reading given file path in binary mode.
    Can raise IOError.
    �rbN)�openr�)r�r!r"r#r$r%r&r'r(r)r�r2r2r3�	from_paths�r�)�fp_or_path_or_payloadr!r"r#r$r%r&r'r(r)r*c
Cszt|ttf�r,t||||||||||	d�
}
nHt|ttf�rXt||||||||||	d�
}
nt||||||||||	d�
}
|
S)a)
    Detect if the given input (file, bytes, or path) points to a binary file. aka. not a string.
    Based on the same main heuristic algorithms and default kwargs at the sole exception that fallbacks match
    are disabled to be stricter around ASCII-compatible but unlikely to be a string.
    )	r!r"r#r$r%r&r'r(r))rDr\rr�rFrEr�r�)r�r!r"r#r$r%r&r'r(r)�guessesr2r2r3�	is_binary3sX����
�
r�)	rrrNNTFrT)	rrrNNTFrT)	rrrNNTFrT)	rrrNNTFrF)0rR�osr�typingrrrrr�cdr	r
rr�constantr
rrr�mdr�modelsrr�utilsrrrrrrr�	getLoggerrJ�
StreamHandlerrM�setFormatter�	FormatterrFrErV�floatr\�boolr�r�r�r�r2r2r2r3�<module>s�$
��


�Z�

� �

�!�

�