AlkantarClanX12

Your IP : 3.129.42.198


Current Path : /lib/python2.7/site-packages/pip/_vendor/chardet/
Upload File :
Current File : //lib/python2.7/site-packages/pip/_vendor/chardet/universaldetector.pyc

�
��abc@s�dZddlZddlZddlZddlmZddlmZmZm	Z	ddl
mZddlm
Z
ddlmZdd	lmZd
efd��YZdS(s
Module containing the UniversalDetector detector class, which is the primary
class a user of ``chardet`` should use.

:author: Mark Pilgrim (initial port to Python)
:author: Shy Shalom (original C code)
:author: Dan Blanchard (major refactoring for 3.0)
:author: Ian Cordasco
i����Ni(tCharSetGroupProber(t
InputStatetLanguageFiltertProbingState(tEscCharSetProber(tLatin1Prober(tMBCSGroupProber(tSBCSGroupProbertUniversalDetectorcBs�eZdZdZejd�Zejd�Zejd�Zidd6dd6d	d
6dd6d
d6dd6dd6dd6Z	e
jd�Zd�Z
d�Zd�ZRS(sq
    The ``UniversalDetector`` class underlies the ``chardet.detect`` function
    and coordinates all of the different charset probers.

    To get a ``dict`` containing an encoding and its confidence, you can simply
    run:

    .. code::

            u = UniversalDetector()
            u.feed(some_bytes)
            u.close()
            detected = u.result

    g�������?s[�-�]s(|~{)s[�-�]sWindows-1252s
iso-8859-1sWindows-1250s
iso-8859-2sWindows-1251s
iso-8859-5sWindows-1256s
iso-8859-6sWindows-1253s
iso-8859-7sWindows-1255s
iso-8859-8sWindows-1254s
iso-8859-9sWindows-1257siso-8859-13cCsqd|_g|_d|_d|_d|_d|_d|_||_t	j
t�|_d|_
|j�dS(N(tNonet_esc_charset_probert_charset_proberstresulttdonet	_got_datat_input_statet
_last_chartlang_filtertloggingt	getLoggert__name__tloggert_has_win_bytestreset(tselfR((sI/usr/lib/python2.7/site-packages/pip/_vendor/chardet/universaldetector.pyt__init__Qs									cCs�idd6dd6dd6|_t|_t|_t|_tj|_d|_	|j
rg|j
j�nx|jD]}|j�qqWdS(s�
        Reset the UniversalDetector and all of its probers back to their
        initial states.  This is called by ``__init__``, so you only need to
        call this directly in between analyses of different documents.
        tencodinggt
confidencetlanguagetN(
R	RtFalseR
RRRt
PURE_ASCIIRRR
RR(Rtprober((sI/usr/lib/python2.7/site-packages/pip/_vendor/chardet/universaldetector.pyR^s					cCsy|jr
dSt|�sdSt|t�s;t|�}n|js{|jtj�rwidd6dd6dd6|_n�|jtj	tj
f�r�idd6dd6dd6|_n�|jd	�r�id
d6dd6dd6|_nl|jd�ridd6dd6dd6|_n<|jtjtjf�rOid
d6dd6dd6|_nt
|_|jddk	r{t
|_dSn|jtjkr�|jj|�r�tj|_q�|jtjkr�|jj|j|�r�tj|_q�n|d|_|jtjkr�|js(t|j�|_n|jj|�tjkrui|jjd6|jj�d6|jj d6|_t
|_qun�|jtjkru|j!s�t"|j�g|_!|jt#j$@r�|j!j%t&��n|j!j%t'��nx`|j!D]U}|j|�tjkr�i|jd6|j�d6|j d6|_t
|_Pq�q�W|j(j|�rut
|_)qundS(s�
        Takes a chunk of a document and feeds it through all of the relevant
        charset probers.

        After calling ``feed``, you can check the value of the ``done``
        attribute to see if you need to continue feeding the
        ``UniversalDetector`` more data, or if it has made a prediction
        (in the ``result`` attribute).

        .. note::
           You should always call ``close`` when you're done feeding in your
           document if ``done`` is not already ``True``.
        Ns	UTF-8-SIGRg�?RRRsUTF-32s��sX-ISO-10646-UCS-4-3412s��sX-ISO-10646-UCS-4-2143sUTF-16i����(*R
tlent
isinstancet	bytearrayRt
startswithtcodecstBOM_UTF8RtBOM_UTF32_LEtBOM_UTF32_BEtBOM_LEtBOM_BEtTrueR	RRRtHIGH_BYTE_DETECTORtsearcht	HIGH_BYTEtESC_DETECTORRt	ESC_ASCIIR
RRtfeedRtFOUND_ITtcharset_nametget_confidenceRRRRtNON_CJKtappendRRtWIN_BYTE_DETECTORR(Rtbyte_strR ((sI/usr/lib/python2.7/site-packages/pip/_vendor/chardet/universaldetector.pyR1os~		




		
	
	

	c	Cs>|jr|jSt|_|js5|jjd�n1|jtjkrhidd6dd6dd6|_n�|jtj	krfd}d}d}xD|jD]9}|s�q�n|j�}||kr�|}|}q�q�W|rf||j
krf|j}|jj�}|j�}|jd	�r?|jr?|jj||�}q?ni|d6|d6|jd6|_qfn|jj�tjkr7|jddkr7|jjd
�x�|jD]�}|s�q�nt|t�rx^|jD]+}|jjd|j|j|j��q�Wq�|jjd|j|j|j��q�Wq7n|jS(
s�
        Stop analyzing the current document and come up with a final
        prediction.

        :returns:  The ``result`` attribute, a ``dict`` with the keys
                   `encoding`, `confidence`, and `language`.
        sno data received!tasciiRg�?RRRgsiso-8859s no probers hit minimum thresholds%s %s confidence = %sN(R
RR+RRtdebugRRRR.R	RR4tMINIMUM_THRESHOLDR3tlowerR$RtISO_WIN_MAPtgetRtgetEffectiveLevelRtDEBUGR"Rtprobers(	Rtprober_confidencetmax_prober_confidencet
max_proberR R3tlower_charset_nameRtgroup_prober((sI/usr/lib/python2.7/site-packages/pip/_vendor/chardet/universaldetector.pytclose�s`				

		
(Rt
__module__t__doc__R;tretcompileR,R/R7R=RtALLRRR1RG(((sI/usr/lib/python2.7/site-packages/pip/_vendor/chardet/universaldetector.pyR3s"


		m(RIR%RRJtcharsetgroupproberRtenumsRRRt	escproberRtlatin1proberRtmbcsgroupproberRtsbcsgroupproberRtobjectR(((sI/usr/lib/python2.7/site-packages/pip/_vendor/chardet/universaldetector.pyt<module>$s