AlkantarClanX12
Current Path : /opt/alt/python310/lib64/python3.10/html/__pycache__/ |
Current File : //opt/alt/python310/lib64/python3.10/html/__pycache__/parser.cpython-310.pyc |
o 6��f�C � @ s� d Z ddlZddlZddlmZ dgZe�d�Ze�d�Ze�d�Z e�d�Z e�d �Ze�d �Ze�d�Z e�d�Ze�d �Ze�dej�Ze�d �Ze�d�ZG dd� dej�ZdS )zA parser for HTML and XHTML.� N)�unescape� HTMLParserz[&<]z &[a-zA-Z#]z%&([a-zA-Z][-.a-zA-Z0-9]*)[^a-zA-Z0-9]z)&#(?:[0-9]+|[xX][0-9a-fA-F]+)[^0-9a-fA-F]z <[a-zA-Z]�>z--\s*>z+([a-zA-Z][^\t\n\r\f />\x00]*)(?:\s|/(?!>))*z]((?<=[\'"\s/])[^\s/>][^\s/=>]*)(\s*=+\s*(\'[^\']*\'|"[^"]*"|(?![\'"])[^>\s]*))?(?:\s|/(?!>))*aF <[a-zA-Z][^\t\n\r\f />\x00]* # tag name (?:[\s/]* # optional whitespace before attribute name (?:(?<=['"\s/])[^\s/>][^\s/=>]* # attribute name (?:\s*=+\s* # value indicator (?:'[^']*' # LITA-enclosed value |"[^"]*" # LIT-enclosed value |(?!['"])[^>\s]* # bare value ) \s* # possibly followed by a space )?(?:\s|/(?!>))* )* )? \s* # trailing whitespace z#</\s*([a-zA-Z][-.a-zA-Z0-9:_]*)\s*>c @ s� e Zd ZdZdZdd�dd�Zdd� Zd d � Zdd� Zd Z dd� Z dd� Zdd� Zdd� Z dd� Zd7dd�Zdd� Zdd� Zdd � Zd!d"� Zd#d$� Zd%d&� Zd'd(� Zd)d*� Zd+d,� Zd-d.� Zd/d0� Zd1d2� Zd3d4� Zd5d6� Zd S )8r aE Find tags and other markup and call handler functions. Usage: p = HTMLParser() p.feed(data) ... p.close() Start tags are handled by calling self.handle_starttag() or self.handle_startendtag(); end tags by self.handle_endtag(). The data between tags is passed from the parser to the derived class by calling self.handle_data() with the data as argument (the data may be split up in arbitrary chunks). If convert_charrefs is True the character references are converted automatically to the corresponding Unicode character (and self.handle_data() is no longer split in chunks), otherwise they are passed by calling self.handle_entityref() or self.handle_charref() with the string containing respectively the named or numeric reference as the argument. )ZscriptZstyleT)�convert_charrefsc C s || _ | �� dS )z�Initialize and reset this instance. If convert_charrefs is True (the default), all character references are automatically converted to the corresponding Unicode characters. N)r �reset)�selfr � r �2/opt/alt/python310/lib64/python3.10/html/parser.py�__init__V s zHTMLParser.__init__c C s( d| _ d| _t| _d| _tj�| � dS )z1Reset this instance. Loses all unprocessed data.� z???N)�rawdata�lasttag�interesting_normal�interesting� cdata_elem�_markupbase� ParserBaser �r r r r r _ s zHTMLParser.resetc C s | j | | _ | �d� dS )z�Feed data to the parser. Call this as often as you want, with as little or as much text as you want (may include '\n'). r N)r �goahead�r �datar r r �feedg s zHTMLParser.feedc C s | � d� dS )zHandle any buffered data.� N)r r r r r �closep s zHTMLParser.closeNc C s | j S )z)Return full source of start tag: '<...>'.)�_HTMLParser__starttag_textr r r r �get_starttag_textv s zHTMLParser.get_starttag_textc C s$ |� � | _t�d| j tj�| _d S )Nz</\s*%s\s*>)�lowerr �re�compile�Ir )r �elemr r r �set_cdata_modez s zHTMLParser.set_cdata_modec C s t | _d | _d S �N)r r r r r r r �clear_cdata_mode~ s zHTMLParser.clear_cdata_modec C s: | j }d}t|�}||k �r�| jr;| js;|�d|�}|dk r:|�dt||d ��}|dkr8t�d�� ||�s8�n�|}n| j � ||�}|rI|�� }n| jrN�n�|}||k ro| jrf| jsf| �t |||� �� n | �|||� � | �||�}||kr{�nj|j}|d|��rt�||�r�| �|�} n>|d|�r�| �|�} n3|d|�r�| �|�} n(|d|�r�| �|�} n|d |�r�| �|�} n|d |k r�| �d� |d } n�n| dk �r|sאn|�d|d �} | dk r�|�d|d �} | dk r�|d } n| d 7 } | j�r| j�s| �t ||| � �� n | �||| � � | �|| �}n�|d|��rlt�||�}|�rO|�� d d� } | �| � |�� } |d| d ��sH| d } | �|| �}q d||d � v �rk| �|||d � � | �||d �}ny|d|��r�t�||�}|�r�|�d �} | �| � |�� } |d| d ��s�| d } | �|| �}q t�||�}|�r�|�r�|�� ||d � k�r�|�� } | |k�r�|} | �||d �}n|d |k �r�| �d� | �||d �}nnJ d��||k s|�r||k �r| j�s| j�r| j�s| �t |||� �� n | �|||� � | �||�}||d � | _ d S )Nr �<�&�"