File: //lib64/python3.8/html/__pycache__/parser.cpython-38.pyc
U
    e5d9E  �                   @   s�   d Z ddlZddlZddlZddlmZ dgZe�d�Ze�d�Z	e�d�Z
e�d�Ze�d	�Ze�d
�Z
e�d�Ze�d�Ze�d
�Ze�dej�Ze�d
�Ze�d�ZG dd� dej�ZdS )zA parser for HTML and XHTML.�    N)�unescape�
HTMLParserz[&<]z
&[a-zA-Z#]z%&([a-zA-Z][-.a-zA-Z0-9]*)[^a-zA-Z0-9]z)&#(?:[0-9]+|[xX][0-9a-fA-F]+)[^0-9a-fA-F]z	<[a-zA-Z]�>z--\s*>z+([a-zA-Z][^\t\n\r\f />\x00]*)(?:\s|/(?!>))*z]((?<=[\'"\s/])[^\s/>][^\s/=>]*)(\s*=+\s*(\'[^\']*\'|"[^"]*"|(?![\'"])[^>\s]*))?(?:\s|/(?!>))*aF  
  <[a-zA-Z][^\t\n\r\f />\x00]*       # tag name
  (?:[\s/]*                          # optional whitespace before attribute name
    (?:(?<=['"\s/])[^\s/>][^\s/=>]*  # attribute name
      (?:\s*=+\s*                    # value indicator
        (?:'[^']*'                   # LITA-enclosed value
          |"[^"]*"                   # LIT-enclosed value
          |(?!['"])[^>\s]*           # bare value
         )
        \s*                          # possibly followed by a space
       )?(?:\s|/(?!>))*
     )*
   )?
  \s*                                # trailing whitespace
z#</\s*([a-zA-Z][-.a-zA-Z0-9:_]*)\s*>c                   @   s�   e Zd ZdZdZdd�dd�Zdd� Zd	d
� Zdd� Zd
Z	dd� Z
dd� Zdd� Zdd� Z
dd� Zd9dd�Zdd� Zdd� Zdd � Zd!d"� Zd#d$� Zd%d&� Zd'd(� Zd)d*� Zd+d,� Zd-d.� Zd/d0� Zd1d2� Zd3d4� Zd5d6� Zd7d8� Zd
S ):r   aE  Find tags and other markup and call handler functions.
    Usage:
        p = HTMLParser()
        p.feed(data)
        ...
        p.close()
    Start tags are handled by calling self.handle_starttag() or
    self.handle_startendtag(); end tags by self.handle_endtag().  The
    data between tags is passed from the parser to the derived class
    by calling self.handle_data() with the data as argument (the data
    may be split up in arbitrary chunks).  If convert_charrefs is
    True the character references are converted automatically to the
    corresponding Unicode character (and self.handle_data() is no
    longer split in chunks), otherwise they are passed by calling
    self.handle_entityref() or self.handle_charref() with the string
    containing respectively the named or numeric reference as the
    argument.
    )ZscriptZstyleT)�convert_charrefsc                C   s   || _ | ��  dS )z�Initialize and reset this instance.
        If convert_charrefs is True (the default), all character references
        are automatically converted to the corresponding Unicode characters.
        N)r   �reset)�selfr   � r   �#/usr/lib64/python3.8/html/parser.py�__init__W   s    zHTMLParser.__init__c                 C   s(   d| _ d| _t| _d| _tj�| � dS )z1Reset this instance.  Loses all unprocessed data.� z???N)�rawdata�lasttag�interesting_normal�interesting�
cdata_elem�_markupbase�
ParserBaser   �r   r   r   r	   r   `   s
    zHTMLParser.resetc                 C   s   | j | | _ | �d� dS )z�Feed data to the parser.
        Call this as often as you want, with as little or as much text
        as you want (may include '\n').
        r   N)r   �goahead�r   �datar   r   r	   �feedh   s    zHTMLParser.feedc                 C   s   | � d� dS )zHandle any buffered data.�   N)r   r   r   r   r	   �closeq   s    zHTMLParser.closeNc                 C   s   | j S )z)Return full source of start tag: '<...>'.)�_HTMLParser__starttag_textr   r   r   r	   �get_starttag_textw   s    zHTMLParser.get_starttag_textc                 C   s$   |� � | _t�d| j tj�| _d S )Nz</\s*%s\s*>)�lowerr   �re�compile�Ir   )r   �elemr   r   r	   �set_cdata_mode{   s    
zHTMLParser.set_cdata_modec                 C   s   t | _d | _d S �N)r   r   r   r   r   r   r	   �clear_cdata_mode   s    zHTMLParser.clear_cdata_modec                 C   sX  | j }d}t|�}||k �r�| jrv| jsv|�d|�}|dk r�|�dt||d ��}|dkrpt�d��	||�sp�q�|}n*| j
�	||�}|r�|�� }n| jr��q�|}||k r�| jr�| js�| �t
|||� �� n| �|||� � | �||�}||kr��q�|j}|d|��rJt�||��r"| �|�}	n�|d|��r:| �|�}	nn|d|��rR| �|�}	nV|d|��rj| �|�}	n>|d	|��r�| �|�}	n&|d
 |k �r�| �d� |d
 }	n�q�|	dk �r<|�s��q�|�d|d
 �}	|	dk �r�|�d|d
 �}	|	dk �r|d
 }	n|	d
7 }	| j�r*| j�s*| �t
|||	� �� n| �|||	� � | �||	�}q|d|��r�t�||�}|�r�|�� d
d� }
| �|
� |�� }	|d|	d
 ��s�|	d
 }	| �||	�}qn<d||d � k�r�| �|||d
 � � | �||d
 �}�q�q|d|��r�t�||�}|�rP|�d
�}
| �|
� |�� }	|d|	d
 ��sB|	d
 }	| �||	�}qt�||�}|�r�|�r�|�� ||d � k�r�|�� }	|	|k�r�|}	| �||d
 �}�q�n.|d
 |k �r�| �d� | �||d
 �}n�q�qdstd��q|�rF||k �rF| j�sF| j�r(| j�s(| �t
|||� �� n| �|||� � | �||�}||d � | _ d S )Nr   �<�&�"