HEX
Server: Apache/2.4.52 (Ubuntu)
System: Linux spn-python 5.15.0-89-generic #99-Ubuntu SMP Mon Oct 30 20:42:41 UTC 2023 x86_64
User: arjun (1000)
PHP: 8.1.2-1ubuntu2.20
Disabled: NONE
Upload Files
File: //home/arjun/projects/env/lib/python3.10/site-packages/weasyprint/__pycache__/html.cpython-310.pyc
o

&we�,�@s�dZzddlmZWneyddlmZYnwdd�ZddlZddlmZmZdd	lm	Z	dd
l
mZddlm
Z
ddlmZdd
lmZddlmZe�Zeed�Zeed�Zeed�Zeeed�Zeeed�Zeed�ZdZe�de�d��Zdd�Zdd�ZiZ dd�Z!dd�Z"dd �Z#e"d!�d"d#��Z$e"d$�d%d&��Z%e"d'�d(d)��Z&e"d*�d+d,��Z'e"d-�d.d/��Z(e"d0�d1d2��Z)d3d4�Z*d5d6�Z+e�d7ej,�Z-d8d9�Z.dS):aVSpecific handling for some HTML elements, especially replaced elements.

Replaced elements (eg. <img> elements) are rendered externally and behave as an
atomic opaque box in CSS. In general, they may or may not have intrinsic
dimensions. But the only replaced elements currently supported in WeasyPrint
are images with intrinsic dimensions.

�)�files)�	read_textcCst|�|�d�S)Nzutf-8)rr)�package�resource�r�H/home/arjun/projects/env/lib/python3.10/site-packages/weasyprint/html.pyrsrN�)�CSS�css)�get_child_text)�CounterStyle)�boxes)�SVGImage)�LOGGER)�get_url_attributezhtml5_ua.csszhtml5_ua_form.csszhtml5_ph.css)�string�
counter_style�rz 	

z[^z]+cCs|������S)a4Transform (only) ASCII letters to lower case: A-Z is mapped to a-z.

    This is used for `ASCII case-insensitive
    <https://whatwg.org/C#ascii-case-insensitive>`_ matching.

    This is different from the :meth:`str.lower` method of Unicode strings
    which also affect non-ASCII characters,
    sometimes mapping them into the ASCII range:

    >>> keyword = 'Bac\N{KELVIN SIGN}ground'
    >>> assert keyword.lower() == 'background'
    >>> assert ascii_lower(keyword) != keyword.lower()
    >>> assert ascii_lower(keyword) == 'bac\N{KELVIN SIGN}ground'

    )�encode�lower�decoderrrr�ascii_lower-srcs(t�|�dd��}t�fdd�|D��S)zDReturn whether element has a ``rel`` attribute with given link type.�rel�c3s�|]	}t|��kVqdS)N)r)�.0�token��	link_typerr�	<genexpr>Ds�z(element_has_link_type.<locals>.<genexpr>)�HTML_SPACE_SEPARATED_TOKENS_RE�findall�get�any)�elementr�tokensrrr�element_has_link_typeAsr%cCs$|jtvrt|j||||�S|gS)zbHandle HTML elements that need special care.

    :returns: a (possibly empty) list of boxes.
    )�element_tag�
HTML_HANDLERS�tag)r#�box�get_image_from_uri�base_urlrrr�handle_elementKs

�r,cs�fdd�}|S)zDReturn a decorator registering a function handling ``tag`` elements.cs|t�<|S)z;Decorator registering a function handling ``tag`` elements.)r')�function�r(rr�	decoratorYszhandler.<locals>.decoratorr)r(r/rr.r�handlerWsr0cCs@d|jdvr
tjntj}||j|j||�}|j|_|j|_|S)z�Wrap an image in a replaced box.

    That box is either block-level or inline-level, depending on what the
    element should be.

    �block�display)�styler
�BlockReplacedBox�InlineReplacedBoxr(�
string_set�bookmark_label)r#r)�image�type_�new_boxrrr�make_replaced_box`s�r;�imgcCs�t|d|�}|�d�}|r=|||jdd�}|dur!t|||�gS|r/tj�||�g|_|gS|dkr5gS|dus;J�gS|rKtj�||�g|_|gSgS)z�Handle ``<img>`` elements.

    Return either an image or the alt-text.

    See: https://www.w3.org/TR/html5/embedded-content-1.html#the-img-element

    �src�alt�image_orientation)�url�orientationNr)rr!r3r;r
�TextBox�anonymous_from�children)r#r)r*r+r=r>r8rrr�
handle_imgrs&	

�rE�embedcCsNt|d|�}|�dd���}|r%||||jdd�}|dur%t|||�gSgS)z�Handle ``<embed>`` elements, return either an image or nothing.

    See: https://www.w3.org/TR/html5/embedded-content-0.html#the-embed-element

    r=�typerr?�r@�forced_mime_typerAN�rr!�stripr3r;)r#r)r*r+r=r9r8rrr�handle_embed�s�rL�objectcCsPt|d|�}|�dd���}|r%||||jdd�}|dur%t|||�gS|gS)z�Handle ``<object>`` elements, return either an image or the fallback.

    See: https://www.w3.org/TR/html5/embedded-content-0.html#the-object-element

    �datarGrr?rHNrJ)r#r)r*r+rNr9r8rrr�
handle_object�s�rO�colgroupcs>t�tj�rtdd�|D��s�fdd�t�j�D��_�gS)�Handle the ``span`` attribute.css�|]}|jdkVqdS)�colNr.)r�childrrrr�s�z"handle_colgroup.<locals>.<genexpr>csg|]	}tj��g��qSr)r
�TableColumnBoxrC)r�_�r)rr�
<listcomp>�s��z#handle_colgroup.<locals>.<listcomp>)�
isinstancer
�TableColumnGroupBoxr"�range�spanrD�r#r)�_get_image_from_uri�	_base_urlrrVr�handle_colgroup�s
�r_rRcs4t�tj�r�jdkr�fdd�t�j�D�S�gS)rQrcsg|]}����qSr)�copy)r�_irVrrrW�szhandle_col.<locals>.<listcomp>)rXr
rTr[rZr\rrVr�
handle_col�srbz{http://www.w3.org/2000/svg}svgc
Csj|jd}|jd}z	t||||�}Wnty-}zt�d|�gWYd}~Sd}~wwt|||�gS)zUHandle ``<svg>`` elements.

    Return either an image or the fallback content.

    �url_fetcher�contextzFailed to load inline SVG: %sN)�keywordsr�	Exceptionr�errorr;)r#r)r*r+rcrdr8�	exceptionrrr�
handle_svg�s

��ricCs�d}d}d}g}g}d}d}g}i}	|jj�dd�}
|j�ddd�D]�}|j}|jdkr5|dur5t|�}q"|jdkr�t|�dd��}|�dd�}
|d	kratt	|
�
d
��D]}||vr_|�|�qTq"|dkrk|�|
�q"|dkrv|duru|
}q"|d
kr�|dur�|
}q"|dkr�|dur�t||
�}q"|dkr�|dur�t||
�}q"|r�||	vr�|
|	|<q"|jdkr�t
|d�r�t|d|j�}|�dd�}|dur�t�d�q"|�||f�q"t|||||||||
|	d�
S)a2Get metadata dictionary out of HTML object.

    Relevant specs:

    https://www.whatwg.org/html#the-title-element
    https://www.whatwg.org/html#standard-metadata-names
    https://wiki.whatwg.org/wiki/MetaExtensions
    https://microformats.org/wiki/existing-rel-values#HTML5_link_type_extensions

    N�lang�title�meta�link�namer�contentre�,�author�description�	generatorzdcterms.createdzdcterms.modified�
attachment�hrefz'Missing href in <link rel="attachment">)
rkrrrsre�authors�created�modified�attachmentsrj�custom)�
etree_element�attribr!�wrapper_element�	query_allr(rr�map�strip_whitespace�split�append�parse_w3c_dater%rr+rrg�dict)�htmlrkrrrsrervrwrxryrzrjr#rnro�keywordr@�attachment_titlerrr�get_html_metadata�st


����
�
�����r�cCs
|�t�S)z�Use the HTML definition of "space character",
    not all Unicode Whitespace.

    https://www.whatwg.org/html#strip-leading-and-trailing-whitespace
    https://www.whatwg.org/html#space-character

    )rK�HTML_WHITESPACErrrrr�#s
r�aG
    ^
    [ 	

]*
    (?P<year>\d\d\d\d)
    (?:
        -(?P<month>0\d|1[012])
        (?:
            -(?P<day>[012]\d|3[01])
            (?:
                T(?P<hour>[01]\d|2[0-3])
                :(?P<minute>[0-5]\d)
                (?:
                    :(?P<second>[0-5]\d)
                    (?:\.\d+)?  # Second fraction, ignored
                )?
                (?:
                    Z |  # UTC
                    (?P<tz_hour>[+-](?:[01]\d|2[0-3]))
                    :(?P<tz_minute>[0-5]\d)
                )
            )?
        )?
    )?
    [ 	

]*
    $
cCs t�|�r|St�d||�dS)zYParse datetimes as defined by the W3C.

    See https://www.w3.org/TR/NOTE-datetime

    z#Invalid date in <meta name="%s"> %rN)�W3C_DATE_RE�matchr�warning)�	meta_namerrrrr�Qs

�r�)/�__doc__�importlib.resourcesr�ImportErrorr�rerr	r
r�css.countersr�formatting_structurer
�imagesr�loggerr�urlsr�HTML5_UA_COUNTER_STYLE�HTML5_UA�
HTML5_UA_FORM�HTML5_PH�HTML5_UA_STYLESHEET�HTML5_UA_FORM_STYLESHEET�HTML5_PH_STYLESHEETr��compilerrr%r'r,r0r;rErLrOr_rbrir�r��VERBOSEr�r�rrrr�<module>sh	�


��
	
$




	
>�