a *eg@sdZddlZddlZddlZddlmZmZddlmZddl m Z m Z m Z m Z mZddlmZmZmZmZmZmZddlmZgdZed Zgd Zed d ZGd ddeZGdddedZ Gddde Z!Gddde"Z#GdddZ$e$Z%Gddde&Z'GdddZ(ddZ)GdddZ*e*Z+d d!Z,Gd"d#d#Z-Gd$d%d%eZ.Gd&d'd'eZ/Gd(d)d)e e/dZ0Gd*d+d+Z1Gd,d-d-e0Z2d.d/Z3Gd0d1d1e/Z4Gd2d3d3e0e4dZ5dS)4z pygments.lexer ~~~~~~~~~~~~~~ Base lexer classes. :copyright: Copyright 2006-2024 by the Pygments team, see AUTHORS. :license: BSD, see LICENSE for details. N) apply_filtersFilter)get_filter_by_name)ErrorTextOther Whitespace _TokenType) get_bool_opt get_int_opt get_list_optmake_analysatorFuture guess_decode) regex_opt) Lexer RegexLexerExtendedRegexLexerDelegatingLexer LexerContextincludeinheritbygroupsusingthisdefaultwordsline_rez.*? ))sutf-8)szutf-32)szutf-32be)szutf-16)szutf-16becCsdS)Nxr r D/usr/local/lib/python3.9/site-packages/pip/_vendor/pygments/lexer.py"r$c@seZdZdZddZdS) LexerMetaz This metaclass automagically converts ``analyse_text`` methods into static methods which always return float values. cCs(d|vrt|d|d<t||||S)N analyse_text)r type__new__)Zmcsnamebasesdr r r#r)+szLexerMeta.__new__N)__name__ __module__ __qualname____doc__r)r r r r#r&%sr&c@sneZdZdZdZgZgZgZgZdZ dZ dZ dZ ddZ ddZdd Zd d Zd d ZdddZddZdS)ra" Lexer for a specific language. See also :doc:`lexerdevelopment`, a high-level guide to writing lexers. Lexer classes have attributes used for choosing the most appropriate lexer based on various criteria. .. autoattribute:: name :no-value: .. autoattribute:: aliases :no-value: .. autoattribute:: filenames :no-value: .. autoattribute:: alias_filenames .. autoattribute:: mimetypes :no-value: .. autoattribute:: priority Lexers included in Pygments should have two additional attributes: .. autoattribute:: url :no-value: .. autoattribute:: version_added :no-value: Lexers included in Pygments may have additional attributes: .. autoattribute:: _example :no-value: You can pass options to the constructor. The basic options recognized by all lexers and processed by the base `Lexer` class are: ``stripnl`` Strip leading and trailing newlines from the input (default: True). ``stripall`` Strip all leading and trailing whitespace from the input (default: False). ``ensurenl`` Make sure that the input ends with a newline (default: True). This is required for some lexers that consume input linewise. .. versionadded:: 1.3 ``tabsize`` If given and greater than 0, expand tabs in the input (default: 0). ``encoding`` If given, must be an encoding name. This encoding will be used to convert the input string to Unicode, if it is not already a Unicode string (default: ``'guess'``, which uses a simple UTF-8 / Locale / Latin1 detection. Can also be ``'chardet'`` to use the chardet library, if it is installed. ``inencoding`` Overrides the ``encoding`` if given. NrcKs||_t|dd|_t|dd|_t|dd|_t|dd|_|dd |_|d pZ|j|_g|_ t |d d D]}| |qpd S)a This constructor takes arbitrary options as keyword arguments. Every subclass must first process its own options and then call the `Lexer` constructor, since it processes the basic options like `stripnl`. An example looks like this: .. sourcecode:: python def __init__(self, **options): self.compress = options.get('compress', '') Lexer.__init__(self, **options) As these options must all be specifiable as strings (due to the command line usage), there are various utility functions available to help with that, see `Utilities`_. stripnlTstripallFensurenltabsizerencodingguessZ inencodingfiltersr N) optionsr r1r2r3r r4getr5r7r add_filter)selfr8filter_r r r#__init__szLexer.__init__cCs2|jrd|jjd|jdSd|jjdSdS)Nz)r8 __class__r-r;r r r#__repr__szLexer.__repr__cKs*t|tst|fi|}|j|dS)z8 Add a new stream filter to this lexer. N) isinstancerrr7append)r;r<r8r r r#r:s zLexer.add_filtercCsdS)a A static method which is called for lexer guessing. It should analyse the text and return a float in the range from ``0.0`` to ``1.0``. If it returns ``0.0``, the lexer will not be selected as the most probable one, if it returns ``1.0``, it will be selected immediately. This is used by `guess_lexer`. The `LexerMeta` metaclass automatically wraps this function so that it works like a static method (no ``self`` or ``cls`` parameter) and the return value is automatically converted to `float`. If the return value is an object that is boolean `False` it's the same as if the return values was ``0.0``. Nr )textr r r#r'szLexer.analyse_textc Cst|ts|jdkr"t|\}}n|jdkrz tdWn.tyf}ztd|WYd}~n d}~00d}tD].\}}||rp|t|d|d}qqp|durt |dd}|| dpd d}|}n&||j}|d r|td d}n|d r|td d}| d d }| d d }|j rD|}n|jrV|d }|jdkrn||j}|jr|d s|d 7}|S)zVApply preprocessing such as decoding the input, removing BOM and normalizing newlines.r6chardetzchardet is not vendored by pipzkTo enable chardet encoding guessing, please install the chardet library from http://chardet.feedparser.org/Nreplaceir5ruz   r)rBstrr5r ImportError _encoding_map startswithlendecoderEdetectr9rFr2stripr1r4 expandtabsr3endswith)r;rD_edecodedbomr5encr r r#_preprocess_lexer_inputsJ               zLexer._preprocess_lexer_inputFcs4fdd}|}|s0t|j}|S)ae This method is the basic interface of a lexer. It is called by the `highlight()` function. It must process the text and return an iterable of ``(tokentype, value)`` pairs from `text`. Normally, you don't need to override this method. The default implementation processes the options recognized by all lexers (`stripnl`, `stripall` and so on), and then yields all tokens from `get_tokens_unprocessed()`, with the ``index`` dropped. If `unfiltered` is set to `True`, the filtering mechanism is bypassed even if filters are defined. c3s$D]\}}}||fVq dSN)get_tokens_unprocessed)rStvr;rDr r#streamersz"Lexer.get_tokens..streamer)rXrr7)r;rDZ unfilteredr^streamr r]r# get_tokenss  zLexer.get_tokenscCstdS)aS This method should process the text and return an iterable of ``(index, tokentype, value)`` tuples where ``index`` is the starting position of the token within the input text. It must be overridden by subclasses. It is recommended to implement it as a generator to maximize effectiveness. N)NotImplementedErrorr]r r r#rZs zLexer.get_tokens_unprocessed)F)r-r.r/r0r*aliases filenamesZalias_filenames mimetypespriorityurlZ version_addedZ_exampler=rAr:r'rXr`rZr r r r#r1s";1 r) metaclassc@s$eZdZdZefddZddZdS)ra  This lexer takes two lexer as arguments. A root lexer and a language lexer. First everything is scanned using the language lexer, afterwards all ``Other`` tokens are lexed using the root lexer. The lexers from the ``template`` lexer package use this base lexer. cKs<|fi||_|fi||_||_tj|fi|dSrY) root_lexerlanguage_lexerneedlerr=)r;Z _root_lexerZ_language_lexerZ_needler8r r r#r=-szDelegatingLexer.__init__cCsd}g}g}|j|D]H\}}}||jurP|rF|t||fg}||7}q||||fq|rx|t||ft||j|S)N)rirZrjrCrM do_insertionsrh)r;rDZbuffered insertionsZ lng_bufferir[r\r r r#rZ3s   z&DelegatingLexer.get_tokens_unprocessedN)r-r.r/r0rr=rZr r r r#r#s rc@seZdZdZdS)rzI Indicates that a state should include rules from another state. Nr-r.r/r0r r r r#rJsrc@seZdZdZddZdS)_inheritzC Indicates the a state should inherit from its superclass. cCsdS)Nrr r@r r r#rAUsz_inherit.__repr__N)r-r.r/r0rAr r r r#rpQsrpc@s eZdZdZddZddZdS)combinedz: Indicates a state combined from multiple states. cGs t||SrY)tupler))clsargsr r r#r)`szcombined.__new__cGsdSrYr )r;rtr r r#r=cszcombined.__init__N)r-r.r/r0r)r=r r r r#rq[srqc@sFeZdZdZddZdddZdddZdd d Zd d Zd dZ dS) _PseudoMatchz: A pseudo match object constructed from a string. cCs||_||_dSrY)_text_start)r;startrDr r r#r=msz_PseudoMatch.__init__NcCs|jSrY)rwr;argr r r#rxqsz_PseudoMatch.startcCs|jt|jSrY)rwrMrvryr r r#endtsz_PseudoMatch.endcCs|r td|jS)Nz No such group) IndexErrorrvryr r r#groupwsz_PseudoMatch.groupcCs|jfSrY)rvr@r r r#groups|sz_PseudoMatch.groupscCsiSrYr r@r r r# groupdictsz_PseudoMatch.groupdict)N)N)N) r-r.r/r0r=rxr{r}r~rr r r r#ruhs   rucsdfdd }|S)zL Callback that yields multiple actions for each group in the match. Nc3stD]\}}|durqqt|turR||d}|r||d||fVq||d}|dur|r|||d|_||t||d||D]}|r|Vqq|r||_dS)N) enumerater(r r}rxposrur{)lexermatchctxrnactiondataitemrtr r#callbacks$  zbygroups..callback)Nr )rtrr rr#rsrc@seZdZdZdS)_ThiszX Special singleton used for indicating the caller class. Used by ``using``. Nror r r r#rsrc sjidvr:d}t|ttfr.|d<n d|fd<turTdfdd }nd fdd }|S) a Callback that processes the match with a different lexer. The keyword arguments are forwarded to the lexer, except `state` which is handled separately. `state` specifies the state that the new lexer will start in, and can be an enumerable such as ('root', 'inline', 'string') or a simple string which is assumed to be on top of the root state. Note: For that to work, `_other` must not be an `ExtendedRegexLexer`. statestackrootNc3srr"|j|jfi}n|}|}|j|fiD]\}}}||||fVqD|rn||_dSrY)updater8r?rxrZr}r{rrrrZlxsrnr[r\) gt_kwargskwargsr r#rs  zusing..callbackc3sf|jfi}|}|j|fiD]\}}}||||fVq8|rb||_dSrY)rr8rxrZr}r{rr_otherrrr r#rs  )N)N)poprBlistrrr)rrrrr rr#rs     rc@seZdZdZddZdS)rz Indicates a state or state action (e.g. #pop) to apply. For example default('#pop') is equivalent to ('', Token, '#pop') Note that state tuples may be used as well. .. versionadded:: 2.0 cCs ||_dSrY)r)r;rr r r#r=szdefault.__init__N)r-r.r/r0r=r r r r#rsrc@s"eZdZdZdddZddZdS) rz Indicates a list of literal words that is transformed into an optimized regex that matches any of the words. .. versionadded:: 2.0 rkcCs||_||_||_dSrY)rprefixsuffix)r;rrrr r r#r=szwords.__init__cCst|j|j|jdS)Nrr)rrrrr@r r r#r9sz words.getN)rkrk)r-r.r/r0r=r9r r r r#rs rc@sJeZdZdZddZddZddZdd Zdd d Zd dZ ddZ d S)RegexLexerMetazw Metaclass for RegexLexer, creates the self._tokens attribute from self.tokens on the first instantiation. cCs t|tr|}t||jS)zBPreprocess the regular expression component of a token definition.)rBrr9recompiler)rsregexrflagsrr r r#_process_regexs zRegexLexerMeta._process_regexcCs&t|tus"t|s"Jd||S)z5Preprocess the token component of a token definition.z0token type must be simple type or callable, not )r(r callable)rstokenr r r#_process_tokenszRegexLexerMeta._process_tokencCst|trf|dkrdS||vr$|fS|dkr0|S|dddkrRt|dd SdsdJd|nt|trd |j}|jd 7_g}|D].}||ksJd ||||||q|||<|fSt|tr|D] }||vs|d vsJd|q|SdsJd |dS)z=Preprocess the state transition action of a token definition.#pop#pushNz#pop:Fzunknown new state z_tmp_%drzcircular state ref )rrzunknown new state def )rBrIintrq_tmpnameextend_process_staterr)rs new_state unprocessed processedZ tmp_stateitokensZistater r r#_process_new_states<     z!RegexLexerMeta._process_new_statec Cst|tsJd||ddks2Jd|||vrB||Sg}||<|j}||D]@}t|tr||ksJd|||||t|q\t|trq\t|tr||j ||}| t dj d|fq\t|tusJd|z||d||}WnLtyX} z2td |dd |d |d | | WYd} ~ n d} ~ 00||d } t|dkr|d}n||d||}| || |fq\|S)z%Preprocess a single state definition.zwrong state name r#zinvalid state name zcircular state reference rkNzwrong rule def zuncompilable regex z in state z of z: r)rBrIflagsrrrrprrrrCrrrr(rrr Exception ValueErrorrrM) rsrrrtokensrZtdefrrexerrrr r r#r)s@     < zRegexLexerMeta._process_stateNcCs<i}|j|<|p|j|}t|D]}||||q$|S)z-Preprocess a dictionary of token definitions.) _all_tokensrrr)rsr* tokendefsrrr r r#process_tokendefTs  zRegexLexerMeta.process_tokendefc Csi}i}|jD]}|jdi}|D]\}}||}|durz|||<z|t}WntynYq(Yn0|||<q(||d}|durq(||||d<z|t} WntyYq(0|| ||<q(q|S)a Merge tokens from superclasses in MRO order, returning a single tokendef dictionary. Any state that is not defined by a subclass will be inherited automatically. States that *are* defined by subclasses will, by default, override that state in the superclass. If a subclass wishes to inherit definitions from a superclass, it can use the special value "inherit", which will cause the superclass' state definition to be included at that point in the state. rNr)__mro____dict__r9itemsindexrrr) rsr inheritablectoksrrZcuritemsZ inherit_ndxZ new_inh_ndxr r r# get_tokendefs\s0       zRegexLexerMeta.get_tokendefscOsRd|jvr:i|_d|_t|dr(|jr(n|d||_tj |g|Ri|S)z:Instantiate cls after preprocessing its token definitions._tokensrtoken_variantsrk) rrrhasattrrrrrr(__call__)rsrtkwdsr r r#rs zRegexLexerMeta.__call__)N) r-r.r/r0rrrrrrrr r r r#rs#+ 1rc@s$eZdZdZejZiZdddZdS)rz Base for simple stateful regular expression-based lexers. Simplifies the lexing process so that you need only provide a list of states and regular expressions. rc csd}|j}t|}||d}|D]*\}}} |||} | r"|durrt|turb||| fVn||| EdH| }| durJt| tr| D]D} | dkrt|dkr| q| dkr| |dq| | qndt| t rt | t|kr|dd=n || d=n.| dkr*| |dnds>Jd| ||d}qq"zP||d krd g}|d }|t d fV|d7}Wq|t||fV|d7}WqtyYqYq0qdS) z~ Split ``text`` into (tokentype, text) pairs. ``stack`` is the initial stack (default: ``['root']``) rrNrrrFwrong state def: rGr)rrr(r r}r{rBrrrMrrCrabsrrr|) r;rDrrrZ statestack statetokensrexmatchrrmrr r r#rZsR              z!RegexLexer.get_tokens_unprocessedN)r) r-r.r/r0r MULTILINErrrZr r r r#rsrc@s"eZdZdZdddZddZdS)rz9 A helper object that holds lexer position data. NcCs*||_||_|pt||_|p"dg|_dS)Nr)rDrrMr{r)r;rDrrr{r r r#r=szLexerContext.__init__cCsd|jd|jd|jdS)Nz LexerContext(z, ))rDrrr@r r r#rAszLexerContext.__repr__)NN)r-r.r/r0r=rAr r r r#rs rc@seZdZdZdddZdS)rzE A RegexLexer that uses a context object to store its state. Nc cs<|j}|st|d}|d}n|}||jd}|j}|D]b\}}}|||j|j} | r:|durt|tur|j|| fV| |_n$||| |EdH|s||jd}|durt |t r|D]P} | dkrt |jdkr|j q| dkr|j |jdq|j | qnnt |trZt|t |jkrL|jdd=n |j|d=n2|dkrx|j |jdndsJd |||jd}q6q:zz|j|jkrWq8||jd krdg|_|d}|jtd fV|jd7_Wq6|jt||jfV|jd7_Wq6ty4Yq8Yq60q6dS) z Split ``text`` into (tokentype, text) pairs. If ``context`` is given, use this lexer context instead. rrrNrrrFrrG)rrrrDrr{r(r r}rBrrrMrrCrrrrr|) r;rDcontextrrrrrrrrr r r#rZs`          z)ExtendedRegexLexer.get_tokens_unprocessed)NN)r-r.r/r0rZr r r r#r src cst|}zt|\}}Wnty6|EdHYdS0d}d}|D]\}}}|durZ|}d} |r|t||kr|| ||} | r||| fV|t| 7}|D]"\} } } || | fV|t| 7}q||} zt|\}}Wq^tyd}YqYq^0q^| t|krD|||| dfV|t|| 7}qD|r|pDd}|D]$\}}}|||fV|t|7}qJzt|\}}Wntyd}YqYn0q6dS)ag Helper for lexers which must combine the results of several sublexers. ``insertions`` is a list of ``(index, itokens)`` pairs. Each ``itokens`` iterable should be inserted at position ``index`` into the token stream given by the ``tokens`` argument. The result is a combined token stream. TODO: clean up the code here. NTrF)iternext StopIterationrM)rmrrrrealposZinsleftrnr[r\ZoldiZtmpvalZit_indexZit_tokenZit_valuepr r r#rlSsN        rlc@seZdZdZddZdS)ProfilingRegexLexerMetaz>Metaclass for ProfilingRegexLexer, collects regex timing info.csLt|tr t|j|j|jdn|t|tjffdd }|S)Nrcs`jdfddg}t}|||}t}|dd7<|d||7<|S)Nrrrr) _prof_data setdefaulttimer)rDrendposinfot0rest1rsZcompiledrrr r# match_funcsz:ProfilingRegexLexerMeta._process_regex..match_func) rBrrrrrrsysmaxsize)rsrrrrr rr#rs   z&ProfilingRegexLexerMeta._process_regexN)r-r.r/r0rr r r r#rsrc@s"eZdZdZgZdZdddZdS)ProfilingRegexLexerzFDrop-in replacement for RegexLexer that does profiling of its regexes.rc#sjjit||EdHjj}tdd|Dfdddd}tdd|D}t t djj t ||ft d t d d t d |D]}t d |qt d dS)NcssN|]F\\}}\}}|t|ddddd|d|d||fVqdS)zu'z\\\NAi)reprrPrF).0rrnr[r r r# sz=ProfilingRegexLexer.get_tokens_unprocessed..cs |jSrY)_prof_sort_indexr!r@r r#r$r%z.T)keyreversecss|]}|dVqdS)Nr )rr"r r r#rr%z2Profiling result for %s lexing %d chars in %.3f mszn==============================================================================================================z$%-20s %-64s ncalls tottime percall)rrzn--------------------------------------------------------------------------------------------------------------z%-20s %-65s %5d %8.4f %8.4f) r?rrCrrZrsortedrsumprintr-rM)r;rDrrawdatarZ sum_totalr,r r@r#rZs(   z*ProfilingRegexLexer.get_tokens_unprocessedN)r)r-r.r/r0rrrZr r r r#rsr)6r0rrrZpip._vendor.pygments.filterrrZpip._vendor.pygments.filtersrZpip._vendor.pygments.tokenrrrrr Zpip._vendor.pygments.utilr r r r rrZpip._vendor.pygments.regexoptr__all__rrrK staticmethodZ_default_analyser(r&rrrIrrprrrrqrurrrrrrrrrrrlrrr r r r#sF       s' 2 (aH@