a :jgK@sTdZddlZddlmZddlmZmZmZddlm Z m Z m Z ddlmZdd lmZdd lTdd lmZdd lmZmZmZmZmZm Z!gd Z"ej#ej$ddZ$ddZ%e$e%ddZ&e$e%ddZ'e$e%ddZ(e$e%ddZ)e$e%ddZ*e$e%ddZ+ddZedGd d!d!e Z,edd'd#d$Z edd(d%d&Z dS))an This module contains a set of functions for vectorized string operations and methods. .. note:: The `chararray` class exists for backwards compatibility with Numarray, it is not recommended for new development. Starting from numpy 1.4, if one needs arrays of strings, it is recommended to use arrays of `dtype` `object_`, `bytes_` or `str_`, and use the free functions in the `numpy.char` module for fast vectorized string operations. Some methods will only be available if the corresponding string method is available in your version of Python. The preferred alias for `defchararray` is `numpy.char`. N) set_module)bytes_str_ character)ndarrayarrayasarraycompare_chararrays) overrides)*)multiply) _partition _rpartition_split_rsplit _splitlines_join)5equal not_equal greater_equal less_equalgreaterlessZstr_lenaddrmod capitalizecentercountdecodeencodeendswith expandtabsfindindexisalnumisalphaisdigitislowerisspaceistitleisupperjoinljustlowerlstrip partitionreplacerfindrindexrjust rpartitionrsplitrstripsplit splitlines startswithstripswapcasetitle translateupperzfill isnumeric isdecimalr r r chararrayz numpy.char)modulecCs||fSNx1Zx2rHrHB/usr/local/lib/python3.9/site-packages/numpy/_core/defchararray.py_binary_op_dispatcher5srLcCst||ddS)ao Return (x1 == x2) element-wise. Unlike `numpy.equal`, this comparison is performed by first stripping whitespace characters from the end of the string. This behavior is provided for backward-compatibility with numarray. Parameters ---------- x1, x2 : array_like of str or unicode Input arrays of the same shape. Returns ------- out : ndarray Output array of bools. Examples -------- >>> y = "aa " >>> x = "aa" >>> np.char.equal(x, y) array(True) See Also -------- not_equal, greater_equal, less_equal, greater, less z==Tr rIrHrHrKr9srcCst||ddS)a Return (x1 != x2) element-wise. Unlike `numpy.not_equal`, this comparison is performed by first stripping whitespace characters from the end of the string. This behavior is provided for backward-compatibility with numarray. Parameters ---------- x1, x2 : array_like of str or unicode Input arrays of the same shape. Returns ------- out : ndarray Output array of bools. See Also -------- equal, greater_equal, less_equal, greater, less Examples -------- >>> x1 = np.array(['a', 'b', 'c']) >>> np.char.not_equal(x1, 'b') array([ True, False, True]) z!=Tr rIrHrHrKrZsrcCst||ddS)a Return (x1 >= x2) element-wise. Unlike `numpy.greater_equal`, this comparison is performed by first stripping whitespace characters from the end of the string. This behavior is provided for backward-compatibility with numarray. Parameters ---------- x1, x2 : array_like of str or unicode Input arrays of the same shape. Returns ------- out : ndarray Output array of bools. See Also -------- equal, not_equal, less_equal, greater, less Examples -------- >>> x1 = np.array(['a', 'b', 'c']) >>> np.char.greater_equal(x1, 'b') array([False, True, True]) z>=Tr rIrHrHrKr{srcCst||ddS)a Return (x1 <= x2) element-wise. Unlike `numpy.less_equal`, this comparison is performed by first stripping whitespace characters from the end of the string. This behavior is provided for backward-compatibility with numarray. Parameters ---------- x1, x2 : array_like of str or unicode Input arrays of the same shape. Returns ------- out : ndarray Output array of bools. See Also -------- equal, not_equal, greater_equal, greater, less Examples -------- >>> x1 = np.array(['a', 'b', 'c']) >>> np.char.less_equal(x1, 'b') array([ True, True, False]) z<=Tr rIrHrHrKrsrcCst||ddS)a Return (x1 > x2) element-wise. Unlike `numpy.greater`, this comparison is performed by first stripping whitespace characters from the end of the string. This behavior is provided for backward-compatibility with numarray. Parameters ---------- x1, x2 : array_like of str or unicode Input arrays of the same shape. Returns ------- out : ndarray Output array of bools. See Also -------- equal, not_equal, greater_equal, less_equal, less Examples -------- >>> x1 = np.array(['a', 'b', 'c']) >>> np.char.greater(x1, 'b') array([False, False, True]) >Tr rIrHrHrKrsrcCst||ddS)a Return (x1 < x2) element-wise. Unlike `numpy.greater`, this comparison is performed by first stripping whitespace characters from the end of the string. This behavior is provided for backward-compatibility with numarray. Parameters ---------- x1, x2 : array_like of str or unicode Input arrays of the same shape. Returns ------- out : ndarray Output array of bools. See Also -------- equal, not_equal, greater_equal, less_equal, greater Examples -------- >>> x1 = np.array(['a', 'b', 'c']) >>> np.char.less(x1, 'b') array([True, False, False]) >> a = np.array(["a", "b", "c"]) >>> np.strings.multiply(a, 3) array(['aaa', 'bbb', 'ccc'], dtype='>> i = np.array([1, 2, 3]) >>> np.strings.multiply(a, i) array(['a', 'bb', 'ccc'], dtype='>> np.strings.multiply(np.array(['a']), i) array(['a', 'aa', 'aaa'], dtype='>> a = np.array(['a', 'b', 'c', 'd', 'e', 'f']).reshape((2, 3)) >>> np.strings.multiply(a, 3) array([['aaa', 'bbb', 'ccc'], ['ddd', 'eee', 'fff']], dtype='>> np.strings.multiply(a, i) array([['a', 'bb', 'ccc'], ['d', 'ee', 'fff']], dtype='d?Zd@dAZ dBdCZ!dDdEZ"dFdGZ#dHdIZ$dJdKZ%dLdMZ&ddNdOZ'dPdQZ(ddRdSZ)dTdUZ*ddVdWZ+ddXdYZ,ddZd[Z-dd\d]Z.d^d_Z/dd`daZ0ddbdcZ1ddddeZ2ddfdgZ3ddhdiZ4ddjdkZ5dldmZ6dndoZ7ddpdqZ8drdsZ9dtduZ:dvdwZ;dxdyZ= 2`` and ``order='F'``, in which case `strides` is in "Fortran order". Methods ------- astype argsort copy count decode dump dumps encode endswith expandtabs fill find flatten getfield index isalnum isalpha isdecimal isdigit islower isnumeric isspace istitle isupper item join ljust lower lstrip nonzero put ravel repeat replace reshape resize rfind rindex rjust rsplit rstrip searchsorted setfield setflags sort split splitlines squeeze startswith strip swapaxes swapcase take title tofile tolist tostring translate transpose upper view zfill Parameters ---------- shape : tuple Shape of the array. itemsize : int, optional Length of each array element, in number of characters. Default is 1. unicode : bool, optional Are the array elements of type unicode (True) or string (False). Default is False. buffer : object exposing the buffer interface or str, optional Memory address of the start of the array data. Default is None, in which case a new array is created. offset : int, optional Fixed stride displacement from the beginning of an axis? Default is 0. Needs to be >=0. strides : array_like of ints, optional Strides for the array (see `~numpy.ndarray.strides` for full description). Default is None. order : {'C', 'F'}, optional The order in which the array data is stored in memory: 'C' -> "row major" order (the default), 'F' -> "column major" (Fortran) order. Examples -------- >>> charar = np.char.chararray((3, 3)) >>> charar[:] = 'a' >>> charar chararray([[b'a', b'a', b'a'], [b'a', b'a', b'a'], [b'a', b'a', b'a']], dtype='|S1') >>> charar = np.char.chararray(charar.shape, itemsize=5) >>> charar[:] = 'abc' >>> charar chararray([[b'abc', b'abc', b'abc'], [b'abc', b'abc', b'abc'], [b'abc', b'abc', b'abc']], dtype='|S5') rFNrCc Cs~|r t}nt}t|}t|tr*|} d}nd} |durNtj||||f|d} ntj||||f||||d} | durz| | d<| S)Norder)bufferoffsetstridesrV.)rrint isinstancestrr__new__) subtypeshapeitemsizeunicoderWrXrYrVdtypeZfillerselfrHrHrKr]s( zchararray.__new__cCs|jjdvr|t|S|S)NSUbc)rbcharviewtype)rcZarrcontextZ return_scalarrHrHrK__array_wrap__s zchararray.__array_wrap__cCs|jjdvrtddS)Nrdz-Can only create a chararray from string data.)rbrerQ)rcobjrHrHrK__array_finalize__s zchararray.__array_finalize__cCs8t||}t|tr4|}t|dkr0d}n|}|S)Nr)r __getitem__r[rr9len)rcrjvaltemprHrHrKrms   zchararray.__getitem__cCs t||S)zg Return (self == other) element-wise. See Also -------- equal )rrcotherrHrHrK__eq__szchararray.__eq__cCs t||S)zk Return (self != other) element-wise. See Also -------- not_equal )rrqrHrHrK__ne__szchararray.__ne__cCs t||S)zo Return (self >= other) element-wise. See Also -------- greater_equal )rrqrHrHrK__ge__szchararray.__ge__cCs t||S)zl Return (self <= other) element-wise. See Also -------- less_equal )rrqrHrHrK__le__szchararray.__le__cCs t||S)zh Return (self > other) element-wise. See Also -------- greater )rrqrHrHrK__gt__&szchararray.__gt__cCs t||S)ze Return (self < other) element-wise. See Also -------- less )rrqrHrHrK__lt__0szchararray.__lt__cCs t||S)z Return (self + other), that is string concatenation, element-wise for a pair of array_likes of str or unicode. See Also -------- add rrqrHrHrK__add__:s zchararray.__add__cCs t||S)z Return (other + self), that is string concatenation, element-wise for a pair of array_likes of `bytes_` or `str_`. See Also -------- add ryrqrHrHrK__radd__Es zchararray.__radd__cCstt||Sz Return (self * i), that is string multiple concatenation, element-wise. See Also -------- multiply r rrcrSrHrHrK__mul__Ps zchararray.__mul__cCstt||Sr|r}r~rHrHrK__rmul__[s zchararray.__rmul__cCstt||S)z Return (self % i), that is pre-Python 2.6 string formatting (interpolation), element-wise for a pair of array_likes of `bytes_` or `str_`. See Also -------- mod )r rr~rHrHrK__mod__fs zchararray.__mod__cCstSrG)NotImplementedrqrHrHrK__rmod__rszchararray.__rmod__cCs||||S)a Return the indices that sort the array lexicographically. For full documentation see `numpy.argsort`, for which this method is in fact merely a "thin wrapper." Examples -------- >>> c = np.array(['a1b c', '1b ca', 'b ca1', 'Ca1b'], 'S5') >>> c = c.view(np.char.chararray); c chararray(['a1b c', '1b ca', 'b ca1', 'Ca1b'], dtype='|S5') >>> c[c.argsort()] chararray(['1b ca', 'Ca1b', 'a1b c', 'b ca1'], dtype='|S5') )Z __array__argsort)rcZaxiskindrVrHrHrKruszchararray.argsortcCs tt|S)z Return a copy of `self` with only the first character of each element capitalized. See Also -------- char.capitalize )r rrcrHrHrKrs zchararray.capitalize cCstt|||S)z Return a copy of `self` with its elements centered in a string of length `width`. See Also -------- center )r rrcwidthZfillcharrHrHrKrs zchararray.centercCst||||S)z Returns an array with the number of non-overlapping occurrences of substring `sub` in the range [`start`, `end`]. See Also -------- char.count )r rcsubstartendrHrHrKr s zchararray.countcCs t|||S)zn Calls ``bytes.decode`` element-wise. See Also -------- char.decode )r!rcencodingerrorsrHrHrKr!s zchararray.decodecCs t|||S)zp Calls :meth:`str.encode` element-wise. See Also -------- char.encode )r"rrHrHrKr"s zchararray.encodecCst||||S)z Returns a boolean array which is `True` where the string element in `self` ends with `suffix`, otherwise `False`. See Also -------- char.endswith )r#)rcsuffixrrrHrHrKr#s zchararray.endswithcCstt||S)z Return a copy of each string element where all tab characters are replaced by one or more spaces. See Also -------- char.expandtabs )r r$)rctabsizerHrHrKr$s zchararray.expandtabscCst||||S)z For each element, return the lowest index in the string where substring `sub` is found. See Also -------- char.find )r%rrHrHrKr%s zchararray.findcCst||||S)z Like `find`, but raises :exc:`ValueError` when the substring is not found. See Also -------- char.index )r&rrHrHrKr&s zchararray.indexcCst|S)z Returns true for each element if all characters in the string are alphanumeric and there is at least one character, false otherwise. See Also -------- char.isalnum )r'rrHrHrKr's zchararray.isalnumcCst|S)z Returns true for each element if all characters in the string are alphabetic and there is at least one character, false otherwise. See Also -------- char.isalpha )r(rrHrHrKr(s zchararray.isalphacCst|S)z Returns true for each element if all characters in the string are digits and there is at least one character, false otherwise. See Also -------- char.isdigit )r)rrHrHrKr) s zchararray.isdigitcCst|S)z Returns true for each element if all cased characters in the string are lowercase and there is at least one cased character, false otherwise. See Also -------- char.islower )r*rrHrHrKr*s zchararray.islowercCst|S)z Returns true for each element if there are only whitespace characters in the string and there is at least one character, false otherwise. See Also -------- char.isspace )r+rrHrHrKr+&s zchararray.isspacecCst|S)z Returns true for each element if the element is a titlecased string and there is at least one character, false otherwise. See Also -------- char.istitle )r,rrHrHrKr,3s zchararray.istitlecCst|S)z Returns true for each element if all cased characters in the string are uppercase and there is at least one character, false otherwise. See Also -------- char.isupper )r-rrHrHrKr-?s zchararray.isuppercCs t||S)z Return a string which is the concatenation of the strings in the sequence `seq`. See Also -------- char.join )r.)rcseqrHrHrKr.Ls zchararray.joincCstt|||S)z Return an array with the elements of `self` left-justified in a string of length `width`. See Also -------- char.ljust )r r/rrHrHrKr/Xs zchararray.ljustcCs tt|S)z Return an array with the elements of `self` converted to lowercase. See Also -------- char.lower )r r0rrHrHrKr0ds zchararray.lowercCs t||S)z For each element in `self`, return a copy with the leading characters removed. See Also -------- char.lstrip )r1rccharsrHrHrKr1ps zchararray.lstripcCstt||S)zu Partition each element in `self` around `sep`. See Also -------- partition )r r2rcseprHrHrKr2|szchararray.partitioncCst||||dur|ndS)z For each element in `self`, return a copy of the string with all occurrences of substring `old` replaced by `new`. See Also -------- char.replace Nr)r3)rcoldnewr rHrHrKr3s zchararray.replacecCst||||S)z For each element in `self`, return the highest index in the string where substring `sub` is found, such that `sub` is contained within [`start`, `end`]. See Also -------- char.rfind )r4rrHrHrKr4s zchararray.rfindcCst||||S)z Like `rfind`, but raises :exc:`ValueError` when the substring `sub` is not found. See Also -------- char.rindex )r5rrHrHrKr5s zchararray.rindexcCstt|||S)z Return an array with the elements of `self` right-justified in a string of length `width`. See Also -------- char.rjust )r r6rrHrHrKr6s zchararray.rjustcCstt||S)zv Partition each element in `self` around `sep`. See Also -------- rpartition )r r7rrHrHrKr7szchararray.rpartitioncCs t|||S)z For each element in `self`, return a list of the words in the string, using `sep` as the delimiter string. See Also -------- char.rsplit )r8rcrmaxsplitrHrHrKr8s zchararray.rsplitcCs t||S)z For each element in `self`, return a copy with the trailing characters removed. See Also -------- char.rstrip )r9rrHrHrKr9s zchararray.rstripcCs t|||S)z For each element in `self`, return a list of the words in the string, using `sep` as the delimiter string. See Also -------- char.split )r:rrHrHrKr:s zchararray.splitcCs t||S)z For each element in `self`, return a list of the lines in the element, breaking at line boundaries. See Also -------- char.splitlines )r;)rckeependsrHrHrKr;s zchararray.splitlinescCst||||S)z Returns a boolean array which is `True` where the string element in `self` starts with `prefix`, otherwise `False`. See Also -------- char.startswith )r<)rcprefixrrrHrHrKr<s zchararray.startswithcCs t||S)z For each element in `self`, return a copy with the leading and trailing characters removed. See Also -------- char.strip )r=rrHrHrKr=s zchararray.stripcCs tt|S)z For each element in `self`, return a copy of the string with uppercase characters converted to lowercase and vice versa. See Also -------- char.swapcase )r r>rrHrHrKr> s zchararray.swapcasecCs tt|S)z For each element in `self`, return a titlecased version of the string: words start with uppercase characters, all remaining cased characters are lowercase. See Also -------- char.title )r r?rrHrHrKr?s zchararray.titlecCstt|||S)aB For each element in `self`, return a copy of the string where all characters occurring in the optional argument `deletechars` are removed, and the remaining characters have been mapped through the given translation table. See Also -------- char.translate )r r@)rctableZ deletecharsrHrHrKr@"s zchararray.translatecCs tt|S)z Return an array with the elements of `self` converted to uppercase. See Also -------- char.upper )r rArrHrHrKrA0s zchararray.uppercCstt||S)z Return the numeric string left-filled with zeros in a string of length `width`. See Also -------- char.zfill )r rB)rcrrHrHrKrB<s zchararray.zfillcCst|S)z For each element in `self`, return True if there are only numeric characters in the element. See Also -------- char.isnumeric )rCrrHrHrKrCHs zchararray.isnumericcCst|S)z For each element in `self`, return True if there are only decimal characters in the element. See Also -------- char.isdecimal )rDrrHrHrKrDTs zchararray.isdecimal)rFNrNrT)NF)rNN)r)rN)NN)NN)rN)r)rN)rN)r)N)N)rN)rN)r)NN)N)NN)N)rN)N)N)=__name__ __module__ __qualname____doc__r]rirkrmrsrtrurvrwrxrzr{rrrrrrrrr r!r"r#r$r%r&r'r(r)r*r+r,r-r.r/r0r1r2r3r4r5r6r7r8r9r:r;r<r=r>r?r@rArBrCrDrHrHrHrKrE2sz                                rETcCst|ttfrX|dur*t|tr&d}nd}|dur:t|}t||}t|||||dSt|ttfrnt|}t|trRt |j j t rRt|ts| t}|dur|j}t |j j tr|d}|durt |j j trd}nd}|rt}nt}|durt||d}|s<||jks<|s*t|ts<|rNt|trN||t|f}|St|trt |j j tr|dur|}|rt}nt}|durt|||dd}nt|||f|dd}| tS)a Create a `~numpy.char.chararray`. .. note:: This class is provided for numarray backward-compatibility. New code (not concerned with numarray compatibility) should use arrays of type `bytes_` or `str_` and use the free functions in :mod:`numpy.char` for fast vectorized string operations instead. Versus a NumPy array of dtype `bytes_` or `str_`, this class adds the following functionality: 1) values automatically have whitespace removed from the end when indexed 2) comparison operators automatically remove whitespace from the end when comparing values 3) vectorized string operations are provided as methods (e.g. `chararray.endswith `) and infix operators (e.g. ``+, *, %``) Parameters ---------- obj : array of str or unicode-like itemsize : int, optional `itemsize` is the number of characters per scalar in the resulting array. If `itemsize` is None, and `obj` is an object array or a Python list, the `itemsize` will be automatically determined. If `itemsize` is provided and `obj` is of type str or unicode, then the `obj` string will be chunked into `itemsize` pieces. copy : bool, optional If true (default), then the object is copied. Otherwise, a copy will only be made if ``__array__`` returns a copy, if obj is a nested sequence, or if a copy is needed to satisfy any of the other requirements (`itemsize`, unicode, `order`, etc.). unicode : bool, optional When true, the resulting `~numpy.char.chararray` can contain Unicode characters, when false only 8-bit characters. If unicode is None and `obj` is one of the following: - a `~numpy.char.chararray`, - an ndarray of type :class:`str_` or :class:`bytes_` - a Python :class:`str` or :class:`bytes` object, then the unicode setting of the output array will be automatically determined. order : {'C', 'F', 'A'}, optional Specify the order of the array. If order is 'C' (default), then the array will be in C-contiguous order (last-index varies the fastest). If order is 'F', then the returned array will be in Fortran-contiguous order (first-index varies the fastest). If order is 'A', then the returned array may be in any order (either C-, Fortran-contiguous, or even discontiguous). NTF)r`rarWrVrU)rbrVZsubok)r[bytesr\rnrElisttupleasnarrayr issubclassrbrgrrfr`rrZastyperZobjecttolistnarray)rjr`copyrarVr_rbrorHrHrKr ash?        r cCst||d||dS)a Convert the input to a `~numpy.char.chararray`, copying the data only if necessary. Versus a NumPy array of dtype `bytes_` or `str_`, this class adds the following functionality: 1) values automatically have whitespace removed from the end when indexed 2) comparison operators automatically remove whitespace from the end when comparing values 3) vectorized string operations are provided as methods (e.g. `chararray.endswith `) and infix operators (e.g. ``+``, ``*``, ``%``) Parameters ---------- obj : array of str or unicode-like itemsize : int, optional `itemsize` is the number of characters per scalar in the resulting array. If `itemsize` is None, and `obj` is an object array or a Python list, the `itemsize` will be automatically determined. If `itemsize` is provided and `obj` is of type str or unicode, then the `obj` string will be chunked into `itemsize` pieces. unicode : bool, optional When true, the resulting `~numpy.char.chararray` can contain Unicode characters, when false only 8-bit characters. If unicode is None and `obj` is one of the following: - a `~numpy.char.chararray`, - an ndarray of type `str_` or `unicode_` - a Python str or unicode object, then the unicode setting of the output array will be automatically determined. order : {'C', 'F'}, optional Specify the order of the array. If order is 'C' (default), then the array will be in C-contiguous order (last-index varies the fastest). If order is 'F', then the returned array will be in Fortran-contiguous order (first-index varies the fastest). Examples -------- >>> np.char.asarray(['hello', 'world']) chararray(['hello', 'world'], dtype='sP         !   24