“Python Regexp 2.7” team

Currently registered issues:

1) Atomic Grouping / Possessive Qualifiers


2) Named Match Groups as Match Attributes


3) Match objects support Array Indexing


4) Add support for Perl-Style Relative Back References


5) Allow Parenthetically Well-Nested Comments in Regular Expressions


6) Add support for fixed-width Expression matching, enabling the undocumented Template option


7) Better compiled expression Cache


8) Emacs / Perl like Named Character Sets


9) Engine Cleanups, Documentation and general Improvements


9-1) New Engine Proposal that replaces pseudo-recursion with a Single Loop


9-1-1) New Engine Proposal that replaces pseudo-recursion with Three Nested Loops


9-2) New Engine designed by Matthew Barnett


9-3) New Engine based on Thompson Non-Finite Automaton (NFA)

    [ http://bugs.python.org/issue1662581 ]
    [ http://bugs.python.org/issue1721518 ]
    [ http://swtch.com/~rsc/regexp/regexp1.html ]

10) Reduce use of Magic Numbers by sharing Constants between the C-Engine and Python


11) Catch-All for any other Perl 5.10.0 / 6.0 features we may wish to add


12) Clarify elements of the Documentation about how Regular Expression Comment nesting works


13) Add a grouptuple method to the Match object which would return a 3-tuple for each match group


14) Allow UNICODE Match Group Identifiers


15) Add __doc__ strings to the Pattern_Type, Match_Type and Scanner_Type classes


16) Implement various FIXMEs


16-1) Allow the deletion of the string attribute associated with a Match object


17) Variable-Length Positive and Negative Look-Behind Expressions


18) Allow for Strings to be scanned in Reverse by a given Regular Expression Pattern


19) Allow In-Line Pattern Flags to be Positionally Dependant


20) Allow In-Line Pattern Flags to be Negated


21) Allow Scoped In-Line Pattern Flags


22) Change how a Zero-Width Pattern splits a string


23) Fix inconsistencies in how Character Ranges work in Case-Insensitive Character Classes


24) Fix missing character bug in findall / finditer methods


25) Allow sub-expressions of size greater than 65535


26) Allow Capture Groups in Look-Behind expressions


27) Allow UNICODE (\u, \U) escape sequences in Regular Expressions

    [ http://bugs.python.org/issue3665 ]
    [ http://bugs.python.org/file11235/re_unicode_escapes.diff - Georg Brandl]

28) Add Flags parameter to re.split, re.sub and re.subn

    [ http://bugs.python.org/issue3482 ]
    [ http://bugs.python.org/issue3255 ]

29) re.sub / re.subn should allow Unmatched Group replacement via empty string

    [ http://bugs.python.org/issue1519638 ]

30) re.escape should only escape non-alphanumeric characters that are known Regular Expression operators

    [ http://bugs.python.org/issue2650 ]
    [ http://bugs.python.org/file10080/re.patch (Russ Cox) ]
    [ http://bugs.python.org/file10084/re.patch (Russ Cox) ]
    [ http://bugs.python.org/file10130/re.patch (Lorenz Quack - add Frozen Set to store characters) ]
    [ http://bugs.python.org/file10215/re_patch.diff (Rafael Zanella -- combo w/ dict) ]

31) Make sure \w properly matches non-Roman scripts
    a) Verify Regexp2.7 uses UNICODE 5.x
    b) Verify whether Mc, Mn and Me character classes should be classified as Spaces or Words

    [ http://bugs.python.org/issue1693050 ]

32) Add support for immutable bytes and mutable buffer objects in the place of basestring types

    [ http://bugs.python.org/issue1282 ]
    [ http://bugs.python.org/issue1708652 ]
    [ http://www.python.org/dev/peps/pep-3137 ]

33) Ignore redundant repeat operators (e.g. in '(x*)*', '(x*)?', '(x*){n}' and '(x*){n,m}')

    [ http://bugs.python.org/issue2537 ]
    [ http://bugs.python.org/issue1633953 ]
    [ http://bugs.python.org/issue1456280 ]
    [ http://bugs.python.org/issue214033 - (x?)? - No longer pertinent ]

34) Add support for exact (start to finish) matches via an exact method on the pattern object (e.g. exact == search(r'\A...\Z', ...) == match(r'...\Z', ...))

    [ http://bugs.python.org/issue1708652 ]

35) Add support for PCRE-style Regular Expression subroutines

    [ http://bugs.python.org/issue694374 ]
    [ http://manpages.courier-mta.org/htmlman3/pcresyntax.3.html ]

36) Add option to make \Z and \z operate like Perl / the PCRE, e.g. re.PERL, re.E, (?E)

    [ Suggestion by Matthew Barnett ]

99) Solve the mysterious PyObjectDel / Py_DECREF debug memory issue

    [ http://bugs.python.org/issue3299 ]
    [ http://bugs.python.org/file10891/_sre-2.patch (Victor Stinner) ]
    [ http://bugs.python.org/file10892/_curses_panel.patch (Victor Stinner) ]
    [ http://bugs.python.org/file10893/pyobject_del.patch (Victor Stinner) ]

A number of these issues have combined solutions as well as the core solutions listed here; see https://code.launchpad.net/~pythonregexp2.7 for a complete list of all current branches.

This is a group engaged in updating the current Regular Expression engine in the Python programming language to support enhanced features, various bug fixes and better documentation.

Team details

Log in for email information.
Created on:
Membership policy:
Moderated Team

All members

You must log in to join or leave this team.

Latest members

Mailing list

This team does not use Launchpad to host a mailing list.