Portal Wizard uses wrong regular expression for emails

Bug #1199386 reported by Marcel van der Boom (HS-Development BV)
44
This bug affects 8 people
Affects Status Importance Assigned to Milestone
Odoo Addons (MOVED TO GITHUB)
Fix Released
Medium
OpenERP R&D Addons Team 1

Bug Description

During testing with my email-address '<email address hidden>' I noticed a couple of things:

1. The partner contact with that email-address can't be given access to the portal by the portal wizard, it fails on creation of the user in addons/mail/res_user.py:68 :

        if not data.get('login', False):
            raise osv.except_osv(_('Invalid Action!'),...)))

2. The login field is empty, because the portal wizard uses the email field to create a login in addons/portal/wizard/portal_wizard.py:177 :

            'login': extract_email(wizard_user.email),

The extract_email function in portal_wizard.py uses 'email_re' to determine what the login field should look like.
'email_re' is defined as in multiple places, which is the first problem. It at least in tools/mail.py and tools/import_email.py. (The latter contains email information from the 'steel-sa.com ' (?!?) domain by the way.)

email_re = re.compile(r"""
    ([a-zA-Z][\w\.-]*[a-zA-Z0-9] # username part
    @ # mandatory @ sign
    [a-zA-Z0-9][\w\.-]* # domain must start with a letter ... Ged> why do we include a 0-9 then?
     \.
     [a-z]{2,3} # TLD
    )
    """, re.VERBOSE)

Problems I see:

1. The email_re is defined in multiple places; make it one;
2. It forces to have email-addresses with at least 2 characters for the user part, of which the first must contain a letter; addresses like <email address hidden> won't work then; is there a reason for this?
3. the TLD must contain 2 or 3 characters, which is unsuitable for domains like .name or .info

Revision history for this message
Marcel van der Boom (HS-Development BV) (mrb) wrote :

Patch for email regular expression to:
- allow for 1 letter username parts
- support modern TLD (.name and .info for example)

Revision history for this message
Denis Karataev (dskarataev) wrote :

Guys I don't know why you are silent but your email_re expression is really ugly. For example, there is no chance to validate email for domains .name, .avia, .travel, .info etc. You allow only 2-3 symbols, WHY???

Also we faced the same problem - your expression doesn't work for numeral email names like <email address hidden> if despite of that Google allowed it! For example here in Russia we often use numeral emails!

It's so pity to see that kind of bad code and it's also pity to understand that you even didn't answer to this bugreport from SUMMER?! 6 months, guys! The man even provided the patch to you. What only you could do it's to check and to merge it. Is it so hard?...

Revision history for this message
Denis Karataev (dskarataev) wrote :

I subscribed Olivier Dony and openerp-community list for notifications of this bug. Please pay attention to it.

The problem in two words is that OpenERP highly restrict us with email addresses we can use for users. So we can't use for example: <email address hidden>, <email address hidden>, <email address hidden> etc. It's why we can't add a portal user for customer if its email doesn't match to this ugly designed expression.

Please, keep in mind that this bug also posted in July, so it's already 6 months ago! Does everyone like that time of bugs consideration?

Revision history for this message
Denis Karataev (dskarataev) wrote : Re: [Openerp-community] [Bug 1199386] [NEW] Email address regular expression needs work

I subscribed Olivier Dony and openerp-community list for notifications of
this bug. Please pay attention to it.

The problem in two words is that OpenERP highly restrict us with email
addresses we can use for users. So we can't use for example:
<email address hidden>, <email address hidden>, <email address hidden> etc. It's why we can't
add a portal user for customer if its email doesn't match to this ugly
designed expression.

Please, keep in mind that this bug also posted in July, so it's already 6
months ago! Does everyone like that time of bugs consideration?

2014/1/14 Launchpad Bug Tracker <email address hidden>

> You have been subscribed to a public bug by Denis Karataev (dskarataev):
>
> During testing with my email-address '<email address hidden>' I noticed a
> couple of things:
>
> 1. The partner contact with that email-address can't be given access to
> the portal by the portal wizard, it fails on creation of the user in
> addons/mail/res_user.py:68 :
>
> if not data.get('login', False):
> raise osv.except_osv(_('Invalid Action!'),...)))
>
> 2. The login field is empty, because the portal wizard uses the email
> field to create a login in addons/portal/wizard/portal_wizard.py:177 :
>
> 'login': extract_email(wizard_user.email),
>
> The extract_email function in portal_wizard.py uses 'email_re' to
> determine what the login field should look like.
> 'email_re' is defined as in multiple places, which is the first problem.
> It at least in tools/mail.py and tools/import_email.py. (The latter
> contains email information from the 'steel-sa.com ' (?!?) domain by the
> way.)
>
> email_re = re.compile(r"""
> ([a-zA-Z][\w\.-]*[a-zA-Z0-9] # username part
> @ # mandatory
> @ sign
> [a-zA-Z0-9][\w\.-]* # domain must start with a
> letter ... Ged> why do we include a 0-9 then?
> \.
> [a-z]{2,3} # TLD
> )
> """, re.VERBOSE)
>
> Problems I see:
>
> 1. The email_re is defined in multiple places; make it one;
> 2. It forces to have email-addresses with at least 2 characters for the
> user part, of which the first must contain a letter; addresses like
> <email address hidden> won't work then; is there a reason for this?
> 3. the TLD must contain 2 or 3 characters, which is unsuitable for domains
> like .name or .info
>
> ** Affects: openobject-addons
> Importance: Undecided
> Status: New
>
> ** Affects: openobject-server
> Importance: Undecided
> Status: New
>
> --
> Email address regular expression needs work
> https://bugs.launchpad.net/bugs/1199386
> You received this bug notification because you are a member of OpenERP
> Community, which is subscribed to the bug report.
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openerp-community
> Post to : <email address hidden>
> Unsubscribe : https://launchpad.net/~openerp-community
> More help : https://help.launchpad.net/ListHelp
>

--
Denis Karataev.

Revision history for this message
Etienne Hirt (hirt) wrote : Re: Email address regular expression needs work

You may have a look at the complete replacement of RE for email-parsing in
https://code.launchpad.net/~hirt/ocb-addons/6.1_fix_mailaddress-parsing

Hope this helps
Etienne

Revision history for this message
Alexandre Fayolle - camptocamp (alexandre-fayolle-c2c) wrote :

Here are various occurences of regular expressions looking very strongly like email matching regexps I found with grep in the trunk branches:

ack --type=python 're\.(compile|search|match).*@' server/trunk addons/trunk web/trunk/
server/trunk/openerp/addons/base/ir/ir_mail_server.py
123:name_with_email_pattern = re.compile(r'("[^<@>]+")\s*<([^ ,<@]+@[^> ,]+)>')
124:address_pattern = re.compile(r'([^ ,<@]+@[^> ,]+)')

server/trunk/openerp/tools/mail.py
58: part = re.compile(r"(<(([^a<>]|a[^<>\s])[^<>]*)@[^<>]+>)", re.IGNORECASE | re.DOTALL)
487:email_re = re.compile(r"""([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6})""", re.VERBOSE)
490:single_email_re = re.compile(r"""^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6}$""", re.VERBOSE)
498:reference_re = re.compile("<.*-open(?:object|erp)-(\\d+)(?:-([\w.]+))?.*@(.*)>", re.UNICODE)

addons/trunk/survey/wizard/survey_answer.py
704: if re.match("^[a-zA-Z0-9._%-+]+@[a-zA-Z0-9._%-]+.[a-zA-Z]{2,6}$", val1) == None:
743: if re.match("^[a-zA-Z0-9._%-+]+@[a-zA-Z0-9._%-]+.[a-zA-Z]{2,6}$", val1) == None:
906: if re.match("^[a-zA-Z0-9._%-+]+@[a-zA-Z0-9._%-]+.[a-zA-Z]{2,6}$", val) == None:
942: if re.match("^[a-zA-Z0-9._%-+]+@[a-zA-Z0-9._%-]+.[a-zA-Z]{2,6}$", val) == None:

addons/trunk/crm/base_partner_merge.py
741: re_email = re.compile(r".*@")

This badly needs unification and consistency.

Revision history for this message
Guewen Baconnier @ Camptocamp (gbaconnier-c2c) wrote :

This is a must read when dealing with regular expressions for emails: http://davidcel.is/blog/2012/09/06/stop-validating-email-addresses-with-regex/

Revision history for this message
Stefan Rijnhart (Opener) (stefan-opener) wrote :

Brilliant example of how OpenERP should stop reinventing the wheel. What Etienne was hinting at: we replaced the regexp in OCB 6.1 with getaddresses from the standard email.utils package. It parses all the email adresses that are mentioned in this thread fine.

Revision history for this message
Ferdinand (office-chricar) wrote :

thanks Stefan for making the point

Revision history for this message
Leonardo Pistone (lepistone) wrote :

Denis,

I totally agree about the topic, i.e. that regexps like that are bad, and they have been around for too long, and so on.

Still, I personally disagree with this somewhat abusive tone to be used in a discussion on an open source project.

Everyone has his priorities, and the OCB branches are there exactly for the fixes that OpenERP SA doesn't merge for whatever reason.

Of course we can influence or lobby them to do the way we want, but I still believe this can be done with a slightly softer tone.

my 2 cents,
leo

Revision history for this message
Denis Karataev (dskarataev) wrote :

Leonardo,

I'm sorry if it seems abusive, but I don't think that it's normal even to not check the bug report within 6 months, but unfortunately OpenERP has many same bug reports without any answers. It's not good for commercial product. Especially when the bug explanation already includes the patch. At least they could set status and priority for the bug report and write something like "we agree, it designed bad. we'll try to fix it as soon as possible". Don't you think so?

What about email_re expression I'm also very surprising how people with "bad" emails like <email address hidden> worked with portal module until now? It's not a small trouble, it doesn't let people to work with portal, they can't dismiss it and continue to work.

It's no problem for me to fix it with my OpenERP instance, but I guess all we want to help OpenERP S.A. to fix as more as possible bugs in official branch. But when someone added a bug explanation and still don't get any answer, it's not good workflow.

Only with the best wishes,
Denis.

Revision history for this message
Stefan Rijnhart (Opener) (stefan-opener) wrote :

@Olivier Thank you for solving this problem based on the suggested approach (while even improving on it).

Reference: http://bazaar.launchpad.net/~openerp/openobject-server/7.0/revision/5198

Revision history for this message
Denis Karataev (dskarataev) wrote :

Already in 7.0? Olivier rocks! Thank you a lot!

Revision history for this message
Olivier Dony (Odoo) (odo-openerp) wrote :

Hi,
This has been fixed in addons 7.0 at revision 9764 rev-id: <email address hidden>, using the corrected email_split function. As a result, updating both server and addons 7.0 to the latest version will get you the complete fix.

Concerning the server-part of the big report, I'd prefer addressing it on bug 1165531, as this bug is (at least partly) a duplicate. The unification of the email parsing will be continued in trunk, where extra regular expressions have been introduced again. Splitting the 2 concerns (specific portal problem vs improvement/unification of email parsing) seems easier to track.

Changed in openobject-addons:
assignee: nobody → OpenERP R&D Addons Team 1 (openerp-dev-addons1)
importance: Undecided → Medium
milestone: none → 7.0
status: New → Fix Released
no longer affects: openobject-server
summary: - Email address regular expression needs work
+ Portal Wizard uses wrong regular expression for emails
Revision history for this message
Joël Grand-Guillaume @ camptocamp (jgrandguillaume-c2c) wrote :

Thanks Olivier for the fix, appreciate that !

Revision history for this message
Marco Di Francesco (marco-difra) wrote : AUTO: Marco Di Francesco is out of the office. (returning Mon 12/01/2015)

I am out of the office from Wed 20/08/2014 until Mon 12/01/2015.

I will respond to your message when I return.

Note: This is an automated response to your message "[Openerp-community]
[Bug 1199386] FW:<email address hidden> 7" sent on 25/11/2014 7.21.21.

This is the only notification you will receive while this person is away.

Revision history for this message
Marco Di Francesco (marco-difra) wrote :

I am out of the office from Tue 23/12/2014 until Mon 12/01/2015.

I will respond to your message when I return.

Note: This is an automated response to your message "[Openerp-community]
[Bug 1199386] AUTO: Marco Di Francesco is out of the office. (returning Mon
12/01/2015)" sent on 23/12/2014 8.19.28.

This is the only notification you will receive while this person is away.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.