[nycphp-talk] Validating Email Addresses
Daniel Convissor
danielc at analysisandsolutions.com
Thu Dec 2 18:16:59 EST 2004
On Thu, Dec 02, 2004 at 01:21:25PM -0500, James B. Wetterau Jr. wrote:
> > Daniel Convissor wrote:
> > >
> > > '^[a-z0-9_.=+-]+@([a-z0-9-]+\.)+([a-z]{2,6})$'
>
> I feel that that regex is way underblown.
>
> It misses many valid email addresses. It's perfectly legitimate to
> have characters other than [a-z0-9_.=+-] in the local part of the
> address.
Back when I did a study of RFC 822, those were the characters that the
RFC said were valid. Perhaps I made a mistake. But I haven't heard
of any problems with my expression.
But RFC 822 has been replaced by 2822. I believe the newer RFC allows
more characters in the local part. Again, I haven't heard of any
complaints with it. None the less, looking at
ftp://ftp.rfc-editor.org/in-notes/rfc2822.txt shows the following:
atext = ALPHA / DIGIT / ; Any character except controls,
"!" / "#" / ; SP, and specials.
"$" / "%" / ; Used for atoms
"&" / "'" /
"*" / "+" /
"-" / "/" /
"=" / "?" /
"^" / "_" /
"`" / "{" /
"|" / "}" /
"~"
Most of those characters are quite unusual for email addresses.
<grumble>
fine
</grumble>
Let's "do the right thing..." Here's a tweaking of my ereg
expression to include all the legit characters in the local part:
'^[-!#$%&\'*+/=?^_`{|}~a-z0-9.]+@([a-z0-9-]+\.)+([a-z]{2,6})$'
The one flaw with that is it allows the local part to be just a
period: . at foo.com.
So, while we're at it, let's take it all the way:
$expr = '^[-!#$%&\'*+/=?^_`{|}~a-z0-9]+'
. '(.[-!#$%&\'*+/=?^_`{|}~a-z0-9]+)*'
. '@([a-z0-9-]+\.)+([a-z]{2,6})$';
I've needed to do this for a while. Thanks for the excuse.
--Dan
--
T H E A N A L Y S I S A N D S O L U T I O N S C O M P A N Y
data intensive web and database programming
http://www.AnalysisAndSolutions.com/
4015 7th Ave #4, Brooklyn NY 11232 v: 718-854-0335 f: 718-854-0409
More information about the talk
mailing list