[nycphp-talk] Sneaking in unwanted characters
Adam Maccabee Trachtenberg
adam at trachtenberg.com
Wed Sep 10 19:26:46 EDT 2003
On Wed, 10 Sep 2003, Chris Shiflett wrote:
> Most of the RFCs either have example regular expressions or a very specific
> grammar that can be used to build one. I've seen that one in the back of the
> O'Reilly book, and my instinct tells me it shouldn't have to be that
> complicated. :-)
The full valid e-mail spec is really nasty, cause you can have
comments inside the address and other weird things. Here is the regex
from PHP Cookbook that allows most real-world addresses, but not
everything that's okay:
/
^ # anchor at the beginning
[^@\s]+ # name is all characters except @ and whitespace
@ # the @ divides name and domain
(
[-a-z0-9]+ # (sub)domains are letters, numbers, and hyphens
\. # separated by a period
)+ # and we can have one or more of them
(
[a-z]{2} # TLDs can be a two-letter alphabetical country
code
|com|net # or one of
|edu|org # many
|gov|mil # possible
|int|biz # three-letter
|pro # combinations
|info|arpa # or even
|aero|coop # a few
|name # four-letter ones
|museum # plus one that's six-letters long!
)
$ # anchor at the end
/ix # and everything is case-insensitive
Alternatively, check out imap_rfc822_parse_adrlist().
-adam
--
adam at trachtenberg.com
author of o'reilly's php cookbook
avoid the holiday rush, buy your copy today!
More information about the talk
mailing list