Commit Graph

8 Commits

Author SHA1 Message Date
medmunds
d3ac4a1542 Drop Python 3.3-specific tests 2018-05-15 11:36:00 -07:00
medmunds
802a56c87d Inbound: fix 8bit Unicode parsing as escape sequences on Python 3
Work around Python 3 email parser change that can turn Unicode
characters into \u escape sequences when parsing a message (or
attachment) that uses "Content-Transfer-Encoding: 8bit".
2018-04-02 16:41:05 -07:00
medmunds
dbe48d48af Inbound: add parse_raw_mime_bytes and parse_raw_mime_file
Useful for cases where ESP could send raw 8bit message
(and its charset is something other than utf-8).

Also reworks earlier Python 2.7 workaround email.parser.Parser header
unfolding bugs to handle any text-like, file-like IO stream, without
trying to manipulate the entire message as a single string.
2018-04-01 17:26:27 -07:00
medmunds
3928f6ea5e Inbound: fix charset handling in .text, .html, .get_content_text()
Make `AnymailInboundMessage.text`, `.html` and `.get_content_text()`
usually do the right thing for non-UTF-8 messages/attachments. Fixes
an incorrect UnicodeDecodeError when receiving an (e.g.,) ISO-8859-1
encoded message, and improves handling for inbound messages that were
not properly encoded by the sender.

* Decode using the message's (or attachments's) declared charset
  by default (rather than always defaulting to 'utf-8'; you can
  still override with `get_content_text(charset=...)`
* Add `errors` param to `get_content_text()`, defaulting to 'replace'.
  Mis-encoded messages will now use the Unicode replacement character
  rather than raising errors. (Use `get_content_text(errors='strict')`
  for the previous behavior.)
2018-04-01 17:26:17 -07:00
medmunds
eab11ed53e Inbound: test parsing RFC2231 MIME header parameters
And decide not to work around a Python 3.3 bug accessing MIME headers
that have non-ASCII characters in params. The bug is fixed in the
Python 3.4 email package (and didn't exist in Python 2.7). Python 3.3
was only supported with Django 1.8.
2018-03-24 17:47:06 -07:00
medmunds
3d27e3fe6b Inbound: decode Unicode and other non-ASCII email headers on Python 2
In AnymailInboundMessage, work around Python 2 email.parser.Parser's
lack of handling for RFC2047-encoded email headers. (The Python 3 email
package already decodes these automatically.)

Improves inbound handling on Python 2 for all ESPs that provide raw
MIME email or raw headers with inbound events. (Mailgun, Mandrill,
SendGrid, SparkPost.)
2018-03-24 10:03:18 -07:00
medmunds
70094cf3bc Inbound: correctly parse long (folded) headers in raw MIME messages
Work around Python 2 email.parser.Parser bug handling RFC5322 folded
headers. Fixes problems where long headers in inbound mail (e.g.,
Subject) get truncated or have unexpected spaces.

This change also updates AnymailInboundMessage.parse_raw_mime to use
the improved "default" email.policy on Python 3 (rather than the
default "compat32" policy). This likely fixes several other parsing
bugs that will still affect code running on Python 2.

Improves inbound parsing for all ESPs that provide raw MIME email.
(Mailgun, Mandrill, SendGrid, SparkPost)
2018-03-23 19:00:42 -07:00
Mike Edmunds
b57eb94f64 Add inbound mail handling
Add normalized event, signal, and webhooks for inbound mail.

Closes #43
Closes #86
2018-02-02 10:38:53 -08:00