Message encodings: "LookupError: unknown encoding: ..." #111

Closed
opened 2022-10-29 21:23:27 +02:00 by pfm · 1 comment
Collaborator

Some messages contain content declarations causing errors.

Example:

Content-Type: text/plain; charset="iso-2022-jp"
	boundary="--98943748626111287826

Above header seems to be parsed incorrectly, leading to the following error:

LookupError: unknown encoding: iso-2022-jp"
	boundary="--98943748626111287826

In the last line of this fragment of email.message built-in Python module:

        if hasattr(payload, 'encode'):
            if charset is None:
                self._payload = payload
                return
            if not isinstance(charset, Charset):
                charset = Charset(charset)
            payload = payload.encode(charset.output_charset)
Some messages contain content declarations causing errors. Example: ``` Content-Type: text/plain; charset="iso-2022-jp" boundary="--98943748626111287826 ``` Above header seems to be parsed incorrectly, leading to the following error: ``` LookupError: unknown encoding: iso-2022-jp" boundary="--98943748626111287826 ``` In the last line of this fragment of `email.message` built-in Python module: ```python if hasattr(payload, 'encode'): if charset is None: self._payload = payload return if not isinstance(charset, Charset): charset = Charset(charset) payload = payload.encode(charset.output_charset) ```
pfm added this to the Open beta disroot test milestone 2022-10-29 21:23:27 +02:00
pfm added the
ISSUE
label 2022-10-29 21:23:27 +02:00
Author
Collaborator

I'm closing this ticket, because above is an invalid MIME message.

It should include a semicolon after charset:

Content-Type: text/plain; charset="iso-2022-jp";
	boundary="--98943748626111287826

We can't do too much with invalid input message, so perhaps it makes sense to leave this as it is.

If it happens frequently enough for us to notice, we can re-open this ticket.

I'm closing this ticket, because above is an invalid MIME message. It should include a semicolon after charset: ``` Content-Type: text/plain; charset="iso-2022-jp"; boundary="--98943748626111287826 ``` We can't do too much with invalid input message, so perhaps it makes sense to leave this as it is. If it happens frequently enough for us to notice, we can re-open this ticket.
pfm closed this issue 2022-12-29 22:37:25 +01:00
Sign in to join this conversation.
No description provided.