Skip to content Skip to sidebar Skip to footer

Imap Message Gets Unicodedecodeerror 'utf-8' Codec Can't Decode

After 5 hours of trying, time to get some help. Sifted through all the stackoverflow questions related to this but couldn't find the answer. The code is a gmail parser - works for

Solution 1:

Here is an example how to retrieve and read mail parts with imapclient and the email.* modules from the python standard libs:

from imapclient import IMAPClient
import email
from email import policy


defwalk_parts(part, level=0):
    print(' ' * 4 * level + part.get_content_type())
    # do something with part content (applies encoding by default)# part.get_content()if part.is_multipart():
        for part in part.get_payload():
            get_parts(part, level + 1)


# context manager ensures the session is cleaned upwith IMAPClient(host="your_mail_host") as client:
    client.login('user', 'password')

    # select some folder
    client.select_folder('INBOX')

    # do something with folder, e.g. search & grab unseen mails
    messages = client.search('UNSEEN')
    for uid, message_data in client.fetch(messages, 'RFC822').items():
        email_message = email.message_from_bytes(
            message_data[b'RFC822'], policy=policy.default)
        print(uid, email_message.get('From'), email_message.get('Subject'))

    # alternatively search for specific mails
    msgs = client.search(['SUBJECT', 'some subject'])

    ## do something with a specific mail:## fetch a single mail with UID 12345
    raw_mails = client.fetch([12345], 'RFC822')

    # parse the mail (very expensive for big mails with attachments!)
    mail = email.message_from_bytes(
        raw_mails[12345][b'RFC822'], policy=policy.default)

    # Now you have a python object representation of the mail and can dig# into it. Since a mail can be composed of several subparts we have# to walk the subparts.# walk all parts at oncefor part in mail.walk():
        # do something with that partprint(part.get_content_type())
    # or recurse yourself into sub parts until you find the interesting part
    walk_parts(mail)

See the docs for email.message.EmailMessage. There you find all needed bits to read into a mail message.

Solution 2:

use 'ISO 8859-1' instead of 'utf-8'

Solution 3:

I had the same issue And after a lot of research I realized that I simply need to use, message_from_bytes function from email rather than using message_from_string

so for your code simply replace:

raw_email_str = raw_email.decode('utf-8')        
 email_message = email.message_from_string(raw_email_str)

to

email_message = email.message_from_bytes(raw_email)

should work like a charm :)

Post a Comment for "Imap Message Gets Unicodedecodeerror 'utf-8' Codec Can't Decode"