使用Gmail API获取电子邮件/消息正文的HTML内容

23 投票

3 回答

26826 浏览

数据工程师

提问于 2025-04-18 11:10

有没有办法通过GMail API获取邮件内容的HTML格式呢？

我已经查看过message.get的文档，尝试把format参数改成full、minimal和raw，但都没有用。返回的还是邮件内容的纯文本。

关于格式值的说明：

“full”：返回解析后的邮件内容，放在payload字段里，raw字段不使用。（默认选项）

“minimal”：只返回邮件的元数据，比如标识符和标签，不返回邮件头、内容或payload。

“raw”：以字符串形式返回整个邮件内容，放在raw字段里，payload字段不使用。这包括标识符、标签、元数据、MIME结构和小的内容部分（通常小于2KB）。

难道我们不能简单地获取邮件内容的HTML格式吗？或者有没有其他方法可以做到，让用户在我的应用和GMail中看到的邮件几乎没有差别？

payload gmail api application integration email content html format raw data email metadata mime structure

3 个回答

这里是完整的教程：

1- 假设你已经完成了所有的凭证创建，可以在这里查看。

2- 下面是如何获取一个Mime消息：

 public static String getMimeMessage(String messageId)
            throws Exception {

           //getService definition in -3
        Message message = getService().users().messages().get("me", messageId).setFormat("raw").execute();

        Base64 base64Url = new Base64(true);
        byte[] emailBytes = base64Url.decodeBase64(message.getRaw());

        Properties props = new Properties();
        Session session = Session.getDefaultInstance(props, null);

        MimeMessage email = new MimeMessage(session, new ByteArrayInputStream(emailBytes));

        return getText(email); //getText definition in at -4
    }

3- 这是创建Gmail实例的代码：

private static Gmail getService() throws Exception {
    final NetHttpTransport HTTP_TRANSPORT = GoogleNetHttpTransport.newTrustedTransport();
    // Load client secrets.
    InputStream in = SCFManager.class.getResourceAsStream(CREDENTIALS_FILE_PATH);
    if (in == null) {
        throw new FileNotFoundException("Resource not found: " + CREDENTIALS_FILE_PATH);
    }
    GoogleClientSecrets clientSecrets = GoogleClientSecrets.load(JSON_FACTORY, new InputStreamReader(in));

    // Build flow and trigger user authorization request.
    GoogleAuthorizationCodeFlow flow = new GoogleAuthorizationCodeFlow.Builder(
            HTTP_TRANSPORT, JSON_FACTORY, clientSecrets, SCOPES)
            .setDataStoreFactory(new FileDataStoreFactory(new java.io.File(TOKENS_DIRECTORY_PATH)))
            .setAccessType("offline")
            .build();
    LocalServerReceiver receiver = new LocalServerReceiver.Builder().setPort(8888).build();
    Credential credential = new AuthorizationCodeInstalledApp(flow, receiver).authorize("user");

    return new Gmail.Builder(HTTP_TRANSPORT, JSON_FACTORY, credential)
            .setApplicationName(APPLICATION_NAME)
            .build();
}

4- 这是如何解析Mime消息的：

 public static String getText(Part p) throws
            MessagingException, IOException {
        if (p.isMimeType("text/*")) {
            String s = (String) p.getContent(); 
            return s;
        }

        if (p.isMimeType("multipart/alternative")) {
            // prefer html text over plain text
            Multipart mp = (Multipart) p.getContent();
            String text = null;
            for (int i = 0; i < mp.getCount(); i++) {
                Part bp = mp.getBodyPart(i);
                if (bp.isMimeType("text/plain")) {
                    if (text == null) {
                        text = getText(bp);
                    }
                    continue;
                } else if (bp.isMimeType("text/html")) {
                    String s = getText(bp);
                    if (s != null) {
                        return s;
                    }
                } else {
                    return getText(bp);
                }
            }
            return text;
        } else if (p.isMimeType("multipart/*")) {
            Multipart mp = (Multipart) p.getContent();
            for (int i = 0; i < mp.getCount(); i++) {
                String s = getText(mp.getBodyPart(i));
                if (s != null) {
                    return s;
                }
            }
        }

        return null;
    }

5- 如果你想知道如何获取邮件ID，这里是列出它们的方法：

 public static List<String> listTodayMessageIds() throws Exception {
        ListMessagesResponse response = getService()
                .users()
                .messages()
                .list("me") 
                .execute();  

        if (response != null && response.getMessages() != null && !response.getMessages().isEmpty()) {
            return response.getMessages().stream().map(Message::getId).collect(Collectors.toList());
        } else {
            return null;
        }
    }

注意：

如果你之后想用“类似JavaScript的方式”查询HTML内容，我建议你看看jsoup库……它非常直观，使用起来也很简单：

Document jsoup = Jsoup.parse(body);

Elements tds = jsoup.getElementsByTag("td");
Elements ps = tds.get(0).getElementsByTag("p");

希望这对你有帮助 :-)

回答于 2025-04-18 由 Python大师

分享举报

无论你选择FULL还是RAW，都会根据你的需求返回文本或HTML部分。如果你选择FULL，你会得到一个解析过的表示，这个表示会是嵌套的JSON字典，你需要逐层查找里面的文本或HTML部分。如果你选择RAW格式，你会在Message.raw字段中得到整个邮件的RFC822格式。你可以把这个内容传给你所用编程语言的mime库，然后用它来找到你感兴趣的部分。Mime结构比较复杂，通常你会看到一个顶层的“multipart”类型，里面直接包含文本或HTML部分，但这并没有保证，因为它的结构可以非常深！ :)

回答于 2025-04-18 由 Python大师

分享举报

包含HTML和纯文本内容的电子邮件消息会有多个部分，而其中mimeType为"text/html"的部分就是包含HTML内容的。你可以通过以下逻辑找到它：

var part = message.parts.filter(function(part) {
  return part.mimeType == 'text/html';
});
var html = urlSafeBase64Decode(part.body.data);

回答于 2025-04-18 由 Python大师

分享举报

使用Gmail API获取电子邮件/消息正文的HTML内容

3 个回答

撰写回答