Pages: 1
RSS
Convert pure HTML email to Plain Text format
 
Is there any way convert pure HTML emails to Plain text format?

- if email contains HTML + Plain text - this is easy, just delete HTML part
- if mail contains just HTML - if you press delete - nothing happened (it's ok because you can't delete whole text body)

What i try:
- export/import: email keeps body intact how receive it (don't converting body by export)

Maybe idea to some new functionality? ;-) Bat! already can render HTML emails as Plain text.. Just save the result. It can be placed in menu Export messages to / Plain text message files (.MSG)   User can later import this file and delete old one.

Or maybe some other tool/plugin/action that can do it?
 
It's an interesting idea, but I'm not aware of a command or plug-in that does this.

For occassional use, to convert a single message, you can:

- Drag the message to the Outbox of the account (or Ctrl-Drag to copy it instead of moving it);
- Double-click on it to open it in the editor;
- In the editor's status bar, click on the text format box and set it to Plain Text;
- Save the message using 'Put Message in the Outbox'
- Drag it back to the inbox.

(however, note that this will change the time/date of the message..)

Theoratically, it should be possible to make a filter that converts a HTML-message to plain text (using the 'create a formatted message' action, with 'strip the original attachments' option, and 'plain text' in the template format options), but I couldn't get that to work.. the messages that it generated still had a HTML attachment.
I volunteer as a moderator to help keep the forum tidy. I do not work for Ritlabs SRL.
 
I try also Copy / Edit method (then Unpark) but as you say it changes time/date of the message. You can't do it with old messages because you lost original date. Also when you compare output you find out that a lot of email headers are stripped out. Bat! keeps only mandatory fields.

If you do it in moment when email receive - you get only small time shift and you don't need additional info from email headers - Sender info, IP...

About filter: HTML attachment can be deleted probably only in case when there exist another plain text part.
Edited: Boris Test - 03 September 2019 10:18:55
 
Another solution (for this time best):

- Export message to .eml.
- Edit in some editor and delete base64/html part
- Copy plain text render from Bat! to our edited email
- Delete Content-Type, Content-Transfer-Encoding email headers
- Save and Import back to Bat!

Email headers, date, time are untouched. All works. There can be problem with encodings but for me it looks ok.

But this is time consuming work. It's impossible manually convert all messages with this way. Some tool will be handy. This functionality can significantly reduce size of the mailboxes!
Edited: Boris Test - 03 September 2019 09:33:27
 
I play little bit with filters. I use Create a formatted message filter (as proposed Daniel van Rooijen) - which works fine and convert pure HTML email to Plain text. Problem is that filtering always recreate new email header. There is special setting for that: Do not preserve header of the original message, which i think should solve this problem but unchecked or checked - result is the same (no original header).

There is another post that advises the same: https://www.ritlabs.com/en/auth-forums/forum4/topic6257/message23889/#message23889

Tested with 8.8.0.1 (Voyager), tomorrow i test it with latest version.
Edited: Boris Test - 04 September 2019 13:14:31
 
Quote
Boris Test wrote:
Another solution (for this time best):

- Export message to .eml.
- Edit in some editor and [..] [..]

But this is time consuming work. It's impossible manually convert all messages with this way. Some tool will be handy.

When you make a filter, one of the actions that it can do is to run an external program ('Run external action'). It can also export the message. So, if you can find a suitable HTML-2-TXT conversion tool, it might be possible to create a filter that does all the above very quickly..?
I volunteer as a moderator to help keep the forum tidy. I do not work for Ritlabs SRL.
 
Quote
Daniel van Rooijen wrote:
Run external action
This can be handy ...

Today i tested with latest version 8.8.9.11 but Do not preserve header of the original message is not working. Maybe meaning of this setting is something else? Or maybe it works only in some cases?

Screenshot

In help i don't find any useful info about this setting. If this settings works then all is done. (Create formatted message filter works ok)
 
Currently reading about filters and find this: Unfortunately for you, you cannot set the date header manually with TB  nor with a macro. The dateheader is set automatically based on the  current time.

This is problem because we loosing original email date if we use filters. I try manually set it with %HDRDate=%ODATE but it was always current date/time. So only Do not preserve header of the original message can save us in this case (if it works) :)
Edited: Boris Test - 04 September 2019 11:14:39
 
I try another approach. I use Export message filter. There is setting RFC 822 message (.MSG/.EML, plain text) but html message is exported as html not as plain text. So this is also not a correct way. Plain text description in this case is confusing.

Screenshot
 
I find this puzzling. I agree that Outlook creates excessive HTML, but other email clients tend to add just the basics. And Outlook (fr om my experiences) always includes both HTML and plain text, which is easy to manage as you stated earlier. The big items are unwanted graphics and TB! lets you avoid that in browser choice. Assuming you compose in plain text and reply in plain text, and remove HTML where both are included, that doesn't appear to leave a large percentage of HTML wastage. Your efforts seem to require much of your time to clean out HTML and how much storage space will it save? A possible concern is that in removing HTML you may find that you are modifying the message's intent, such as bulleted items or emphasized text wh ere the author was letting the formatting communicate parts of the message. Anyway, best of luck in your pursuits. .
 
Quote
david kirk wrote:
Your efforts seem to require much of your time to clean out HTML and how much storage space will it save?
In my case i handle with "bad" formatted html emails coming from one corporation. Each email is in 92kb size (pure html email) If i convert it to plain text with all content (just loosing formatting and gui shit) i get only 2kb. For ~1000 emails is 92MB vs 2MB inbox size. I am sure that TB! is happier with second option (speed, work, backups, search, memory...) and this is only one communication, one inbox and more than 1000 emails :-)
Edited: Boris Test - 05 September 2019 13:46:56
 
I believe regex can probably do this, but my skills there are basic. There are a few examples of using regex on several websites and may give you some ideas. Good luck.
 
Quote
david kirk wrote:
I believe regex can probably do this,
Regex can help in some cases (strip some info, etc ...) Problem is if your source email is BASE64 encoded - regex doesn't help here. There must be done conversion first. Also write regex for converting HTML to PLAIN TEXT is also not a good way.

I have support ticket here but i am waiting to return of some people from vacation. There is one setting that can solve it all (maybe it's just bug and they will fix it). All this functionality is already done inside TB! I keep you informed.
Edited: Boris Test - 08 September 2019 11:30:50
 
About Do not preserve header of the original message - it's official bug now, it will be fixed in future release.

> We confirm that issue - the original headers are not preserved anyway. We have submitted a report to our Bug Tracker. Thank you!
 
The Bat! 9.0.16.2 (BETA)
[-] 0001811: The Sorting Office option "Do not preserve header of the original messages" is always applied regardless of the settings

Fix is coming.. :)
 
Yes, the fix is there in The Bat! v9.1.6. Therefore, If the option "Do not preserve header of the original messages" is turned off for the "Create formatted message" action in the Sorting Office, then the RFC-822 headers of original message are added as an attachment (with Content-type "text/rfc822-headers", as described in RFC-1892) to the newly created message.
 
Little summary after some time:  

Fix "Small Do not preserve header of the original messages" - doesn't work like expected, they add headers of the original message as an attachment, so it can't be used as solution for this task. Original email headers in external file are useless in macro scenario.

Ticket status was closed with state: Couldn't be solved

Another proposed solution:
What about create new macro command e.g. %NHEADERS (RFC-822 headers of the new message) and then we can just assign %NHEADERS=%HEADERS (RFC-822 headers of the original message) and job done. Or some other syntax like %SETHEADERS(%HEADERS). Assigning each value from header doesn't have sense because we loose a lot of original information from headers. We need apply one command for all headers

List of message headers macros: https://www.ritlabs.com/en/support/help/73/#6674
Edited: Boris Test - 31 March 2020 16:37:16
Pages: 1