Potential plaintext parser issue


Continuing the discussion from I receive an undefined method error when submitting a remote execution job so that we don’t derail a user support topic:

@Ivan_Necas So firstly, this isn’t the OneBox itself that’s the issue - I think it’s an edge-case combination of what your email client sent and what Discourse generally does to handle inbound email. Let me break it down:

To avoid it re-parsing, here’s a screenshot of what Discourse thinks it got from your client:

Two things to note here:

  1. The GitHub URL is on a line by itself - that’s what triggers the OneBox, URLs within a line aren’t treated that way
  2. There’s a comma on the next line all by itself

This is confirmed in the database, looking at what’s stored, there are definitely newlines there. Would be interesting to see what’s in your Sent folder, and where those newlines came from.

Where I think we’ve found an issue is with that comma - Discourse tries pretty aggressively to remove things like signatures and suchlike. I think it’s interpreting that comma with some kind of “Is this a line of all punctuation?” regex (which would normally match the start of signatures etc) and thus stripping the rest of the paragraph. Still hunting in the code for where that’s defined…

I need to upgrade our Discourse tonight anyway, and I see a few things in the patch notes relating to plaintext email. After that I’ll run some tests via private messages and see if I can replicate the behaviour after upgrade, and report upstream if so - although it is weird that your client decided to break the line in that way, I think. In the short term, not adding punctuation to long URLs would probably avoid hitting this (and it’s one less character to type :stuck_out_tongue:)


OK, replicated, it’s definitely the comma on a single line. If you send the exact same content without it, eg:

As I've suggested in
I will revert the changes in tasks that caused this troubles and release a
new version
that will still ship the service scripts in 1.16

Then the whole thing renders fine, OneBox and all. A bit of an edge case where punctuation has been shifted onto it’s own line, but I’ll raise it upstream all the same as I think their parser is being harsh here.


Tracking this here: