DomCrawler filterXpath for emails

In my project I am trying to use filterXPath for emails. So I get an E-Mail via IMAP and put the mail body into my DomCrawler.

$crawler = new Crawler();
$crawler->addHtmlContent($mail->textHtml); //mail html content utf8

Now to my issue. I only want the plain text of the mail body, but still remain all new lines spaces etc – the exact same as the mail looks just in plain text without html (still with nr etc).

For that reason I tried using $crawler->filterXPath('//body/descendant-or-self::*/text()') to get every text node inside the mail.

However my test-mail containts html like:

<p>
    <u>
        <span>
            <a href="mailto:[email protected]">
                <span style="color:#0563C1">[email protected]</span>
            </a>
        </span>
    </u>
    <span>
</span>
    <span>·</span>
    <span>
        <b>
            <a href="http://www.example.com">
                <span style="color:#0563C1">www.example.com</span>
            </a>
        </b>
    <p/>
    </span>
</p>

In my mail this looks like [email protected] · www.example.com (in one single line).

With my filterXPath I get multiple nodes which result in following (multiple lines):

[email protected]
· wwww.example.com

I know that probably the might be the problem, which is a r, but since I can’t change the html in the mail, I need another solution – as mentioned before in the mail it is only a single line.

Please keep in mind, that my solution has to work for every mail – I do not know how the mail html looks like – it can change every time. So I need a generic solution.

I already tried using strip_tags too – this does not change the result at all.


My current approach:

$crawler = new Crawler();
$crawler->addHtmlContent($mail->textHtml);

$text = "";
foreach ($crawler->filterXPath('//body/descendant-or-self::*/text()') as $element) {
    $part = trim($element->textContent);
    if($part) {
        $text .= "|".$part."|n"; //to see whitespaces etc
    }
}
echo $text;

//OUTPUT
|[email protected]|
|·|
| |
|www.example.com|
| |

Source: Symfony Questions

Was this helpful?

0 / 0

Leave a Reply 0

Your email address will not be published. Required fields are marked *