Why Does SimplePie Replace Some Characters With Gibberish?

Sometimes when you use SimplePie to load and output an RSS feed, some characters, like quote marks and apostrphes, are replaced with some gibberish like €‡™. You may wonder what’s wrong, and search to find a way to prevent the unsightly garbage from appearing.

You have an encoding issue.

RSS feeds are encoded as UTF-8, as are many web pages. If you try to put SimplePie output on a page that isn’t UTF-8, you’ll get the weird characters.

“But…my page is UTF-8! I have a <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> tag in my header!”

Actually, there’s more to it than that. In addition to specifying the charset in your header, you server also has to send the data in UTF-8. If you use Firefox, choose Tools -> Page Info from the menu. In the resulting dialog box, note the two references to the encoding and charset.

Page Info - Character Encoding

Note that on Webmaster-Source, as you can see in the screenshot above, the page’s meta tag is specifying UTF-8 and the server encoding is set to UTF-8 as well. Thus, I shouldn’t have any problems with SimplePie output. If you have one set to one character type, and the other set to another encoding, you probably will have problems.

With a PHP script, you can set the encoding with a header() statement at the top, before anything is outputted. (WordPress already does this, in case you were wondering.)

header("Content-Type: text/html; charset=utf-8");

If you have URF-8 specified in both places, then you’re problems should be gone.

  • http://www.kawentzmann.de Kahuna Kawentzmann

    Hi, I have set everything to UTF-8. It looks like yours in Firefox. But still the simplepie output in the footer contains gibberish, while it is fine on the wordpress.

    http://www.kawentzmann.de
    vs.
    http://www.kawentzmann.de/wordpress/

    Would appreciate any help!

    • http://www.webmaster-source.com Matt

      That’s odd. I assume the Last.fm line is the part you’re talking about?

      Your encoding types look like they check out…

      How are you loading it? SimplePie? Or some other script?