Displaying HTML Entities in Magpie RSS

by Stuart Bowness

We use the Magpie RSS reader to display the list of ‘Recent Journal Entries’ on the front page. It does its job well, but it sometimes fails to display HTML entities like apostrophes and quotes correctly. Instead, they appear as a ‘?’. This problem comes up with some frequency on the Magpie mailing list, but I wasn’t able to find a working solution there.

As it turns out, Magpie doesn’t automatically detect the character encoding of the RSS it reads. You have to manually define the character encoding of both the input RSS and the output XHTML. Since we use UTF-8, adding these two lines of PHP before including Magpie <? include(’rss_fetch.inc’) ?> did the trick:

define('MAGPIE_INPUT_ENCODING', 'UTF-8');
define('MAGPIE_OUTPUT_ENCODING', 'UTF-8');

If that doesn’t work, you need to check the character encoding yourself. The input encoding is defined RSS file, at the top, in a line like this: <?xml version="1.0" encoding="UTF-8"?>. The output encoding is that of your XHTML page, usually defined in the Content-Type <meta> tag.

38 Responses to “Displaying HTML Entities in Magpie RSS”

  1. This fix didn’t work for me. What did, was defining the output encoding before loading Magpie, as follows:


    define('MAGPIE_OUTPUT_ENCODING', 'UTF-8');
    require('magpierss/rss_fetch.inc');

    Mapgie then spits out UTF-8 encoded characters, which can be converted to HTML entities quite easily, e.g.:


    $output = htmlentities($output, ENT_QUOTES, "UTF-8");

    This solution doesn’t require hacking the Magpie source.

    Matthew

  2. Tim Houghton says:

    Thanks Matthew! Your ‘define’ solution worked brilliantly for me (although I had to change the curly single quotes in your comment to straight single quotes, but I’m sure that was a result of the commenting system here and not your intention).

  3. Todd Kempf says:

    Matthew – thanks for your post you htmlentities call was exactly what I needed…..

  4. Stuart says:

    We have updated our page based on Matthew’s tip with a few modifications. Hope it helps!

  5. Ian says:

    Thank you very much for posting this…I believe you’ve just saved me a TON of time.

  6. Chris Kelley says:

    Excellent solution – worked perfectly, thanks for saving me the headache.

  7. Konrad Burman says:

    Sweet, I was looking for exactly this solution! Thank you!

  8. sweet and easy fix–thanks Nathan and Matthew!

  9. Cliff says:

    Thanks man… it really did the job!

  10. Elmar says:

    Thanks. Exactly what i needed … after i realized, that magpierss was killing my german umlauts

  11. Hein says:

    Great work! Thanks!

  12. dev says:

    thanks for the quick fix!

  13. Esther says:

    This worked for me too… Excellent!

  14. Arnau says:

    Thank you! :)

  15. Simon says:

    great thanks, worked for me

  16. Abed says:

    My thanks to Nathan & Matthew. Precisely what I was looking for.

  17. john karalis says:

    Hallo guys i think i have the same proble i am using magpierss under pligg. and in some imports of rss i cant see correctly the same html entities. for instanse i cant see

  18. name req says:

    Erm, where to put these two lines!????!?!?!?!

  19. Before you include the magpie rss in your include syntax.

  20. Great article, this addresses the problem perfectly. Simple, concise, well done.

    Thank you.

  21. manish says:

    thnx for soln

  22. Geoffrey says:

    I am using Magpie RSS to get RSS data from Hong Kong Government, and it works great, All organised into a nice Array for my Manipulation, and with the Proxy option, Brilliant, Fits my Requirement pretty well.
    Question now is, is it possible to easily get the Magpie RSS to return a HTML page that is not an RSS feed?
    I have tried the normal Link, and it Returns = 1.
    If I could get it to do this, I don’t need to write a second retrieve, Cache and Proxy application, when Magpie does 90% of what I need.
    If there isnt a quick solution, a pointer into which area I should look at would be helpful!

    Thanks

  23. I’m not sure if Magpie by default has the capability to return an html page instead of an rss feed. This is something that could easily be programmed though. It’s just as simple as looping through the array and displaying the results in an html format.

  24. Thanks for posting this solution – a big help!

  25. Benni says:

    Thanks for this gread advice. It helped me out a lot.

  26. Suzette says:

    Where does the output line of text go? After $rss = fetch_rss( $url );? I am so confused!
    $output = htmlentities($output, ENT_QUOTES, “UTF-8″);

    When I add the line define(’MAGPIE_OUTPUT_ENCODING’, ‘UTF-8′); above, I get weird characters instead of ?, I now get things like:

    London Miles Gallery continues to bring it in the UK…
    Viva La Art Benefit Auction this Saturday…

    Any help would be much appreciated!

    Any help would be much appreciated!

  27. You declare the line before you include magpie. IE.

    define('MAGPIE_INPUT_ENCODING', 'UTF-8');
    define('MAGPIE_OUTPUT_ENCODING', 'UTF-8');
    < ? include(’rss_fetch.inc’) ?>

  28. Suzette says:

    I guess my question was about the usage of htmlentities(), I figured it out, I used it only on the title and it worked perfect:

    $title = $item['title'];
    $title = htmlentities($title, ENT_QUOTES, “UTF-8″);

  29. Francesco says:

    Thank you very much for the quick tip! Char encoding problems are always the weirdest thing to debug!
    Francesco

  30. Vats Thakur says:

    thanks great post!

  31. rallat says:

    Thanks man
    Your advice save me the day :)

  32. Luke says:

    Hello, thanks for the post but I have been unable to get this working at all. I’ve tried the htmlentities() method, setting output encoding to UTF-8 as well as trying iso-8859-1… nothing seems to work Magpie keeps returning ‘?’ characters. Could this be because I am having to use require_once(’rss_fetch.inc’); instead of include(’rss_fetch.inc’); ? Currently using MagpieRSS v0.7a.

    Cheers.

  33. Luke says:

    All good, just had to add the define’s inside rss_fetch.inc. :S

  34. gihan says:

    Thankx Stuart you saved me from lot of works

  35. Ben Brown says:

    Thank you, kind internet gentlepeople! You just fixed my bug.

  36. Kris says:

    Thank you! This worked incredibly without much hassle at all – if only I’d found it before all the previous hassle :P

    @ Suzette – I tried it quickly below the line “$rss = fetch_rss( $url );” but it worked above. I guess it needs to do the conversion before the rss_fetch function. Also you would need to change $output to $rss in any of your cases (and mine :P )

    Thanks!

  37. Thanks for the little tutorial. Any idea why, when I set the output to UTF-8, all of the content would simply disappear? I set it to ISO-8859-1, I have the question marks; I set it to UTF-8, no data comes through at all. I have empty HTML.

  38. rd says:

    Still doesn’t work for me..May I know where to put this line?
    $output = htmlentities($output, ENT_QUOTES, “UTF-8″);

Trackbacks/Pingbacks

Leave a Reply