So, I noticed earlier today that none of my posts were showing up in Technorati. I’ve found that whenever this happens, and it happens relatively often, the problem stems from the HTML on this site not validating properly. Since I’m using XHTML 1.0 Strict, this means that anything the least bit unusual generally breaks the page.
I have a couple of custom fixes I’ve had to include to the WordPress installation which disappear every time I upgrade WordPress. I have to figure this out, by the way, because I can’t be the only one impacted by this, so I must be doing something wrong… or it’s one of my custom written plugins. Hmmm.
Anyway, I tried embedding the YouTube player today and that destroyed the page. First of all, the embed tag isn’t part of HTML or XHTML 1.0. So, I can’t include it in the post, which means I can’t let Firefox users see the player. So, I decided to use SWFObject, a JS-based flash embedder. It went downhill from here.
Between X-Valid and the default wpautop and wptexturize, the whole script tag was getting destroyed. The CDATA block was getting ripped out, the > in the CDATA close tag was getting converted to >, and there were <p> tags everywhere.
Even disabling wpautop and wptexturize (via the TextControl plugin) and disabling X-Valid for that post didn’t help. Everything was fine except something kept killing the final > on the CDATA close tag, which is supposed to be ]]>. After worrying about it, then thinking that I need to make another patch to my wordpress install (noooooo!), I just gave up and removed the CDATA blocks and made sure my JS didn’t have any characters that would break validation.
This isn’t a workable solution, so I’ll have to look more. A quick google shows that it might be the rich editor, but I’ll have to look later. Holy crap, though, this is an insanely annoying bug.
Update: And my custom fixes have been removed. CDATA elements still break validation, but the other issue I mentioned where I had to have a custom patch is now resolved. Turns out it was this ancient (by Internet standards) tagging plugin I was using to display the tags on each post. I’ve switched to Christine Davis’s excellent Ultimate Tag Warrior plugin. This plugin should be held up as a poster child for well thought out plugins, at least from a user standpoint. It imported the tags from the old plugin and set them up. There was one problem with the + signs I used for multi-word tags, but a little SQL magic and we’re all set. Still looking for a fix for the CDATA issue, but that will have to wait. I need to get to bed so I can finish a spec tomorrow morning.





June 19th, 2006 at 2:39 PM
why not go to Transitional instead of Strict for <embed>s?
June 19th, 2006 at 2:59 PM
This is the only place I can screw up that gets traffic. It’s a good learning exercise to see what breaks with Strict, so I’m going to stick with Strict so I can learn. I’ve found a ton of small bugs with WordPress and wpautop and wptexturize. I may even report them one day…
The ESPN SOB js or the open source SWFObject one that I’m using here essentially hides this stuff from validators anyway and only browsers that actually know about embed will get it.
Sujal
May 4th, 2007 at 5:11 PM
I ran into the same problem trying to embed a flash video into my posts. The culprit is the function the_content(…) in wp-includes/post-template.php. It contains a str_replace specifically to garble the ]]> tags into ]]>. Not sure why you would want to special case that (except with the intent of breaking things just like this..)