<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Matthew Turland &#187; XML</title>
	<atom:link href="http://matthewturland.com/tag/xml/feed/" rel="self" type="application/rss+xml" />
	<link>http://matthewturland.com</link>
	<description></description>
	<lastBuildDate>Tue, 24 Jan 2012 04:03:47 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Renaming a DOMNode in PHP</title>
		<link>http://matthewturland.com/2010/02/09/renaming-a-domnode-in-php/</link>
		<comments>http://matthewturland.com/2010/02/09/renaming-a-domnode-in-php/#comments</comments>
		<pubDate>Wed, 10 Feb 2010 01:07:14 +0000</pubDate>
		<dc:creator>Matthew Turland</dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[DOM]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[Web Scraping]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false">http://matthewturland.com/?p=218</guid>
		<description><![CDATA[A recent work assignment had me using PHP to pull HTML data into a DOMDocument instance and renaming some elements, such as b to strong or i to em. As it turns out, renaming elements using the DOM extension is rather tedious. Version 3 of the DOM standard introduces a renameNode() method, but the PHP [...]]]></description>
			<content:encoded><![CDATA[<p>A recent work assignment had me using PHP to pull HTML data into a <code><a title="PHP: DOMDocument - Manual" href="http://php.net/manual/en/class.domdocument.php">DOMDocument</a></code> instance and renaming some elements, such as <a title="HTML element - Wikipedia, the free encyclopedia" href="http://en.wikipedia.org/wiki/HTML_element#Presentation">b to strong or i to em</a>. As it turns out, renaming elements using the DOM extension is rather tedious.</p>
<p>Version 3 of the DOM standard introduces a <code><a title="Document Object Model Core" href="http://www.w3.org/TR/DOM-Level-3-Core/core.html#Document3-renameNode">renameNode()</a></code> method, but the PHP DOM extension doesn&#8217;t currently support it.</p>
<p>The <code><a title="PHP: DOMNode - Manual" href="http://php.net/manual/en/class.domnode.php#domnode.props.nodename">$nodeName</a></code> property of the <code><a title="PHP: DOMNode - Manual" href="http://php.net/manual/en/class.domnode.php">DOMNode</a></code> class is read-only, so it can&#8217;t be changed that way.</p>
<p>A node can be created with a different name in the same document, but if you specify a value to go along with it, any entities in that value are automatically encoded, so it&#8217;s not possible to pass in the intended inner content of a node if it contains other nodes.</p>
<p>The only method I&#8217;ve found that works is to replicate the attributes and child nodes of the original node. Attributes are fairly easy, but I ran into an issue replicating children where only the first child of any given node was replicated within its intended replacement and the remaining children were omitted. Here&#8217;s the original code that was exhibiting this behavior.</p>
<pre class="brush: php; title: ; notranslate">foreach ($oldNode-&gt;childNodes as $childNode) {
    $newNode-&gt;appendChild($childNode);
}</pre>
<p>The reason for this behavior is that the <code><a title="PHP: DOMNode - Manual" href="http://php.net/manual/en/class.domnode.php#domnode.props.childnodes">$childNodes</a></code> property of <code>$oldNode</code> is implicitly modified when <code>$childNode</code> is transferred from it to <code>$newNode</code>, so the internal pointer of <code>$childNodes</code> to the next child in the list is no longer accurate.</p>
<p>To get around this, I took advantage of the fact that any node with any child nodes will always have a <code><a title="PHP: DOMNode - Manual" href="http://php.net/manual/en/class.domnode.php#domnode.props.firstchild">$firstChild</a></code> property pointing to the first one. The modified code that takes this approach is below and has the behavior I originally set out to implement.</p>
<pre class="brush: php; title: ; notranslate">while ($oldNode-&gt;firstChild) {
    $newNode-&gt;appendChild($oldNode-&gt;firstChild);
}</pre>
<p>If you&#8217;re curious, below is the full code segment for renaming a node.</p>
<pre class="brush: php; title: ; notranslate">$newNode = $oldNode-&gt;ownerDocument-&gt;createElement('new_element_name');
if ($oldNode-&gt;attributes-&gt;length) {
    foreach ($oldNode-&gt;attributes as $attribute) {
        $newNode-&gt;setAttribute($attribute-&gt;nodeName, $attribute-&gt;nodeValue);
    }
}
while ($oldNode-&gt;firstChild) {
    $newNode-&gt;appendChild($oldNode-&gt;firstChild);
}
$oldNode-&gt;ownerDocument-&gt;replaceChild($newNode, $oldNode);</pre>
<p>Another potential &#8220;gotcha&#8221; is the argument order of the <code><a title="PHP: DOMNode::replaceChild - Manual" href="http://php.net/manual/en/domnode.replacechild.php">replaceChild()</a></code> method, which is the new node followed by the old node rather than the reverse that most people might expect. Thanks to <a title="joshua may (notjosh) on Twitter" href="http://twitter.com/notjosh">Joshua May</a> for pointing that one out to me; I might never have understood why I was getting a <a title="PHP: DOMNode::appendChild - Manual" href="http://php.net/manual/en/domnode.appendchild.php#domnode.appendchild.errors">&#8220;Not Found Error&#8221;</a> <code><a title="PHP: DOMException - Manual" href="http://php.net/manual/en/class.domexception.php">DOMException</a></code> otherwise.</p>
]]></content:encoded>
			<wfw:commentRss>http://matthewturland.com/2010/02/09/renaming-a-domnode-in-php/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>DomQuery Update</title>
		<link>http://matthewturland.com/2009/03/28/domquery-update/</link>
		<comments>http://matthewturland.com/2009/03/28/domquery-update/#comments</comments>
		<pubDate>Sat, 28 Mar 2009 02:26:41 +0000</pubDate>
		<dc:creator>Matthew Turland</dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[CSS]]></category>
		<category><![CDATA[DOM]]></category>
		<category><![CDATA[jQuery]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[I think it&#8217;s mostly flown under the radar, but one of my smaller projects is a class called DomQuery that is built on top of DOM and the SPL ArrayObject. The functionality is provides is somewhat similar to jQuery, but it&#8217;s different in that it does so programmatically through the API rather than using an [...]]]></description>
			<content:encoded><![CDATA[<p>I think it&#8217;s mostly flown under the radar, but one of my smaller projects is a class called <a href="http://matthewturland.com/2008/05/25/domquery/" title="i should be coding :: domquery">DomQuery</a> that is built on top of <a href="http://us2.php.net/dom" title="PHP: DOM - Manual">DOM</a> and the <a href="http://us3.php.net/spl" title="PHP: SPL - Manual">SPL</a> <a href="http://php.net/manual/en/class.arrayobject.php" title="PHP: ArrayObject - Manual">ArrayObject</a>. The functionality is provides is somewhat similar to <a href="http://jquery.com/" title="jQuery: The Write Less, Do More, JavaScript Library">jQuery</a>, but it&#8217;s different in that it does so programmatically through the API rather than using an expression parser.</p>
<p>This post is mainly to inform anyone who might be interested that I&#8217;ve moved the project from its old home at <a href="http://www.assembla.com/" title="Welcome | Assembla">Assembla</a> to a new repository on <a href="http://github.com/elazar/domquery" title="elazar's domquery at master - GitHub">github</a>. I&#8217;ve been enjoying my use of <a href="http://git-scm.com/" title="Git - Fast Version Control System">git</a> for version control of other projects and it seems an appropriate place to house DomQuery to allow other people to play with it. I haven&#8217;t had time recently to make many updates, but hope that will change in the short term. If you haven&#8217;t used DomQuery, why not try it today?</p>
]]></content:encoded>
			<wfw:commentRss>http://matthewturland.com/2009/03/28/domquery-update/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>DomQuery</title>
		<link>http://matthewturland.com/2008/05/25/domquery/</link>
		<comments>http://matthewturland.com/2008/05/25/domquery/#comments</comments>
		<pubDate>Sun, 25 May 2008 04:26:53 +0000</pubDate>
		<dc:creator>Matthew Turland</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[Ever since I started working with the jQuery JavaScript library, I&#8217;ve loved it. It offers the power to do a lot with only a little code and makes features offered by the JavaScript DOM implementation much easier to access. My interest in web scraping prompted me to consider creating an equivalent of sorts for PHP. [...]]]></description>
			<content:encoded><![CDATA[<p>Ever since I started working with the <a title="jQuery: The Write Less, Do More, JavaScript Library" href="http://jquery.com">jQuery</a> JavaScript library, I&#8217;ve loved it. It offers the power to do a lot with only a little code and makes features offered by the JavaScript <a title="Document Object Model - Wikipedia, the free encyclopedia" href="http://en.wikipedia.org/wiki/Document_Object_Model"><acronym title="Document Object Model">DOM</acronym></a> implementation much easier to access. My interest in web scraping prompted me to consider creating an equivalent of sorts for PHP.</p>
<p>This obviously doesn&#8217;t include some features specific to the client-side or any that require evaluating CSS, but it does include many for extracting data from a valid XML or HTML document. I&#8217;ve posted my initial work on the concept in an <a title="DomQuery" href="http://github.com/elazar/domquery">GitHub repository</a>. The code there is commented with docblocks and includes unit tests with over 99% code coverage. Comments and suggestions are welcome.</p>
]]></content:encoded>
			<wfw:commentRss>http://matthewturland.com/2008/05/25/domquery/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
<!-- WP Super Cache is installed but broken. The path to wp-cache-phase1.php in wp-content/advanced-cache.php must be fixed! -->
