<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>MarkupAsAnApi &#187; screen scraping</title>
	<atom:link href="http://www.markupasanapi.com/tag/screen-scraping/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.markupasanapi.com</link>
	<description>Publish once, publish everywhere</description>
	<lastBuildDate>Mon, 07 Sep 2009 03:56:21 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Scraping HTML with innerHTML or jQuery</title>
		<link>http://www.markupasanapi.com/2007/09/11/scraping-html-with-innerhtml-or-jquery/</link>
		<comments>http://www.markupasanapi.com/2007/09/11/scraping-html-with-innerhtml-or-jquery/#comments</comments>
		<pubDate>Tue, 11 Sep 2007 08:54:30 +0000</pubDate>
		<dc:creator>halans</dc:creator>
				<category><![CDATA[Javascript]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[DOM]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[innerHTML]]></category>
		<category><![CDATA[jquery]]></category>
		<category><![CDATA[screen scraping]]></category>
		<category><![CDATA[screenscraping]]></category>

		<guid isPermaLink="false">http://www.markupasanapi.com/?p=5</guid>
		<description><![CDATA[A couple of nice write-ups on how to scrape HTML using innerHTML at Pathfinder Development: A common solution has been to proxy and scrape an application with a combination of XQuery and TagSoup (to fix the ugly, broken HTML, dontcha know), but it is possible to do this purely in the browser. or with jQuery, [...]<p>Post from <a href="http://www.halans.com">Jean-Jacques Halans</a> <a href="http://www.markupasanapi.com">MarkupAsAnApi</a> blog.<br/><br/><a href="http://www.markupasanapi.com/2007/09/11/scraping-html-with-innerhtml-or-jquery/">Scraping HTML with innerHTML or jQuery</a></p>
]]></description>
			<content:encoded><![CDATA[<p>A couple of nice write-ups on how to scrape HTML using innerHTML at <a title="Pathfinder" href="http://www.pathf.com/blogs/2007/09/parsing-html-wi/">Pathfinder Development</a>:</p>
<blockquote><p>A common solution has been to proxy and scrape an application with a combination of XQuery and TagSoup (to fix the ugly, broken HTML, dontcha know), but it is possible to do this purely in the browser.</p></blockquote>
<p>or with jQuery, as <a title="Scraping with jQuery" href="http://jan.varwig.org/archiv/scraping-pages-with-jquery">Jan Varwig</a> describes:</p>
<blockquote><p>Fortunately, just the day before, I discovered <a href="http://www.jquery.com/">jQuery</a>, a Javascript framework with strong support for <a href="http://docs.jquery.com/DOM/Traversing/Selectors">finding DOM-Nodes via CSS, XPath and some custom selectors</a>. The tricky part now was to get jQuery to access the DOM-Tree of the schedule page on kino.de.</p></blockquote>
<p>Of course, screen scraping would be so much easier using Web Standards.</p>
<p>Post from <a href="http://www.halans.com">Jean-Jacques Halans</a> <a href="http://www.markupasanapi.com">MarkupAsAnApi</a> blog.<br/><br/><a href="http://www.markupasanapi.com/2007/09/11/scraping-html-with-innerhtml-or-jquery/">Scraping HTML with innerHTML or jQuery</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.markupasanapi.com/2007/09/11/scraping-html-with-innerhtml-or-jquery/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
