<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>acloudtree &#187; Database</title>
	<atom:link href="http://www.acloudtree.com/category/database/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.acloudtree.com</link>
	<description>Programming, Computers, Writing, Economics, and Life</description>
	<lastBuildDate>Tue, 07 Feb 2012 00:03:38 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>(Nerd) Mechanize &amp; Javascript</title>
		<link>http://www.acloudtree.com/nerd-mechanize-javascript/</link>
		<comments>http://www.acloudtree.com/nerd-mechanize-javascript/#comments</comments>
		<pubDate>Wed, 18 Nov 2009 17:02:04 +0000</pubDate>
		<dc:creator>jared.folkins</dc:creator>
				<category><![CDATA[Database]]></category>
		<category><![CDATA[OOP]]></category>
		<category><![CDATA[REGEX]]></category>
		<category><![CDATA[Scraping]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[Mechanize]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://www.acloudtree.com/?p=101</guid>
		<description><![CDATA[This is from the mechanize site, wish I would have read it before I started. Since Javascript is completely visible to the client, it cannot be used to prevent a scraper from following links. But it can make life difficult, and until someone writes a Javascript interpreter for Perl or a Mechanize clone to control [...]]]></description>
			<content:encoded><![CDATA[<p>This is from the mechanize site, wish I would have read it before I started.</p>
<blockquote><p>Since Javascript is completely visible to the client, it cannot be used to prevent a scraper from following links. But it can make life difficult, and until someone writes a Javascript interpreter for Perl or a Mechanize clone to control Firefox, there will be no general solution. But if you want to scrape specific pages, then a solution is always possible.</p>
<p>One typical use of Javascript is to perform argument checking before posting to the server. The URL you want is probably just buried in the Javascript function. Do a regular expression match on $mech-&gt;content() to find the link that you want and $mech-&gt;get it directly (this assumes that you know what you are looking for in advance).</p>
<p>In more difficult cases, the Javascript is used for URL mangling to satisfy the needs of some middleware. In this case you need to figure out what the Javascript is doing (why are these URLs always really long?). There is probably some function with one or more arguments which calculates the new URL. Step one: using your favorite browser, get the before and after URLs and save them to files. Edit each file, converting the the argument separators (&#8216;?&#8217;, &#8216;&amp;&#8217; or &#8216;;&#8217;) into newlines. Now it is easy to use diff or comm to find out what Javascript did to the URL. Step 2 &#8211; find the function call which created the URL &#8211; you will need to parse and interpret its argument list. Using the Javascript Debugger Extension for Firefox may help with the analysis. At this point, it is fairly trivial to write your own function which emulates the Javascript for the pages you want to process.</p>
<p>Here&#8217;s annother approach that answers the question, &#8220;It works in Firefox, but why not Mech?&#8221; Everything the web server knows about the client is present in the HTTP request. If two requests are identical, the results should be identical. So the real question is &#8220;What is different between the mech request and the Firefox request?&#8221;</p>
<p>The Firefox extension &#8220;Tamper Data&#8221; is an effective tool for examining the headers of the requests to the server. Compare that with what LWP is sending. Once the two are identical, the action of the server should be the same as well.</p>
<p>I say &#8220;should&#8221;, because this is an oversimplification &#8211; some values are naturally unique, e.g. a SessionID, but if a SessionID is present, that is probably sufficient, even though the value will be different between the LWP request and the Firefox request. The server could use the session to store information which is troublesome, but that&#8217;s not the first place to look (and highly unlikely to be relevant when you are requesting the login page of your site).</p>
<p>Generally the problem is to be found in missing or incorrect POSTDATA arguments, Cookies, User-Agents, Accepts, etc. If you are using mech, then redirects and cookies should not be a problem, but are listed here for completeness. If you are missing headers, $mech-&gt;add_header can be used to add the headers that you need.</p></blockquote>
<p><a title="Mechanize" href="http://search.cpan.org/dist/WWW-Mechanize/lib/WWW/Mechanize/FAQ.pod#I_have_this_web_page_that_has_JavaScript_on_it,_and_my_Mech_program_doesn%27t_work." target="_blank">LINK</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.acloudtree.com/nerd-mechanize-javascript/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Bend Oregon Real Estate and why it holds</title>
		<link>http://www.acloudtree.com/bend-oregon-real-estate-and-why-it-holds/</link>
		<comments>http://www.acloudtree.com/bend-oregon-real-estate-and-why-it-holds/#comments</comments>
		<pubDate>Thu, 03 Sep 2009 23:22:49 +0000</pubDate>
		<dc:creator>jared.folkins</dc:creator>
				<category><![CDATA[bend]]></category>
		<category><![CDATA[Database]]></category>
		<category><![CDATA[Economy]]></category>
		<category><![CDATA[housing]]></category>
		<category><![CDATA[Life]]></category>
		<category><![CDATA[oregon]]></category>
		<category><![CDATA[Politics]]></category>
		<category><![CDATA[bubble]]></category>
		<category><![CDATA[burden]]></category>
		<category><![CDATA[debt]]></category>
		<category><![CDATA[Estate]]></category>
		<category><![CDATA[foreclosure]]></category>
		<category><![CDATA[Real]]></category>

		<guid isPermaLink="false">http://www.acloudtree.com/bend-oregon-real-estate-and-why-it-holds/</guid>
		<description><![CDATA[The other day, I was talking with a good buddy. And like always, economics and housing came up in the conversation. “Jared” he started “I was speaking with my wife-“ stroking his thick beard recalling as he went. “-and you know how you told me your annual wage? Is that just your wage?” I nodded. [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal">The other day, I was talking with a good buddy. And like always, economics and housing came up in the conversation.</p>
<p class="MsoNormal">“Jared” he started “I was speaking with my wife-“ stroking his thick beard recalling as he went. “-<span class="GramE">and</span> you know how you told me your annual wage? Is that <u>just</u> your wage?”</p>
<p class="MsoNormal">I nodded.</p>
<p class="MsoNormal">“Well, why don’t you go and negotiate yourself into a home? I know you’re smart. But you can’t <u>put your life on hold</u> just to get the lowest home price.” Our conversation moved itself forward from there. But it got me thinking.</p>
<p class="MsoNormal"><em>Am I putting my life on hold?</em></p>
<p class="MsoNormal"><span id="more-86"></span></p>
<p class="MsoNormal">And being that I like to think and self reflect, I spent the next two days doing so. And here is what I came up with.</p>
<p class="MsoNormal">My friends purchased a home in 2006 for $400k. Homes currently sell fro $190-$220k in the area. And the prices are still falling. This couple would love to start having children but they literally cannot afford them. They need both incomes and the hit they would take from paying for childcare would tip their sinking ship. Not to mention his hours were recently cut.</p>
<p class="MsoNormal">That’s putting your life on hold.<o:p><br />
</o:p></p>
<p class="MsoNormal">A single guy I know, purchased a home for $300k. After he purchased the home, his pay was significantly cut. He is extremely worried about layoffs. So being proactive, he found a better job near Seattle but doesn’t know if he should take it. Being unable to sell his home.</p>
<p class="MsoNormal">That’s putting your life on hold.</p>
<p class="MsoNormal">Another family was struggling to make ends meet. They had heard that I had written a program to collect the County’s data from the county website. They had me over for dinner and I showed them the data I had collected and the trends that the data was showing. The next day they dropped the selling price on their rental by $50k, and miraculously were able to sell the home. A month later Lehman brothers collapsed. After that, the man lost his job. He ended up finding one but they had to move for it and the unfortunate thing is that they were unable to sell their private residence. He currently is trying to rent it but cannot find anyone to cover the mortgage. So the family goes with state funded health care for their children, and no health care for themselves, while dropping $1000 bucks a month on a home they don&#8217;t use.<o:p><br />
</o:p></p>
<p class="MsoNormal">That’s putting your life on hold.</p>
<p class="MsoNormal">A couple moved here in 2005. They wanted to buy a home, but asked why my wife and I were waiting. I went on to show them the data I had collected and encouraged them to wait things out. She recently lost her job with the school district up North. And his job was on the ropes. Being a teacher but unable to find work, guess what they did. They moved to freaking China.</p>
<p class="MsoNormal">That’s putting your life on hold.<o:p><br />
</o:p></p>
<p class="MsoNormal">One of my wife’s childhood friends noticed that prices in their area had declined by 50%. So in April of this year, what do her and her husband do? They bought into a home. The day after they sign, her husband loses his job. This month they have missed their first payment and are worried that they will have to move back into her parent’s house.</p>
<p class="MsoNormal">That’s putting your life on hold.</p>
<p class="MsoNormal">My wife and I look around us and feel incredibly blessed.  <span></span>In October we will hopefully have a beautiful baby girl come bouncing into our world. Things have actually have been very busy in preparation for the baby. There are baby showers, shopping, parenting classes, and more showers. Putting together the crib and picking out baby names. The anticipation grows from these events, making me long to meet my little girl.</p>
<p class="MsoNormal">And this is when I realize that our life is moving too fast for anything or anyone to slow it down. And should I <u><strong>never</strong></u> own a home so that we can <u><strong>afford</strong></u> my daughter the best possible upbringing; or that my family won’t be debt burdened and have to <u><strong>put their lives or their dreams on hold</strong></u>; well then, I am totally cool with that.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.acloudtree.com/bend-oregon-real-estate-and-why-it-holds/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Let my dataset change your mindset</title>
		<link>http://www.acloudtree.com/let-my-dataset-change-your-mindset/</link>
		<comments>http://www.acloudtree.com/let-my-dataset-change-your-mindset/#comments</comments>
		<pubDate>Mon, 31 Aug 2009 01:11:43 +0000</pubDate>
		<dc:creator>jared.folkins</dc:creator>
				<category><![CDATA[Database]]></category>
		<category><![CDATA[Economy]]></category>
		<category><![CDATA[OOP]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Web Design]]></category>
		<category><![CDATA[bubble]]></category>
		<category><![CDATA[county]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[deschutes]]></category>
		<category><![CDATA[hans]]></category>
		<category><![CDATA[housing]]></category>
		<category><![CDATA[power]]></category>
		<category><![CDATA[rosling]]></category>
		<category><![CDATA[set]]></category>

		<guid isPermaLink="false">http://www.acloudtree.com/let-my-dataset-change-your-mindset/</guid>
		<description><![CDATA[If you follow the link provided. It will take you to a brilliant video by Hans Rosling. My hope, is that anyone watching at least begins to understand the power of data. Maybe not the exact method to manipulate the stuff, but at least the desire to know more about it. This video is quite [...]]]></description>
			<content:encoded><![CDATA[<p>If you follow the <strong><a href="http://www.ted.com/talks/hans_rosling_at_state.html" title="Link" target="_blank">link</a></strong> provided. It will take you to a brilliant video by Hans Rosling. My hope, is that anyone watching at least begins to understand the power of data. Maybe not the exact method to manipulate the stuff, but at least the desire to know more about it.</p>
<p>This video is quite timely for me. For I have thought long and hard about putting together a series of posts on data analysis. Particularly on the work I have done in looking at the Deschutes County Clerks Data.</p>
<p>For those of you who don&#8217;t know. In the fall of 2007 I created a program that went to the county website and would pull public records and sales data concerning housing. It would throw this data into a usable database.  And I would use this data to educate family and friends on when our housing bubble was started, and how bad it really was.</p>
<p>I built a small website with this data, but stopped my work on it because of several complex reasons.</p>
<p>Anyway, if people are interested in how to analyze data, or would appreciate an ongoing conversation about it, let me know.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.acloudtree.com/let-my-dataset-change-your-mindset/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

