<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Cogito ergo sum &#187; mac</title>
	<atom:link href="http://www.corgitoergosum.net/category/mac/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.corgitoergosum.net</link>
	<description>Do not try to bend the spoon. Instead only try to realize the truth.</description>
	<lastBuildDate>Thu, 29 Dec 2011 00:16:34 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Every teardrop is a waterfall</title>
		<link>http://www.corgitoergosum.net/2011/10/24/every-teardrop-is-a-waterfall/</link>
		<comments>http://www.corgitoergosum.net/2011/10/24/every-teardrop-is-a-waterfall/#comments</comments>
		<pubDate>Mon, 24 Oct 2011 04:46:59 +0000</pubDate>
		<dc:creator>kenshin03</dc:creator>
				<category><![CDATA[mac]]></category>
		<category><![CDATA[music]]></category>
		<category><![CDATA[tech]]></category>

		<guid isPermaLink="false">http://www.corgitoergosum.net/?p=1045</guid>
		<description><![CDATA[Not meant to drag on this forever but got to say how Coldplay&#8217;s performance at the Celebrating Steve event was really triumphant and uplifting, overcoming (but not withholding) the melancholy and sorrow that hung so heavily in the air. Music touches people&#8217;s souls, and at its most intense, takes them on a trip far, far [...]]]></description>
			<content:encoded><![CDATA[<p>Not meant to drag on this forever but got to say how Coldplay&#8217;s performance at the <a href="http://events.apple.com.edgesuite.net/10oiuhfvojb23/event/index.html" target="_blank">Celebrating Steve</a> event was really triumphant and uplifting, overcoming (but not withholding) the melancholy and sorrow that hung so heavily in the air. Music touches people&#8217;s souls, and at its most intense, takes them on a trip far, far away and back, with the renewed faith that things have been healed a little.</p>
<div id="attachment_1056" class="wp-caption alignnone" style="width: 610px"><a href="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/10/Screen-Shot-2011-10-24-at-11.40.44-AM.png" rel="lightbox[1045]"><img src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/10/Screen-Shot-2011-10-24-at-11.40.44-AM-1024x576.png" alt="" title="Screen Shot 2011-10-24 at 11.40.44 AM" width="600" /></a><p class="wp-caption-text">Viva La Vida. Long Live Life.</p></div>
<div id="attachment_1060" class="wp-caption alignnone" style="width: 610px"><a href="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/10/Screen-Shot-2011-10-24-at-12.20.56-PM.png" rel="lightbox[1045]"><img src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/10/Screen-Shot-2011-10-24-at-12.20.56-PM.png" alt="" title="Screen Shot 2011-10-24 at 12.20.56 PM" width="600"/></a><p class="wp-caption-text">Fix It. Lights will guide your soul.</p></div>
<p><a href="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/10/Screen-Shot-2011-10-24-at-11.48.28-AM.png" rel="lightbox[1045]"><img src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/10/Screen-Shot-2011-10-24-at-11.48.28-AM-1024x576.png" alt="" title="Screen Shot 2011-10-24 at 11.48.28 AM" width="600" class="alignnone size-large wp-image-1061" /></a></p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fwww.corgitoergosum.net%2F2011%2F10%2F24%2Fevery-teardrop-is-a-waterfall%2F&amp;title=Every%20teardrop%20is%20a%20waterfall" id="wpa2a_2"><img src="http://www.corgitoergosum.net/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.corgitoergosum.net/2011/10/24/every-teardrop-is-a-waterfall/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Fanboys</title>
		<link>http://www.corgitoergosum.net/2011/10/09/fanboys/</link>
		<comments>http://www.corgitoergosum.net/2011/10/09/fanboys/#comments</comments>
		<pubDate>Sun, 09 Oct 2011 15:14:15 +0000</pubDate>
		<dc:creator>kenshin03</dc:creator>
				<category><![CDATA[mac]]></category>
		<category><![CDATA[tech]]></category>
		<category><![CDATA[apple]]></category>
		<category><![CDATA[presentation]]></category>
		<category><![CDATA[stevejobs]]></category>

		<guid isPermaLink="false">http://www.corgitoergosum.net/?p=1031</guid>
		<description><![CDATA[While reading Revolution in the Valley some years back, one of the things that moved me was how Apple evangelists would voluntarily go off to electronics stores and offer to &#8220;fix&#8221; the retail Apple experience &#8211; ensuring software was running properly, answering questions from customers and sales reps, and the like. They were the first [...]]]></description>
			<content:encoded><![CDATA[<p><img alt="Fanboys" src="http://www.movieposter.com/posters/archive/main/98/MPW-49080" title="Fanboys" class="alignnone" width="500" height="741" /></p>
<p>While reading <a href="http://www.amazon.com/Revolution-Valley-Insanely-Great-Story/dp/0596007191" target="_blank">Revolution in the Valley</a> some years back, one of the things that moved me was how Apple evangelists would voluntarily go off to electronics stores and offer to &#8220;fix&#8221; the retail Apple experience &#8211; ensuring software was running properly, answering questions from customers and sales reps, and the like. They were the first generation Apple Geniuses.</p>
<p>At one time or another, each Mac user would undoubtedly come across the chance to do his bit of evangelizing. Mine was about three or four years ago, when our department was still flat-out filled with XP laptops (with the anti-virus kicking in and commanding 99% CPU hourly) and only a handful of folks owned Apple machines. Refused to be issued a Macbook by IT, I sought to bring in my own instead. That sparked a bit of interest among my team mates (and grunt from management), especially those planning to replace their notebooks.</p>
<p>After hearing remarks like how the &#8220;Mac OS looks like Windows&#8221;, I had an urge to set the record straight and tell the team about the story of Apple and Steve Jobs. I consider it essential knowledge for anyone who works in the field. </p>
<p>Here&#8217;re the slides I showed them that afternoon. A fitting end for an unforgettable week I guess.</p>
<!-- SlideShare error: doc is missing or has illegal characters -->
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fwww.corgitoergosum.net%2F2011%2F10%2F09%2Ffanboys%2F&amp;title=Fanboys" id="wpa2a_4"><img src="http://www.corgitoergosum.net/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.corgitoergosum.net/2011/10/09/fanboys/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The legacy of Steve Jobs</title>
		<link>http://www.corgitoergosum.net/2011/10/06/the-legacy-of-steve-jobs/</link>
		<comments>http://www.corgitoergosum.net/2011/10/06/the-legacy-of-steve-jobs/#comments</comments>
		<pubDate>Thu, 06 Oct 2011 08:42:30 +0000</pubDate>
		<dc:creator>kenshin03</dc:creator>
				<category><![CDATA[life]]></category>
		<category><![CDATA[mac]]></category>
		<category><![CDATA[tech]]></category>
		<category><![CDATA[apple mac jobs]]></category>

		<guid isPermaLink="false">http://www.corgitoergosum.net/?p=1014</guid>
		<description><![CDATA[Like many others I was shocked by the news of Steve Job&#8217;s passing away when I woke up this morning. Sure, the writing has been on the wall for quite some time and his publicized resignation more or less paved way for what is to come, but that doesn&#8217;t make the news any easier to [...]]]></description>
			<content:encoded><![CDATA[<div class="wp-caption alignnone" style="width: 610px"><a href="http://www.flickr.com/photos/13105039@N04/2220459219/"><img alt="Photo courtesy of turtlemom_nancy @flickr" src="http://farm3.static.flickr.com/2058/2220459219_ebbf69efe7_z.jpg?zz=1" title="Photo courtesy of turtlemom_nancy @flickr" width="600" /></a><p class="wp-caption-text">Photo courtesy of turtlemom_nancy @flickr</p></div>
<p>Like many others I was shocked by the news of Steve Job&#8217;s passing away when I woke up this morning. Sure, the writing has been on the wall for quite some time and his publicized resignation more or less paved way for what is to come, but that doesn&#8217;t make the news any easier to swallow when it strikes.</p>
<p>The world as we know it will never be the same. For many years to come, people will still be discovering how much they have truly lost, and occasionally pondering how Steve would have done things differently if he is still around.</p>
<p>Most people weigh upon his contributions by the products he helped to create, his innovative visions or business acumens. Those are impressive, but I&#8217;m more touched by the more human side of the man. His relentlessness in pursuing true greatness. The drive to create works of wonder that are in harmony with nature, people, the world. His refusal to be let down by the status quo. His eagerness to shine among the flock. And the sense of proudness and satisfaction when he sees people appreciating his work. To that end, this mentality is no different than that of a fourth grader at school striving hard to win the praises of parents and teachers. </p>
<p>Some people singled that out as a fault in Jobs&#8217; character. That his biological parents put him up for adoption created a vacuum in his heart he spent a lifetime trying to reconcile. In my opinion, that aspect was a gift rather than a fault. It was probably one of the greatest source of strength in Steve&#8217;s life, and as a result the single most important aspiring quality contributing to the high levels of standard carried forward to most of Apple&#8217;s products. </p>
<p>If only could more people in the world retain this child-like innocence and act upon it in everything they do, the human race would benefit from more amazing breakthroughs than an array of new laptops and mobile phones every year. That, would be the greatest legacy that Steve Jobs could have left behind.</p>
<div class="wp-caption alignnone" style="width: 510px"><img alt="jobs macintosh" src="http://ph.cdn.photos.upi.com/collection/upi/30c15bb46bfe191c989972d1ae4bb2c1/Apple-Chairman-Steve-Jobs-shown-during-the-demonstration-of-the-Macintosh-computer-at-annual-shareholders-meeting_1.jpg" title="jobs macintosh" width="500" height="414" /><p class="wp-caption-text">steve jobs 1955-2011</p></div>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fwww.corgitoergosum.net%2F2011%2F10%2F06%2Fthe-legacy-of-steve-jobs%2F&amp;title=The%20legacy%20of%20Steve%20Jobs" id="wpa2a_6"><img src="http://www.corgitoergosum.net/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.corgitoergosum.net/2011/10/06/the-legacy-of-steve-jobs/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Three Pillars of Social Reader Relevancy (I)</title>
		<link>http://www.corgitoergosum.net/2011/07/14/the-three-pillars-of-relevancy-i/</link>
		<comments>http://www.corgitoergosum.net/2011/07/14/the-three-pillars-of-relevancy-i/#comments</comments>
		<pubDate>Thu, 14 Jul 2011 15:49:47 +0000</pubDate>
		<dc:creator>kenshin03</dc:creator>
				<category><![CDATA[cassius]]></category>
		<category><![CDATA[mac]]></category>
		<category><![CDATA[mobile]]></category>
		<category><![CDATA[tech]]></category>
		<category><![CDATA[camus]]></category>
		<category><![CDATA[flipboard]]></category>

		<guid isPermaLink="false">http://www.corgitoergosum.net/?p=959</guid>
		<description><![CDATA[While big name players attempt to tackle the issue simply by snapping on extra features (Google Mobile Voice Search, Google Instant Preview for Mobile.etc.), the underlying problems remain resolved as its ranking algorithm is the same as its desktop counterpart. Flipboard is so great because, in my opinion, it has found and defined the new three pillars of relevancy for mobile content consumption and they are freshness, social, and readability* - and they work wonderfully.]]></description>
			<content:encoded><![CDATA[<p>In Web Search, the ranking of results is primarily determined by their <strong>freshness</strong>, <strong>relevancy</strong> (in regard to a search query) and <strong>content quality</strong>. Freshness is indisputable and needs little explanation, relevancy is an approximation of how much data a web site contains that have something to do with the user&#8217;s query and content quality is an indication of how &#8220;good&#8221; the site&#8217;s information is, given factors like PageRank, spam scores and so on.</p>
<p>Once crossed the line into the mobile word, however, these three factors lose their usefulness drastically. Text input on mobile devices is largely impractical and traditional web pages don&#8217;t render well, so media discovery and consumption on mobile devices is generally inferior compared to the same experience from printed mediums like newspapers and magazines. While big name players attempt to tackle the issue simply by snapping on extra features (Google Mobile Voice Search, Google Instant Preview for Mobile.etc.), the underlying problems remain resolved as its ranking algorithm is the same as its desktop counterpart. Flipboard is so great because, in my opinion, it has found and defined the new three pillars of relevancy for mobile content consumption and they are <strong>freshness</strong>, <strong>social</strong>, and <strong>readability</strong>* &#8211; and they work wonderfully.</p>
<p><a href="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/07/three_pillars.png" rel="lightbox[959]"><img class="alignnone size-full wp-image-969" title="three_pillars" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/07/three_pillars.png" alt="" width="509" height="409" /></a></p>
<p>With this in mind, I put some work to the server components of Cassius. From a simple script that turns a Tweet into a JSON feed, the pipeline now includes saving documents into a transitional store (MongoDB) and a series of quality measurement calculations. While the extra processing means we won&#8217;t be able to serve the feed in realtime, the cost should be worthwhile and I hope the results justify that.</p>
<p><a href="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/07/camus_workflow2.png" rel="lightbox[959]"><img class="alignnone size-full wp-image-970" title="camus_workflow2" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/07/camus_workflow2.png" alt="" width="544" height="399" /></a></p>
<h4>How well does your article read?</h4>
<p>In Zite or Flipboard, it&#8217;s not uncommon to run into articles with summary texts that resemble gibberish (see below). The issue is often a result of incorrect identification of raw HTML elements as meaningful content, and is very hard to avoid. I have seen attempts to solve this problem using NLP and machine learning classification methods, to varying degrees of success. Since those are beyond my capabilities, I opted to use some traditional methods to measure the quality of a piece of writing &#8211; by taking its readability metrics. From Wikipedia, readability evaluation refers to &#8220;<em>the ease in which text can be read and understood</em>&#8220;, and &#8220;<em>…various factors to measure readability have been used, such as speed of perception, perceptibility at a distance, perceptibility in peripheral vision, visibility, the reflex blink technique, rate of work, eye movements, and fatigue in reading…</em>&#8220;.  Readability metrics measurement tools are widely available, and embedded in word processors and email clients.</p>
<div id="attachment_961" class="wp-caption alignnone" style="width: 341px"><img class="size-full wp-image-961 " title="readability metrics in M$ Word" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/07/rdb6.jpg" alt="" width="331" height="316" /><p class="wp-caption-text">source: corporategeek.info</p></div>
<div id="attachment_971" class="wp-caption alignnone" style="width: 290px"><img class="size-full wp-image-971 " title="flipboard_fail" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/07/flipboard_fail.png" alt="" width="280" height="345" /><p class="wp-caption-text">results of bad scraping</p></div>
<p>In a nutshell, the tools apply different statistical formulas on a piece of English writing, and the resulting scores form an impression of its understandability. The formulas typically break text into syntactic components such as words and sentences and count their distribution or frequency in relation to the text being analyzed. The most common readability formulas and descriptions are given below:</p>
<p><a href="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/07/Screen-Shot-2011-07-14-at-11.00.36-PM.png" rel="lightbox[959]"><img class="alignnone size-full wp-image-975" title="Different readability metrics" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/07/Screen-Shot-2011-07-14-at-11.00.36-PM.png" alt="" width="584" height="387" /></a></p>
<p>I found it more pleasing to read blog posts and articles on Flipboard/Zite that are about a page in length. Contents that span multiple pages are too demanding for casual reads, while short tweets or one liners aren&#8217;t worth the two clicks effort to expand and shrink them from the page (yes really). For simplicity, let&#8217;s take my reading habits as standard, and use the following thresholds for computation:</p>
<ul>
<li><a href="http://en.wikipedia.org/wiki/Flesch–Kincaid_readability_test" target="_blank">Flesch</a> &#8211; <strong>50</strong> (<em>Times</em> magazine has a score of about 52)</li>
<li><a title="Flesch-Kincaid" href="http://en.wikipedia.org/wiki/Flesch–Kincaid_readability_test" target="_blank">Flesch-Kincaid</a> &#8211; <strong>13</strong> (pre-college level)</li>
<li><a title="Gunning Fog" href="http://en.wikipedia.org/wiki/Gunning_fog_index" target="_blank">Gunning Fog</a> &#8211; <strong>12</strong> (texts for wide audience have fog index of less than 12)</li>
<li><a title="SMOG" href="http://en.wikipedia.org/wiki/SMOG_(Simple_Measure_Of_Gobbledygook)" target="_blank">SMOG</a> - <strong>13</strong> (pre-college level)</li>
<li><a title="Coleman Liau" href="http://en.wikipedia.org/wiki/Coleman–Liau_Index" target="_blank">Coleman Liau </a>- <strong>13</strong> (pre-college level)</li>
<li><a title="Automated Readability Index" href="http://en.wikipedia.org/wiki/Automated_Readability_Index" target="_blank">ARI</a> - <strong>13</strong> (pre-college level)</li>
</ul>
<div>The gist here takes the readability scores, evaluate their distances from the threshold and combine the scores as a mean. Very straightforward.</div>
<script src="http://gist.github.com/1082659.js"></script>
<p>In the next post we&#8217;ll continue to explore the three pillars, and look at some test results to see whether the additional aspect of readability would help us create a feed that is better optimized for the user&#8217;s final reading experience.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.corgitoergosum.net/2011/07/14/the-three-pillars-of-relevancy-i/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Knitting a page</title>
		<link>http://www.corgitoergosum.net/2011/06/22/knitting-a-page/</link>
		<comments>http://www.corgitoergosum.net/2011/06/22/knitting-a-page/#comments</comments>
		<pubDate>Wed, 22 Jun 2011 10:18:41 +0000</pubDate>
		<dc:creator>kenshin03</dc:creator>
				<category><![CDATA[cassius]]></category>
		<category><![CDATA[mac]]></category>
		<category><![CDATA[mobile]]></category>
		<category><![CDATA[tech]]></category>
		<category><![CDATA[flipboard]]></category>
		<category><![CDATA[ios]]></category>
		<category><![CDATA[ipad]]></category>
		<category><![CDATA[linkedin]]></category>

		<guid isPermaLink="false">http://www.corgitoergosum.net/?p=926</guid>
		<description><![CDATA[References on how to layout news story articles is plentiful, yet the most useful I came across was a paper published in 1977 titled "Computer Assisted Layout of Newspapers" by the MIT.]]></description>
			<content:encoded><![CDATA[<p>When set out to build the prototype, there&#8217;re many things in the design I considered fundamental, chief among them being a template system flexible enough so that no re-installs or updates are necessary if a new page layout combination is desired.</p>
<p>References on the topic is plentiful, but surprisingly the most useful one I came across was a paper published in 1977 titled &#8220;Computer Assisted Layout of Newspapers&#8221; by the MIT. You can find the full 184 pages <a href="http://dspace.mit.edu/handle/1721.1/1285" target="_new">here</a>. The paper is a gem to read and goes into detail on even how ads and pictures layouts could be automatically assigned to a theoretical newspaper page. I shall definitely return to it for more inspiration, but so far I have based the design of the prototype on Chapter 6, <em>A Symbolic Graphics Language For News Layout</em>.</p>
<p>The diagram below lifted from P.84 of the paper tells it all. Pages on Flipboard largely employ a rows/columns layout combination, and the powerful template language described in the paper should be able to cover all variations effortlessly .</p>
<div id="attachment_928" class="wp-caption alignleft" style="width: 559px"><img src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/06/symbolic_layout_language.png" alt="" title="symbolic_layout_language" width="549" height="452" class="size-full wp-image-928" /><p class="wp-caption-text">a simple yet powerful layout language</p></div>
<p>Note that I cheated a little and defined my version of the template language in JSON, mainly for easier parsing in Objective-C. </p>
<p>Therefore,<br />
<code>P1 || (S1 = S2)</code> is represented with <code>{"columns": [{"type":"P1"}, {"rows":[{"type":"S1"}, {"type":"S2"}]}]}</code> in my app,</p>
<p>and</p>
<p><code>S3 || (S4=(S5 || S6 || S7))</code> becomes <code>{"columns":[{"type":"S3"}, { "rows": [{"type":"S3"}, {"columns":[{"type":"S5"},{"type":"S5"}, {"type":"S7"}]}]  }]}</code>.</p>
<p>With a structure like this, we could simply parse the JSON into multi-dimensional arrays (e.g. {&#8220;P1&#8243;, &#8220;{S1, S2}&#8221;}), then write classes to traverse the array and return suitable UIViews or collections of UIViews. Only two-level nesting is supported in the code right now.</p>
<p>The UIView generation process itself is just as crude at present. While looping through the array, the type of value stored is examined, and if it&#8217;s a definition like &#8220;P1&#8243; or &#8220;TIA&#8221;, a helper class would create the corresponding UIView, with arguments being the article itself and attributes like the size of the array passed in for presentation purposes. All these take place in the <a href="https://github.com/kenshin03/cassius/blob/master/Cassius/PageLayoutManager.m#L142" target="_new">PageLayoutManager</a> class. A whole lot more work will be put in around these classes.</p>
<p>I&#8217;m hoping that more help from the server-side will be used for both the templates definition and articles selection process. Analysis on word count, images in the article, source authority, social signals and other relevancy factors should already been taken into account by the time these articles and templates arrive at the client app.</p>
<p>Finally, here&#8217;s the template used for generating the pages shown in the first video. There are four pages altogether, with pages 1 and 4 being row-based and pages 2 and 3 column-based. These layout designs are quite similar to the ones used heavily on tweets-display pages on Flipboard.</p>
<p><code></p>
<p>{"pages":[<br />
{"rows":[{"type":"TIA"},{"columns":[{"type":"TIA"},{"type":"TIA"}, {"type":"TIA"}]}]}, {"columns":[{"type":"TIA"},{"rows":[{"type":"TIA"},{"type":"TIA"}]}]},<br />
{"columns":[{"type":"TIA"}]},<br />
{"rows":[{"type":"TIA"}, {"type":"TIA"}]}<br />
]}</p>
<p></code><br />
<br/><br/>The rendering:<br/><br />
<div id="attachment_938" class="wp-caption alignleft" style="width: 710px"><img src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/06/cassius_prototype1_pages.jpg" alt="" title="cassius prototype 1 pages" width="700" height="230" class="size-full wp-image-938" /><p class="wp-caption-text">page 1 - row 1 is article, row 2 three columns of articles. <br/>page 2 - column 1 is article, column 2 is 2 rows of articles.<br/> page 3 - 1 column, 1 article.<br/> page 4 - 2 rows of articles</p></div></p>
<p><strong>Remote or Local?</strong></p>
<p>A colleague pointed out the template definitions must be defined and stored on the client app locally, as the app shouldn&#8217;t need to fetch a new template from the server when the device changes orientation. I haven&#8217;t thought about that yet. To me, it makes more sense to have the server picking templates that are more suited to the content being served. I&#8217;m totally not thinking about how to deal with landscape orientations yet.</p>
<p>Extended Reading:</p>
<ul>
<li><a href="http://dspace.mit.edu/handle/1721.1/1285" target="_new">COMPUTER-ASSISTED LAYOUT OF NEWSPAPERS</a>. Reubtures et al. </li>
<li><a href="http://portal.acm.org/citation.cfm?id=380538" target="_new">Optimization of web newspaper layout in real tim</a>e. J. Gonzalez et al. / Computer Networks 36<br />
]]></content:encoded>
			<wfw:commentRss>http://www.corgitoergosum.net/2011/06/22/knitting-a-page/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cassius is on github</title>
		<link>http://www.corgitoergosum.net/2011/06/16/cassius-is-on-github/</link>
		<comments>http://www.corgitoergosum.net/2011/06/16/cassius-is-on-github/#comments</comments>
		<pubDate>Thu, 16 Jun 2011 10:43:16 +0000</pubDate>
		<dc:creator>kenshin03</dc:creator>
				<category><![CDATA[cassius]]></category>
		<category><![CDATA[mac]]></category>
		<category><![CDATA[mobile]]></category>
		<category><![CDATA[tech]]></category>
		<category><![CDATA[flipboard]]></category>
		<category><![CDATA[ios]]></category>
		<category><![CDATA[ipad]]></category>
		<category><![CDATA[linkedin]]></category>

		<guid isPermaLink="false">http://www.corgitoergosum.net/?p=917</guid>
		<description><![CDATA[Finally got round to putting together a decent enough client of Cassius on Github!]]></description>
			<content:encoded><![CDATA[<p>Finally got round to putting together a decent enough client! Although the code is pretty rough now and the app would crash after a while, but at least it&#8217;s a start innit?!</p>
<p>Repos:</p>
<ol>
<ul>
<a href="https://github.com/kenshin03/camus">https://github.com/kenshin03/camus</a> (server side, java)</ul>
<ul>
<a href="https://github.com/kenshin03/cassius">https://github.com/kenshin03/cassius</a> (iOS client side)</ul>
</ol>
<p>The article pages were generated by a custom template that&#8217;s defined in JSON, and the image on the cover page is grabbed from Instagram&#8217;s API:</p>
<div style="width:425px" id="__ss_8325334"> <strong style="display:block;margin:12px 0 4px"><a href="http://www.slideshare.net/kenshin03/cassius-20110616" title="Cassius 20110616">Cassius 20110616</a></strong> <iframe src="http://www.slideshare.net/slideshow/embed_code/8325334" width="425" height="355" frameborder="0" marginwidth="0" marginheight="0" scrolling="no"></iframe>
<div style="padding:5px 0 12px"> View more <a href="http://www.slideshare.net/">videos</a> from <a href="http://www.slideshare.net/kenshin03">kenshin03</a> </div>
</p></div>
<p>The iconic page flip effects were lifted straight out from <a href="https://github.com/mtabini/AFKPageFlipper">AFKPageFlipper</a>. Thanks again Marco!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.corgitoergosum.net/2011/06/16/cassius-is-on-github/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Replicating Flipboard Part IV &#8211; Prelude</title>
		<link>http://www.corgitoergosum.net/2011/06/12/replicating-flipboard-part-iv-prelude/</link>
		<comments>http://www.corgitoergosum.net/2011/06/12/replicating-flipboard-part-iv-prelude/#comments</comments>
		<pubDate>Sat, 11 Jun 2011 17:13:41 +0000</pubDate>
		<dc:creator>kenshin03</dc:creator>
				<category><![CDATA[cassius]]></category>
		<category><![CDATA[mac]]></category>
		<category><![CDATA[mobile]]></category>
		<category><![CDATA[tech]]></category>
		<category><![CDATA[flipboard]]></category>
		<category><![CDATA[ios]]></category>
		<category><![CDATA[ipad]]></category>
		<category><![CDATA[linkedin]]></category>

		<guid isPermaLink="false">http://www.corgitoergosum.net/?p=845</guid>
		<description><![CDATA[What you see here is a cover page using a random image from Instagram transitioning into the first articles page showing stories from my Twitter feed. The layout was generated dynamically from a custom template defined in json.]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s been forever since the last update. But amidst a trillion other things, this Flipboard study never strayed far from my mind (and the IDE).</p>
<p>More detailed posts on the progress and designs of the prototype will come this week, but right now, I can&#8217;t wait to share the very first screenshots of <del>Cloneboard</del> Cassius.</p>
<p>What you see here is a cover page using a random image from Instagram transitioning into the first articles page showing stories from my Twitter feed. The layout was generated dynamically from a custom template defined in json.</p>

<a href='http://www.corgitoergosum.net/2011/06/12/replicating-flipboard-part-iv-prelude/cassius1/' title='cassius1'><img width="150" height="150" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/06/cassius1-150x150.png" class="attachment-thumbnail" alt="cassius1" title="cassius1" /></a>
<a href='http://www.corgitoergosum.net/2011/06/12/replicating-flipboard-part-iv-prelude/cassius2/' title='cassius2'><img width="150" height="150" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/06/cassius2-150x150.png" class="attachment-thumbnail" alt="cassius2" title="cassius2" /></a>
<a href='http://www.corgitoergosum.net/2011/06/12/replicating-flipboard-part-iv-prelude/cassius3/' title='cassius3'><img width="150" height="150" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/06/cassius3-150x150.png" class="attachment-thumbnail" alt="cassius3" title="cassius3" /></a>
<a href='http://www.corgitoergosum.net/2011/06/12/replicating-flipboard-part-iv-prelude/cassius4/' title='cassius4'><img width="150" height="150" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/06/cassius4-150x150.png" class="attachment-thumbnail" alt="cassius4" title="cassius4" /></a>
<a href='http://www.corgitoergosum.net/2011/06/12/replicating-flipboard-part-iv-prelude/cassius5/' title='cassius5'><img width="150" height="150" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/06/cassius5-150x150.png" class="attachment-thumbnail" alt="cassius5" title="cassius5" /></a>
<a href='http://www.corgitoergosum.net/2011/06/12/replicating-flipboard-part-iv-prelude/cassius6/' title='cassius6'><img width="150" height="150" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/06/cassius6-150x150.png" class="attachment-thumbnail" alt="cassius6" title="cassius6" /></a>

<p>I&#8217;ll be putting all the code up on Github for better back up and version control soon.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.corgitoergosum.net/2011/06/12/replicating-flipboard-part-iv-prelude/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Replicating Flipboard Part III – How Flipboard lays out content</title>
		<link>http://www.corgitoergosum.net/2011/03/06/replicating-flipboard-part-iii-%e2%80%93-how-flipboard-lays-out-content/</link>
		<comments>http://www.corgitoergosum.net/2011/03/06/replicating-flipboard-part-iii-%e2%80%93-how-flipboard-lays-out-content/#comments</comments>
		<pubDate>Sun, 06 Mar 2011 11:47:18 +0000</pubDate>
		<dc:creator>kenshin03</dc:creator>
				<category><![CDATA[cassius]]></category>
		<category><![CDATA[mac]]></category>
		<category><![CDATA[mobile]]></category>
		<category><![CDATA[tech]]></category>
		<category><![CDATA[flipboard]]></category>
		<category><![CDATA[ios]]></category>
		<category><![CDATA[ipad]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[relevancy]]></category>

		<guid isPermaLink="false">http://www.corgitoergosum.net/?p=754</guid>
		<description><![CDATA[Shifting the focus back to the iPad app, let's take a look at how Flipboard processes and lays out Facebook and Twitter feeds.]]></description>
			<content:encoded><![CDATA[<p>Shifting the focus back to the iPad app, let&#8217;s take a look at how Flipboard processes and lays out Facebook and Twitter feeds. Does it really employ any social signal based ranking?</p>
<p>Here&#8217;s a sample of some Twitter feeds I receive, shown in a browser and in Flipboard:</p>
<table border="0" width="90%">
<tbody>
<tr>
<td>
<p><div id="attachment_755" class="wp-caption alignleft" style="width: 210px"><img class="size-full wp-image-755" title="sample_twitter_feed" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/03/sample_twitter_feed.png" alt="" width="200" height="741" /><p class="wp-caption-text">Sample Twitter Feed</p></div></td>
<td>
<p><div id="attachment_756" class="wp-caption alignleft" style="width: 410px"><img class="size-full wp-image-756" title="flipboard_twitter_layout" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/03/flipboard_twitter_layout.png" alt="" width="400" height="534" /><p class="wp-caption-text">Same group of Twitter feeds viewed in Flipboard</p></div></td>
</tr>
</tbody>
</table>
<p>Alright&#8230;what can be deduced from this? A few things became obvious when the same data is shown side-by-side:</p>
<ul>
<li>Ranking of the Twitter articles in Flipboard more or less retains the original sort-by-time order! Don&#8217;t think there&#8217;re any clever hidden social ranking at play here.</li>
<li>Some Tweets were dropped by Flipboard, presumably because they do not contain links. Examples like (#3, #11, #15, #17, #18) are simply Twitter conversations.</li>
<li>Tweets that have links to sites with Images (e.g. #7, #9) seemed to be given higher display priority.</li>
</ul>
<p>What about Facebook? The typical Facebook feed is a lot more complex, with mixed content ranging from Check-ins, uploaded lmages, to Likes and notifications from whatever apps/games a user has added. The prospect of having to analyze so many different types of content itself is daunting enough, let alone the additional efforts of re-ranking and laying them out in an app.</p>
<p>As above, this is a snapshot of my Facebook News feed (names, faces and updates blurred out &#8211; don&#8217;t worry people!) in a browser and in Flipboard:</p>
<table border="0" width="90%">
<tbody>
<tr valign="bottom">
<td valign="bottom">
<p><div id="attachment_764" class="wp-caption alignleft" style="width: 210px"><img class="size-full wp-image-764" title="sample_facebook_feed" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/03/sample_facebook_feed.png" alt="" width="200" height="1911" /><p class="wp-caption-text">facebook news feed</p></div></td>
<td valign="bottom">
<p><div id="attachment_765" class="wp-caption alignleft" style="width: 410px"><img class="alignleft size-full wp-image-766" title="flipboard_facebook_layout" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/03/flipboard_facebook_layout.png" alt="" width="400" height="534" /><p class="wp-caption-text">facebook in Flipboard</p></div></td>
</tr>
</tbody>
</table>
<p>Immediately we are striked with much more intriguing findings:</p>
<ul>
<li>Flipboard completely ditched the published time of the feed articles and laid them out entirely based on readability attributes (text length, image size.etc.).</li>
<li>Quite a few Facebook articles were dropped:
<ul>
<li>Article #5 &#8211; a Facebook Places-Checkin: Due to lack of images?</li>
<li>Article #6 &#8211; An Image uploaded from iPhone. Not sure why it wasn&#8217;t shown in Flipboard.</li>
<li>Articles  #7, #13 &#8211; These were &#8220;xxx started using xxx app&#8221; messages, which contain no links or images and frankly, nothing interesting.</li>
<li>Articles #20, #21 &#8211; These were messages from Groupon. As article #14 was also a Groupon article, I guess #20 and #21 were dropped as Flipboard prevents showing too many articles from the same source all at once.</li>
</ul>
</li>
</ul>
<p>And while looking at Facebook articles, we might as well take a quick look again if Facebook-Likes contribute to the layout or ranking at all. The verdict &#8211; NOPE. If social signals play a major role in ranking, then #12 and #19 should have higher placements in Page 1 or 2 at least.</p>
<ul>
<li>#1 &#8211; 1 reply</li>
<li>#2 &#8211; 1 Like</li>
<li>#3 &#8211; 0 Likes (but 82 youtube Likes)</li>
<li>#8 &#8211; 0 Likes</li>
<li>#4, #9 &#8211; 0 Likes</li>
<li>#11 &#8211; 4 Likes, 7 Comments</li>
<li>#12 &#8211; 1 Reply, 92 Likes</li>
<li>#14 &#8211; 0 Likes</li>
<li>#16 &#8211; 3 Comments</li>
<li>#19 &#8211; 34 Likes</li>
<li>#17 &#8211; 4 Likes</li>
<li>#18 &#8211; 2 Likes</li>
<li>#22 &#8211; 2 Likes</li>
</ul>
<p>From such observations, I suspect Flipboard conducts workflows that roughly compose of the steps below:</p>
<table border="0" width="90%">
<tbody>
<tr>
<td>
<p><div id="attachment_786" class="wp-caption alignleft" style="width: 334px"><img class="size-full wp-image-786" title="flipboard_blog_post_facebook_workflow" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/03/flipboard_blog_post_facebook_workflow.png" alt="" width="324" height="458" /><p class="wp-caption-text">Likely processing workflow for Facebook feeds</p></div></td>
<td>
<p><div id="attachment_787" class="wp-caption alignleft" style="width: 334px"><img class="size-full wp-image-787" title="flipboard_blog_post_twitter_workflow" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/03/flipboard_blog_post_twitter_workflow.png" alt="" width="324" height="458" /><p class="wp-caption-text">Likely processing workflow for Twitter feeds</p></div></td>
</tr>
</tbody>
</table>
<p>Initially, I found it hard to accept this interpretation of templates-over-content. It is basically saying there&#8217;s no magic, that <strong>Flipboard merely puts together a random collection of page templates then proceeds to filling those templates with the next most suitable article from a content feed</strong>.</p>
<p>Putting an end to this suspicion, I switched on Flipboard after a few more articles emerged in my Facebook feed. Given a largely unchanged set of data, if Flipboard employs a content-centric ranking criteria, then the layout of the pages should remain more or less the same? From the screenshots below, this was clearly not the case. Notice how the same data was laid out much more differently:</p>
<table border="0" width="80%">
<tbody>
<tr>
<td>
<p><div id="attachment_765" class="wp-caption alignleft" style="width: 410px"><img class="alignleft size-full wp-image-766" title="flipboard_facebook_layout" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/03/flipboard_facebook_layout.png" alt="" width="400" height="534" /><p class="wp-caption-text">Original render in Flipboard</p></div></td>
<td>
<p><div id="attachment_793" class="wp-caption alignleft" style="width: 330px"><img class="size-full wp-image-793" title="flipboard_facebook_layout2" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/03/flipboard_facebook_layout2.png" alt="" width="320" height="512" /><p class="wp-caption-text">Second render of the same Facebook feed</p></div></td>
</tr>
</tbody>
</table>
<p>So that concludes the brief study on Flipboard&#8217;s layout algorithm. From this point on I shall start firing up the IDEs and getting my hands dirty on building that prototype. Pretty thrilled by the amount of attention and support this little project seems to be garnering &#8211; thanks and please stay tuned!</p>
<p><a href="http://www.corgitoergosum.net/2011/02/28/replicating-flipboard-part-ii-social-signals/comment-page-1/#comment-85">Previous Post</a><br />
<br/><br/><br />
Related:</p>
<ul>
<li><a href="http://twitter.com/#!/tbrant/status/19852837412413440">how are the username and password used/stored for Google Reader?</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.corgitoergosum.net/2011/03/06/replicating-flipboard-part-iii-%e2%80%93-how-flipboard-lays-out-content/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Replicating Flipboard Part II &#8211; Social Signals</title>
		<link>http://www.corgitoergosum.net/2011/02/28/replicating-flipboard-part-ii-social-signals/</link>
		<comments>http://www.corgitoergosum.net/2011/02/28/replicating-flipboard-part-ii-social-signals/#comments</comments>
		<pubDate>Mon, 28 Feb 2011 10:32:22 +0000</pubDate>
		<dc:creator>kenshin03</dc:creator>
				<category><![CDATA[cassius]]></category>
		<category><![CDATA[mac]]></category>
		<category><![CDATA[mobile]]></category>
		<category><![CDATA[tech]]></category>
		<category><![CDATA[flipboard]]></category>
		<category><![CDATA[ios]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[scrappers]]></category>
		<category><![CDATA[tweets]]></category>

		<guid isPermaLink="false">http://www.corgitoergosum.net/?p=725</guid>
		<description><![CDATA[The adoption of a brand new ranking paradigm – social strength, was the single most ground-breaking thinking that was in Flipboard’s design, not unlike how Google invented PageRank and changed the game on search...]]></description>
			<content:encoded><![CDATA[<div id="attachment_740" class="wp-caption alignnone" style="width: 510px"><a href="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/02/5268732048_9c693385b2.jpg" rel="lightbox[725]"><img class="size-full wp-image-740 " title="5268732048_9c693385b2" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/02/5268732048_9c693385b2.jpg" alt="" width="500" height="322" /></a><p class="wp-caption-text">Image: http://www.flickr.com/photos/19marksdesign/</p></div>
<p>A quick update on the project.</p>
<p>Most people never stray far from Flipboard&#8217;s default sections (e.g. Tech, Gaming, Fashion) which feature collections of articles from sources hand-picked by the Flipboard editorial team. Pay closer attention though, and you would notice that the laying out of those articles in the Flipboard grids were not simply based on published time or source. Length of the headline, size of the embedded image, similarity of the topics &#8211; all these factors seem to come into consideration. Then of course the social strength (tweet count, retweet count, Facebook likes.etc.) of a particular article is one major relevancy factor too. In my opinion, the adoption of a brand new ranking paradigm &#8211; social strength, was the single most ground-breaking thinking that was in Flipboard&#8217;s design, not unlike how Google invented PageRank and changed the game on search.</p>
<p>This post is not about Flipboard&#8217;s layout algorithm (I hope I could do one someday), but rather a quick detour into social signals scraping. The more social signal information we could gather about any article, the easier it would be for us to rank them in terms of interestingness.</p>
<p>After a few hours of fiddling (wasted countless moments not realizing twitter has removed basic authentication support), the two pieces of information I was looking for became accessible.</p>
<h4>Twitter Shares</h4>
<p>This couldn&#8217;t be simpler. Just do a curl to the Count API:</p>
<pre>curl "http://urls.api.twitter.com/1/urls/count.json?url=web_site_url"</pre>
<pre>e.g. curl "http://urls.api.twitter.com/1/urls/count.json?url=http://edition.cnn.com/2011/WORLD/europe/02/10/index.html"</pre>
<p>which returns</p>
<pre>{"count":4,"url":"http://edition.cnn.com/2011/WORLD/europe/02/10/egypt.protests.london/index.html/"}</pre>
<p>Although there&#8217;re some limitations. Most notably is how it doesn&#8217;t follow redirects and treats different addresses of the same web page as totally separate:</p>
<pre>/* all have different results */
curl "http://urls.api.twitter.com/1/urls/count.json?url=http://www.facebook.com/BufSabres/posts/10150099820062954"

curl "http://urls.api.twitter.com/1/urls/count.json?url=http://fb.me/R97GMpug"

curl "http://www.facebook.com/BufSabres/posts/10150099820062954?somefakeparam"</pre>
<p>We could definitely use some pre-processing before calling the Count API.</p>
<h4>Facebook Likes</h4>
<p>Then there&#8217;s the beast of Facebook-Likes. Poised as an innocent function for simply declaring your interest on any entity (a web page, a person, an event.etc.), Likes is a tour de force from Facebook&#8217;s arsenal for pushing forward their Social Graph vision. When a Like button is added to any web page, that page automatically becomes a living entity in Facebook&#8217;s Open Graph repository. In other words, if there&#8217;s a Like button added on your web site, Facebook would be able to analyze what type of entity your site represents (a blog, a movie,  a sport team.etc.) and whom expressed interest on it. Now whether that ambition of categorizing the entire web and building social graphs around them is a move forward towards a semantic web or a potentially huge exploit of privacy is for another discussion.</p>
<p>As of now, Facebook&#8217;s API does not have a convenient way of returning the Likes count, so as usual, one has to do it via scrapping. The <a href="http://developers.facebook.com/docs/opengraph/" target="_blank">official documentation</a> points to the &#8220;<strong>Facebook URL Linter</strong>&#8221; for obtaining the Social Graph info stored by Facebook on any web site. On the URL Linter page, the unique Social Graph ID for a web site would be shown. We could grab that ID and then invoke the Graph API to obtain the Likes count.</p>
<p>e.g. The Matrix page on RottenTomatoes has a Social Graph ID of 119655798047100, and from &#8220;http://graph.facebook.com/?id=119655798047100&#8243; we know it&#8217;s been liked on Facebook <strong>571</strong> times.</p>
<pre>{
   "id": "119655798047100",
   "name": "The Matrix",
   "picture": "http://profile.ak.fbcdn.net/hprofile-ak-snc4/162036_119655798047100_2294421_s.jpg",
   "link": "http://www.rottentomatoes.com/m/matrix/",
   "category": "Movie",
   "website": "http://www.rottentomatoes.com/m/matrix/",
   "description": "Average Rating: 7.4/10 \t\t\tReviews Counted: 126 \t\t\tFresh: 109 | Rotten: 17",
   "likes": 571
}</pre>
<p>Except that for reasons unknown, the Linter does not always return the Social Graph ID, like &#8220;http://developers.facebook.com/tools/lint/?url=http%3A%2F%2Fwww.oscars.com&#8221;, &#8220;http://developers.facebook.com/tools/lint/?url=http%3A%2F%2Fwww.apple.com%2F&#8221; and many others.</p>
<p>For that matter, perhaps it is perhaps easier for us to bypass the Graph API altogether and directly scrape the iframe that site owners use to add the Like buttons.</p>
<p>All that&#8217;s required is a curl followed by some ugly regex filtering code on the returned markup from <strong>line 22</strong>:</p>
<pre class="brush: xml; title: ; notranslate">
&lt;div id=&quot;connect_widget_4d6b574b6fe272202491168&quot; class=&quot;connect_widget&quot;&gt;
&lt;table class=&quot;connect_widget_interactive_area&quot;&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&quot;connect_widget_vertical_center connect_widget_button_cell&quot;&gt;
&lt;div class=&quot;connect_button_slider&quot;&gt;
&lt;div class=&quot;connect_button_container&quot;&gt;
&lt;a class=&quot;connect_widget_like_button clearfix like_button_no_like&quot;&gt;
&lt;span class=&quot;liketext&quot;&gt;Like&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;&lt;/td&gt;
&lt;td class=&quot;connect_widget_vertical_center&quot;&gt;
&lt;div class=&quot;connect_confirmation_cell connect_confirmation_cell_no_like&quot;&gt;
&lt;div class=&quot;connect_widget_text_summary connect_text_wrapper&quot;&gt;
&lt;span class=&quot;connect_widget_facebook_favicon&quot;&gt; &lt;/span&gt;
&lt;span class=&quot;connect_widget_user_action connect_widget_text hidden_elem&quot;&gt;You like &lt;strong&gt;The Matrix&lt;/strong&gt;.&lt;span class=&quot;unlike_span hidden_elem&quot;&gt;
&lt;a class=&quot;mls connect_widget_unlike_link&quot;&gt;Unlike&lt;/a&gt;&lt;/span&gt;
&lt;span class=&quot;connect_widget_admin_span hidden_elem&quot;&gt;
&lt;a class=&quot;connect_widget_admin_option&quot;&gt;Admin Page&lt;/a&gt;&lt;span class=&quot;connect_widget_insights_span hidden_elem&quot;&gt;
&lt;a class=&quot;connect_widget_insights_link&quot;&gt;Insights&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class=&quot;connect_widget_error_span hidden_elem&quot;&gt;
&lt;a class=&quot;connect_widget_error_text&quot;&gt;Error&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class=&quot;connect_widget_summary connect_widget_text&quot;&gt;&lt;span class=&quot;connect_widget_connected_text hidden_elem&quot;&gt;You and 611 others like this.&lt;/span&gt;
&lt;span class=&quot;connect_widget_not_connected_text&quot;&gt;611 likes.
&lt;a href=&quot;/campaign/landing.php?campaign_id=137675572948107&amp;partner_id&amp;placement=like_button&amp;extra_2=HK&quot; target=&quot;_blank&quot;&gt;Sign Up&lt;/a&gt; to see what your friends like.&lt;/span&gt;
&lt;span class=&quot;unlike_span hidden_elem&quot;&gt;
&lt;a class=&quot;mls connect_widget_unlike_link&quot;&gt;Unlike&lt;/a&gt;&lt;/span&gt;
&lt;span class=&quot;connect_widget_admin_span hidden_elem&quot;&gt;
&lt;a class=&quot;connect_widget_admin_option&quot;&gt;Admin Page&lt;/a&gt;&lt;span class=&quot;connect_widget_insights_span hidden_elem&quot;&gt;
&lt;a class=&quot;connect_widget_insights_link&quot;&gt;Insights&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class=&quot;connect_widget_error_span hidden_elem&quot;&gt;
&lt;a class=&quot;connect_widget_error_text&quot;&gt;Error&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;div id=&quot;connect-widget-comment-box-markup&quot;&gt;
&lt;!--
&lt;div class=&quot;connect_widget_comment_box hidden_elem&quot;&gt;
&lt;div class=&quot;connect_widget_comment_box_upward_nub&quot;&gt;&lt;/div&gt;
&lt;div class=&quot;connect_widget_comment_area&quot;&gt;
&lt;table class=&quot;uiGrid&quot; cellspacing=&quot;0&quot; cellpadding=&quot;0&quot;&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;input type=&quot;text&quot; class=&quot;inputtext connect_widget_comment_textinput DOMControl_placeholder&quot; placeholder=&quot;Share it on Facebook with a comment...&quot; value=&quot;Share it on Facebook with a comment...&quot; title=&quot;Share it on Facebook with a comment...&quot; /&gt;&lt;/td&gt;
&lt;td&gt;&lt;label class=&quot;connect_widget_comment_button hidden_elem uiButton uiButtonConfirm&quot; for=&quot;u033146_1&quot;&gt;&lt;input value=&quot;Post&quot; type=&quot;submit&quot; id=&quot;u033146_1&quot; /&gt;&lt;/label&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
--&gt;&lt;/div&gt;
&lt;/div&gt;
</pre>
<p>So here we are, able to extract Twitter Shares and Facebook Likes for any web site. To complete the test, I rounded up a handful of random web sites from Delicious.com and re-ranked them based on social strength. The result was quite clear in terms of showing what kind of web sites might be more appealing to the masses.</p>
<h4>Reranking with Social Strengths</h4>
<div id="attachment_733" class="wp-caption aligncenter" style="width: 661px"><a href="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/02/sites.png" rel="lightbox[725]"><img class="size-full wp-image-733 " title="sites" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/02/sites.png" alt="" width="651" height="122" /></a><p class="wp-caption-text">Random sites from Delicious, with their social strength scores extracted and normaled</p></div>
<p>Before and After</p>
<div id="attachment_734" class="wp-caption alignleft" style="width: 275px"><img class="size-full wp-image-734    " title="nine_sites_preranked" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/02/nine_sites_preranked.png" alt="" width="265" height="515" /><p class="wp-caption-text">A bunch of latest sites from Delicious</p></div>
<div id="attachment_735" class="wp-caption alignright" style="width: 296px"><a href="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/02/nine_sites_ranked.png" rel="lightbox[725]"><img class="size-full wp-image-735    " title="nine_sites_ranked" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/02/nine_sites_ranked.png" alt="" width="286" height="554" /></a><p class="wp-caption-text">Same group of sites, ranked by their Social Strength</p></div>
<p>Previous:<br />
<a href="http://www.corgitoergosum.net/2011/01/17/replicating-flipboard-part-i-site-scraping/">Part 1: Site Scraping</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.corgitoergosum.net/2011/02/28/replicating-flipboard-part-ii-social-signals/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Replicating Flipboard Part I – Site Scraping</title>
		<link>http://www.corgitoergosum.net/2011/01/17/replicating-flipboard-part-i-site-scraping/</link>
		<comments>http://www.corgitoergosum.net/2011/01/17/replicating-flipboard-part-i-site-scraping/#comments</comments>
		<pubDate>Mon, 17 Jan 2011 13:14:50 +0000</pubDate>
		<dc:creator>kenshin03</dc:creator>
				<category><![CDATA[cassius]]></category>
		<category><![CDATA[mac]]></category>
		<category><![CDATA[tech]]></category>
		<category><![CDATA[flipboard]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[readability]]></category>
		<category><![CDATA[scrappers]]></category>

		<guid isPermaLink="false">http://www.corgitoergosum.net/?p=673</guid>
		<description><![CDATA[Having taken a long good look at the social magazine Flipboard, it was time to dig beneath the cover and contemplate what kind of technology lies behind its minimalistic interface. I began by studying the inconspicuous yet critically important Reader feature. Personally, if Flipboard was to pop open mobile Safari each time I clicked into [...]]]></description>
			<content:encoded><![CDATA[<p>Having <a href="http://www.corgitoergosum.net/2011/01/08/taking-the-flipboard-apart/" target="_blank">taken a long good look</a> at the social magazine Flipboard, it was time to dig beneath the cover and contemplate what kind of technology lies behind its minimalistic interface.</p>
<p>I began by studying the inconspicuous yet critically important <strong>Reader</strong> feature. Personally, if Flipboard was to pop open mobile Safari each time I clicked into an article, I would have deleted the app right away, so that to me is the number 1 major feature. One of the wonders of Flipboard was how it manages to scrape relevant content from web sites in such an elegant and effortless manner.</p>
<div id="attachment_679" class="wp-caption aligncenter" style="width: 510px"><img class="size-full wp-image-679 " title="flipup3" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/01/flipup3.jpg" alt="" width="500" height="333" /><p class="wp-caption-text">Expanded web content view on Flipboard. wpcdn.padgadget.com</p></div>
<p>Part of my job these years has been to do with web scrapping and crawling, and all the more because I know how scrappers and crawlers work, the speed and precision that Flipboard does it left me very impressed &#8211; there are apparently no machine learning models, no editorial selection, no reliance on meta data, no nonsense. Most of the time, you just throw any links at  the app &#8211; Facebook, bitty, flickr, youtube vids, RSS, and It swallows it and churns out their core contents in the most aesthetically pleasing format like a humble neo-classism painter. What technical options out there are capable of performing such an achievement?</p>
<div id="attachment_674" class="wp-caption aligncenter" style="width: 245px"><a href="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/01/safari_reader_icon.png" rel="lightbox[673]"><img class="size-full wp-image-674" title="safari_reader_icon" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/01/safari_reader_icon.png" alt="" width="235" height="119" /></a><p class="wp-caption-text">Safari Reader</p></div>
<p>After days of testing and failing, I was prepared to throw in the towel, abandon the idea and go back to reading about iOS and cocos2d &#8211; until incidentally a little blue icon on the desktop screamed out at me. That was the &#8220;Reader&#8221; icon in <strong>Safari&#8217;s toolbar</strong>.</p>
<p>Introduced in Safari 5, the Reader is a feature that extracted the most prominent contents of a web page and displays them in a clutter free overlay. The Reader works best with pages that are text heavy, such as news articles and the like. Now that&#8217;s not unlike what Flipboard does.</p>
<div id="attachment_675" class="wp-caption aligncenter" style="width: 512px"><img class="size-large wp-image-675  " title="SafariReaderActive" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/01/SafariReaderActive-1024x922.png" alt="" width="502" height="451" /><p class="wp-caption-text">Content extracted by Safari Reader</p></div>
<p>Is there any chance that Flipboard employs a similar scraping technique as Apple? Would it take us a step closer if somehow we could crack Safari Reader&#8217;s secret? Clues were revealing themselves here and there. After a bit of googling, I landed on the site of the <a href="http://lab.arc90.com/experiments/readability/" target="_blank">Readability project</a> by a group called Arc90.</p>
<p>The main works of the Readability project is a JavaScript file that parses the dom tree of a document and extracts the section where the most relevant content lies. I threw a few URLs at Readability&#8217;s test page, and was pleasantly surprised  with how comparable the results were with Safari&#8217;s Reader, even for more complicated pages (some sources say that Safari does ship with a version of Readability.js bundled in).</p>
<p>Since the JS isn&#8217;t very viable for running in non-browser environments, I reverted to the php port by <a href="http://www.keyvan.net/2010/08/php-readability/" target="_blank">Keyvan Minoukadeh</a>. With some extra cleaning logic thrown on top, I was able to get very good results with the sample news article pages.</p>
<p>Source: <a href="http://www.time.com/time/world/article/0,8599,2042733,00.html" target="_blank">Times article</a> versus  <a href="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/01/times.htm">readability php extracted result</a></p>
<div id="attachment_681" class="wp-caption aligncenter" style="width: 465px"><img class="size-full wp-image-681" title="original_site" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/01/original_site.png" alt="" width="455" height="286" /><p class="wp-caption-text">original article on Times.com</p></div>
<div id="attachment_682" class="wp-caption alignleft" style="width: 810px"><img class="size-full wp-image-682" title="flipboard_compare_outputs" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/01/flipboard_compare_outputs.gif" alt="" width="800" height="350" /><p class="wp-caption-text">Left to right: Flipboard render, Safari 5 Reader, Readability php port</p></div>
<p>Source: <a href="http://www.cracked.com/article_18986_how-to-understand-baffling-ad-campaign-5Bcomic5D.html" target="_blank">A cartoon on cracked.com</a> versus  <a href="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/01/cracked_scrapped.htm">cracked.com article scrapped by readability php</a></p>
<div id="attachment_687" class="wp-caption aligncenter" style="width: 510px"><img class="size-full wp-image-687" title="cracked_com_page" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/01/cracked_com_page.png" alt="" width="500" height="309" /><p class="wp-caption-text">A cartoon page on cracked.com</p></div>
<div id="attachment_688" class="wp-caption alignnone" style="width: 810px"><img class="size-full wp-image-688" title="cracked_render_results" src="http://www.corgitoergosum.net/wp/wp-content/uploads/2011/01/cracked_render_results.gif" alt="" width="800" height="350" /><p class="wp-caption-text">Clockwise: Safari 5 wouldn&#39;t show &quot;Reader&quot; mode, Readability php extracted gibberish JS code, Flipboard gave up crawling page; shows url instead</p></div>
<p>Surely the tool was not without limitations. With looser or pages that aren&#8217;t text centric, the results are often gibberish and unusable. For instance, the Readability would throw its hands up in the air when given a <a href="http://www.flickr.com/photos/rapturedmind/sets/72157624669047152/with/5346402926/" target="_blank">Flickr photoset URL</a>.</p>
<p>Flipboard most probably employs several scrapping techniques suited to different types of  web pages. With lots more to study, I shall leave the scraping part at that and revisit this space later.</p>
<p>Continued:<br />
<a href="http://www.corgitoergosum.net/2011/02/28/replicating-flipboard-part-ii-social-signals/">Part II &#8211; Social Signals</a></p>
<p>&nbsp;</p>
<p><strong>Scraping and other issues:</strong></p>
<ul>
<li><a href="http://www.cinchcast.com/scobleizer/169405" target="_blank">How @flipboard does image cropping. Interview of CTO @avh</a></li>
<li><a href="http://gizmodo.com/5594176/is-flipboard-legal" target="_blank">Is Flipboard legal</a></li>
<li><a href="http://legalintangibles.com/?p=185" target="_blank">From a lawyer&#8217;s point of view</a></li>
<li><a href="http://scripting.com/stories/2010/07/22/aboutFlipboardAndReadingSu.html" target="_blank">About Flipboard and reading surfaces</a></li>
</ul>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fwww.corgitoergosum.net%2F2011%2F01%2F17%2Freplicating-flipboard-part-i-site-scraping%2F&amp;title=Replicating%20Flipboard%20Part%20I%20%E2%80%93%20Site%20Scraping" id="wpa2a_8"><img src="http://www.corgitoergosum.net/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.corgitoergosum.net/2011/01/17/replicating-flipboard-part-i-site-scraping/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
	</channel>
</rss>

