Replicating Flipboard Part IV – Prelude

It’s been forever since the last update. But amidst a trillion other things, this Flipboard study never strayed far from my mind (and the IDE).

More detailed posts on the progress and designs of the prototype will come this week, but right now, I can’t wait to share the very first screenshots of Cloneboard Cassius.

What you see here is a cover page using a random image from Instagram transitioning into the first articles page showing stories from my Twitter feed. The layout was generated dynamically from a custom template defined in json.

I’ll be putting all the code up on Github for better back up and version control soon.

Share
Read More

10 months of BashBash. In 30 seconds.

Got introduced to the brilliant SVN visualization tool Gource, and couldn’t stop playing around with it since!

Here’s a visualization of BashBash‘s svn trunk over the course of nearly 10 months. The clip gloriously depicted who was responsible for the lion’s share of hard work and who was sitting on the fence most of the time. :)

Seeing the beams and flashes madly dazing by, one almost forgets how much sleep was lost and how painful each cycle of software development really could be.

Logstalgia is another cool product from the team that’s worth checking out.

Here’s the commands used for generating the video. I used MacPorts to install gource on Snow Leopard.


svn log -r 1:HEAD –xml –verbose –quiet > bashbash.xml

gource -a 0.5 -s 0.1 bashbash.xml –hide dirnames,filenames -1024×768 –user-scale 2.0 –highlight-users -i 30 –max-file-lag 0.1 -i 0 –max-files 0 –key –output-ppm-stream – | ffmpeg -y -b 3000K -r 60 -f image2pipe -vcodec ppm -i – -vcodec libx264 -vpre hq -crf 28 -threads 0 bashbash.mp4

// add soundtrack
ffmpeg -i coldplay.mp3 -i bashbash.mp4 -vcodec libx264 -vpre hq -crf 28 -threads 0 bashbash-bgm.mp4

Share
Read More

JQueryMobile & Prezi




Amidst the crazily squeezed schedule in the last few weeks, a couple of things fascinated me so much I couldn’t resist from getting my hands dirty.


Putting the two together, this is a Prezi demo of a simple prototype I’ve built with JQuerymobile. The technical stuff will be added in this post later – for the moment, just sit back and be dazzled by pretty pictures flying across the screen!


Share
Read More

K is for…

…a pile of plastic awards?

Share
Read More

Replicating Flipboard Part III – How Flipboard lays out content

Shifting the focus back to the iPad app, let’s take a look at how Flipboard processes and lays out Facebook and Twitter feeds. Does it really employ any social signal based ranking?

Here’s a sample of some Twitter feeds I receive, shown in a browser and in Flipboard:

Sample Twitter Feed

Same group of Twitter feeds viewed in Flipboard

Alright…what can be deduced from this? A few things became obvious when the same data is shown side-by-side:

  • Ranking of the Twitter articles in Flipboard more or less retains the original sort-by-time order! Don’t think there’re any clever hidden social ranking at play here.
  • Some Tweets were dropped by Flipboard, presumably because they do not contain links. Examples like (#3, #11, #15, #17, #18) are simply Twitter conversations.
  • Tweets that have links to sites with Images (e.g. #7, #9) seemed to be given higher display priority.

What about Facebook? The typical Facebook feed is a lot more complex, with mixed content ranging from Check-ins, uploaded lmages, to Likes and notifications from whatever apps/games a user has added. The prospect of having to analyze so many different types of content itself is daunting enough, let alone the additional efforts of re-ranking and laying them out in an app.

As above, this is a snapshot of my Facebook News feed (names, faces and updates blurred out – don’t worry people!) in a browser and in Flipboard:

facebook news feed

facebook in Flipboard

Immediately we are striked with much more intriguing findings:

  • Flipboard completely ditched the published time of the feed articles and laid them out entirely based on readability attributes (text length, image size.etc.).
  • Quite a few Facebook articles were dropped:
    • Article #5 – a Facebook Places-Checkin: Due to lack of images?
    • Article #6 – An Image uploaded from iPhone. Not sure why it wasn’t shown in Flipboard.
    • Articles #7, #13 – These were “xxx started using xxx app” messages, which contain no links or images and frankly, nothing interesting.
    • Articles #20, #21 – These were messages from Groupon. As article #14 was also a Groupon article, I guess #20 and #21 were dropped as Flipboard prevents showing too many articles from the same source all at once.

And while looking at Facebook articles, we might as well take a quick look again if Facebook-Likes contribute to the layout or ranking at all. The verdict – NOPE. If social signals play a major role in ranking, then #12 and #19 should have higher placements in Page 1 or 2 at least.

  • #1 – 1 reply
  • #2 – 1 Like
  • #3 – 0 Likes (but 82 youtube Likes)
  • #8 – 0 Likes
  • #4, #9 – 0 Likes
  • #11 – 4 Likes, 7 Comments
  • #12 – 1 Reply, 92 Likes
  • #14 – 0 Likes
  • #16 – 3 Comments
  • #19 – 34 Likes
  • #17 – 4 Likes
  • #18 – 2 Likes
  • #22 – 2 Likes

From such observations, I suspect Flipboard conducts workflows that roughly compose of the steps below:

Likely processing workflow for Facebook feeds

Likely processing workflow for Twitter feeds

Initially, I found it hard to accept this interpretation of templates-over-content. It is basically saying there’s no magic, that Flipboard merely puts together a random collection of page templates then proceeds to filling those templates with the next most suitable article from a content feed.

Putting an end to this suspicion, I switched on Flipboard after a few more articles emerged in my Facebook feed. Given a largely unchanged set of data, if Flipboard employs a content-centric ranking criteria, then the layout of the pages should remain more or less the same? From the screenshots below, this was clearly not the case. Notice how the same data was laid out much more differently:

Original render in Flipboard

Second render of the same Facebook feed

So that concludes the brief study on Flipboard’s layout algorithm. From this point on I shall start firing up the IDEs and getting my hands dirty on building that prototype. Pretty thrilled by the amount of attention and support this little project seems to be garnering – thanks and please stay tuned!

Previous Post



Related:

Share
Read More

Replicating Flipboard Part II – Social Signals

Image: http://www.flickr.com/photos/19marksdesign/

A quick update on the project.

Most people never stray far from Flipboard’s default sections (e.g. Tech, Gaming, Fashion) which feature collections of articles from sources hand-picked by the Flipboard editorial team. Pay closer attention though, and you would notice that the laying out of those articles in the Flipboard grids were not simply based on published time or source. Length of the headline, size of the embedded image, similarity of the topics – all these factors seem to come into consideration. Then of course the social strength (tweet count, retweet count, Facebook likes.etc.) of a particular article is one major relevancy factor too. In my opinion, the adoption of a brand new ranking paradigm – social strength, was the single most ground-breaking thinking that was in Flipboard’s design, not unlike how Google invented PageRank and changed the game on search.

This post is not about Flipboard’s layout algorithm (I hope I could do one someday), but rather a quick detour into social signals scraping. The more social signal information we could gather about any article, the easier it would be for us to rank them in terms of interestingness.

After a few hours of fiddling (wasted countless moments not realizing twitter has removed basic authentication support), the two pieces of information I was looking for became accessible.

Twitter Shares

This couldn’t be simpler. Just do a curl to the Count API:

curl "http://urls.api.twitter.com/1/urls/count.json?url=web_site_url"
e.g. curl "http://urls.api.twitter.com/1/urls/count.json?url=http://edition.cnn.com/2011/WORLD/europe/02/10/index.html"

which returns

{"count":4,"url":"http://edition.cnn.com/2011/WORLD/europe/02/10/egypt.protests.london/index.html/"}

Although there’re some limitations. Most notably is how it doesn’t follow redirects and treats different addresses of the same web page as totally separate:

/* all have different results */
curl "http://urls.api.twitter.com/1/urls/count.json?url=http://www.facebook.com/BufSabres/posts/10150099820062954"

curl "http://urls.api.twitter.com/1/urls/count.json?url=http://fb.me/R97GMpug"

curl "http://www.facebook.com/BufSabres/posts/10150099820062954?somefakeparam"

We could definitely use some pre-processing before calling the Count API.

Facebook Likes

Then there’s the beast of Facebook-Likes. Poised as an innocent function for simply declaring your interest on any entity (a web page, a person, an event.etc.), Likes is a tour de force from Facebook’s arsenal for pushing forward their Social Graph vision. When a Like button is added to any web page, that page automatically becomes a living entity in Facebook’s Open Graph repository. In other words, if there’s a Like button added on your web site, Facebook would be able to analyze what type of entity your site represents (a blog, a movie,  a sport team.etc.) and whom expressed interest on it. Now whether that ambition of categorizing the entire web and building social graphs around them is a move forward towards a semantic web or a potentially huge exploit of privacy is for another discussion.

As of now, Facebook’s API does not have a convenient way of returning the Likes count, so as usual, one has to do it via scrapping. The official documentation points to the “Facebook URL Linter” for obtaining the Social Graph info stored by Facebook on any web site. On the URL Linter page, the unique Social Graph ID for a web site would be shown. We could grab that ID and then invoke the Graph API to obtain the Likes count.

e.g. The Matrix page on RottenTomatoes has a Social Graph ID of 119655798047100, and from “http://graph.facebook.com/?id=119655798047100″ we know it’s been liked on Facebook 571 times.

{
   "id": "119655798047100",
   "name": "The Matrix",
   "picture": "http://profile.ak.fbcdn.net/hprofile-ak-snc4/162036_119655798047100_2294421_s.jpg",
   "link": "http://www.rottentomatoes.com/m/matrix/",
   "category": "Movie",
   "website": "http://www.rottentomatoes.com/m/matrix/",
   "description": "Average Rating: 7.4/10 \t\t\tReviews Counted: 126 \t\t\tFresh: 109 | Rotten: 17",
   "likes": 571
}

Except that for reasons unknown, the Linter does not always return the Social Graph ID, like “http://developers.facebook.com/tools/lint/?url=http%3A%2F%2Fwww.oscars.com”, “http://developers.facebook.com/tools/lint/?url=http%3A%2F%2Fwww.apple.com%2F” and many others.

For that matter, perhaps it is perhaps easier for us to bypass the Graph API altogether and directly scrape the iframe that site owners use to add the Like buttons.

All that’s required is a curl followed by some ugly regex filtering code on the returned markup from line 22:

<div id="connect_widget_4d6b574b6fe272202491168" class="connect_widget">
<table class="connect_widget_interactive_area">
<tbody>
<tr>
<td class="connect_widget_vertical_center connect_widget_button_cell">
<div class="connect_button_slider">
<div class="connect_button_container">
<a class="connect_widget_like_button clearfix like_button_no_like">
<span class="liketext">Like</span></a></div>
</div></td>
<td class="connect_widget_vertical_center">
<div class="connect_confirmation_cell connect_confirmation_cell_no_like">
<div class="connect_widget_text_summary connect_text_wrapper">
<span class="connect_widget_facebook_favicon"> </span>
<span class="connect_widget_user_action connect_widget_text hidden_elem">You like <strong>The Matrix</strong>.<span class="unlike_span hidden_elem">
<a class="mls connect_widget_unlike_link">Unlike</a></span>
<span class="connect_widget_admin_span hidden_elem">
<a class="connect_widget_admin_option">Admin Page</a><span class="connect_widget_insights_span hidden_elem">
<a class="connect_widget_insights_link">Insights</a></span></span>
<span class="connect_widget_error_span hidden_elem">
<a class="connect_widget_error_text">Error</a></span></span>
<span class="connect_widget_summary connect_widget_text"><span class="connect_widget_connected_text hidden_elem">You and 611 others like this.</span>
<span class="connect_widget_not_connected_text">611 likes.
<a href="/campaign/landing.php?campaign_id=137675572948107&partner_id&placement=like_button&extra_2=HK" target="_blank">Sign Up</a> to see what your friends like.</span>
<span class="unlike_span hidden_elem">
<a class="mls connect_widget_unlike_link">Unlike</a></span>
<span class="connect_widget_admin_span hidden_elem">
<a class="connect_widget_admin_option">Admin Page</a><span class="connect_widget_insights_span hidden_elem">
<a class="connect_widget_insights_link">Insights</a></span></span>
<span class="connect_widget_error_span hidden_elem">
<a class="connect_widget_error_text">Error</a></span></span></div>
</div></td>
</tr>
</tbody>
</table>
<div id="connect-widget-comment-box-markup">
<!--
<div class="connect_widget_comment_box hidden_elem">
<div class="connect_widget_comment_box_upward_nub"></div>
<div class="connect_widget_comment_area">
<table class="uiGrid" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td><input type="text" class="inputtext connect_widget_comment_textinput DOMControl_placeholder" placeholder="Share it on Facebook with a comment..." value="Share it on Facebook with a comment..." title="Share it on Facebook with a comment..." /></td>
<td><label class="connect_widget_comment_button hidden_elem uiButton uiButtonConfirm" for="u033146_1"><input value="Post" type="submit" id="u033146_1" /></label></td>
</tr>
</tbody>
</table>
</div>
</div>
--></div>
</div>

So here we are, able to extract Twitter Shares and Facebook Likes for any web site. To complete the test, I rounded up a handful of random web sites from Delicious.com and re-ranked them based on social strength. The result was quite clear in terms of showing what kind of web sites might be more appealing to the masses.

Reranking with Social Strengths

Random sites from Delicious, with their social strength scores extracted and normaled

Before and After

A bunch of latest sites from Delicious

Same group of sites, ranked by their Social Strength

Previous:
Part 1: Site Scraping

Share
Read More
Get Adobe Flash playerPlugin by wpburn.com wordpress themes