Replicating Flipboard Part IV – Prelude
It’s been forever since the last update. But amidst a trillion other things, this Flipboard study never strayed far from my mind (and the IDE).
More detailed posts on the progress and designs of the prototype will come this week, but right now, I can’t wait to share the very first screenshots of Cloneboard Cassius.
What you see here is a cover page using a random image from Instagram transitioning into the first articles page showing stories from my Twitter feed. The layout was generated dynamically from a custom template defined in json.
I’ll be putting all the code up on Github for better back up and version control soon.
Read MoreReplicating Flipboard Part III – How Flipboard lays out content
Shifting the focus back to the iPad app, let’s take a look at how Flipboard processes and lays out Facebook and Twitter feeds. Does it really employ any social signal based ranking?
Here’s a sample of some Twitter feeds I receive, shown in a browser and in Flipboard:
![]() Sample Twitter Feed |
![]() Same group of Twitter feeds viewed in Flipboard |
Alright…what can be deduced from this? A few things became obvious when the same data is shown side-by-side:
- Ranking of the Twitter articles in Flipboard more or less retains the original sort-by-time order! Don’t think there’re any clever hidden social ranking at play here.
- Some Tweets were dropped by Flipboard, presumably because they do not contain links. Examples like (#3, #11, #15, #17, #18) are simply Twitter conversations.
- Tweets that have links to sites with Images (e.g. #7, #9) seemed to be given higher display priority.
What about Facebook? The typical Facebook feed is a lot more complex, with mixed content ranging from Check-ins, uploaded lmages, to Likes and notifications from whatever apps/games a user has added. The prospect of having to analyze so many different types of content itself is daunting enough, let alone the additional efforts of re-ranking and laying them out in an app.
As above, this is a snapshot of my Facebook News feed (names, faces and updates blurred out – don’t worry people!) in a browser and in Flipboard:
![]() facebook news feed |
![]() facebook in Flipboard |
Immediately we are striked with much more intriguing findings:
- Flipboard completely ditched the published time of the feed articles and laid them out entirely based on readability attributes (text length, image size.etc.).
- Quite a few Facebook articles were dropped:
- Article #5 – a Facebook Places-Checkin: Due to lack of images?
- Article #6 – An Image uploaded from iPhone. Not sure why it wasn’t shown in Flipboard.
- Articles #7, #13 – These were “xxx started using xxx app” messages, which contain no links or images and frankly, nothing interesting.
- Articles #20, #21 – These were messages from Groupon. As article #14 was also a Groupon article, I guess #20 and #21 were dropped as Flipboard prevents showing too many articles from the same source all at once.
And while looking at Facebook articles, we might as well take a quick look again if Facebook-Likes contribute to the layout or ranking at all. The verdict – NOPE. If social signals play a major role in ranking, then #12 and #19 should have higher placements in Page 1 or 2 at least.
- #1 – 1 reply
- #2 – 1 Like
- #3 – 0 Likes (but 82 youtube Likes)
- #8 – 0 Likes
- #4, #9 – 0 Likes
- #11 – 4 Likes, 7 Comments
- #12 – 1 Reply, 92 Likes
- #14 – 0 Likes
- #16 – 3 Comments
- #19 – 34 Likes
- #17 – 4 Likes
- #18 – 2 Likes
- #22 – 2 Likes
From such observations, I suspect Flipboard conducts workflows that roughly compose of the steps below:
![]() Likely processing workflow for Facebook feeds |
![]() Likely processing workflow for Twitter feeds |
Initially, I found it hard to accept this interpretation of templates-over-content. It is basically saying there’s no magic, that Flipboard merely puts together a random collection of page templates then proceeds to filling those templates with the next most suitable article from a content feed.
Putting an end to this suspicion, I switched on Flipboard after a few more articles emerged in my Facebook feed. Given a largely unchanged set of data, if Flipboard employs a content-centric ranking criteria, then the layout of the pages should remain more or less the same? From the screenshots below, this was clearly not the case. Notice how the same data was laid out much more differently:
![]() Original render in Flipboard |
![]() Second render of the same Facebook feed |
So that concludes the brief study on Flipboard’s layout algorithm. From this point on I shall start firing up the IDEs and getting my hands dirty on building that prototype. Pretty thrilled by the amount of attention and support this little project seems to be garnering – thanks and please stay tuned!
Previous Post
Related:
Replicating Flipboard Part II – Social Signals
A quick update on the project.
Most people never stray far from Flipboard’s default sections (e.g. Tech, Gaming, Fashion) which feature collections of articles from sources hand-picked by the Flipboard editorial team. Pay closer attention though, and you would notice that the laying out of those articles in the Flipboard grids were not simply based on published time or source. Length of the headline, size of the embedded image, similarity of the topics – all these factors seem to come into consideration. Then of course the social strength (tweet count, retweet count, Facebook likes.etc.) of a particular article is one major relevancy factor too. In my opinion, the adoption of a brand new ranking paradigm – social strength, was the single most ground-breaking thinking that was in Flipboard’s design, not unlike how Google invented PageRank and changed the game on search.
This post is not about Flipboard’s layout algorithm (I hope I could do one someday), but rather a quick detour into social signals scraping. The more social signal information we could gather about any article, the easier it would be for us to rank them in terms of interestingness.
After a few hours of fiddling (wasted countless moments not realizing twitter has removed basic authentication support), the two pieces of information I was looking for became accessible.
Twitter Shares
This couldn’t be simpler. Just do a curl to the Count API:
curl "http://urls.api.twitter.com/1/urls/count.json?url=web_site_url"
e.g. curl "http://urls.api.twitter.com/1/urls/count.json?url=http://edition.cnn.com/2011/WORLD/europe/02/10/index.html"
which returns
{"count":4,"url":"http://edition.cnn.com/2011/WORLD/europe/02/10/egypt.protests.london/index.html/"}
Although there’re some limitations. Most notably is how it doesn’t follow redirects and treats different addresses of the same web page as totally separate:
/* all have different results */ curl "http://urls.api.twitter.com/1/urls/count.json?url=http://www.facebook.com/BufSabres/posts/10150099820062954" curl "http://urls.api.twitter.com/1/urls/count.json?url=http://fb.me/R97GMpug" curl "http://www.facebook.com/BufSabres/posts/10150099820062954?somefakeparam"
We could definitely use some pre-processing before calling the Count API.
Facebook Likes
Then there’s the beast of Facebook-Likes. Poised as an innocent function for simply declaring your interest on any entity (a web page, a person, an event.etc.), Likes is a tour de force from Facebook’s arsenal for pushing forward their Social Graph vision. When a Like button is added to any web page, that page automatically becomes a living entity in Facebook’s Open Graph repository. In other words, if there’s a Like button added on your web site, Facebook would be able to analyze what type of entity your site represents (a blog, a movie, a sport team.etc.) and whom expressed interest on it. Now whether that ambition of categorizing the entire web and building social graphs around them is a move forward towards a semantic web or a potentially huge exploit of privacy is for another discussion.
As of now, Facebook’s API does not have a convenient way of returning the Likes count, so as usual, one has to do it via scrapping. The official documentation points to the “Facebook URL Linter” for obtaining the Social Graph info stored by Facebook on any web site. On the URL Linter page, the unique Social Graph ID for a web site would be shown. We could grab that ID and then invoke the Graph API to obtain the Likes count.
e.g. The Matrix page on RottenTomatoes has a Social Graph ID of 119655798047100, and from “http://graph.facebook.com/?id=119655798047100″ we know it’s been liked on Facebook 571 times.
{
"id": "119655798047100",
"name": "The Matrix",
"picture": "http://profile.ak.fbcdn.net/hprofile-ak-snc4/162036_119655798047100_2294421_s.jpg",
"link": "http://www.rottentomatoes.com/m/matrix/",
"category": "Movie",
"website": "http://www.rottentomatoes.com/m/matrix/",
"description": "Average Rating: 7.4/10 \t\t\tReviews Counted: 126 \t\t\tFresh: 109 | Rotten: 17",
"likes": 571
}
Except that for reasons unknown, the Linter does not always return the Social Graph ID, like “http://developers.facebook.com/tools/lint/?url=http%3A%2F%2Fwww.oscars.com”, “http://developers.facebook.com/tools/lint/?url=http%3A%2F%2Fwww.apple.com%2F” and many others.
For that matter, perhaps it is perhaps easier for us to bypass the Graph API altogether and directly scrape the iframe that site owners use to add the Like buttons.
All that’s required is a curl followed by some ugly regex filtering code on the returned markup from line 22:
<div id="connect_widget_4d6b574b6fe272202491168" class="connect_widget"> <table class="connect_widget_interactive_area"> <tbody> <tr> <td class="connect_widget_vertical_center connect_widget_button_cell"> <div class="connect_button_slider"> <div class="connect_button_container"> <a class="connect_widget_like_button clearfix like_button_no_like"> <span class="liketext">Like</span></a></div> </div></td> <td class="connect_widget_vertical_center"> <div class="connect_confirmation_cell connect_confirmation_cell_no_like"> <div class="connect_widget_text_summary connect_text_wrapper"> <span class="connect_widget_facebook_favicon"> </span> <span class="connect_widget_user_action connect_widget_text hidden_elem">You like <strong>The Matrix</strong>.<span class="unlike_span hidden_elem"> <a class="mls connect_widget_unlike_link">Unlike</a></span> <span class="connect_widget_admin_span hidden_elem"> <a class="connect_widget_admin_option">Admin Page</a><span class="connect_widget_insights_span hidden_elem"> <a class="connect_widget_insights_link">Insights</a></span></span> <span class="connect_widget_error_span hidden_elem"> <a class="connect_widget_error_text">Error</a></span></span> <span class="connect_widget_summary connect_widget_text"><span class="connect_widget_connected_text hidden_elem">You and 611 others like this.</span> <span class="connect_widget_not_connected_text">611 likes. <a href="/campaign/landing.php?campaign_id=137675572948107&partner_id&placement=like_button&extra_2=HK" target="_blank">Sign Up</a> to see what your friends like.</span> <span class="unlike_span hidden_elem"> <a class="mls connect_widget_unlike_link">Unlike</a></span> <span class="connect_widget_admin_span hidden_elem"> <a class="connect_widget_admin_option">Admin Page</a><span class="connect_widget_insights_span hidden_elem"> <a class="connect_widget_insights_link">Insights</a></span></span> <span class="connect_widget_error_span hidden_elem"> <a class="connect_widget_error_text">Error</a></span></span></div> </div></td> </tr> </tbody> </table> <div id="connect-widget-comment-box-markup"> <!-- <div class="connect_widget_comment_box hidden_elem"> <div class="connect_widget_comment_box_upward_nub"></div> <div class="connect_widget_comment_area"> <table class="uiGrid" cellspacing="0" cellpadding="0"> <tbody> <tr> <td><input type="text" class="inputtext connect_widget_comment_textinput DOMControl_placeholder" placeholder="Share it on Facebook with a comment..." value="Share it on Facebook with a comment..." title="Share it on Facebook with a comment..." /></td> <td><label class="connect_widget_comment_button hidden_elem uiButton uiButtonConfirm" for="u033146_1"><input value="Post" type="submit" id="u033146_1" /></label></td> </tr> </tbody> </table> </div> </div> --></div> </div>
So here we are, able to extract Twitter Shares and Facebook Likes for any web site. To complete the test, I rounded up a handful of random web sites from Delicious.com and re-ranked them based on social strength. The result was quite clear in terms of showing what kind of web sites might be more appealing to the masses.
Reranking with Social Strengths
Before and After

A bunch of latest sites from Delicious
Previous:
Part 1: Site Scraping
Replicating Flipboard Part I – Site Scraping
Having taken a long good look at the social magazine Flipboard, it was time to dig beneath the cover and contemplate what kind of technology lies behind its minimalistic interface.
I began by studying the inconspicuous yet critically important Reader feature. Personally, if Flipboard was to pop open mobile Safari each time I clicked into an article, I would have deleted the app right away, so that to me is the number 1 major feature. One of the wonders of Flipboard was how it manages to scrape relevant content from web sites in such an elegant and effortless manner.

Expanded web content view on Flipboard. wpcdn.padgadget.com
Part of my job these years has been to do with web scrapping and crawling, and all the more because I know how scrappers and crawlers work, the speed and precision that Flipboard does it left me very impressed – there are apparently no machine learning models, no editorial selection, no reliance on meta data, no nonsense. Most of the time, you just throw any links at the app – Facebook, bitty, flickr, youtube vids, RSS, and It swallows it and churns out their core contents in the most aesthetically pleasing format like a humble neo-classism painter. What technical options out there are capable of performing such an achievement?
After days of testing and failing, I was prepared to throw in the towel, abandon the idea and go back to reading about iOS and cocos2d – until incidentally a little blue icon on the desktop screamed out at me. That was the “Reader” icon in Safari’s toolbar.
Introduced in Safari 5, the Reader is a feature that extracted the most prominent contents of a web page and displays them in a clutter free overlay. The Reader works best with pages that are text heavy, such as news articles and the like. Now that’s not unlike what Flipboard does.

Content extracted by Safari Reader
Is there any chance that Flipboard employs a similar scraping technique as Apple? Would it take us a step closer if somehow we could crack Safari Reader’s secret? Clues were revealing themselves here and there. After a bit of googling, I landed on the site of the Readability project by a group called Arc90.
The main works of the Readability project is a JavaScript file that parses the dom tree of a document and extracts the section where the most relevant content lies. I threw a few URLs at Readability’s test page, and was pleasantly surprised with how comparable the results were with Safari’s Reader, even for more complicated pages (some sources say that Safari does ship with a version of Readability.js bundled in).
Since the JS isn’t very viable for running in non-browser environments, I reverted to the php port by Keyvan Minoukadeh. With some extra cleaning logic thrown on top, I was able to get very good results with the sample news article pages.
Source: Times article versus readability php extracted result

original article on Times.com

Left to right: Flipboard render, Safari 5 Reader, Readability php port
Source: A cartoon on cracked.com versus cracked.com article scrapped by readability php

A cartoon page on cracked.com

Clockwise: Safari 5 wouldn't show "Reader" mode, Readability php extracted gibberish JS code, Flipboard gave up crawling page; shows url instead
Surely the tool was not without limitations. With looser or pages that aren’t text centric, the results are often gibberish and unusable. For instance, the Readability would throw its hands up in the air when given a Flickr photoset URL.
Flipboard most probably employs several scrapping techniques suited to different types of web pages. With lots more to study, I shall leave the scraping part at that and revisit this space later.
Continued:
Part II – Social Signals
Scraping and other issues:
- How @flipboard does image cropping. Interview of CTO @avh
- Is Flipboard legal
- From a lawyer’s point of view
- About Flipboard and reading surfaces
Tearing Flipboard apart
I’ve been checking out Flipboard ever since it got nominated as the iPad App of 2010. As an avid user of Google Reader and RSS (and recently Pulse), it was initially hard to see what the hype was about. Magazine style format? The future of media? Seriously?
Two weeks onward, Flipboard became the primary reason for turning on my iPad. The smoothness of the browsing experience, attenuated by the attention to detail here and there, suddenly made the routine task of feed-reading much more enjoyable.
Without repeating what’s been said about the app (See this and this for good reviews), there are several attributes about Flipboard that I found particularly intriguing:
- Page flipping effect – simple yet immensely effective. This little effect single-handedly created the app’s magazine experience, to the extent that the contents themselves didn’t really matter. One can arguably say that due to this transition effect, Flipboard feels a lot more magazine-like than “real” e-magazine apps like Zinio.

image from blogs.chron.com
- Speed – it’s easy to tell that a great deal of caching is taking place behind the scenes. Images and site snippets almost appear in no-time. Again, the page flipping effect has a lot to do with this. While the user’s focus is being hijacked by the slick transition, objects on the next page are busily being pre-rendered so that most of the page would be ready for viewing when the transition is over.It’s impressive how one little trick served the dual purpose of defining an app’s identity as well as boosting the perceived performance of its users. Funny enough, the residing door sequences in the original Resident Evil game comes to mind.
- Fluid Grid Layout Engine – Flipboard arranges feed content into neatly organized grids to compose a virtual broadsheet newspaper. The interesting thing is that the Flipboard would attempt to choose a layout composition most suitable to the content at hand. For instance, articles containing a landscape image would be placed at the topmost slot spanning across the entire breadth of the screen, while a tweet linking to a Facebook album would be given a full page treatment.

image by netzkobold@flickr.com
- Readable Content – Flipboard extracts the most relevant section from the corresponding web site for display. Gone are ads, banners and all the other distractions that are most unwelcome on the iPad. The Flipboard reader is shown in-line, without annoying back-and-forth hopping between the app and mobile Safari.
Of course that’s not all but these few features were enough to have me wooed. What’s important is that Flipboard offers us a glimpse into something that’s more than a sum of its parts – that in the future, information around us will be increasingly collected in real-time and ranked socially.
In the upcoming posts, I will be attempting to build a prototype that replicates Flipboard’s key functionalities using open source libraries and code snippets found on the web. I think it’ll be a fun and inspiring project that would help us understand better the kind of technology that lies under the hood of this amazing app. Stay tuned!
Extended reading/viewing:
- “Flipping”, zombies-style in Resident Evil :
- “The Inside Story: Flipboard’s Crazy Launch And Its Plan To Save Media”: http://www.businessinsider.com/flipboard-ceo-mike-mccue-2010-7.
Follow or Perish: The Real Lion in 10.7

As expected, the announcement of the Mac AppStore rapidly sparked a shitstorm of heated debates. Many view this as the company tightening its grip on the last mile of computer user experience. Some say it’s a shameless regression towards 80s Microsoft style oppression. In typical right timing, Cupertino has offered little comfort by publishing this set of most stringent AppStore publishing guidelines ever (“Apps that exhibit bugs will be rejected!“).
Let’s temporarily cast aside the grandoise responsibility of accelerating computer technology and look at it from an ordinary user’s point of view. Since becoming a switcher over seven years ago (my first box was a G4 Mac Mini with 512MB ram), I’ve only came to purchase a handful of third-party software packages (namely Delicious Library) and opting to stay in “trial” heaven for other apps (Sorry, “Cyberduck“) indefinitely. The fact that the Mac’s pre-loaded bundle of useful software and the abundance of free/open source alternatives out there almost eliminate the need to do so.
Pricing has been another factor. While incredibly helpful and well designed, does Screenflow‘s $99 price point really stack up against iLife ($49), Parallels ($79), Snow Leopard ($29) or COD4 ($54)? Sure, developers have the right to charge whatever price deemed appropriate and frankly, $99 is a little to ask for a piece of great software, given the amount of brains and efforts behind the making. However, as in the case for music and films, software has become a commodity in the web era, and traditional marketing principles do not apply to this market — it plays by its own twisted rules. With the ubiquity of the AppStore, unprecedented volume of downloads and a solid payment model, iOS developers have struck profitability even with a race-to-the-bottom effect in force, so to that end, there seem little reason why desktop users won’t embrace the Mac AppStore as they have with the iOS AppStore.
Steam‘s commercial success and Ubuntu‘s uprising have proven that downloading is the way to go for software distribution. All that remains is how delicate issues such as restrictions and rights for “control” are handled. When Apple got that nailed and have enough developers convinced, the rest of the industry will have no choice but imitate. Follow or perish. That‘s what the lion in 10.7 really stands for.
Extended Reading:
- http://gizmodo.com/5668805/mac-os-x-lion-the-best-features
- http://gizmodo.com/5669210/what-wont-apple-allow-in-the-new-mac-app-store
- http://www.ubergizmo.com/15/archives/2010/04/steve_jobs_says_no_mac_app_store_in_os_x_107.html




















Recent Comments