SkyGrid and the Emergence of Flow-Based Search
GigaOm had a post today on a company called SkyGrid and its official company launch. As an investor, advisor, and beta-user of the platform, I thought I would chime in with my own self-serving post mostly because I wanted to talk about the advanced technology and architecture behind SkyGrid and why it makes the company such an interesting case study in the evolution of search technology.
Simply put, SkyGrid represents a massive and exciting departure from traditional search architectures and technologies. If I had to sum it up in a word, I would say that SkyGrid represents what I consider to be one of the first "flow based" search architectures, while traditional search engines are "crawl based" architectures.
Old Search: Crawl/Index/Query
While the technical departure was necessitated by the leading edge demands of investment professionals, it was these needs, and the lack of traditional search's ability to meet them, that exposed some of the most glaring weaknesses of traditional search technology. Specially, traditional search technology and architectures suffer from several glaring weaknesses:
- Crawl-based: Current search architectures collect information to index primarily by employing massive farms of "crawlers" that systematically crawl IP address spaces. The benefit of crawling is that it is exhaustive, the drawback is that it time consuming and expensive.
- One-off: Search platforms are designed around rapidly processing one off queries. This makes search engines highly useful and adept at finding "the needle in the haystack" but very cumbersome to use in situations where one just wants to get new results to the same old query.
- Batch-based: Page rank and the other "secret sauce"algorithms behind most search engines today require a very expensive and complicated indexing process to be performed on "snap shots" of data. It can be days or even weeks before newly published content is crawled and properly indexed, meaning that most search engines fail to provide "real time" results for all but the most popular content sources (which they crawl very frequently).
- Unabridged: Search engines are exhaustive in that they return every URL that mentions a string. This is good is you are looking for a needle in the haystack, but bad if you are trying to search on a common term such as "Google" or "Microsoft". While ranking algorithms do a great job of ordering results according to likely relevancy, they don't filter down the number of results. Since most users don't go past the first page of results, this makes it quite easy to miss relevant information that for some reason doesn't rank in the top 10 results.
- Unstructured: Search engines typically present query results as a simple list without context or analytics, beyond say separating them by a simple criteria, such as text and images. While some progress has been made in terms of trying to cluster results or help users filter them, by and large, users still just get an unprocessed, unanalyzed data dump when they do a search.
- Retrospective: Search today is focused on determining what has happened in the past. Who wrote what, who said what, etc. However this does little to help people figure out what will happen in the future.
Without giving away the farm, SkyGrid represents an exciting departure from the search technologies and architectures of the past. This change has been made possible by several factors including the widespread deployment and adoption of ping servers and RSS/ATOM feeds, dramatic improvement in several areas of artificial intelligence and unstructured data analytics, and new stream-based methods of database and query design.
SkyGrid Search: Flow/Filter/Analyze
When you put all of these technologies together, along with a laser like focus on solving some of the unique high-end demands of investment professionals, you get a radical new search architecture and technology that not only solves some very pressing and pragmatic problems facing investors, but holds the potential to actually predict the pattern and influence of idea/meme propagation throughout the internet and from there into the financial markets and beyond.
Specifically, SkyGrid's search architecture differs from traditional search engines in that it is:
- Flow-based: SkyGrid treats the web as a giant pub-sub system or at least it does to the extent that the rapidly growing RSS/Ping server infrastructure does. It does not crawl the web, but rather the web "flows" to it.
- Persistent: SkyGrid persists queries over time so that incremental results are delivered with no additional action by the user. One can easily see how this would be valuable in the case of something like, oh say, a stock, which persists from day to day.
- Real-time: Rather than using batch-based indexing, SkyGrid uses a real-time stream-like query system that queries (and analyzes) new content as it flows into the system. This is particularly useful in situations, such as investing, where a few minutes or seconds, can make a huge economic difference.
- Filtered: Rather than presenting results as a data-dump, SkyGrid uses advanced analytics in the form of entity extraction, meta-data analytics, and rules based AI, to quickly analyze and append additional meta-data to incoming information. This enables users to easily filter data according to number of criteria which greatly lessens the chance of "data overload" and greatly improves the chance of "data discovery".
Analytical: By applying highly advanced artificial intelligence, such as natural language procession, entity extraction, etc. SkyGrid is able to actually analyze and assess the actual content of a URL, thus enabling it to make determinations such as the sentiment (positive/negative) of information, its "velocity" and its "authority". This goes a step beyond simple meta-data filtering to creating real insights into the content.
- Predictive: SkyGrid's flow based architecture and advanced analytics enable it to view the web as a living breathing, changing entity. By observing the propagation of information over time and across downstream nodes, SkyGrid is in a position to not only assess the "authority" and "influence" of individual nodes, but it should ultimately be able to make reasonable predictions about which information will flow where on the web. By correlating this observed "flow" over time with observed movements in things such as, oh say, stock markets, company sales, etc. it can not only assess the historical sensitivity of changes on the web creating changes in the real world, but it should ultimately be able to theoretically predict, with reasonable accuracy, many of those changes. Yes, I said it: SkyGrid and its new search architecture may ultimately predict the future.
I realize that the last point is at the very least hyperbolic and at worst disingenuous, but as an early beta-user I can tell you first hand that once you see it in action and understand the architecture, predicting the future, in some very specific, limited, yet potentially highly valuable ways, is certainly not something beyond the realm of reason and indeed something that seems quite possible given the progress to date. That said, SkyGrid is still a beta platform and many features have yet to be implemented in part or in full, but the promise and potential is undeniably there.
Why won't SkyGrid simply be put of business by the big players like so many other search oriented start-ups? First and foremost because SkyGrid is delivering a premium product to a group of users that will pay significant sums for something that not only dramatically improves their daily productivity but holds out the promise of providing insightful, market oriented analytics that they simply can't get elsewhere. Second, the existing search engines cannot compete effectively against SkyGrid because to do so would require a reengineering of their basic search architectures to address all of their shortcomings relative to SkyGrid. Moving from a traditional crawl/index/query architecture to a flow/filter/analyze one is a decidedly non-trivial undertaking, one that would require an entire re-architecture of their core services and thus one highly unlikely to be made.
Well then does that mean that SkyGrid will put the "legacy" search engines out of business? Not at all. The current search engines are optimized to deal incredibly well with the vast majority of queries from the vast majority of users and they will likely continue to do so for some time. Next generation flow-based platforms such as SkyGrid are, by design, tackling a subset of the available queries, but arguably a very valuable subset. Indeed that's why SkyGrid can charge $500/seat/month for its services while the existing search engines must give away their services for fee and make their money on advertising.
Now I can see a lot of people being skeptical after reading this about both my ability to impartially judge SkyGrid's next generation search technology as well as its market potential. To them I would say: just keep your eyes out for some announcements over the next month as I think they will conclusively demonstrate that a number of people far more knowledgeable and accomplished than I see the same potential.
Microsoft/Yahoo: A Bad Deal For Silicon Valley: Take II
Marc Andreessen has posted a very thoughtful rebuttal to my argument (as well as Fred's and few others) that the Microsoft/Yahoo deal is a potentially a bad thing for Silicon Valley. The funny thing is that I actually hadn't noticed the post yet in my feeds but it was brought to my attention by a number of people who were basically like "Oooo, you've been served!" and were wondering if I was going to challenge Marc to some kind of blog-off or something.
I hate to disappoint folks, but after reading Marc's post I actually agree with most of what he has to say, especially his overarching message to start-ups, which I took to be "Focus on building a great business and the exit will take of itself". Marc also made a number of other good points around the M&A environment which if I could sum it up were basically "Hey, life will go on, other companies will try to take up the slack, it's not the end of the world." In particular I think his point that a combined Microsoft/Yahoo may prompt some second tier firms to increase their M&A is a good one. I also agree that over the long term, creating a big bureaucratic behemoth such as Microsoft/Yahoo is a good thing for start-ups because it means that start-ups will likely be able to dash ahead of the lumbering giant and secure fresh new areas of opportunity well before the folks at Microhoo file even their TPS reports and get out of their staff meetings.
That said, I still think that Microsoft's acquisition of Yahoo is still a net negative from an M&A perspective. Yes, it's certainly not the end of the world, but on the whole and on the average it's never a positive thing to have an active, well endowed, acquirer removed from the mix. Yahoo may not have been buying 50 start-ups a year, but they were still one of the most active Internet acquirers not just in terms of deals, but also in terms of bids. Indeed the most important party in any deal is not the actual buyer but the second place bidder and Yahoo had seemed to make a career out of being the second place bidder lately. Finally, thanks to its huge market capital, massive traffic and strong (although not relative to Google) monetization platform, Yahoo is one of the few Internet acquirers who have the luxury of being able to easily drop $50-$100M on a "feature" without really thinking about it. I totally agree with Mark that if you are building a "feature" with the intent of getting acquired by Yahoo or whoever, you were likely doomed to failure a long time ago, but at same time, the cynic in me has seen a lot of "features" get funded in the valley over the past two years often under the assumption that if they get enough eyeballs one of the big three M&A fairies will swoop in and drop $100M just to "keep up with the Joneses".
So I agree that life will go on in the valley and there are some real positive non-M&A aspects of the deal for start-ups, but at the same time, I think net, net it's bad for the M&A environment. That may change over time as new companies emerge to take up the slack, but over the next 24 months things could be a bit rough because not only will you have Microsoft and Yahoo thoroughly distracted, but IAC is going to be a complete mess due its dispute with Liberty and AOL appears consumed with consummating its death spiral within Time Warner. I am sure M&A bankers will do their best keep the deals flowing, but if you have an Internet start-up, given the turmoil within the big acquirers and the rapidly deteriorating economic environment, as Marc suggests, you should definitely just keep you head down on focus on building a real business.
Microsoft/Yahoo: A Bad Deal For Silicon Valley
There's a ton of discussion today about Microsoft's unsolicited bid for Yahoo. Much of the discussion focuses on whether or not the deal is a good thing for Microsoft, Yahoo or Google's shareholders. While it's possible it could be a good or bad deal for one, the other, or all three, one thing is for sure: this a bad deal for Silicon Valley start-ups and their VCs.
How could that be? Because by swallowing up Yahoo, Microsoft will be removing one of the biggest and most active acquirors of start-ups in Silicon Valley. The intense competition between Microsoft, Google, and Yahoo has arguably been one of the main factors helping drive up M&A activity and prices for internet related start-ups. It seems like every rumored acquisition over the past few years has had all three fighting in some way to win the deal.
Even though Yahoo has been wounded of late, it still had a market cap in the 10's of billions of dollars which allowed it to be a legitimate competitor for any deal under $1BN and in fact Yahoo has been a pretty active player in that market whether its del.icio.us, flickr, Rivals, etc.
If it's acquired by Microsoft, that will leave only two Internet media/search acquirors with the ability to easily do sub $1BN deals. What's more, while Microsoft has recently show a willingness to deal really big deals such as Acquantive and now Yahoo, it has traditionally been less willing to smaller "tuck in" deals, deals that Yahoo has traditionally been much more active in. Indeed, Microsoft has traditionally been dismissive of these deals because they just don't move the needle for them and their engineering staffs still retain a relatively high degree of NIH attitude.
Losing one of the Valley's most reliable "tuck in" acquirors and second place bidders is a net negative for the Valley. It will make M&A less competitive in general and will reduce the # of potential exits for "me too" start ups" to 2 instead of three. That's bad news for Internet content/search start-ups and their VC backers anyway you look at it.