Is RFID Application Oriented Networking’s Killer App?
RFID is a widely hyped next generation technology for wirelessly “scanning” items. AON, or Application Oriented Networking, is widely hyped next generation technology for managing middleware messages. Put the two technologies together and not only do you have a whole lot of hype, but you may also have a match made in technology heaven, one which ends up making both technologies better.
AONs Second Generation
I have written about AON, or what I prefer to call Message Aware Networking, quite a bit in the past. AON is at the core of a revolution in software architectures that is moving a significant amount of functionality out of applications and pushing it down into the network. The first generation of AON devices act as simple gatekeepers/routers and are limited to pre-processing messages that they then deliver to applications. Datapower, which was recently sold to IBM, was the leading player in this space. While the space is still quite young, already a 2nd generation of AON devices is under development. These devices will not simply serve as intermediaries, but they will actually become an integral part of an application and an embedded part of a business process. By embracing the concepts of distribution and virtualization, as well as standards such as BPEL, these 2nd generation devices will turn the network into an integrated part of a given application and in doing so should enable significant new functionalities while reducing costs and improving performance.
RFID: Almost Ready For Prime Time
RFID seems worlds away from AON, but RFID’s future may be more closely tied to AON than one might at first suspect. Much of the attention around RFID has focused on the core hardware technologies required to make RFID a reality. Not only are there very difficult RF-related physics problems that must be solved, but the costs of silicon-based RFID “tags” must be reduced in order to make RFID economically viable. Now however, with companies such as Impinj shipping cost effective RFID 2.0 tags in volume and with retailers like Wal-Mart mandating RFID adoption, it appears as if RFID is finally poised for rapid growth.
Unfortunately for RFID, if anyone ever does install the hardware, the first thing they are likely to do is crash all of their software. That’s because a fully instrumented RFID installation is bound to kick off thousands, even millions of messages a day and processing all these messages is likely to bring even the most robust computer system to its knees. To solve these performance issues companies will either have invest a fortune in significantly upgrading their core logistics systems or they will have to figure out another way to get the job done.
That other way may be a new class of AON-based devices that are specifically focused on RFID. Such devices could not only locally process millions of RFID messages, thus greatly reducing the overall “load” generated by RFID systems, but they could also offer new capabilities, such as the ability to efficiently “multiplex” messages across an entire supply/demand chain. For example, AON-based RFID devices could use cached lists of expected shipments to identify in real time when specific shipments failed to arrive or arrive incomplete. These devices could also simultaneously notify all of the different supply chain members about missed shipments or out-of-stock events. While a retailer’s or distributor’s centralized computer systems could no doubt do all of this, constantly hitting those servers with updates on every single SKU from every single store, warehouse, truck, etc. would not only be wildly inefficient, it may also distract those systems from more important higher level tasks. In this way, AON-based RFID devices provide a relatively cheap way to “off load” to the network the processing of routine RFID related messages.
Startups To the Rescue
But don’t take my word for it, just look at the start-ups that are springing up to take advantage of this potential opportunity, the most interesting of which I have seen to date is a company called Omnitrol. Omnitrol has built a very intriguing RFID focused AON-based device. Unlike many 1st generation AON players which used Intel-based Linux “pizza boxes” as their hardware platform, Omnitrol has engineered a sophisticated hardware platform that uses Gigabit Ethernet to interconnect multiple RISC processors allowing for greatly increased message processing capabilities within a single box and the ability to seamlessly interconnect multiple boxes. Such an architecture not only guarantees that Omnitrol should be able to process massive message loads, but it also offers a great example of how 2nd generation of AON devices are likely to use much more sophisticated hardware platforms that look more like network devices than enterprise servers.
With Omnitrol’s AON-based RFID devices in place, retailers can keep a large portion of their RFID processing “local”, i.e. confined to the actual store or distribution center where it takes place. Only critical events and/or summaries of activities need be forwarded to the central system (which in most big retailers remains an IBM mainframe). The RFID devices have also been designed from the ground up to be compliant with the various EPCGlobal RFID related standards and have the ports needed to accommodate a variety of RFID readers. In this way, Omnitrol has customized both the hardware and software of its device so that it is RFID centric.
In addition to Omnitrol another start-up focused on this space is Reva Systems. Reva is already shipping a device that is focused on managing RFID networks and clearly has designs on providing more sophisticated “application aware” capabilities over time.
Despite their promise, the success of these RFID-focused AON start-ups heavily depends on how quickly large companies embrace and roll-out RFID. While recent signs seem to indicate that adoption should dramatically accelerate in the next year or two, this is still a major risk.
A Sea Change?
Despite this risk, both companies are great examples of how Application Oriented Networking is rapidly evolving. Perhaps the most interesting trend that these devices underscore is the ability to tailor an AON device, at both a hardware and software level, to a specific application. This tailoring raises the interesting question as to whether or not we will see an explosion of application specific AON devices, each targeting a different enterprise application. Such an explosion may not only provide a whole new class of investment opportunities, but it may also significantly challenge the dominance of general purpose computing platforms in enterprise software. After all, if you move enterprise software into the network, why won’t the devices used to process that software look a lot more like network devices than enterprise severs? Of course, we are a long way off from such a change, but it is interesting to contemplate and once again underscores that Application Oriented Networking is one of the most important and potentially disruptive trends within the software industry.
RSS and Google Base: Google Feeds Off The Web
There has been a lot of talk about Google Base today on the web and much of the reaction appears to be either muted or negative. The lack of enthusiasm seems to be driven by the fact that the GUI is pretty rudimentary and doesn't provide any real-time positive feedback (as Fred points out). But people that are turned off by the GUI should take into account that Google Base wasn't designed primarily for humans; it was designed for computers. In fact, I think if people were computers their reaction would be closer to jumping for joy than scratching their heads.
What's perhaps most interesting about the Google Base design is that it appears to have been designed from the ground up with RSS and XML at its center. One need look no further then the detailed XML Schema and extensive RSS 2.0 specification to realize that Google intends to build the world's largest RSS "reader" which in turn will become the world's largest XML database.
To faciliate this, I suspect that Google will soon announce a program whereby people can register their "Base compliant" RSS feeds with Google base. Google will then poll these feeds regularly just like any other RSS reader. Publishers can either create brand new Base-compliant feeds or with a bit of XSLT/XML Schema of their own they can just transpose their own content into a Base compliant feed. Indeed I wouldn't be surprised if there are several software programs available for download in a couple months that do just that. Soon, every publisher on the planet will be able to have a highly automated, highly structured feed directly into Google base.
Once the feed gets inside Google the fun is just beginning. Most commentators have been underwhelmed by Google Base because they don't see the big deal of Google Base entires showing up as part of free text search. What these commentators miss, is that Google isn't gathering all this structured data just so they can regurgitate it piece-meal via unstructured queries, they are gathering all this data so that they can build the world's largest XML database. With the database assembled, Google will be able to deliver a rich, structured experience that, as Michael Parekh sagely points out, is similar to what directory structures do, however because Google Base will in fact be a giant XML database it will be far more powerful than a structured directory. Not only will Google Base users be able to browse similar listings in a structured fashion, but they will also ultimately be able to do highly detailed, highly accurate queries.
In addition, it should not be lost on people that once Google assimilates all of these disparate feeds, it can combine them and then republish them in whatever fashion it wishes. Google Base will thus become the automated engine behind a whole range of other Google extensions (GoogleBay, GoogleJobs, GoogleDate) and it will also enable individual users to subscribe to a wide range of highly specific and highly customized meta-feeds. "Featured listings" will likely replace or complement AdWords in this implementation, but the click-though model will remain.
As for RSS, Google Base represents a kind of Confirmation. With Google's endorsement, RSS has now graduated from a rather obscure content syndication standard to the exautled status of the web's default standard for data integration. Google's endorsement should in turn push other competitors to adopt RSS as their data transport format and process of choice. This adoption will in turn force many of the infrastructure software vendors to enhance their products so that they can easily consume and produce RSS-based messages which in turn will further cement the standard. At its highest level, Google's adoption of RSS represents a further trimph of REST-based SOA architectures over the traditional RPC architecture being advanced by many software vendors. Once again, short and simple wins over long and complex.
In my next post I will talk about Google Base's impact on the "walled garden" listings sites. I'll give you a hint: it won't be pretty.
SOA Under The Radar: Recap
Last night I served on a panel of VCs at IBD's "Under the Radar: SOA Death Match". The event featured 4 companies with products that were either directly or indirectly focused on enabling Service Oriented Architectures (SOA). Each company presented for 6 minutes, then the panel of VCs asked 6 minutes of questions. At the end of the event, the VC panel picked a "best in show" and the audience picked their own "people's choice".
Perhaps what I found most interesting about the conference was that you could actually get 75 people into a room on a Tuesday evening to discuss Service Oriented Architectures. Sure this is Silicon Valley and there are lots of tech geeks that are always up to discuss the latest and greatest technology trends, but I remember in 2001/2002 when the mere mention of XML, SOAP, etc. brought puzzled stares from many in Silicon Valley. I think it just shows that the whole concept of XML and SOA has reached mainstream acceptance, at least within technology circles, and really is destined to become an important and long term part the technology fabric.
In case you are interested, here's an overview of the 4 companies that presented:
Appistry: Appistry was a bit of mis-match for the conference in that they are more of a application virtualization play than an SOA play. I actually like the application virtualization space quite a bit, although many of the big players have already made acquisitions in the space so the amount of opportunity remaining for start-ups is limited. That said, Appistry seemed to have a very solid product and several good reference customers. They were a bit of a sentimental favorite for me given that the CEO was a former Wash U grad and they are located in Wash U's hometown of St. Louis (not exactly the tech start-up capital), but they clearly were at a disadvantage in the competition because SOA wasn't really their sweet spot. I suspect they knew this and were really just looking to get some valley exposure for their business/fund-raising efforts, so they should have gotten an award for entrepreneurial pluck.
Blue Titan: Blue Titan's main product is a web services management platform that enables companies to provision, secure and manage lots of different web services. Their main competitors are Amber Point and SOA Software (which was supposed to present at this conference but canceled at the last moment). Blue's Titan's founder and CTO presented and he was probably the most engaging presenter of the evening. Conceptually I like the web services management space a lot. I actually funded a company in early 2001 to go after this space (Maaya), but I was *way* too early and I was lucky just to get my money back. These days it looks as though the space is finally getting some traction, but the sales process is complicated by the fundamental architecture issues that come along with embracing SOA which means it's a technical sale that requires multiple sign-offs. In one of the more humorous outcomes of the evening, Blue Titan actually won the "people's choice" award but finished last in the VC panel's voting. I think we VCs were concerned with the difficult sales cycle that Blue Titan faces while the audience was more focused on the visionary nature of the product. Blue Titan's CTO took the difference in stride and said that the vote just proved his belief that potential customers appreciated his business much better than potential VC investors.
Ipedo: Ipedo is focused on Enterprise Information Integration (EII) which I like to call data abstraction. They aren't really focused on SOA per se, but their technology is arguably critical to the enablement of SOAs. Ipedo competes primarily with other start-ups, most notably Composite Software and MetaMatrix. I like this space a lot and actually came very close to investing in the first round of Composite Software (which I still believe is the best company in the space) but wasn't able to get my partners over the goal line. I believe Ipedo started out as more of an XML-database play, but they quickly (and correctly) realized that a more generalized EII platform had more long term promise. One of the most interesting things the CEO mentioned in his presentation was that Ipedo had an office in Shanghai, that their Chinese operations were profitable on a stand-alone basis, and that they were seeing strong demand for their EII solutions in China. Given that EII solutions are just now being adopted by many US corporations I would not have suspected that there was demand in China, but I think it just goes to show how quickly the software market is developing over there. As it turns out, Ipedo ended up winning the VC panel award. I think this had to do with the fact that Ipedo seemed to be addressing a more pragmatic and immediate business need (data integration) than SOAs, so in some senses it really isn't fair as that's really comparing apples and oranges.
Reactivity: Reactivity is a Message Aware Networking company that sells a "software appliance" focused primarily on securing XML messages as they transit a company's network. I funded one of their direct competitors, Datapower, so I am very familiar with the space. XML appliances aren't theoretically required to build an SOA, but they provide a much more secure, reliable and manageable foundation for SOAs. Reactivity has traditionally been focused almost exclusively on the security side of equation (many refer to their product as an XML firewall). To their credit this has really turned out to be the near term sweet spot of the market, however I think Reactivity's early focus has allowed some of their competitors to pigeon hole them as only security focused which may hurt Reactivity as customers begin to look for broader XML message platforms. The big news in this space has been Cisco's recent announcement of its AON initiative which I think will likely force other big networking and software players to seriously consider buying some of the start-ups in the space. I asked the CEO about Cisco and he gave a very honest, straightforward and mature response about Cisco's efforts which was very refreshing to hear from a start-up CEO. Ultimately I think both Datapower and Reactivity will do well, as the space is growing quickly and strategically important to a number of companies.
Super Services, Process Portals and the Road to Composite Applications
Publicly accessible web services seem to be proliferating like rabbits days. Not only are high profile early adopters such as Amazon.com, Ebay, Google and FedEx launching a plethora of new services, but an increasing number of more obscure firms are throwing their hats into the ring, offering everything from commodity futures prices to bible quotes.
Theoretically, this large pool of publicly accessible web services should foster the creation of a new class of “super services”. Super services simply combine several different web services into one master service. They can be custom-designed to serve the needs of a specific company or be repackaged and offered to the public as yet another service. In fact, there are already some interesting examples of enterprising developers stringing together a few web services to create a rudimentary websites which themselves could be exposed as super services such as this "mashup" of Amazon/Google/Yahoo, this mixing of Flickr and the US Government's zip code database, and this combination of Google Maps and Craigslist.
Unfortunately, creating a true super service is much harder than these early examples might suggest. To create super services developers must not only link web services at a semantic and programmatic level but they must also find a way to successfully orchestrate a business process across these services in an orderly enough fashion that a basic level of performance and transactional integrity is maintained. Luckily, emerging business process orchestration technologies, most prominently BPEL, provide a standardized mechanism for creating the process logic underpinning super services. However, while adding BPEL to the mix has tremendous benefits it also makes the act of building super services even more complex and less accessible.
In recognition of both the increasing number of web services and the increasing complexity of linking them together, a new crop of start-ups has emerged including such companies as eSigma, Bindingpoint, Xmethods, and Strike Iron. Initially these start-ups appear to have the rather mundane goal of creating directories of publicly available web services or even libraries of proprietary web services (such as Strike Iron and Xignite have done), but dig a bit deeper and you realize that their ambitions may extend much further.
Take eSigma for example. I had the opportunity to chat with its founder, Troy Haaland, the other day. As Troy explained, the simple portal-like interface of eSigma actually hides an increasingly complex infrastructure. Right now, at the core of this infrastructure is a fully functioning UDDI directory. All of the services you can browse via the portal are actually formally registered in the UDDI directory making them programmatically discoverable. The goal is to link this directory core to a higher level process management capability via a BPEL-based visual authoring/scripting platform. Not only would such a platform allow enterprising developers to easily create and, theoretically re-sell, their own super services, but more importantly it would allow enterprises to create composite applications that exist solely in the “cloud”. Such “cloud based” composite applications could then be used a back-bone of inter-enterprise applications.
In this way, what appear at first to be simple directories may ultimately be transformed into Process Portals, or sites that not only centralize web services meta-data, but host a set of custom-designed super-services and composite applications as well as the visual authoring tools needed to create them.
The Road Ahead
While this is clearly a long term vision, there are indications that elements of this vision may be closer at hand than one might imagine. Within the enterprise, there are already a number of products, from companies such as Amberpoint, Blue Titan, and Digital Evolution vying to manage the low-level provisioning and performance of intra-enterprise web services. As the number of web services multiplies within an enterprise, a directory infrastructure is a logical next step (indeed some products have already taken this step) and some kind of orchestration layer will also clearly be necessary if enterprises want to foster re-usability and enable the creation of super services. In some ways then, the writing is on the wall: Process Portals are an inevitable result of the increasing number of web services. The key questions outstanding then are: 1. Will these portals first make their presence felt inside the enterprise as packaged applications or outside the firewall as publicly accessible Process Portals? 2. Will de novo start-ups be best positioned to own this space or will the pre-existing web services management products “grow” into this space? and 3. Just when exactly will this space generate enough revenue to make it interesting from an investment standpoint?
Software's Top 10 2005 Trends: #6 Inter-Enterprise Applications
Conventional wisdom in the VC world is that there are very few white spaces left to fund in the Enterprise Applications area. After all, you can only fund so many CRM and Supply Chain deals.
However, with the advent of web services and technologies such as BPEL (#7) and Composite Applications (#8), the foundation for a new class of inter-enterprise applications is now in place. These Inter-enterprise applications will automate business processes and work flows across enterprises that up until now have been largely manual. These new applications will not just facilitate real-time collaboration between businesses but they will continually manage specific business processes across enterprises. In essence these applications will merge the process oriented features of applications such as PLM and Supply chain with the real time collaboration features of products such as WebEx and Microsoft’s Office Communicator.
For example, processes like dispute resolution in the credit card industry, or clearing in the securities industry, or claims processing in the healthcare industry all involve lots of enterprises working together on a common business process. While large portions of these processes are automated within individual enterprises much of the process between enterprises remains manually managed (even though a lot of data may be exchanged electronically). Creating shared visibility across the entirety of a multi-enterprise business process will bring tremendous benefits in terms of reduced costs and improved productivity. Some interesting examples of companies trying to build such next generation applications include Webify, Viacore, and Rearden Commerce.
Some existing applications, such as PLM and supply chain, may be able to successfully expand their offerings to encompass such inter-enterprise functionality. Indeed if they don’t they are probably going to go out of business. However many of these applications will need to be created from scratch given the unique technology, security, and management challenges inherent in automation of inter-enterprise processes.
For a complete list of Software's Top 10 2005 trends click here.
Software's Top 10 2005 Trends: #7 BPEL
Business Process Workflow Execution Language for Web Services (or BPEL for short) is hot. BPEL is a XML-based standard for orchestrating business processes within and between enterprises. BPEL is hot because companies increasingly realize that in order to take advantage of their emerging web services they must find a way to integrate a number of services into a cohesive business process flow.
Last year saw some significant activity on the BPEL front with Oracle acquiring BPEL pioneer Collaxa and with other companies such as IBM and BEA announcing deep integration of BPEL into their core Application server platforms. There were even several open source BPEL servers announced such as ActiveBPEL and Twister. Now that the infrastructure is in place, this year should start to see some major BPEL deployments and greater use of BPEL to orchestrate business processes between enterprises.
As BPEL matures there’s a real possibility that more and more of its functionality will be embedded directly into the actual BPEL messages making it possible for trading partners to amend workflows “on the fly” and bringing greater transparency to inter-enterprise process flows in general. In this way a BPEL message becomes a self-contained executable business process thus turning a message into software.
For VCs, BPEL presents some interesting challenges and opportunities. In terms of challenges, standards-based BPEL servers threaten to overwhelm proprietary business-process EAI platforms. On the opportunity front, while the opportunity to fund core BPEL servers has passed, there may be opportunities to fund both high level and low level BPEL deals. At a high level, BPEL servers that are customized to meet the needs of a particular industry, such as insurance or manufacturing, may present interesting investment opportunities, while at a low level BPEL gateways or routers that quickly process and transform messages may also be attractive.
For a complete list of Software's Top 10 2005 trends click here.
The Message Is The Software
Back in the 1990’s Sun Microsystems famously coined the term “the network is the computer” in an effort to illustrate that distributed computing, enabled by networks, was destined to triumph over monolithic CPUs. A similar and perhaps more important revolution is now underway in the software world. In this revolution, monolithic compiled binaries are rapidly being replaced by fragments of distributed code held together by increasingly robust message systems. Far from simply relaying information, these message systems are rapidly evolving into “stateful” clouds. As these vast, intelligent, clouds evolve they are becoming, in many ways, the heart and soul of modern software.
Falling to Pieces
The fragmentation of software binaries is well-worn trend that started with some of the early component models and accelerated dramatically thanks to the adoption of the J2EE and .Net component models. With the advent of Web Services and Service Oriented Architectures (SOAs) such fragmentation has accelerated once again. As software fragments and distributes, the role of messaging systems increases in importance for it is these messaging systems that provide the virtual “glue” necessary to hold a distributed application together.
You’ve Come a Long Way Baby
Describing message systems a merely “glue” might have been appropriate back in the days of EDI, but today’s messaging systems are far more complex and capable than their predecessors. Increasingly these systems are not just intermediaries, but an integral and inextricable part of business processes.
The foundation of the modern message system is XML. This simple, yet powerful standard has leveraged its web-based heritage to become the de-facto foundation of almost every major message standard proposed and/or adopted in the last several years.
XML’s power lies not just in its accessibility and ubiquity but in its flexibility and extensibility. Despite its advanced capabilities, early implementations of XML-based message standards tended to simply replicate existing EDI/ANSI standards and thus treat messages as mere “data Sherpa’s” limited to hauling structured data back and forth between applications. However, as the true power of XML has become apparent, next-generation XML-based message standards have tried to incorporate some more advanced capabilities.
The Rise of “Stateful” Messages
Two of the most powerful types of next generation XML-based messages standards are “transgenic” and “stateful” standards. Transgenic XML standards encapsulate text-based code fragments within an XML message. These code fragments can then be uploaded into binaries during run time. For example, it’s possible to map XML elements to java objects and then upload those elements, via a parser, directly into a Java runtime environment.
This is at once both an incredibly powerful and an incredibly scary capability. It is powerful because it allows text-based messages to modify run-time code making it possible to do “on-the-fly” updates of user interfaces, business logic, or what have you. It also allows business logic to “travel” with data payloads which can ensure consistent execution (e.g. encapsulating the formula necessary to calculate a complex derivative within a message about that derivative). It is a scary because it could potentially turn an innocuous looking XML message into the mother-of-all Trojan horses by enabling hackers to attack and change the business logic of programs while they are still running.
Given the risks of transgenic XML, most next-generation XML standards are avoiding such capabilities and instead focusing on “stateful” standards. Stateful XML standards provide mechanisms for embedding/ammending not just the state of a particular operation into a message but often the business logic necessary to complete that operation, and even the larger business process context of that operation. By embedding state and business logic within a message, these standards create a truly “decoupled” and asynchronous software environment in which the message truly becomes the central focus of a software system.
BPEL in the Vanguard
One emerging example of an XML-based “stateful” message standard is Business Process Execution Language (BPEL). At one level, BPEL is simply a standard that defines how business partners interact on a particular business process. However, BPEL can also be used as a “stateful” standard in which the business process is both defined and managed by the message itself. As the BPEL spec itself says:
“It is also possible to use BPEL4WS to define an executable business process. The logic and state of the process determine the nature and sequence of the Web Service interactions conducted at each business partner, and thus the interaction protocols. While a BPEL4WS process definition is not required to be complete from a private implementation point of view, the language effectively defines a portable execution format for business processes that rely exclusively on Web Service resources and XML data. Moreover, such processes execute and interact with their partners in a consistent way regardless of the supporting platform or programming model used by the implementation of the hosting environment.”
While BPEL is obviously still in the early stages of becoming a “stateful” standard, it’s not hard to imagine later versions of the standard explicitly amending messages “on the fly” with state, data, and process information thus conferring to messages many of the same capabilities of complied binaries.
The Intelligent Cloud
As messages begin to take on more of the capabilities and responsibilities traditionally assigned to compiled binaries, the supporting messaging infrastructure must necessarily become more secure and sophisticated. The combination of massive numbers of “stateful” messages with a sophisticated infrastructure effectively creates an intelligent messaging “cloud”. Inside this cloud, messages can be routed, modified, and secured with minimal endpoint interactions. Ultimately, interactions between messages and even the creation of new messages can be accomplished within this cloud all based on pre-defined “stateful” standards and without the need for pre-compiled business logic or processes.
Cloud of Opportunity
For venture investors, the emergence of this intelligent cloud and the migration towards messaging and away from complied binaries offer a multitude of interesting investment opportunities. Clearly there will be increasing demand for intelligent message processing software and equipment. To that end, a nucleus of XML-aware networking equipment companies, such as Datapower and Reactivity, have already emerged as have some standards based message brokers (such as Collaxa which was recently purchase by Oracle). New companies are likely to emerge focused on brokering messages associated with emerging “stateful” standards and still others may find ways to acceptably secure and control “transgenic” messaging.
As these new companies emerge they will help cement the transition away from binary-centric software towards message-centric software and in doing so they will confirm what we can already see today: that the message is the software.
DIM: Hijacking IM for Data Transport
Move over teenagers, the heaviest users of instant messaging are about to become computers themselves. In the beginning, IM communication was strictly a human-to-human affair. A few years ago companies starting sending alerts (and increasingly spam) via IM making it a computer-to-human affair. Now, with the advent of Data over Instant Messaging (DIM) technology, IM is rapidly set to become a computer-to-computer affair.
Why send data over IM? One reason is that IM infrastructures have solved a lot of tough technical problems such as firewall traversal, multi-protocol transformation, and real-time presence management. Sending messages over these networks allows applications to leverage the investments made to solve these tough problems. Another reason is that many companies already have IM “friendly” infrastructures which means that all the necessary firewall ports are open, the clients are already certified and installed, and operations infrastructure like logging, back-up, and even high-availability are already in place. Thus by using IM for computer-to-computer communication, developers are able to “hijack” all the valuable investment made in IM and use it for a purpose that its creators likely never intended.
Of course, DIM-based communications have many of the same drawbacks that human-to-human IM has. Because IM is a real-time “fire and forget” system, DIM lacks many of the hard-core transaction capabilities that most Enterprise Application Integration (EAI) solutions incorporate. Thus you wouldn’t want to rely on DIM for mission critical transactions management. In fact, a full blown EAI system with a rich work flow capability, rules-based message management and semantic mapping capabilities is more capable and reliable than DIM for just about everything.
However, a full blown EAI system will also cost you millions and take at least 6 months to get up and running. With DIM, the infrastructure is already in place, so not only is the time to deploy radically accelerated, but the overall cost of the installation is also dramatically lower. In addition, because DIM is a relatively simple, lightweight technology it is comparatively easy to integrate into applications, especially desktop applications. DIM is just one of the low-end EAI technologies I have written about in the past that threaten to give the traditional “high-end” EAI vendors a run for their money.
To see a good example of DIM in action you need to look no further Castbridge’s Data Messenger product. Castbridge just released the 2.0 Beta of their product and it is chock-full of DIM goodies. The Castbridge product essentially allows other applications to instant message each other both inside and outside the firewall. Most customers use the technology to link desktop applications together (such as linking two Excel spreadsheets over the Internet) but the platform itself can be integrated into just about any application or database out there.
Castbridge’s customers are putting the technology to use in some very innovative ways. For example, the Singapore Police Department is using Castbridge’s DIM technology as a way to quickly and easily share security information during major events (trade shows, parades, etc.). In the past, each agency had its own systems for collecting and reporting information on any activity (e.g. “Man arrested for chewing gum at entrance”) during a major event. While each agency had a representative in the overall command center, the only way to share information was by yelling across the room to a colleague. With Castbridge, each agency simply enters their data into a standard Excel spreadsheet. The Castbridge technology sends instant messages to all the other spreadsheets as soon as new data is entered effectively keeping everyone instantly up-to-date on the current security status and dramatically reducing the possibility for miscommunication. This problem is not unique. In fact, some F-16s almost shot down the Gov. of Kentucky’s plane over Washington DC recently because the FAA controllers had no easy way of notifying the Homeland Security Department and NORAD about that plane, so it sounds like the US government could use Castbridge’s solution as well.
There are a myriad of other uses for DIM-like technology for everything from keeping sales forecasts up-to-date, to keeping inventory and financial information current. On Wall Street, where spreadsheets abound and real-time communication is paramount, use cases for this technology are rampant. Syndicate desks could create real-time distributed order books, while fixed income desks could give clients “live” lists of inventory and derivative traders could ensure that their pricing models instantaneously incorporate the latest data.
The strong potential for DIM on Wall Street is probably why one of the biggest vendors of traditional IM technology to Wall Street firms, IM Logic, recently announced it’s own DIM product called IM Linkage which is designed explicitly to help Wall Street firms leverage DIM.
As DIM starts to see wider adoption it will be interesting to see how the major IM networks respond. On the one hand they probably won’t take kindly to the idea of computers “hijacking” their networks to send data around the world (hard to monetize that kind of traffic) but on the other hand they may seem DIM as a new revenue source where they can possibly take a cut of license sales in return for certifying DIM-apps on their networks.
However things evolve, you can be sure of one thing: DIM-based applications are here to stay and their impact will be felt by everyone from traditional EAI vendors to application owners, to IM networks. Let the data messaging games begin!
The Data Abstraction Layer: Software Architecture’s Great Frontier
Abstraction has meaningfully permeated almost every layer of modern software architecture except for the data layer. This lack of abstraction has led to a myriad of data-related problems perhaps the most of important of which is significant data duplication and inconsistency throughout most large enterprises. Companies have generally responded to these problems by building elaborate and expensive enterprise application integration (EAI) infrastructures to try and synchronize data throughout an enterprise and/or cleanse it of inconsistencies. But these infrastructures simply perpetuate the status quo and do nothing to address the root cause of all this confusion: the lack of true abstraction at the data layer. Fortunately, the status quo may soon be changing thanks to a new generation technologies designed to create a persistent “data abstraction layer” that sits between databases and applications. This Data Abstraction Layer could greatly reduce the need for costly EAI infrastructures while significantly increasing the productivity and flexibility of application development.
Too Many Damn Databases
In an ideal world, companies would have just one master database. However if you take a look inside any large company’s data center, you will quickly realize one thing: they have way too many damn databases. Why would companies have hundreds of databases when they know that having multiple databases is causing huge integration and consistency problems? Simply put, because they have hundreds of applications and these applications have been programmed in a way that pre-ordains each one has to have a separate database.
Why would application programmers pre-ordain that their applications must have dedicated databases? Because of the three S’s: speed, security and schemas. Each of these factors drives the need for dedicated databases in their own way:
1. Speed: Performance is a critically important facet of almost every application. Programmers often spend countless hours optimizing their code to ensure proper performance. However, one of the biggest potential performance bottlenecks for many applications is the database. Given this, programmers often insist on their own dedicated database (and often their own dedicated hardware) to ensure that the database can be optimized, in terms of caching, connections, etc., for their particular application.
2. Security: Keeping data secure, even inside the firewall, has always been of paramount importance to data owners. In addition, new privacy regulations, such as HIPPA, have made it critically important for companies to protect data from being used in ways that violate customer privacy. When choosing between creating a new database or risking a potential security or privacy issue, most architects will simply take the safe path and create their own database. Such access control measures have the additional benefit of enhancing performance as they generally limit database load.
3. Schemas: The database schema is the essentially the embodiment of an application’s data model. Poorly designed schemas can create major performance problems and can greatly limit the flexibility of an application to add features. As a result, most application architects spend a significant amount of time optimizing schemas for each particular application. With each schema heavily optimized for a particular application it is often impossible for applications to share schemas which in turn makes it logical to give each application its own database.
Taken together, the three S’s effectively guarantee that the utopian vision of a single master database for all applications will remain a fantasy for some time. The reality is that the 3 S’s (not to mention pragmatic realities such as mergers & acquisitions and internal politics) virtually guarantee that large companies will continue to have hundreds if not thousands of separate databases.
This situation appears to leave most companies in a terrible quandary: while they’d like to reduce the number of databases they have in order to reduce their problems with inconsistent and duplicative data, the three S’s basically dictate that this is near next to impossible.
Master Database = Major Headache
Unwilling to accept such a fate, in the 1990’s companies began to come up with “work arounds” to this problem. One of the most popular involved the establishment of “master databases” or databases “of record”. These uber databases typically contained some of the most commonly duplicated data, such as customer contact information. The idea was that these master databases would contain the sole “live” copy of this data. Every other database that had this information would simply subscribe to the master database. That way, if a record was updated in the master database, the updates would cascade down to all the subordinate databases. While not eliminating the duplication of data, master databases at least kept important data consistent.
The major drawback with this approach is that in order to ensure proper propagation of the updates it is usually necessary to install a complex EAI infrastructure as this infrastructure provides the publish & subscribe “bus” that links all of the master/servant databases together. However, in addition to being expensive and time consuming to install, EAI infrastructures must be constantly maintained because slight changes to schemas or access controls can often disrupt them.
Thus, many companies that turned to EAI to solve their data problems have unwittingly created an additional expensive albatross that they must spend significant amounts of time and money on just to maintain. The combination of these complex EAI infrastructures with the already fragmented database infrastructure has created what amount’s to a Rube-Goldberg like IT architecture within many companies which is incredibly expensive to maintain, troubleshoot, and expand. With so many interconnections and inter-dependencies, companies often find themselves reluctant to innovate as new technologies or applications might threaten the very delicate balance they have established in their existing infrastructure.
So the good news is that by using EAI it is possible to eliminate some data consistency problems, but the bad news is that the use of EAI often results in a complex and expensive infrastructure that can even reduce overall IT innovation. EAI’s fundamental failing is that rather than offering a truly innovative solution to the data problem, it simply “paves the cow path” by trying to incrementally enhance the existing flawed infrastructure.
The Way Out: Abstraction
In recognition of this fundamental failure, a large number of start-ups have been working on new technologies that might better solve these problems. While these start-ups are pursuing a variety of different technologies, a common theme that binds them is their embracement of “abstraction” as the key to solving data consistency and duplication problems.
Abstraction is one of the most basic principles of information technology and it underpins much of the advances in programming languages and technical architectures that have occurred in the past 20 years. One particular area in which abstraction has been applied with great success is in the definition of interfaces between the “layers” of an architecture. For example, by defining a standardize protocol (HTTP) and a standardized language (HTML), it has been possible to abstract much of the presentation layer from the application layer. This abstraction allows programmers working on the presentation layer to be blissfully unaware of and uncoordinated with the programmers working on the application layer. Even within the network layer, technologies such as DNS or NAT rely on simple but highly effective implementations of the principle of abstraction to drive dramatic improvements in the network infrastructure.
Despite all of its benefits, abstraction has not yet seen wide use in the data layer. In what looks like the dark ages compared to the presentation layer, programmers must often “hard code” to specific database schemas, data stores, and even network locations. They must also often use database-specific access control mechanisms and tokens.
This medieval behavior is primarily due to one of the Three S’s: speed. Generally speaking, the more abstract and architecture, the more processing cycles required. Given the premium that many architects place on database performance, they have been highly reluctant to employ any technologies which might compromise performance.
However as Moore’s Law continues its steady advance, performance concerns are becoming less pronounced and as a result architects are increasing willing to consider “expensive” technologies such as abstraction, especially if they can help address data consistency and duplication problems.
The Many Faces of Abstraction
How exactly can abstraction solve these problems? It solves them by applying the principles of abstraction in several key areas including:
1. Security Abstraction: To preserve security and speed, database access has traditionally been carefully regulated. Database administrators typically “hard code” access control provisions by tying them to specific applications and/or users. Using abstraction, access control can be centralized and managed in-between the data layer and the application layer. This mediated access control frees programmers and database administrators from having to worry about coordinating with each other. It also provides for centralized management of data privacy and security issues.
2. Schema Abstraction: Rather than having programmers hard code to database schemas associated with a specific databases, abstraction technologies enable them to code to virtual schemas that sit between the application and database layers. These virtual schemas may map to multiple tables in multiple different databases but the application programmer remains blissfully unaware of the details. Some virtual schemas also theoretically have the advantage of being infinitely extensible thereby allowing programmers to easily modify their data model without having to redo their database schemas.
3. Query/Update Abstraction: Once security and schemas have been abstracted it is possible to bring abstraction down to the level of individual queries and updates. Today queries and updates must be directed at specific databases and they must often have knowledge of how that data is stored and indexed within each database. Using abstraction to pre-process queries as they pass from the application layer to the data layer, it is possible for applications to generate federated or composite queries/updates. While applications view these composite queries/updates as a single request, they may in fact require multiple operations in multiple databases. For example, a single query to retrieve a list of a customer’s last 10 purchases may be broken down into 3 separate queries: one to a customer database, one to an orders database and one to a shipping database.
The Data Abstraction Layer
With security, schemas and queries abstracted what starts to develop is a true data abstraction layer. This layer sits between the data layer and the application layer and decouples them once and for all freeing programmers from having to worry about the intimate details of databases and freeing database administrators from maintaining hundreds of bi-lateral relationships with individual applications.
With this layer fully in place, the need for complicated EAI infrastructures starts to decline dramatically. Rather than replicating master databases through elaborate “pumps” and “busses”, these master databases are simply allowed to stand on their own. Programmers creating new data models and schemas select existing data from a library of abstracted elements. Queries/updates are pre-processed at the data abstraction layer which determines access privileges and then federates the request across the appropriate databases.
Data Servers: Infrastructure’s Next Big Thing?
With so much work to be done at the data abstraction layer, the potential for a whole new class of infrastructure software called Data Servers, seems distinctly possible. Similar to the role application servers play in the application layer, Data Servers manage all of the abstractions and interfaces between the actual resources in the data layer and a set of generic APIs/standards for accessing them. In this way, the data servers virtually create the ever elusive “single master database”. From the programmer’s perspective this database appears to have a unified access control and schema design, but it still allows the actual data layer to be highly optimized in terms of resource allocation, physical partitioning and maintenance.
The promise is that with data servers in place, there will be little if any rationale for replicating data across an organization as all databases can be accessed from all applications. By reducing the need for replication, data servers will not only reduce the need for expensive EAI infrastructures but they will reduce the actual duplication of data. Reducing the duplication of data will naturally lead to reduced problems with data consistency.
Today however, the promise of data servers remains just that, a promise. There remain a number of very tough challenges to overcome before Data Servers truly can be the “one database to rule them all”. Just a couple of these challenges include:
1. Distributed Two-Phase Commits: Complex transactions are typically consummated via a “two phase commit” process that ensures the ACID properties of a transaction are not compromised. While simply querying or reading a database does not typically require a two phase commit, writing to one typically does. Data servers theoretically need to be able to break-up a write into several smaller writes, in essence they need to be able to distribute a transaction across multiple databases while still being able to ensure a two-phase commit. There is general agreement in the computer science world that, right now at least, it is almost impossible to consummate a distributed two-phase commit with absolute certainty. Some start-ups are developing “work arounds” that cheat by only guaranteeing a two-phase commit with one database and letting the others fend for the themselves, but this kind of compromise will not be acceptable over the long term.
2. Semantic Schema Mapping: Having truly abstracted schemas that enable programmers to reuse existing data elements and map them into their new schemas sounds great in theory, but it is very difficult to pull off in the real world where two programmers can look at the same data and easily come up with totally different definitions for it. Past attempts at similar programming reuse and standardization, such as object libraries, have had very poor results. To ensure that data is not needlessly replicated, technology that incorporates semantic analysis as well as intelligent pattern recognition will be needed to ensure that programmers do not unwittingly create a second customer database simply because they were unaware that one already existed.
Despite these potential problems and many others, the race to build the data abstraction layer is definitely “on”. Led by a fleet of nimble start-ups, companies are moving quickly develop different pieces of the data abstraction layer. For example, a whole class of companies such as Composite Software, Metamatrix, and Pantero are trying to build query/update federation engines that enable federated reads and writes to databases. On the schema abstraction front, not many companies have made dramatic progress but companies such as Contivo are trying to create meta-data management systems which ultimately seek to enable the semantic integration of data schemas, while XML database companies such as Ipedo and Mark Logic continue to push forward the concept of infinitely extensible schemas.
Ultimately, the creation of a true Data Server will require a mix of technologies from a variety companies. The market opportunity for whichever company successfully assembles all of the different piece parts is likely to be enormous though, perhaps equal to or larger than the application server market. This large market opportunity combined with the continued data management pains of companies around the world suggests that the vision of the universal Data Server may become a reality sooner than many people think which will teach us once again to never underestimate the power of abstraction.
"Low-end" EAI Is Where The Action's At
While most of the attention in the Enterprise Application Integration (EAI) space has been focused on the development of high-end features such as business activity monitoring and business process management, some of the most interesting innovations are actually occurring at the low-end of the EAI market.
High-End For a Reason
Historically there has been no such thing as “low-end” EAI. EAI, by its very nature, is a complex, costly and technically demanding space that generally involves integrating high-value, high volume transactions systems. In such a demanding environment, failure is simply not an option. Thus, EAI software has typically been engineered, sold, and installed as a high-end product.
“High-end” is of course just another way of saying “very expensive” and EAI surely is that. The average EAI project supposedly costs $500,000 and that’s just to integrate two systems. Try to integrate multiple systems and you are soon easily talking about budgets in the millions of dollars.
Ideally, there would be a way to offer high-end EAI software at low prices, but unfortunately the economics simply don’t work. First off, the software engineering effort required to ensure a failsafe environment for high-volume transaction systems is non-trivial and therefore quite costly. Second, the infrastructure that vendors must build to sell, install, and service such high-end software is inherently expensive.
Thus, the very idea of low-end/low-priced EAI software was thought to be a pipedream and any vendor that was crazy enough to sell their software for $10,000 instead of $500,000 was thought to be on a fast path to going out of business.
A Volkswagen vs. A BMW
Despite the conventional wisdom that “low-end EAI software” is an uneconomic oxy-moron, there are in fact an increasing number of start-ups quietly pursuing this space. These start-ups believe they will be successful not because they are trying to replicate high-end EAI at a lower price, but because they are creating a new “low-end” market by offering a different product to an entirely different, and potentially much larger, market.
To be specific, these low-end EAI vendors differ from their high-end compatriots in several important aspects:
1. Focused on Data Sharing vs. Transactions: High-end EAI vendors have traditionally been focused on building failsafe, ACID-compliant, transaction systems that can handle a corporation’s most important and sensitive data. Low-end vendors do not even attempt to manage transactions, they simply enable basic data sharing between applications without guarantees, roll-backs or any other fancy features. Such software is much less robust than high-end offerings, but it’s also much less complicated and therefore easier to build and support.
2. User vs. Developer Centric: High-end EAI products are generally designed to be manipulated and administered by developers. They have extensive API’s, scripting languages and even visual development environments. Low-end EAI vendors are designing their products to be used by end-users or at worst, business analysts. By eliminating the need for skilled developers, the low-end software significant reduces set-up and maintenance costs.
3. Hijacking vs. building: Most high-end EAI products come with their own extensive messaging infrastructures that have been painstakingly built by their developers. In contrast, low-end EAI vendors try to “hijack” or leverage existing infrastructures, such as the web or instant messaging, to support their products.
4. Indirect vs. Direct: Selling big expensive software is a difficult and complex task. That’s why high-end EAI firms have expensive direct sales forces that can spend 6-9 months closing the average deal. In contrast, the low-end firms are trying to build indirect sales models that can leverage other companies’ sales channels. They can use these channels because their products are less complex to sell and install and their prices are low enough to make their product an attractive “add-on” sale to other products.
EAI For The Rest of Us
At this point you might be saying to yourself “no customer is going to be crazy enough to trust its mission critical systems to a non-transactional EAI platform that is sold by distributor and uses third party infrastructure for key components”. You’re right. Using low-end software for traditional EAI tasks, such as linking payment systems together, would be extremely foolish.
However, these low-end systems aren’t designed to go after the traditional EAI market, they are designed to go after a much different market, the market for ad-hoc intra and inter enterprise data sharing.
Today, only a fraction of intra and inter enterprise data sharing takes place via EAI systems. Instead, most data sharing takes place via e-mail or fax machines and the data involved is often stored in Microsoft Office documents or simple text files. A fairly typical example might be a sales forecasting exercise in which a Vice President of Sales e-mails out a spreadsheet to a group of Regional Directors and asks them to fill in their forecasts for the coming quarter. Each Director fills in a different spreadsheet and then e-mails it back to the VP who has a business analyst open each spreadsheet and combine all of the results into one master spreadsheet.
The Vice President could spend $500,000 on high-end EAI software to build a system for the real time collection and updating of sales forecasts, but spending $500K to automate this task just isn’t worth it. However, it would be worth it to spend $10K or $20K as that would free up the business analyst’s time to actually do analysis and dramatically improve the speed and accuracy of the data collection effort. This is precisely the market that the low-end EAI vendors are targeting.
Just how large is the market for this kind of low-end EAI software? It’s hard to tell exactly, but I challenge you to spend more than 5 minutes with a business executive talking about this software and not find at least a couple projects in their area that could make immediate use of it.
The beauty of these low-end EAI systems is that they make very basic EAI capabilities available for the most mundane applications and allow end-users to set up and tear down these ad-hoc integrations without IT’s involvement. It truly is EAI for the masses.
Despite the promise of low-end EAI, most of its vendors remain largely anonymous. One such vendor is an Australian firm called Webxcentric. They have built a low-end EAI system that allows end users to turn any Excel spreadsheet into a sophisticated data collection system. Using Webxcentric’s system, users can define data collection templates from within Excel using a simple wizard interface and then automatically e-mail those templates to end users who in turn fill in the templates via a web form. One of their customers is a large convenience store operator. The customer was having each of their store managers fax in a sales report at the end of each day. These faxes were then manually rekeyed into a SAP system (a surprisingly common practice). Using Webxcentric, the store managers simply updated a spreadsheet template and the results were then automatically fed into SAP.
Another low-end EAI vendor is CastBridge. CastBridge allows end users to publish and subscribe to data both inside and outside of their enterprise from within packaged applications, such as Microsoft Excel. I like CastBridge’s architecture so much that I made an investment in them last year. One of their early customers, a government in Asia, is using their software to link police stations and hospitals together to enable real time tracking of health and crime statistics. This is a project that they could have used high-end EAI software for, but they preferred the user-friendly, flexible, cost effective approach offered by low-end EAI.
In both cases, these low-end EAI vendors are not trying to displace existing high-end EAI installations, but to expand the overall EAI market by bringing automated data sharing to previously manual processes.
Low-end = Big Market
While in many ways these systems are highly inferior to high-end EAI software, they still get the job done to the customer’s satisfaction and they do so at a price point that is accessible to far more potential buyers.
By making basic EAI capabilities more accessible, the low-end vendors are dramatically expanding the overall EAI market size to encompass a wide range of manual data collection and dissemination processes that up until now were not cost effective to automate. This new market should provide both start-ups and incumbents with far more opportunities for growth than simply adding additional features on top of the high-end systems. Who ever thought the “low-end” could be such an interesting place to be?