Wednesday, May 19, 2010

Why is in-memory database important to #SAP ?

Very busy these days, but wanted to share some thoughts about in-memory database and why it is so important to SAP - and the industry. All of this is just my opinion, and based only on my experience.

SAP is really a collection of business processes implemented in software. No shock there. In some cases, these processes are implemented in code. Yech. In other cases, these processes are implemented as simple or complex finite state machines dynamically driven by data or metadata; unfortunately, these state machines are written on different programming models, using sometimes very different (domain-specific) languages and technologies. Generally speaking, these state machines are integrated (or can be integrated through customization) in various ways that can be very interesting to enterprises, allowing for processes like "build to order" to function properly (which integrates "order" with "collection" with "dunning" with "schedule" with "manufacture" with "ship" with ...).

Because these processes are so diverse, and have been implemented literally over a thirty year stretch in time, there is not a lot of consistency in their underlying technologies and programming models. With all this inconsistency, it is not straightforward to produce and consume events, which makes these processes hard to configure, integrate, and extend.

Event-based systems are not foreign to SAP, but (modern) event-based technology (at least within the applications) is. Anyone who has used ARIS to do the design of business processes for SAP implementations has probably used "event-driven process chains" (EPCs) to describe processes; implementing the event connections, however, has often/mostly been a process of writing custom code. There is no publish/subscribe bus for (most important) SAP events, thus there are no (substantial) event publishers or subscribers in the SAP application suite.

If SAP were to undertake a rewrite of its applications, it would have the opportunity to implement events in a consistent way in the applications. I am not certain if this has been done with Business ByDesign ("ByD"), which was a project which undertook to rewrite some of SAP's applications, but very few mySAP Business Suite customers will be replacing their mySAP suites with ByD any time soon.

If, on the other hand, SAP were to undertake a rewrite of its application server tier to replace all calls to a database with calls to "HassoDB" (the in-memory database that got so much attention at SAPPHIRE this week), SAP would have the ability to simultaneously event-enable its product essentially with little to no additional effort. If HassoDB understands that an object is being stored, updated, or accessed, HassoDB could publish an event - and that event could be consumed by new applications that speed up integration between business processes, allow the insertion of new business processes, or that simply generate alerts for users.



SAP even has a design for such a capability: SAP Live Enterprise from SAP's Imagineering team.

How could this capability be deployed?
Well, imagine that a sales person gets an alert every time their customer makes a payment, is late with a payment, submits a complaint or service request, or places an order on-line. Or that a salesperson sets up an "auto-responder" for those events, thanking the customer or asking her for feedback as appropriate. Event-based capabilities would greatly speed up and improve service.

Another example could be in integrating business processes. Rather than hard-coding the "on-boarding" process for a new employee, there could be an event-driven integration. The hiring process could generate an event when an employee's starting date is set; other processes could subscribe to that event, and do the appropriate processing, including reserving an office, preparing the HR orientation, ordering a company credit card, requesting an entry badge, or assigning and configuring a computer. Whenever the on-boarding process changes, rather than editing the process definition, taking the application down in the process, and restarting it, instead an administrator would just load a new action and subscribe it to the appropriate event.

Finally, how will customers license the in-memory database? Will they have to buy it or will it be included in maintenance? If it is the latter, and if the in-memory database can be made compatible with older versions of SAP, then this would be substantial justification for the maintenance fees SAP has charged customers over the years. Even if customers have to separately license and pay for this capability, if it is made compatible with all versions of mySAP under maintenance, that would be a huge benefit for customers. I'm looking forward to SAP bringing "HassoDB" to market. With the help of the database gurus at Sybase, I think it is possible that in-memory will finally come to "Big Data" enterprise problems.

6 comments:

  1. Nice post.

    "...a rewrite of its application server tier to replace all calls to a database with calls to "HassoDB" (...), SAP would have the ability to simultaneously event-enable its product essentially with little to no additional effort."

    Event firing in R/3 (or whatever it's called these days) probably wont be impacted by a re-write to support another db. If memory serves me (and it often doesn't) the db specific 'driver' (my word) sits between the db and the app layer.

    ReplyDelete
  2. Kevin --

    What you describe is correct, if SAP implements the in-memory database as just another database through their normal database driver mechanism (like they will with Sybase, I'm sure).

    However, SAP isn't talking about just writing a database driver for the in-memory "HassoDB" database - they could do that any time, but it would offer few benefits. Why? Because you could get pretty much all the same benefits by just using any currently-supported relational database, and turning up the cache really high.

    SAP is talking about rearchitecting the applications to take away the database entirely. This would mean rewriting the applications to directly manage their own objects in memory. This rewrite would be a HUGE amount of work, and it isn't clear whether those skills still exist within SAP, or would these skills have to be recreated.

    In this new architecture, SAP's old data management layer would be eliminated. Presumably, SAP would use some common technology across applications, would be aware of the object being managed, and could be made event-aware. This would allow for pervasive eventing, consistently implemented across the applications. Just think what that would enable ...

    -- Dennis

    ReplyDelete
  3. Also, the in-memory concept is based on (or at least was, in Hasso's published paper on the topic) a very different structural model than a typical RDBMS, using columnar/tuple storage vs traditional row-oriented table storage.

    I think that the ultimate database model will be some type of hybrid including Hasso's approach + graph database capabilities + scalable blob storage (since different kinds of media are now becoming part of the "permanent record" of many business transactions).

    It should also be noted that most existing RDBMS's are more than capable of firing mutation events, but the real value comes from meaningful semantic events (order status changed, new customer added), not database-level events. One challenge with SAP R/3 has always been that event-driven applications often required custom hooks/handlers. I hope than ByD and any future products recognize the value of exposing meaningful "events" in addition to "data" and "services". That's the approach we're taking in our product, but we had the luxury of a clean slate.

    The HassoDB concepts are cool and useful stuff in any case.

    ReplyDelete
  4. "However, SAP isn't talking about just writing a database driver for the in-memory "HassoDB" database - they could do that any time, but it would offer few benefits. Why? Because you could get pretty much all the same benefits by just using any currently-supported relational database, and turning up the cache really high."

    No, you can't get the same results via caching on a relational database. If you could, Oracle and IBM would not have bought in-memory database providers. The benefits of the SAP in-memory database are more efficient storage via columnar structure and compression, but it's still a storage mechanism, a database of sorts (just not relational in structure). This is what they do today with the BW Accelerator. It is a storage mechanism used with SAP's data warehouse that has required near-zero changes to the application to implement.

    The difference in this new approach would be that the current BWA product is for read-only use, and would not adapt itself well to a transactional approach. However, what SAP describes is built on the same technology, and, from all indications, utilized the same way - as an add-on caching engine for the database.

    If what they were talking about required a complete rearchitecting of the application layer, as you suggest, then there is no way they could offer this add-on to ERP applications all the way back to 4.6C.

    ReplyDelete
  5. Rick --

    Very perceptive. As I added in my comment above, the benefits don't come from plugging in the in-memory database into the same system used by other databases. If SAP writes the in-memory database into its applications directly, the application objects will be passed to the database for CRUD operations. The context in which the CRUD operation is performed will be known (e.g., not only which object, but which process, process step, user, etc.). This context and content can be used to fire off very meaningful events. Very meaningful events can thus be expected and available for consumption as described above.

    While SAP does not have the luxury of a clean slate, it could manufacture a "clean slate" for the purposes of rewriting the data management layer of SAP to move to in-memory. And they do have the luxury of lots of resources and lots of customers ;-)

    Thanks!

    -- Dennis

    ReplyDelete
  6. SAPDBA --

    Wow, the comments here are getting more and more technically complex. Thanks for a challenging set of topics to respond to!

    First:
    In-memory databases like TimesTen are not like the in-memory database proposed by Hasso. TimesTen is still a relational database, and guarantees transactions are written to disk every once in a while. Oracle bought TimesTen to learn more about in-memory, that is true, but Oracle is not a big player in the in-memory market or discussion, to my knowledge.

    Second:
    Without rewriting the object management layer of the applications, the best you could do with SAP on an in-memory database would be to replicate the current relational queries that access, store, update, and delete database rows. If an in-memory database were used for this purpose, the best it could do would be like a relational database could do now with a lot of cache. For example, Oracle already supports compression in its relational product, and it is not clear that the types of queries issued today with SAP's data management layer would produce better compression or performance results if simply moved to HassoDB as compared to Oracle. If there is a paper on this topic, I'd like to read it!

    Third:
    There are many limitations to a columnar main-memory database when used in update-intensive applications. Many SAP applications are update-intensive. There are techniques that can be used to make a hybrid database combining columnar approach for reading with a row-oriented approach for updates, using a synchronization method to move data from row to column, but that introduces latency between writing and reading, plus it requires a lot of CPU and memory to support the hybrid approach and all the processing between them. See Stonebraker et al for more detail: http://cs-www.cs.yale.edu/homes/dna/vldb.pdf . Another interesting paper to read is http://www.cs.toronto.edu/vldb04/protected/eProceedings/contents/pdf/IND2P2.PDF .

    Finally:
    You make a very good point about backwards compatibility with the approach I mention. I agree that it would be more work to do this back to 4.6C, although I think the additional work would be minor compared to the size of the effort to do this even just for ECC6.0, let alone for CRM, SRM, etc. HCM, FI/CO, etc. have changed very little in their data management layers since a very long time ago. At this point, however, all of this is speculation; SAP has made no commitment that this project will be done, and has been talking about this since the past 5 years or so with no progress reported. SAP tossing out a line that this will be done for SAP back to 4.6C is not really a commitment yet for whether this will really be done, or how this will be done. Perhaps they will go the easy route and just build a database interface for HassoDB and not optimize the object-management of the applications, but (as I wrote in the blog entry above), I think that would be a disappointment and a lost opportunity.

    Again, SAPDBA, thanks for some very thought-provoking comments - please let me know your thoughts on this response as well.

    ReplyDelete