I have been working on persistence technologies for 8 years. I entered the persistence technology domain at my first company. We firstly used WebObjects’ Enterprise Objects framework (ancestor of all ORM libraries) for our projects in 2000. It was a very good framework and it was adequate for many persistence needs. We then began to develop our own web framework including persistence layer to make a software product to provide “fast application development for web applications” (this motto repeated across years in many places). After economical problems our project cancelled and then I transferred to another company in 2001 as Project Manager for developing new ERP software. I again take the responsibility of developing web framework for ERP applications. Its most important part is surely persistence layer. We developed the framework and used in ERP applications. As our in-house ERP software is developed and brought into use in many servers, our web frameworks matured and evolved. This is the journey that I am involved in persistence technologies.
During that time, EJB era ended and ORM frameworks popped. Many web applications emerged for the promise of fast or simple web application development. As our own development has continued across the years, I observed the emerging frameworks and inspected their documentation from books or articles. Let me share my thoughts about persistence solutions in Java world. Besides all criticism I state here, I appreciate the work and effort behind all these solutions. I know that it is very hard to bring these solutions into widespread use and support it.
It is a fact that we need a persistence solution. Most of applications are database-driven and need to access tables, modify data. Although there are many questions about persistence frameworks, we need them a lot when developing applications. Otherwise SQL codes scatter across applications causing an ugly and unmanageable application. Persistence frameworks build a database layer and separates concerns. Java Persistence solutions are divided into two branches. One is Java standard solutions; JDBC, EJB, JDO. The other is ORM (Object-relational Mapping) frameworks; Hibernate, Toplink, Enterprise Object Framework, iBatis etc.
Let me ask the basic question that everybody somehow asks himself “Isn’t there any other solution to persistence problem simpler and more powerful?” I have been reading many problem news about this issue. Last one was Sun’s attempt to simplify EJB. In fact, this is why new frameworks will never end. Followings are just some samples:
Instead of comparing every solution, I am going to describe most common pitfalls:
1- “Object-Relational Impedance Mismatch” was not solved appropriately:
This is the main problem. Everybody knows it but it still hasn't been solved completely. “Mismatch” is heavily lasting even in simplified-lightweight ORM frameworks. Is this inevitable? No, we can solve it but I think our attitude is wrong. We are trying to carry relational world into object world which will never occur. Objects and database technologies are different and should be left separate. For example, SQL is a very dynamic language that you can’t state it with objects. What we see is QUERY, JOIN, WHERE objects in ORM frameworks to re-implement SQL again in object world. We made this mistake in our first ORM framework but corrected in second one. The magical clue to achieve this is that not object-only or relational-only solution is true. We need a balanced mixture of those technologies by using objects when object usage is true and by using SQL when SQL usage is true. SQL abandonment is the major problem of ORM frameworks. Hiding SQL operations of persistent objects are problematic. This is why your favorite framework’s PersistenceManager class (source code) is complexity champion.
2- Frameworks didn’t suit application architecture needs (Web or Desktop):
Your persistent object model should be compatible with your application architecture; application event system, application GUI components, application flow or navigation. I wonder how much rich applications will be succeeded using current persistence frameworks. I am still skeptical about using those frameworks in complex applications. Maybe it is succeeded but could we maintain with minimal cost? Are transaction boundaries, transaction rollbacks, data validations, lazy loading mechanisms are running without problem? Across the years, I still see the same demo a simple data list and simple command buttons to add/update/delete records (Where is grid, detail grids, tabs, detail tabs etc?). That is not the applications we are trying to develop, we are expected to develop much more complex applications.
3- XML usage should be immediately abandoned:
SQL, Mapping and descriptions are held in XML files. This is the most bleeding mistake done through the years. When annotation is added to the Java, eventually XML files are abandoned. Why do we need adding another touch-point for developers decreasing development speed? Couldn’t we add them to the persistent classes without waiting annotation support? Edit hundreds of XML files without any real benefit. I remember that one of persistence solution documentation says that transaction boundaries flexibly determined via XML files. Who needs flexible transaction boundary changes, I don’t need any. Who need flexible mapping? I don’t need any. Could you change your fragile application without touching any code? Impossible. Field change is flexibly done. Does anyone remember field name or type change doesn’t change application code?
4- Database Metadata was not utilized:
Before going to database, persistent classes should validate its data. This is not the case for many persistence frameworks. We had developed generic validation rules that are executed every time when user enters data like type, range, format, length, security (i.e. SQL injection) control. Any specific validation rules can be added to persistent objects similar to database table constraint rules. There are many other usages for metadata. In some of persistence solutions, there is no API to access to metadata of persistent objects other than database metadata.
5- Object Query Languages are a wrong path:
For the sake of removing all SQL statements in objects, OQLs are invented; EJB-QL,HQL etc. Many classes are written to handle this new query language. Why do we invent a new language even though we already have SQL? I think the reason for that mapping SQL resultset confuses ORM object caches and make them same with ordinary objects. Let’s reinvent everything for OQL; syntax, access plan, index optimization etc. We don’t need this. If we need some changing criteria, basic Criteria API’s are enough for that.
6- SQL visibility is important:
SQL is here and will be here in the future. SQL is tried to be removed completely from persistence solutions (Named Transparent Persistence Framework). SQL must be contributed much more than its current use. SQL has its standards (Ansi 92 SQL) that should be enough for database independence if it is removed for the sake of database independence. If you want to use custom SQL, your trouble begins. SQL usage should be made simpler in persistence frameworks. It is very hard at the moment.
7- Multiple data source dilemma:
One of the major architectural problems of persistence frameworks is multiple data source architecture. This is why we still discuss inabilities of the persistence frameworks today. Small percent of possibility to use different data source added enormous cumbersome to the persistence frameworks. What are the data sources other than database? File, Network, EIS, XML, ESB, JMS etc. I don’t answer the question why we don’t leave this portion of problem to the other software. I think it is a rare condition and should be totally removed from persistence solutions. This will not endanger its generality.
8- Why persistent objects are so weak (POJO)?:
We were asking opposite question years ago for EJB “why EJB objects are so heavy?”. Today the question is converted. I think this is mainly caused with the aim of hiding generated code. I think there should be a middle way. Persistent objects may take much more responsibility. They may take validation responsibility, value object responsibility etc.
9- Poor Transaction Management:
Visibility of objects entered to transaction is questionable. When reading a piece of code, it is very hard to understand transaction borders if it resides in a XML file that you forget to read (Declarative Transaction Demarcation). We use automatic transaction management for mapped child objects freeing programmer to start and commit transaction for them. Is object state handled correctly (according to application needs) in commit or rollback condition? Is transactional memory footprint is manageable? Transactional memory is a very big problem when thousand of objects participate in a transaction. Is isolation level adjustable according to application needs?
10- Heavy Object Caches:
Object caches are subject to debate. Aren’t database buffer pools enough for caching? Garbage collector troubles when memory is loaded with many objects. Why do we reinvent database cache system again with object caches? Yes, it is sometimes useful to cache objects in bulk object creation but if it is a part of system, isn’t this responsibility of database system?
11- Wrong Locking Methods:
Using database locks for application lock needs is another pitfall. Since some processes require many screens and steps, to lock records across long processes leads to performance problems. Optimistic locks and pessimistic locks are still a missing feature. We also implemented optimistic locks as transparent which doesn’t require explicit API call.
12- Persistence API could be simpler:
I do not like API used for the persistence. JPA is candidate to solve this problem. To do simple database operations, we need to write many boilerplate code; Sessions, UnitOfWork, Context etc. Add your UI object to this list, DAO, DTO, value objects etc. Why do not we give this responsibility to persistent objects? I really don’t figure out why transaction mechanism is not used for the session architecture. I think this is why many scripting or domain-specific languages (DSL) are emerging for the simplicity.
DBEmployee dbEmployee = new DBEmployee();
dbEmployee.dbSave(); // inserts record here
13- Bytecode manipulation problems:
We know from JSP technology that runtime code generation and compilation is very expensive. When we add persistent objects to this bandwagon, our software system stability is at stake. Changing an object after its declaration at runtime or build-time may cause many problems; performance, testability, stability etc. Many of us already experienced JITC problems which it has similar mechanism.
14- Unnecessary OID (object id) columns in tables whereas PK columns are enough:
This is the behavior of some persistence framework that your DBA never agrees upon. Many frameworks can’t handle multiple PK and FK keys. This is a very big handicap.
15- Inheritance has no meaningful usage:
Wrong motivation is that; “In object world, inheritance is a great structure so we must use it in relational world.” You don’t need inheritance in relational world. We don’t implement inheritance in our ORM solution and we don’t even need it. Traversing your persistent object one level below is not a hard job. Even your tables may already have some repeating redundant columns.
16- Primary and foreign keys are held in accompanying new classes causing many unnecessary classes:
When considering 1000 tables in our ERP system, applications have many persistent objects. If we had created PK and FK classes, it is going to be a total mess. We used simple strings for the solution of problem. I see in some frameworks PKs and FKs are held in their own classes.
17- Missing cluster support:
Many leading Application Servers support cluster, but to use cluster with persistent objects would not be easy. First of all, sequence generation and locking system should support cluster. I am sure that to write clusterable persistent objects would not be easy. Every application doesn’t need cluster but as your application grows than scalability and availability concern increases than cluster becomes necessary.
18- Missing database events:
Like triggers in databases, we sometimes need database event support, like row inserted, updated, deleted. This is especially required if you want to add support of BPM or Workflow to your application.
19- Missing authorization:
Row-level authorization just recently added to the feature list of ORM frameworks. Database connection users also can be separated according to user roles for security.
20- Missing dynamic rules:
Some validations or calculations may change over time and can’t be stated with static code or parameter. Persistent objects should support dynamic rule engines or mechanisms.
21- Missing object modeling tools for code generation:
Most of ORM framework have mapping design tools. Reverse engineering from database tables are a common feature. We should design objects and generate persistent objects from this design models.
22- Missing audit trailer:
In ERP systems, audit trailing is a very important requirement. Because of this reason, every table includes columns like insert time, user inserted, last edit time, last user edited etc. There should be ways to fill these columns or other helper tables transparently without appending to SQL statements by programmers.
23- Missing persistent object debugging utilities:
Persistent objects state (insert, update, delete, identical) should be visible. At runtime, sometimes persistent objects need to be accessed to read last SQL executed or field values. Instead of attaching a profiling software to production system, some debugging information should be provided.
This is like a wish list and may be used by persistence framework developers. To build a persistent framework is very charming and looks like very easy but as you progress it requires a lot of resource. To justify your development investment, resources should be evaluated carefully thinking all the risks that your attempt may not produce result. Your initial framework will be used and then changing and evolving it be a real nightmare if you are not familiar with concepts and followed the wrong path.