The ORM persistence engine provides an object-relational mapping framework. It forms an object cache that communicates with a relational database via JDBC and is written in pure Java. It consists of multiple sub-systems that perform tasks like loading and parsing of deployment descriptors, providing meta-information about persistent objects, generating and executing SQL statements or switching between different transactional and non-transactional states of an object.
The internal structure of the ORM engine can also be seen as a layered architecture. The picture below shows the functional layers with some terms that are important in such layers:
The layers and helper sub-systems are described in more detail later this document.
Persistent types are represented by ORM beans. An ORM bean consists of 4 files:
The ORM engine provides the abstract base classes for ORM Objects, ORM Object Factories and ORM Object Keys. The application developer must inherit his persistent objects from the abstract superclasses. The necessary Java code and the deployment descriptor can be generated from a model using a code generator. In the past, UML was used as modeling language, currently models are defined in EDL.
The factory classes control the lifecycle of persistent objects. They allow the creation, finding and deletion of persistent objects. Conceptually, the can be compared with Entity Homes (see EJB spec).
An ORM object represents a persistent identity. It can have multiple states, depending on the current transaction of the caller. Since the identity never changes for an object, ORM objects can be shared between multiple threads. They do not have a lifecycle that is limited by the current transaction, for example. This means, it is legal to reference an ORM object permanently in the application, e.g. as an instance variable in a singleton manager.
For a given primary key, there exists at ALL times at maximum only ONE instance of the associated ORM object in the Java memory. This is guaranteed by the internal implementation the ORM engine. If the instance is modified, it’s internal state changes, but the object itself will still be the same.
The primary keys of persistent objects are represented by separate Java classes. A primary key may consist of multiple primitive attributes (compound key). Primary key classes form a similar inheritance tree as the ORM bean classes.
Primary key objects for abstract superclass beans are not abstract and can be instantiated. They can be used at the superclass factory to find objects (which will be the non-abstract leaf beans). For this, the equals method of the key classes are implemented in a way that they return the following results:
class ExtensibleObjectKey extends ORMObjectKey {...} class BasketKey extends ExtensibleObjectKey {...} class OrderKey extends ExtensibleObjectKey {...} class UserKey extends ORMObjectKey {...} new BasketKey(“ABC”).equals(new ExtensibleObjectKey(“ABC”)); // true new BasketKey(“ABC”).equals(new BasketKey(“ABC”)); // true new BasketKey(“ABC”).equals(new OrderKey(“ABC”)); // true new BasketKey(“ABC”).equals(new UserKey(“ABC”)); // false
This implies, that there may be only ONE instance with the primary key “ABC” for all existing subclasses, e.g. there may be no Order with the key “ABC”, if there is already a Basket with “ABC” as key.
ORM objects are held in a single cache (which is a just an ordinary hashtable). The key of the hashtable is the primary key of the ORM object. The values are a reference object which holds the actual ORM object.
There are 3 reference types that can be used to hold an ORM object in the cache:
By default, all objects are referenced softly. The whole cache is exclusively controlled by the garbage collector. There is no way to limit the cache size to a given number of entries, for example.
Each ORM object has multiple attributes that represent the values of the object. An ORM object typically has at least the following attributes:
ORM objects can have relations with each other. There are 3 types of cardinalities for such relations:
Relations can be bidirectional or unidirectional. Bidirectional relations have an associated relation in the opposite direction. The available cardinalities for bidirectional relations depend on each other. The following combinations are supported:
ORM Object 1 | ORM Object 2 |
---|---|
0..1 | 0..1 |
0..1 | 1 |
0..1 | 0..n |
1 | 0..1 |
1 | 0..n |
0..n | 0..1 |
0..n | 1 |
The combinations 1 – 1 and 0..n – 0..n are not allowed. However, 0..n - 0..n relations can be implemented using an additional assignment bean, like:
Bidirectional relations are always managed relations. This means, if one side of the relation changes, the opposite relation changes too.
Example: Let's assume that there are two ORM beans A and B, which have a bidirectional 0..1 - 0..1 relation with each other. Calling "a.setB(b)" changes both directions A --> B and B --> A, so a call to "b.getA()" would return "a". An additional "b.setA(a)" is not needed.
Unidirectional relations are always unmanaged, as there is no opposite direction.
Relations are generally fetched lazy from the database, e.g. on first access.
Because 0..n relations can potentially contain a large number of objects, a special handling is implemented in the ORM engine for updating such relations.
Relations are represented in memory in two different ways:
Relations are fetched completely, when they are read for the first time. For 0..n relations, someone has to obtain an iterator over the relation to trigger the complete relation fetching.
An ORM object has 2 types of internal states:
A transactional state is created when:
If there is already a shared state in memory, the transactional state is created by cloning the shared state, e.g. all loaded attribute and relation states are cloned. For 0..n relations, the relation state is not cloned, but a new delta relation state is created.
Attributes that are not in memory yet (lazy), will not be cloned. They will be bound to the transactional state when they are accessed and loaded from the database.
After a transaction was committed, the new shared states for the changed objects will be created by cloning the transactional states. They replace the old shared state.
Shared states of attributes and relations can be referenced either by strong, soft, weak or none references, as specified in the deployment descriptor.
Transactional states are always referenced using strong references. This guarantees that no values (which have possibly been modified) are garbage collected before the transaction was committed. The reference types are converted when a shared state is cloned into a transactional state, or when a transactional state is cloned to become a shared state.
The OCA of the parent object is included if a lazy attribute or relation must be fetched during an active transaction. This ensures that the object has not been modified in the meantime and that the partial data is still valid. If the OCA has been modified, no object will be found.
SQL queries that find objects in the database can be executed by using the provided methods in the ORM object factory classes. The following kinds of queries are supported:
All query methods are supported on super-class factories, too. In this case, the query will be executed on each existing concrete sub-class table separately, and the result sets will be joined in one collection. Note, that it is not possible to find objects of different types within one single query.
Queries always go down to the database; they cannot be executed in memory only.
It is strongly recommended to use bind variables in all WHERE conditions, in order to keep the load on the database low (e.g. to reduce statement parsing) and to prevent SQL injection attacks.
Queries read the result sets always lazy, as they are expected to work on potentially huge amounts of data. If the query result is completely read by the application (e.g. by iterating through the returned collection), the underlying database cursor is closed implicitly when the end of the iterator is reached. If the query result is not completely read by the application, the underlying cursor must be closed explicitly. This is done by calling an “endRequest” method by the application that will release all database resources (connections, cursors) being associated with the current thread.
The ORM object factory classes support the execution of arbitrary WHERE conditions in order to find objects of a certain type.
The WHERE condition can optionally contain placeholders for additional SQL parameters, which are identified by ‘?’ symbols. The "WHERE" must not be passed in by the application developer, only the condition must be provided. The condition may contain arbitrary SQL expressions like ORDER BY.
// leads to: SELECT uuid, name, … FROM FOO WHERE name=? Collection<Foo> foos = fooFactory.getObjectsBySQLWhere(“name=?”, new Object[]{“bla”});
The ORM object factory classes support the execution of arbitrary JOIN queries in order to find objects of a certain type.
In contrast to getObjectsBySQLWhere queries, which only operate on the table of a single object, the getObjectsBySQLJoin queries operate on multiple tables that have a dependency on each other. This requires a naming schema in order to identify the columns of each involved table.
The “local” table (e.g. the table of the persistent object, whose factory is used to execute the query) is always identified by the “this” alias. The aliases of other related tables can be chosen by the application developer.
The WHERE condition can optionally contain placeholders for additional SQL parameters, which are identified by ‘?’ symbols. The FROM and the WHERE string must not be passed in by the application developer.
// leads to: SELECT this.uuid, this.name, … FROM Bar this, Foo f WHERE this.uuid=f.barID and this.name=? Collection<Bar> bars = barFactory.getObjectsBySQLJoin(“Foo f”, “this.uuid=f.barID and this.name=?”, new Object[]{“bla”});
The “this” alias is bound to the “BAR” table, because the “BarFactory” was used to call the method and objects of type “Bar” are expected in the result set. The application developer must declare any additional tables that are important for the join condition, e.g. “Foo f”.
The JDBC manager manages the JDBC connections that are currently used by the application. Each thread will get its own exclusive connection. When the thread is finished, it should call “closeConnection” to release the connection again.
The JDBC manager uses a configurable datasource in order to obtain connections. For example, the datasource may be an OracleDataSource which is configured in the ORM engine directly, or it may be a datasource from an application server into which the ORM engine was deployed. Which datasource must be used can be configured in the ORM configuration.
Connection pooling, failover etc. must be handled by the datasource (e.g. the Oracle JDBC driver).
The ORM engine performs a number of type conversions for the attributes of persistent objects. The general concept is as follows:
When a modeling language like EDL in conjunction with a code generator is used, there is another mapping from model types to the implementation (e.g. Java / JDBC) types. This mapping is done on top of the ORM type mapping and has nothing to do with the ORM engine implementation, so for more details and a list compound types that are supported by the code generator refer to the EDL documentation.
The table lists the currently supported standard conversions that are built-in into ORM:
Java Type | JDBC Type | Oracle Type |
---|---|---|
boolean | BOOLEAN | NUMBER(1) |
byte | TINYINT | NUMBER(4) |
byte[] | BLOB | BLOB |
char | CHAR | VARCHAR2(3 CHAR) |
char[] | VARCHAR(<length>) | VARCHAR2(<length> CHAR) |
short | SMALLINT | NUMBER(6) |
int | INTEGER | NUMBER(11) |
long | BIGINT | NUMBER(21) |
float | FLOAT | FLOAT(63) |
double | DOUBLE | FLOAT(126) |
java.io.Serializable | BLOB | BLOB |
java.lang.Byte | TINYINT | NUMBER(4) |
java.lang.Character | CHAR | VARCHAR2(3 CHAR) |
java.lang.Short | SMALLINT | NUMBER(6) |
java.lang.Integer | INTEGER | NUMBER(11) |
java.lang.Long | BIGINT | NUMBER(21) |
java.lang.Float | FLOAT | FLOAT(63) |
java.lang.Double | DOUBLE | FLOAT(126) |
java.lang.String | VARCHAR(<length>) | VARCHAR2(<length> CHAR) |
java.math.BigDecimal | DECIMAL | NUMBER(38, 6) |
java.sql.Blob | BLOB | BLOB |
java.sql.Clob | CLOB | CLOB |
java.util.Date | TIMESTAMP | DATE |
ORM transactions (like connections and states) are associated with the current thread. The current transaction can be obtained from the transaction manager. Multiple calls from the same thread will always return the same transaction object, so it can be reused for multiple transaction cycles.
The thread binding has several implications: It is not possible to pass an ORM object with uncommitted changes from one thread to another, because the other thread will have a different transactional context. This means, it would possibly see completely different values for the attributes and relations.
All changes at an ORM object require an active transaction. So in order to create, modify or remove an ORM object, a transaction must be started. This can be achieved by obtaining the transaction for the current thread from the transaction manager and calling its begin method.
The transaction contains a list of all objects that have been touched. If the store method is called, all such changes will be written down to the database (but not committed yet). This can be used to ensure that a find query will find objects that have been created in the running transaction.
A flush is quite similar to a store, but additionally the transactional state of the object in memory will be dropped. This operation should only be used in rare cases where large amounts of data are written through ORM to the database and memory consumption is otherwise too high, like for import processes.
On commit of the transaction, the following actions will be done:
If the ORM object has an OCA attribute, it will be used in the generated SQL statements in order to ensure transaction consistency. The OCA is also used when writing back committed changes as shared state in the ORM object.
If no OCA is available, the shared state will be dropped from memory.
The transactional states will simply be thrown away. Nothing is written to the database. The current JDBC connection will be rolled back.
It is possible to perform additional actions during transaction operations by registering a transaction listener. The listeners can either be registered at the transaction manager, in which case they would be called for ANY transaction. Alternatively, they can be registered for a single transaction, so they would be called on lifecycle events for the single transaction only. The listener methods are invoked on:
The description manager provides meta-information about the persistent ORM objects that have been registered at the ORM engine. The descriptions are initialized from the ORM deployment descriptor files during startup.
The mapping manager provides meta-information about how persistent ORM objects map to database tables. The mappings are also initialized from the ORM deployment descriptors at startup. They are used to generate the SQL statements.
There is one mapping for each concrete subclass. Abstract superclasses are not mapped to tables, thus they don’t have a mapping. Therefore, the attributes and relations of the abstract superclasses will be mapped in the concrete subclasses to their respective tables.
The deployment manager parses the XML deployment descriptors and initializes the description and mapping objects. The deployment descriptor contains a complete description of the persistent Java classes, their attributes, relations and their mapping to database tables. The application developer is responsible to register all deployment descriptor files that are available in the system when initializing the ORM engine. The deployment manager uses a SAX parser and some specialized content handlers for efficiently parsing the files. It also does some verification in order to ensure the validness and completeness of the deployed ORM files.
The monitoring manager allows the monitoring of certain runtime aspects of the ORM engine. In particular, it can be used to retrieve statistical information about the number of cache hits, cache misses, database hits, database misses etc. for each kind of persistent object, or the number of transactions and active JDBC connections. The information can be used to optimize the application.
The synchronization manager performs a synchronization of the ORM object caches that are living in different Java VM’s. It propagates all changes to ORM objects to the listening parties after a transaction has been committed.
By default, the changes are propagated as UDP multicast messages. Only the primary key objects and the type of change (object creation, modification, deletion) are submitted, but not the full state. The receiving ORM engine simply removes the shared state of the affected ORM object from the cache. Therefore, such objects will be read from the database again on next access.
The ORM Engine class is the central entry point for the application developers. It provides methods to initialize the whole system and gives access to the individual sub-system managers.
The engine can be used to deploy the ORM beans by registering the available deployment descriptors at the deployment manager.
A typical flow of using the ORM engine would look like this:
ORMEngine engine = ORMEngine.newInstance(props); engine.init(); // deploy the ORM deployment descriptors ... engine.start(); // use the engine to lookup ORM Object factories and work with them ... engine.stop();
The ORM engine can be configured with properties that must be passed at its instantiation.
The ORM engine uses the Apache Commons Logging framework as logging API. This framework serves as an abstract API, which can be implemented by arbitrary “adapter” classes to support specific logging framework implementations. For example, there may by an adapter that uses the Apache Log4J logging framework to handle the log messages.
The ORM engine uses the "sql" logging category for logging the generated SQL statements and transaction boundaries, using DEBUG as log level.