Personal Library

Concept - ORM Layer

Overview

The ORM persistence engine provides an object-relational mapping framework. It forms an object cache that communicates with a relational database via JDBC and is written in pure Java. It consists of multiple sub-systems that perform tasks like loading and parsing of deployment descriptors, providing meta-information about persistent objects, generating and executing SQL statements or switching between different transactional and non-transactional states of an object.

The internal structure of the ORM engine can also be seen as a layered architecture. The picture below shows the functional layers with some terms that are important in such layers:

Application Layer: consists of the application objects of the business application, e.g. the actual persistent objects that must be stored in the database, their factory objects and their primary key objects
Object Layer: handles all aspects of persistent objects on a global object-level, e.g. caching of objects, switching between object states depending on the current transaction, creation of new persistent objects
State Layer: handles all aspects on state-level, e.g. updating an object state in the database, getting the values of an attribute from its attribute state, determining the relation state for a relation. The state layer uses meta-information about the persistent objects, which are provided by a description manager.
Query Layer: handles all aspects on query-level, e.g. construction of SQL statements in order to find, create, update, delete persistent object states, attribute states or relation states, execution of multiple SQL queries for finding the states of all existing instances of the sub-classes of an abstract super-class etc. The query layer uses meta-information about the mapping from persistent objects to the database, which are provided by a mapping manager.
JDBC Layer: handles all aspects on JDBC-level, e.g. type-safe setting and getting values from JDBC prepared statements or JDBC result sets, management of JDBC connections
Transaction Manager: controls which object state is currently associated with the current thread, controls commit and rollback actions in the cache and the database

The layers and helper sub-systems are described in more detail later this document.

Features of the ORM Engine

Object-Relational Mapping Features

inheritance between persistent objects with abstract superclasses
mapping of concrete leaf classes to tables
find by primary key on superclasses or leaf classes
find by alternate key on superclasses or leaf classes
find by attribute on superclasses or leaf classes
find by SQL where condition on superclasses or leaf classes
find by SQL join condition on superclasses or leaf classes
bidirectional managed relations between subclasses and / or superclasses
unidirectional unmanaged relations between subclasses and / or superclasses
simple or compound primary keys
simple or compound alternate keys
simple or compound foreign keys for relations
all primitive Java types for attributes
complex Java types (String, Date, Blob, Clob)
custom types for attributes
relation cardinalities 0..1, 1, 0..n
support for required not-null attributes
registration of custom objects listeners at objects
modification time tracking

Caching Features

caching of persistent objects by primary key
caching of persistent objects by alternate key
caching of relations between persistent objects
configurable cache reference types for objects (strong, soft, weak, none)
configurable cache reference types for relations (strong, soft, weak, none)
configurable cache reference types for attributes (strong, soft, weak, none)
global cache clearing
cache clearing by type
cache clearing / refreshing for single objects
lazy fetch of relations on access
lazy fetch of attributes on access
instant fetch of attributes with object
cache synchronization between multiple ORM engines on object modifications
cache monitoring (cache size, number of objects of a certain type)
cache size control via Java VM garbage collector

Transaction Features

optimistic transactions with optimistic control attribute (OCA)
explicit locking of single objects ("select for update")
optional synchronization with JTA transactions for integration with Application Server transactions
begin, commit, rollback, store, flush of transactions
registration of custom transaction listeners at transaction
transaction monitoring

JDBC Features

creation of SQL statements
support for Oracle JDBC driver (thin, OCI)
supports Oracle 8i, 9i, 10g, 11i
support for bind variables
connection monitoring

Logging Features

logging of SQL statements that are sent to the database along with the values of their bind variables
logging of transaction boundaries
support for Apache commons-logging API

Runtime Environment

pure Java
platform independent, runnable in any Java application on any operating system
multiple instances connected with different databases possible

ORM Engine Sub-Systems

Object Layer

Persistent types are represented by ORM beans. An ORM bean consists of 4 files:

the actual ORM bean class, which represents a persistent object
the ORM factory class, which controls the lifecycle of the ORM object. The factory is quite similar to an EJB home.
the ORM key class, which identifies a persistent object
the ORM deployment descriptor, which describes the ORM object and its mapping to the database

The ORM engine provides the abstract base classes for ORM Objects, ORM Object Factories and ORM Object Keys. The application developer must inherit his persistent objects from the abstract superclasses. The necessary Java code and the deployment descriptor can be generated from a model using a code generator. In the past, UML was used as modeling language, currently models are defined in EDL.

ORM Object Factories

The factory classes control the lifecycle of persistent objects. They allow the creation, finding and deletion of persistent objects. Conceptually, the can be compared with Entity Homes (see EJB spec).

ORM Objects

An ORM object represents a persistent identity. It can have multiple states, depending on the current transaction of the caller. Since the identity never changes for an object, ORM objects can be shared between multiple threads. They do not have a lifecycle that is limited by the current transaction, for example. This means, it is legal to reference an ORM object permanently in the application, e.g. as an instance variable in a singleton manager.
For a given primary key, there exists at ALL times at maximum only ONE instance of the associated ORM object in the Java memory. This is guaranteed by the internal implementation the ORM engine. If the instance is modified, it’s internal state changes, but the object itself will still be the same.

ORM Object Keys

The primary keys of persistent objects are represented by separate Java classes. A primary key may consist of multiple primitive attributes (compound key). Primary key classes form a similar inheritance tree as the ORM bean classes.

Primary key objects for abstract superclass beans are not abstract and can be instantiated. They can be used at the superclass factory to find objects (which will be the non-abstract leaf beans). For this, the equals method of the key classes are implemented in a way that they return the following results:

class ExtensibleObjectKey extends ORMObjectKey {...}
class BasketKey extends ExtensibleObjectKey {...}
class OrderKey extends ExtensibleObjectKey {...}
class UserKey extends ORMObjectKey {...}

new BasketKey(“ABC”).equals(new ExtensibleObjectKey(“ABC”)); // true
new BasketKey(“ABC”).equals(new BasketKey(“ABC”)); // true
new BasketKey(“ABC”).equals(new OrderKey(“ABC”)); // true
new BasketKey(“ABC”).equals(new UserKey(“ABC”)); // false

This implies, that there may be only ONE instance with the primary key “ABC” for all existing subclasses, e.g. there may be no Order with the key “ABC”, if there is already a Basket with “ABC” as key.

Object Cache

ORM objects are held in a single cache (which is a just an ordinary hashtable). The key of the hashtable is the primary key of the ORM object. The values are a reference object which holds the actual ORM object.

There are 3 reference types that can be used to hold an ORM object in the cache:

strong object reference: holds the object directly in the cache, using a strong Java reference. The object will never be garbage collected.
soft object reference: holds the object in the cache using a soft Java reference. The object may be garbage collected, if the VM is low on memory and nobody else references the object.
weak object reference: holds the object in the cache using a weak Java reference. The object may be garbage collected, if the VM is low on memory and nobody else references it. Weak referenced objects are gc’ed before soft referenced objects.

By default, all objects are referenced softly. The whole cache is exclusively controlled by the garbage collector. There is no way to limit the cache size to a given number of entries, for example.

Object Attributes

Each ORM object has multiple attributes that represent the values of the object. An ORM object typically has at least the following attributes:

one or more primary key attributes: there MUST be at least one primary key attribute. This attribute cannot be changed later anymore. It is used to identify the object.
optimistic control attribute (OCA): optional, but recommended. Used for optimistic transaction handling.

Object Relations

ORM objects can have relations with each other. There are 3 types of cardinalities for such relations:

0..1: the relation references 0 or 1 other object
0..n: the relation references 0 or any number of other objects
1: the relation always references exactly 1 other object. The referenced object must be passed when the ORM object is created.

Relations can be bidirectional or unidirectional. Bidirectional relations have an associated relation in the opposite direction. The available cardinalities for bidirectional relations depend on each other. The following combinations are supported:

ORM Object 1	ORM Object 2
0..1	0..1
0..1	1
0..1	0..n
1	0..1
1	0..n
0..n	0..1
0..n	1

The combinations 1 – 1 and 0..n – 0..n are not allowed. However, 0..n - 0..n relations can be implemented using an additional assignment bean, like:

Bidirectional relations are always managed relations. This means, if one side of the relation changes, the opposite relation changes too.

Example: Let's assume that there are two ORM beans A and B, which have a bidirectional 0..1 - 0..1 relation with each other. Calling "a.setB(b)" changes both directions A --> B and B --> A, so a call to "b.getA()" would return "a". An additional "b.setA(a)" is not needed.

Unidirectional relations are always unmanaged, as there is no opposite direction.

Relation Fetching

Relations are generally fetched lazy from the database, e.g. on first access.
Because 0..n relations can potentially contain a large number of objects, a special handling is implemented in the ORM engine for updating such relations.

Relations are represented in memory in two different ways:

complete: 0..1 and 1 relations always have a “complete” state in memory when they are read or modified, 0..n relation MAY have a complete state in memory. Complete means that all referenced objects are loaded from the database.
delta: When modifying a 0..n relation that is not complete in memory yet, only a delta representing the modification will be held. It is not necessary to read a complete 0..n relation into memory in order to modify it.

Relations are fetched completely, when they are read for the first time. For 0..n relations, someone has to obtain an iterator over the relation to trigger the complete relation fetching.

State Layer

An ORM object has 2 types of internal states:

shared state: There may be one shared state, which represents the current committed and publicly visible state in the database. It can be used for read-operations only.
transactional state: Each thread may see an own transactional state of an ORM object. The transactional states are used to modify objects. They represent the modifications that have been done by the current thread. The modifications are done in an isolated environment and are not visible to other threads until the transaction is committed.

Creating a Transactional State

A transactional state is created when:

a new object is created.
there is a write access to an attribute or a relation of an object.
the object is locked pessimistically (by calling its lock or tryLock method)

If there is already a shared state in memory, the transactional state is created by cloning the shared state, e.g. all loaded attribute and relation states are cloned. For 0..n relations, the relation state is not cloned, but a new delta relation state is created.
Attributes that are not in memory yet (lazy), will not be cloned. They will be bound to the transactional state when they are accessed and loaded from the database.

After a transaction was committed, the new shared states for the changed objects will be created by cloning the transactional states. They replace the old shared state.

State References

Shared states of attributes and relations can be referenced either by strong, soft, weak or none references, as specified in the deployment descriptor.

Transactional states are always referenced using strong references. This guarantees that no values (which have possibly been modified) are garbage collected before the transaction was committed. The reference types are converted when a shared state is cloned into a transactional state, or when a transactional state is cloned to become a shared state.

Lazy Fetch of Attributes

The OCA of the parent object is included if a lazy attribute or relation must be fetched during an active transaction. This ensures that the object has not been modified in the meantime and that the partial data is still valid. If the OCA has been modified, no object will be found.

Query Layer

SQL queries that find objects in the database can be executed by using the provided methods in the ORM object factory classes. The following kinds of queries are supported:

find all instances of a certain type
find all instances of a certain type that match an SQL WHERE condition
find all instances of a certain type that match an SQL JOIN condition

All query methods are supported on super-class factories, too. In this case, the query will be executed on each existing concrete sub-class table separately, and the result sets will be joined in one collection. Note, that it is not possible to find objects of different types within one single query.

Queries always go down to the database; they cannot be executed in memory only.

It is strongly recommended to use bind variables in all WHERE conditions, in order to keep the load on the database low (e.g. to reduce statement parsing) and to prevent SQL injection attacks.

Query Result Fetching

Queries read the result sets always lazy, as they are expected to work on potentially huge amounts of data. If the query result is completely read by the application (e.g. by iterating through the returned collection), the underlying database cursor is closed implicitly when the end of the iterator is reached. If the query result is not completely read by the application, the underlying cursor must be closed explicitly. This is done by calling an “endRequest” method by the application that will release all database resources (connections, cursors) being associated with the current thread.

Get Objects By SQL Where Queries

The ORM object factory classes support the execution of arbitrary WHERE conditions in order to find objects of a certain type.
The WHERE condition can optionally contain placeholders for additional SQL parameters, which are identified by ‘?’ symbols. The "WHERE" must not be passed in by the application developer, only the condition must be provided. The condition may contain arbitrary SQL expressions like ORDER BY.

Example

// leads to: SELECT uuid, name, … FROM FOO WHERE name=?
Collection<Foo> foos = fooFactory.getObjectsBySQLWhere(“name=?”, new Object[]{“bla”});

Get Objects By SQL Join Queries

The ORM object factory classes support the execution of arbitrary JOIN queries in order to find objects of a certain type.
In contrast to getObjectsBySQLWhere queries, which only operate on the table of a single object, the getObjectsBySQLJoin queries operate on multiple tables that have a dependency on each other. This requires a naming schema in order to identify the columns of each involved table.
The “local” table (e.g. the table of the persistent object, whose factory is used to execute the query) is always identified by the “this” alias. The aliases of other related tables can be chosen by the application developer.
The WHERE condition can optionally contain placeholders for additional SQL parameters, which are identified by ‘?’ symbols. The FROM and the WHERE string must not be passed in by the application developer.

Example

// leads to: SELECT this.uuid, this.name, … FROM Bar this, Foo f WHERE this.uuid=f.barID and this.name=?
Collection<Bar> bars = barFactory.getObjectsBySQLJoin(“Foo f”, “this.uuid=f.barID and this.name=?”, new Object[]{“bla”});

The “this” alias is bound to the “BAR” table, because the “BarFactory” was used to call the method and objects of type “Bar” are expected in the result set. The application developer must declare any additional tables that are important for the join condition, e.g. “Foo f”.

JDBC Layer

Connection Management

The JDBC manager manages the JDBC connections that are currently used by the application. Each thread will get its own exclusive connection. When the thread is finished, it should call “closeConnection” to release the connection again.
The JDBC manager uses a configurable datasource in order to obtain connections. For example, the datasource may be an OracleDataSource which is configured in the ORM engine directly, or it may be a datasource from an application server into which the ORM engine was deployed. Which datasource must be used can be configured in the ORM configuration.
Connection pooling, failover etc. must be handled by the datasource (e.g. the Oracle JDBC driver).

Type Mapping

The ORM engine performs a number of type conversions for the attributes of persistent objects. The general concept is as follows:

ORM attributes have a Java-type and a JDBC type. The types are declared in the ORM deployment descriptors.
The JDBC type is internally mapped to a database-specific type. Currently, ORM only supports Oracle, so only the translation from JDBC types to Oracle types is supported.
The exact Oracle type is only needed if the database tables are created via ORM. It is not used otherwise, so mostly ORM doesn't care as long as the JDBC driver is able to read / write the columns.
For converting JDBC types into Java types and vice versa special attribute handlers are implemented, which are registered for the several Java types.
It is possible to register own custom attribute handlers to support other conversions.

When a modeling language like EDL in conjunction with a code generator is used, there is another mapping from model types to the implementation (e.g. Java / JDBC) types. This mapping is done on top of the ORM type mapping and has nothing to do with the ORM engine implementation, so for more details and a list compound types that are supported by the code generator refer to the EDL documentation.

The table lists the currently supported standard conversions that are built-in into ORM:

Java Type	JDBC Type	Oracle Type
boolean	BOOLEAN	NUMBER(1)
byte	TINYINT	NUMBER(4)
byte[]	BLOB	BLOB
char	CHAR	VARCHAR2(3 CHAR)
char[]	VARCHAR(<length>)	VARCHAR2(<length> CHAR)
short	SMALLINT	NUMBER(6)
int	INTEGER	NUMBER(11)
long	BIGINT	NUMBER(21)
float	FLOAT	FLOAT(63)
double	DOUBLE	FLOAT(126)
java.io.Serializable	BLOB	BLOB
java.lang.Byte	TINYINT	NUMBER(4)
java.lang.Character	CHAR	VARCHAR2(3 CHAR)
java.lang.Short	SMALLINT	NUMBER(6)
java.lang.Integer	INTEGER	NUMBER(11)
java.lang.Long	BIGINT	NUMBER(21)
java.lang.Float	FLOAT	FLOAT(63)
java.lang.Double	DOUBLE	FLOAT(126)
java.lang.String	VARCHAR(<length>)	VARCHAR2(<length> CHAR)
java.math.BigDecimal	DECIMAL	NUMBER(38, 6)
java.sql.Blob	BLOB	BLOB
java.sql.Clob	CLOB	CLOB
java.util.Date	TIMESTAMP	DATE

Transaction Manager

ORM transactions (like connections and states) are associated with the current thread. The current transaction can be obtained from the transaction manager. Multiple calls from the same thread will always return the same transaction object, so it can be reused for multiple transaction cycles.

The thread binding has several implications: It is not possible to pass an ORM object with uncommitted changes from one thread to another, because the other thread will have a different transactional context. This means, it would possibly see completely different values for the attributes and relations.

Transaction Begin

All changes at an ORM object require an active transaction. So in order to create, modify or remove an ORM object, a transaction must be started. This can be achieved by obtaining the transaction for the current thread from the transaction manager and calling its begin method.

Transaction Store

The transaction contains a list of all objects that have been touched. If the store method is called, all such changes will be written down to the database (but not committed yet). This can be used to ensure that a find query will find objects that have been created in the running transaction.

Transaction Flush

A flush is quite similar to a store, but additionally the transactional state of the object in memory will be dropped. This operation should only be used in rare cases where large amounts of data are written through ORM to the database and memory consumption is otherwise too high, like for import processes.

Transaction Commit

On commit of the transaction, the following actions will be done:

Each modified object writes its new state to the database, e.g. an insert, update or delete statement will be executed (similar to store).
The database transaction is committed.
All changes are written back from the transactional states to the shared states.
The transactional states are dropped.

If the ORM object has an OCA attribute, it will be used in the generated SQL statements in order to ensure transaction consistency. The OCA is also used when writing back committed changes as shared state in the ORM object.

If no OCA is available, the shared state will be dropped from memory.

Transaction Rollback

The transactional states will simply be thrown away. Nothing is written to the database. The current JDBC connection will be rolled back.

Transaction Listeners

It is possible to perform additional actions during transaction operations by registering a transaction listener. The listeners can either be registered at the transaction manager, in which case they would be called for ANY transaction. Alternatively, they can be registered for a single transaction, so they would be called on lifecycle events for the single transaction only. The listener methods are invoked on:

transaction beginning
transaction begun
transaction preparing
transaction prepared
transaction flushing
transaction flushed
transaction committing
transaction committed
transaction rolling back
transaction rolled back

Description Manager

The description manager provides meta-information about the persistent ORM objects that have been registered at the ORM engine. The descriptions are initialized from the ORM deployment descriptor files during startup.

Mapping Manager

The mapping manager provides meta-information about how persistent ORM objects map to database tables. The mappings are also initialized from the ORM deployment descriptors at startup. They are used to generate the SQL statements.

There is one mapping for each concrete subclass. Abstract superclasses are not mapped to tables, thus they don’t have a mapping. Therefore, the attributes and relations of the abstract superclasses will be mapped in the concrete subclasses to their respective tables.

Deployment Manager

The deployment manager parses the XML deployment descriptors and initializes the description and mapping objects. The deployment descriptor contains a complete description of the persistent Java classes, their attributes, relations and their mapping to database tables. The application developer is responsible to register all deployment descriptor files that are available in the system when initializing the ORM engine. The deployment manager uses a SAX parser and some specialized content handlers for efficiently parsing the files. It also does some verification in order to ensure the validness and completeness of the deployed ORM files.

Monitoring Manager

The monitoring manager allows the monitoring of certain runtime aspects of the ORM engine. In particular, it can be used to retrieve statistical information about the number of cache hits, cache misses, database hits, database misses etc. for each kind of persistent object, or the number of transactions and active JDBC connections. The information can be used to optimize the application.

Synchronization Manager

The synchronization manager performs a synchronization of the ORM object caches that are living in different Java VM’s. It propagates all changes to ORM objects to the listening parties after a transaction has been committed.
By default, the changes are propagated as UDP multicast messages. Only the primary key objects and the type of change (object creation, modification, deletion) are submitted, but not the full state. The receiving ORM engine simply removes the shared state of the affected ORM object from the cache. Therefore, such objects will be read from the database again on next access.

Examples

Sequence for Primary Key Lookup

Sequence for Updating an Object

Using the ORM Engine

Application Initialization

The ORM Engine class is the central entry point for the application developers. It provides methods to initialize the whole system and gives access to the individual sub-system managers.
The engine can be used to deploy the ORM beans by registering the available deployment descriptors at the deployment manager.

A typical flow of using the ORM engine would look like this:

ORMEngine engine = ORMEngine.newInstance(props);
engine.init();

// deploy the ORM deployment descriptors
...

engine.start();

// use the engine to lookup ORM Object factories and work with them
...

engine.stop();

The ORM engine can be configured with properties that must be passed at its instantiation.

Logging

The ORM engine uses the Apache Commons Logging framework as logging API. This framework serves as an abstract API, which can be implemented by arbitrary “adapter” classes to support specific logging framework implementations. For example, there may by an adapter that uses the Apache Log4J logging framework to handle the log messages.

The ORM engine uses the "sql" logging category for logging the generated SQL statements and transaction boundaries, using DEBUG as log level.

Disclaimer

The information provided in the Knowledge Base may not be applicable to all systems and situations. Intershop Communications will not be liable to any party for any direct or indirect damages resulting from the use of the Customer Support section of the Intershop Corporate Web site, including, without limitation, any lost profits, business interruption, loss of programs or other data on your information handling system.

Table of Contents