INTERSHOP 7's Data Replication in general refers to the process of first updating data in a source system and then synchronizing the data with a target system. The replication mechanism makes it possible to develop and maintain content in the background (i.e., in a source system being offline for the public) without disturbances to the target system being online.
INTERSHOP 7 provides two fundamental ways to update Live system data in a data replication environment: Mass Data Replication, which is intended to be used for high volumes of data, and Business Object Replication, which is to be used for fast updates of some selective data. Both methods use the same communication channels, but differ in the way they collect data in source and inject them in target system.
Note
Term | Description |
Staging | Refers to a framework providing basic functionality to transfer data in terms of database or filesystem data from a source system to a target cluster. |
Data Replication | Data replication is a process to transfer large amounts of data from a source cluster to a target cluster. As a typical scenario, one could first update any storefront data (like product data) and other settings in an editing system and then transfer this information to a live system. This mechanism allows to develop and maintain large content in the background without significant disturbances to the production system. The mechanism for transferring individual business objects in an ad-hoc manner is called object replication (developer and administrator perspective) or publishing (shop manager perspective). |
Editing system | In a data replication environment, the editing system is a dedicated Intershop 7 installation used to prepare or update the storefront data in the background without disturbing the operation of the live system. The wording pronounces the purpose aspect of the system in the data replication environment as seen by a Data Replication Manager. |
Source system | Describes an INTERSHOP 7 system used to import and test new data which then are intended to be transferred to another INTERSHOP 7 system by usage of Data Replication. Thus, it is often used as a synonym for editing system in a data replication environment. |
Offline system | Often used as a synonym for a source system in a data replication environment. |
Live system | In a data replication environment, the live system is a dedicated Intershop 7 installation that serves the live storefront and receives the data that has been prepared in the editing system. The wording pronounces the purpose aspect of the system in the data replication environment as seen by a Data Replication Manager. |
Target system | Describes an INTERSHOP 7 system which is intended to be the receiver of data transferred from another INTERSHOP 7 system (the source system) by usage of Data Replication. Thus, it is often used as a synonym for live system in a Data Replication environment. |
Online system | Often used as a synonym for a target system in a data replication environment. |
Target system vs. Target cluster | A target system refers to an INTERSHOP 7 cluster, which is the receiver (the target) of a data replication process. As seen from a Data Replication perspective, a target system owns one web server address and one database schema, though it may consist of multiple web and app server(s). |
INTERSHOP 7's data replication mechanism is based on three different frameworks: staging, JDBC, and locking framework, see figure below.
While the complete replication mechanism provides an all-encompassing business process centered handling of data synchronization, staging provides the fundamental data transport mechanism and thus a viewing from a technical perspective.
Figure: Mass Data Replication: Involved Frameworks
The staging framework provides the fundamental main entities and processes to identify and access the content affected by data replication, to model the assignment of content to replication processes, and to initiate and manage process execution.
The data replication mechanism does not replace the staging framework. It extends the staging framework in order to facilitate the management and execution of staging processes.
Note
Although the term staging has often been used as a synonym for replication, it is in fact only one INTERSHOP 7 component involved in a Mass Data Replication process.
The JDBC framework and SQL are used to initiate data transfer between database instances or schemata.
The locking framework prevents different processes within an INTERSHOP 7 cluster (such as import processes, jobs, or data replication processes) from accessing the same resources at the same time, e.g., database tables or file system content. Therefore, each process has to impose a virtual lock on any resource it is going to access in order to ensure no other process can concurrently modify the resource.
Basically, the data replication mechanism of INTERSHOP 7 relates two kind of systems: source systems and a target systems.
To provide a Multi Data Center support, target systems, though they can possibly be situated in different locations, are encapsulated in (logic) target clusters. Same applies to source system / (logic) source clusters.
Note
All target systems of one target cluster are allowed to be active at the same time, while at one time only one source system is allowed to be active (up and running).
For an easier understanding, the following figure shows a simplified view with only one editing and one target system; Multi Data Center functionality is described later in a separate paragraph in more detail.
Figure: Simplified Basic Architecture
One target system includes one or more application servers, the Web server, the Web adapter and a target database account. In fact, the number of application servers, Web servers and Web adapters is irrelevant to the data replication mechanism, it must just meet the required needs in order to process incoming requests properly.
One source system also includes one or more application servers, Web server, Web adapters and a source database account. Again, the number of application servers, Web server and Web adapters is irrelevant to the data replication mechanism. Typically, the sizing requirements for a source system are lower, as the source system does not have to process online requests.
All target systems of a target cluster have to use the identical clusterID, i.e., the content of the file share/system/config/cluster/cluster.id needs to be identical. All editing systems of of the according editing cluster have to use an identical clusterID, but different from the target cluster.
A source cluster can be connected to multiple target clusters. However, each data replication process is directed at exactly one replication cluster. It is not possible to update multiple target clusters from a source system in one data replication process. Nevertheless, all target systems belonging to the target cluster selected for a replication process are updated with the same replication process.
Mass data replication is based on the following fundamental paradigms:
Basic mechanism:
From a user's perspective, data replication is separated into two main stages: first defining data replication tasks, and afterwards executing these tasks as data replication processes. Both stages are managed in the editing system and are described in more detail below.
According to the two stages two basic user roles for data replication can be distinguished: Data Replication Manager and System Administrator.
Data Replication Managers operate within the back office of a particular business unit (i.e., enterprise or channel). They do not need any technical knowledge of data replication. They create replication tasks and assign them to the System Administrator for execution. For example, the Data Replication Manager could be an editor who maintains product and catalog information of a consumer channel of the source system. The editor then creates the task to replicate the data to the consumer channel of the target system.
The System Administrator acts as Data Replication Manager of the system unit (central e-selling administration, i.e., Operations back office). System Administrators overlook data replication across the whole system through technical eyes. Their duties encompass receiving of the replication tasks from the Data Replication Managers of the individual business units, combining them to data replication processes for execution, assigning the appropriate target cluster, and starting of the replication processes.
Additionally, the System Administrator can trigger the rollback of publication processes if necessary, and monitors the replication process progress.
Each business unit (channel, enterprise/sales partner) contains an access privilege Data Replication Manager, which is connected with the permission SLD_MANAGE_DATA_REPLICATION. The Data Replication Tasks module of INTERSHOP 7's back office becomes accessible if the user inherits the access privilege Data Replication Manager for the particular business unit.
The System Administrator owns the same permission, but in comparison to the context of a business unit the functionality of module Data Replication Tasks is limited to process published tasks in utilization of additional module Data Replication Processes.
Data replication tasks determine the content to be replicated. They are defined by the responsible Data Replication Managers individually for each channel in the sales organization or partner back office. For example, the Data Replication Manager of the channel “PrimeTechSpecials” can define data replication tasks for this particular channel, using the consumer channel management plug-in in the sales organization back office.
For each data replication task the Data Replication Manager has to define:
Once defined, data replication tasks are submitted to the System Administrator for execution.
A data replication group identifies the content to be replicated from a business object's point of view. Thus, the replication group can encapsulate various content types (file content, database content), which is needed to replicate the selected business object. For example, the data replication group “Organization” includes the organization profile, the departments, the users and roles, and all preferences defined for an organization.
Each replication group refers to a certain content domain.
To execute data replication tasks, the System Administrator defines data replication processes in the central administration front end.
For each data replication process, the System Administrator defines:
For each replication process, a data replication type is set by the System Administrator. From a business point of view, the data replication type determines if new data is transferred and published in one single process or in separate processes. Subsequently to a replication process, which included a successful publication, additionally a one-step-back undo process can be run.
The following replication types are available:
Data Publishing
This process publishes data that have already been transferred to the target cluster. The process triggers all necessary table and directory switches as well as concomitant database commits to persist the changes (publication and cache refresh).
Note
Data publishing can only be executed on the results of a process of type Data Transfer executed immediately before.
Undo
An Undo process rolls back a data replication process of type Data Publishing or Data Transfer & Publishing which has been completed successfully. Undo restores the target cluster state prior to executing the data replication task that is rolled back.
Note
Undo does not support undoing processes of type Data Transfer. Also, Undo can only roll back the most recent data replication process.
A complete data replication process consists of the following main phases, as described in the figure below:
Figure: Phases of a replication process
Publication
The final step of the data replication process is to publish the replicated content, for example by performing a switch between live and shadow tables (full replication of database content) or between active and inactive directories (replication of file system content). As a result, any new or changed data is available for online users, and deleted data does no longer appear in the Web front.
Note
The publication phase has not run through if any of the preceding steps had ended with an error.
The process details for the individual phases differ depending on the content type to be replicated and the staging processor used to execute the replication process.
When preparing a replication process, the System Administrator has to set a data replication type. From a technical point of view, the data replication type determines which replication phases are actually performed for the respective data replication tasks. The following replication types are available:
Data Publishing
This process publishes data that have already been transferred to the target cluster. The process triggers all necessary table and directory switches as well as concomitant database commits to persist the changes (i.e., phases publication and cache refresh).
Note
Data publishing can only be executed on the results of a process of type Data Transfer that was executed immediately before.
Undo
An Undo process rolls back a data replication process of type Data Publishing or Data Transfer & Publishing which has been completed successfully. Undo restores the target cluster state prior to the execution of the data replication task that is rolled back.
Note
Undo does not support undoing processes of type Data Transfer. Also, Undo can only roll back the most recent data replication process.
In the active source system, for each data replication process a ReplicationProcess object, a StagingProcess object and at least one additional StagingProcess object (one for each target system of the target cluster assigned to the replication process) is created, all being tightly integrated with the locking framework, as shown in the figure below.
Figure: Internal structure of a Mass Data Replication process
Both ReplicationProcess and StagingProcess are wrapper classes which extend functionality provided by the Process class of the locking framework.
Note
The locking framework provides the necessary persistent objects. For example, the wrapper class ReplicationProcess contains the persistent object Process. Replication-specific information of the ReplicationProcess are mapped onto custom attributes of the Process object.
A ReplicationProcess consists of ReplicationTask objects and is created and started by the System Administrator.
A ReplicationTask is created by the Data Replication Managers of the respective business unit. The Data Replication Manager defines the content of a ReplicationTask. A ReplicationTask consists of at least one ReplicationTaskAssignment.
A ReplicationTaskAssignment references exactly one StagingGroup and one Domain, thus embodying a ReplicationGroup.
ReplicationGroups can be selected by the Data Replication Managers in the back office of their business unit.
Figure: Mass Data Replication process model
A staging process consists of several components describing the content affected by this process.
Figure: Staging process model
Each StagingProcess has a type. The types which a StagingProcess can assume correspond to the data replication types which the System Administrator can set for each replication process in the back office (see Replication Types described in the section before):
Each StagingProcess references one or more StagingProcessComponents. A StagingProcessComponent references exactly one domain and one StagingGroup.
The staging framework uses resources definitions of the locking framework to lock affected resources (e.g., tables) whenever a data replication process is executed. Thus, the staging mechanism prevents the respective resources from being changed by other processes (e.g., jobs, imports), while a replication process is underway.
The entity model describes the content components to be transferred by replication processes, making use of fundamental concepts of the staging framework such as StagingGroup, StagingTable and StagingDirectory.
Data replication groups identify the content to be transferred between source and target system from a business point of view, e.g., catalogs, channels or product prices. Replication groups are configured via an XML configuration file, replication.xml, located in <IS.INSTANCE.SHARE>/system/config/cluster.
Replication groups can be conceived of as staging group-to-domain assignments. Hence, replication groups relate logical data containers (domains) with physical data containers (staging groups, bundling database tables or staging directories).
There is no persistent object representing a data replication group. Replication groups are used at pipeline layer (see below) and at template layer (to visualize the organization of replication processes).
The staging group-to-domain assignment takes place when assigning a data replication group to a replication task. Responsible for the staging group-to-domain assignments are the pipelines ProcessReplicationGroupAssignment[channelType]. These pipelines are channel type-specific. They perform the following actions:
Start
start node is called, handing over the replication group ID as defined in replication.xml. The pipeline analyzes this given ID and, depending on it, selects a jump node which targets a specific sub-pipeline to handle staging group assignment for this replication group.Note
It is necessary that all referenced domains exist at the point of assigning the replication group to the replication task for successful assignment of staging groups. For example, a catalog to be replicated has to be created before you add the replication group Catalogs to a replication task.
A staging group consists of several staging entities of the same type and contains the configuration determining how that entities are replicated (the staging processor).
A staging entity describes an atomic data container for a certain type of content: database tables, materialized database views, or file system content. Accordingly, the following types of staging entities have to be distinguished:
Figure: Staging Group and Staging Entities
The staging entity StagingTable represents a database table. A StagingTable can be domain-specific or not.
A simple staging table (being not domain-specific) has to fulfill the following requirements:
Tables containing domain-specific content additionally have to fulfill the following requirement:
Tables that are writable in the storefront (and hence will be replicated using a delta replication mechanism) additionally have to fulfill the following requirement:
The table contains a column storing the modification time of the respective table row. This column has to be named LASTMODIFIED being of type DATE, and needs to be updated on each change of the according row. A mechanism is provided which sets the current date in case of changes.
Note
If the column does not exist, it is not possible to track changes.
When creating custom persistent objects using INTERSHOP Studio, the column is generated automatically when setting property ModificationTimeTracking for the respective class to true.
The staging entity StagingDirectory represents a directory containing file system content to be replicated. The staging directories reside in numbered subdirectory of each site directory. The entire content within in these directories can be replicated. The directory tree may look like pictured below:
Figure: Staging Directories in INTERSHOP 7
Note
Data replication can include unit directories in <IS.INSTANCE.SHARE>/sites/<site>/<.active>/units, where <.active> references the currently active directory (1 or 2). Note furthermore, that unit directories in <IS.INSTANCE.SHARE>/sites/<site>/units cannot be replicated, since they do not contain any staging relevant content.
The .active file, located in the site directory, contains the number of the directory currently used by the application server (either 1 or 2), i.e., it defines the active directory. The other numbered directory stores the changed or new files. Upon publication, the content of the .active file is altered to point to the new active directory. The look-up mechanism of the application server reads this information and uses the specified directory.
The staging entity StagingMView, together with the MViewStagingProcessor, is used to update materialized views whose original tables were affected by replication processes. The new content of materialized views is published using database synoymys.
MViews will be refreshed in the background during replication process.
Staging processors provide the core methods for the replication of different content types, such as database content or file system content.
Staging processor decorators provide additional functionality to extend the functionality of staging processors. The decorators perform tasks before or after a state has changed during a data replication process (cf. Replication Process Phases above).
Every StagingGroup is associated with a StagingProcessorConfig. The StagingProcessorConfig determines which staging processors have to be used to replicate the content represented by the staging group, i.e., it defines in which way data is replicated. Each staging group has assigned one staging processor, whereby the staging processor may or may not be extended by one or more staging decorators. As a result, a StagingProcessConfig consists of exactly one StagingProcessor and none, one, or more decorators.
Figure: StagingProcessorConfig
According to the content types there exit different staging processors and decorators, implementing various methods to replicate data from editing to target systems. A detailed description of the available standard processors and decorators is given in the next section.
All staging processors are derived from the class BasicStagingProcessor. This class provides the signature of a couple of hooks called by the pipelets of the staging process pipelines. The following figure depicts the class hierarchy of the standard staging processors. All processor classes and all but two of the decorator classes are provided by the core cartridge; the RefreshSearchIndexesDecorator is implemented in bc_search, ShippingRuleEngineStagingProcessorDecorator comes with bc_shipping.
Figure: Staging Processor Model: Class Hierarchy
For each data replication phase (Preparation, Synchronization, Replication, Publication, Refresh Caches; see Data Replication Phases above), a staging processor provides the following hook functionality:
The staging processor classes provide specific implementations of these hook methods, depending on the type of content and replication mechanism.
The staging processor objects are created by a factory. The factory uses the default constructor of each processor object for initialization.
File system content is handled by sub-classes of FileSystemStagingProcessor providing functionality (hooks) for the publication phase of a staging process (switching directories of site content).
INTERSHOP 7 includes the SimpleFileSystemStagingProcessor as default implementation class for the FileSystemStagingProcessor. This processor first creates binary index files in the source system, keeping information on the staging directories in <IS.INSTANCE.SHARE>/sites/<site>/<.active>. The index files are stored in <IS.INSTANCE.SHARE>/dist/staging.
Then the same procedure is executed in the target system. Afterwards, the target system downloads the binary index files from the source system and checks them for changed file system content by comparing them with its own index files. The target system then downloads the changed files directly into to the shadow directory of the target system.
There is another implementation of the FileSystemStagingProcessor, the DRPIndexFileSystemStagingProcessor. Instead of binary index files it uses a DRP index (XML representation of file system content) on target and source directories to detect changes of file content. Despite that, the procedure is basically equivalent to the SimpleFileSystemStagingProcessor. For performance and resource reasons (memory usage) it is recommended to use the SimpleFileStagingProcessor for new projects.
Note
The DRPIndexFileSystemStagingProcessor uses a modified DRP index mechanism. In contrast to the standard mechanism, the created index file which is used for file comparison contains rounded time stamps and size of each file instead of a check sum to reduce the time necessary to build the DRP index file.
File replication based on FileSystemStagingProcessor involves the following phases:
The target system downloads the generated zip archives from the source system. The zip files are extracted into the shadow directory. Files to be deleted are removed in the shadow directory.
SimpleFileSystemStagingProcessor:
Generation of a binary index of the target system’s file content. The two indexes are compared. Changed files are downloaded to the target system, obsolete files are deleted.
Figure: Replication of file system content
The base class for all staging processors handling database content is the abstract class DatabaseStagingProcessor. It provides methods for transaction, database connection and statement handling. Furthermore, it collects all affected persistent objects being involved in the current replication process.
Database staging processors come in two basic types: full replication processors and delta (partial) replication processors. Both mechanisms are described below.
In case of full replication, database content is transferred from the source to the target system regardless of changes. Full replication is used for most types of database content, except tables that are writable in the target system (such as promotion codes on a system used as live system).
Performance tests proved that it is faster in most cases to delete the whole data from a table and re-fill it completely including changed data than to update only the changed table entries.
Full data replication is available for global (not domain-specific) and for domain-specific data. Global means, that data to be replicated is not selected based on a DOMAINID. Domain-specific means, that data is selected for a DOMAINID column.
The full replication mechanism relies on the following basic database objects: for each table to be replicated there are two tables with suffixes $1 and $2 added to the original table name (one used as the live, i.e., the currently active, and the other one as a shadow table) and an additional database synonym with the original table name pointing to the current live table; see also figure below. The Java functionality accesses the database table via the synonym.
Full replication involves the following steps:
Figure: Replication of database content using the full replication mechanism
In case of delta replication, only content which has actually changed is transferred from the source to the target system. The replicated content is directly inserted into the live (i.e., active) tables of the target system and published by committing the respective database transaction.
Note
Delta replication is used for database data which is writable in the target system. It is needed in every case where data independently resp. concurrently is created or changed not only in the source system, but in the target system, too. An example are promotion codes, which are created in editing system and changed (redeemed) in the target system.
All delta staging processors are derived from the abstract class TransactionalStagingProcessor, which itself is derived from DataBaseStagingProcessor. The TransactionalStagingProcessor provides a method to enable the deletion triggers needed to track deletion of table rows.
Delta replication comprises the following steps:
In case of any error, the active table is completely copied into the shadow table. This is necessary for the data replication type Undo.
Those data records can be detected in terms of the column LASTMODIFIED. The replication is carried out in one large transaction.
Publication
The large transaction is committed.
Note
Synonyms of the tables in the target system are not changed in delta replications.
Figure: Replication of database content using the delta replication mechanism
INTERSHOP 7 includes the following database staging processor classes:
Full replication
Delta replication
MergeDomainSpecificStagingProcessor
This staging processor is used to replicate database content residing in database tables being changed in source as well as target system. Due to this, the replication occurs in one huge transaction. It uses the 'MERGE' sql statement to transfer the new and updated content, and uses the deletion tracking with deletion triggers to realize removed rows in editing system.
Note
The 'MERGE' statement has a restriction: It does not work on tables having a column with a context index. So, only tables with normal indexes are supported.
Note
Due to restriction of a huge transaction, the publication phase cannot be started separately. Further, the undo process is not supported in order to save the backup time of old content in live system.
Table rows which do not exist in the editing system are deleted from the live system.
Note
The 'MERGE' statement has a restriction: It does not work on tables having a column with a context index. So, only tables with normal indexes are supported.
Note
Due to restriction of a huge transaction, the publication phase cannot be started separately. Further, the undo process is not supported in order to save the backup time of old content in live system.
AppendDomainSpecificStagingProcessor
This staging processor is used to replicate only new content of domain-specific tables. Old content of the live system is never overwritten. Removed rows in the editing system will never be deleted in the live system.
Note
This processor replicates its contents in the publication phase in order to support separated Replication and Publication modes.
DeleteAppendDomainSpecificStagingProcessor
This staging processor is used to replicate new and deleted content of domain-specific tables. Existing old content of the live system is never overwritten. Removed rows in the editing system will be deleted in the live system.
Note
This processor replicates its contents in the publication phase in order to support separated Replication and Publication modes.
Processor | Deleting Data | Inserting Data | UnDoing Replication |
---|---|---|---|
FullStagingProcessor | TRUNCATE TABLE {0} REUSE STORAGE | INSERT /*+ APPEND */ INTO <shadow_table_name> dst SELECT * FROM <live_synonym_name> src
INSERT /*+ APPEND */ INTO <shadow_table_name> dst SELECT * FROM <source_table_in_editing_system> src
INSERT /*+ PARALLEL(dst, <nn>) */ INTO <shadow_table_name> dst SELECT /*+ PARALLEL(src, <nn>) */ * FROM <live_synonym_name> src
INSERT /*+ PARALLEL(dst, <nn>) */ INTO <shadow_table_name> dst SELECT /*+ PARALLEL(src, <nn>) */ * FROM <source_table_in_editing_system> src | SQL statement to save content that should not be undone: same as inserting data. |
FullDomainSpecificStagingProcessor | TRUNCATE TABLE {0} REUSE STORAGE | INSERT /*+ APPEND */ INTO <shadow_table_name> dst SELECT * FROM <live_synonym_name> src WHERE <column_name_of_DOMAINID> NOT IN (select stagingdomainid from stagingprocesscomponent where stagingprocessid = <current_stagingprocess_id> and staginggroupid = <current_staginggroupid>
INSERT /*+ APPEND */ INTO <shadow_table_name> dst SELECT * FROM <source_table_in_editing_system> src WHERE <column_name_of_DOMAINID> IN (select stagingdomainid from stagingprocesscomponent where stagingprocessid = <current_stagingprocess_id> and staginggroupid = <current_staginggroupid>)
INSERT /*+ PARALLEL(dst, <nn>) */ INTO <shadow_table_name> dst SELECT /*+ PARALLEL(src, <nn>) */ * FROM <live_synonym_name> src WHERE <column_name_of_DOMAINID> NOT IN (select stagingdomainid from stagingprocesscomponent where stagingprocessid = <current_stagingprocess_id> and staginggroupid = <current_staginggroupid>
INSERT /*+ PARALLEL(dst, <nn>) */ INTO <shadow_table_name> dst SELECT /*+ PARALLEL(dst, <nn>) */ * FROM <source_table_in_editing_system> src WHERE <column_name_of_DOMAINID> IN (select stagingdomainid from stagingprocesscomponent where stagingprocessid = <current_stagingprocess_id> and staginggroupid = <current_staginggroupid>) | INSERT INTO <shadow_table_name> SELECT * FROM <live_synonym_name> WHERE <column_name_of_DOMAINID> = <domainID> |
MViewStagingProcessor | ddl.drop_materialized_view(<mview_name>); | SELECT query FROM user_mviews WHERE mview_name=<mview_name> UNION ALL SELECT query FROM user_synonyms s JOIN user_mviews v ON (s.table_name=v.mview_name) WHERE synonym_name=<mview_name> | same as inserting data |
AppendDomainSpecificStagingProcessor | none | INSERT INTO <live_table_name> SELECT * FROM <source_table_in_editing_system> src WHERE NOT EXISTS (SELECT * FROM <live_table_name> dst WHERE src.<primary_key>=dst.<primary_key> AND (<column_name_of_DOMAINID> IN (SELECT stagingdomainid FROM stagingprocesscomponent WHERE stagingprocessid=<current_stagingprocess_id> AND staginggroupid=<current_staginggroupid>)) | transactional |
DeleteAppendDomainSpecificStagingProcessor | DELETE FROM <live_table_name> dst WHERE (<primary_keys_of_table>) IN (SELECT <primary_keys_of_table> FROM <live_table_name> dst WHERE <column_name_of_DOMAINID> IN (SELECT stagingdomainid FROM stagingprocesscomponent WHERE stagingprocessid=<current_stagingprocess_id> AND staginggroupid=<current_staginggroupid>) MINUS SELECT <primary_keys_of_table> FROM <source_table_in_editing_system> src WHERE <column_name_of_DOMAINID> IN (SELECT stagingdomainid FROM stagingprocesscomponent WHERE stagingprocessid=<current_stagingprocess_id> AND staginggroupid=<current_staginggroupid>)) | ||
INSERT INTO <live_table_name> SELECT * FROM <source_table_in_editing_system> src WHERE (<primary_keys_of_table>) IN (SELECT <primary_keys_of_table> FROM <source_table_in_editing_system> src WHERE <column_name_of_DOMAINID> IN (SELECT stagingdomainid FROM stagingprocesscomponent WHERE stagingprocessid=<current_stagingprocess_id> AND staginggroupid=<current_staginggroupid>) MINUS SELECT <primary_keys_of_table> FROM <live_table_name> dst WHERE <column_name_of_DOMAINID> IN (SELECT stagingdomainid FROM stagingprocesscomponent WHERE stagingprocessid=<current_stagingprocess_id> AND staginggroupid=<current_staginggroupid>)) | transactional | ||
IncrementalDomainSpecificStagingProcessor | DELETE FROM <live_table_name> WHERE ((<column_name_of_DOMAINID>=<domainid_of_current_component>) AND (<primary_keys_of_table>) NOT IN (SELECT <primary_keys_of_table> FROM <source_table_in_editing_system> WHERE <column_name_of_DOMAINID>=<domainid_of_current_component>)) | ||
MERGE INTO <live_table_name> dst USING (SELECT s.* FROM <source_table_in_editing_system> s LEFT OUTER JOIN <live_table_name> d ON (s.<comparsion_key_of_table>=d.<comparsion_key_of_table>) WHERE <column_name_of_DOMAINID>=<domainid_of_current_component> AND (d.lastmodified IS NULL OR d.lastmodified<s.lastmodified)) src ON (s.<primary_key_of_table>=d.<primary_key_of_table>) WHEN MATCHED THEN UPDATE SET dst.<assigned_column_names>=src.<assigned_column_names> WHEN NOT MATCHED THEN INSERT (<column_names>) VALUES src.<column_names>) | transactional | ||
MergeDomainSpecificStagingProcessor | DELETE FROM <live_table_name> WHERE (<primary_keys_of_table>) IN (SELECT <primary_keys_of_table> FROM <source_deletion_table_in_editing_system> WHERE (<column_name_of_DOMAINID>=<domainid_of_current_component>) | ||
MERGE INTO <live_table_name> dst USING (SELECT s.* FROM <source_table_in_editing_system> s LEFT OUTER JOIN <live_table_name> d ON (s.<comparsion_key_of_table>=d.<comparsion_key_of_table>) WHERE (<column_name_of_DOMAINID>=<domainid_of_current_component>) AND (d.lastmodified IS NULL OR d.lastmodified<s.lastmodified)) src ON (s.<comparsion_key_of_table>=d.<comparsion_key_of_table>) WHEN MATCHED THEN UPDATE SET dst.<assigned_column_names>=src.<assigned_column_names> WHEN NOT MATCHED THEN INSERT (<column_names>) VALUES (src.<column_names>) | transactional | ||
DELETE FROM <source_deletion_table_in_editing_system> WHERE (<primary_keys_of_table>)) IN (SELECT <primary_keys_of_table> FROM <live_deletion_table>) | transactional |
Staging process decorators add special functionality to a staging processor. All staging processor decorators are derived from the abstract class StagingProcessorDecorator, which itself is derived from BasicStagingProcessor. It is possible to use more than one staging processor decorator for a staging process.
As the staging processors themselves, the staging processor decorators are specific for the content type (file system content or database content.)
The base class for all staging processor decorators handling file system content is the abstract class StagingProcessorDecorator.
Staging processoer decorators for file system content add functionality to extend the pure transportation of files provided by the FileSystemStagingProcessor classes. This can include a reload of replicated files in the target system(s).
INTERSHOP 7 provides the following file system staging processor decorator classes:
The base class for all staging processor decorators handling database content is the abstract class DatabaseStagingProcessorDecorator, which itself is derived from StagingProcessorDecorator.
Database staging processor decorators should be used to handle table statistics, indexes, or constraints. They can also provide the possibility to execute additional database queries before or after a staging is done.
INTERSHOP 7 provides the following database staging processor decorator classes:
staging.live.enable_foreignkeys
property in staging.properties( onPostReplicationHook for constraints in then-shadow tables, onPostPublicationHook for foreign keys referencing then-live tables).UnusableIndexesStagingProcessorDecorator
The UnusableIndexesStagingProcessorDecorator is provided by cartridge core and sets all indexes of shadow tables that are assigned to the staging processor referenced by this decorator to unusable ( onPreSynchronizationHook).
Note
This decorator requires the RebuildIndexesStagingProcessDecorator described just before.
ExecuteQueryDecorator
The ExecuteQueryDecorator is provided by cartridge core. It is based on FullStagingProcessor's switching $1 and $2 tables on publication phase. Further, it uses the INTERSHOP 7 Query Framework to execute query files on each staging hook to perform the replication.
The staging queries to be executed have to follow the syntax requirements of the Query Framework and have to reside in the directory queries/staging. By convention, they have to be named following the schema <tablename> _ <hookname> with hookname being:
"on[Pre|Post|Error][Preparation|Synchronization|Replication|Publication|RefreshCache]Hook.query"
E.g.: PRODUCT _ onErrorReplicationHook.query.
Staging processors are configured in the global staging configuration file staging.properties, which is located in <IS.INSTANCE.SHARE>/system/config/cluster.
Each staging processor configuration entry consists of
For a detailed description see the section Replication Configuration.
As already stated before, a mass data replication is started off by the editing system by informing each assigned target system on a new replication process. Each target system then pulls the advised data from the editing system.
Communication between the application servers of source and target system(s) is based on a web service (SOAP) and HTTP. The direction of the command communication flow is from source to target system: the source system sends SOAP requests to the Web server of the target system(s), which then forwards these requests to an application server belonging to the server group configured to handle Replication.
File system content in data replication is retrieved by the target system via HTTP from the source system.
To allow the database content to be replicated from the editing system to the target system(s), an additional communication channel connects the database schema of the source system with the database schema(ta of each) of the involved target system(s). Here, two basic replication scenarios can be distinguished:
Note
Local data replication can have significant performance advantages over remote data replication. Use local data replication whenever possible.
In case of remote database replication, the connection is enabled by means of a database link from the target to the source system.
In case target and source system use database schemata in the same database instance (local database replication), the source system can grant access to certain tables to the target system.
A special identification mechanism prevents the target system from performing data replication tasks triggered by other systems than the source system.
After getting a SOAP call from the naming service in the target system, the following steps are performed in order to uniquely identify the source system:
If the StagingMgr finds the UUID, it will accept the call, as the source system is now unambiguously identified. After processing the call, the target system's StagingMgr deletes the UUID in the STAGINGIDENTIFICATION table in the source system database, again using the database link resp. the direct access to the source database schema.
If the UUID is not found, the StagingMgr denies the access and throws an IdentificationException.
Before database content or file system directories can be replicated, some preparations are required by the staging processors in order to create the environment. Preparing the environment for data replication is the task of the preparer classes StagingGroupPreparer, StagingEnvironmentPreparer
and related preparer classes (e.g., StagingTablePreparer),
which are executed on DBInit.
Note
The Staging framework depends on the identical structure of the tables to be replicated. Moreover, it depends on identical UUIDs of all domains in the database and of all staging configuration.
Note
There is no automatic process which initially copies the database content from editing to target system(s). Despite that, it is a task of the installation and deployment process to equalize the databases.
The easiest and highly recommended way to assure this is to execute a DBInit in the editing system, then to export the database with ant export (in the editing system) and to import the resulting database dump in the target system(s) using ant import.
Another way would be the usage of DBMigrate on both, editing and target systems. In this case, all relevant UUIDs would be needed to be predefined.
The following sub section describes all default staging processors provided with INTERSHOP 7. Please refer to the according JavaDoc and configuration examples in the respective Cookbook for more detailed information on configuring these preparers.
The StagingGroupPreparer
class is the first preparer class called when preparing the database for data replication. It prepares all staging groups, staging tables, staging materialized views and staging directories and stores their configuration data in the corresponding STAGINGGROUP, STAGINGTABLE, STAGINGMVIEW and STAGINGDIRECTORY tables. Prepared staging groups can then be used by the pipeline ProcessReplicationGroupAssignment when assigning data replication groups to data replication tasks (see Assignment of Staging Groups to Replication Groups).
Note
The StagingGroupPreparer
has to be executed before the StagingEnvironmentPreparer
is executed.
To prepare staging groups, the StagingGroupPreparer
usually uses the property files StagingGroup.properties (Staging Group preparation) and StagingGroupInformation.properties (Staging Processor - to - Staging Group assignment), which are part of the sub-package dbinit.data.staging (included in the dbinit.jar) of each cartridge.
The Staging framework uses the Locking framework to assure the exclusive access to affected resources (i.e., database tables, files) during a replication process. Thus, it prevents inconsistent data caused by jobs, imports etc running in parallel,.
Staging resource assignments are usually defined in ResourceAssignments.properties. They map staging groups (the key) onto one or more resource definitions of the locking framework (value).
The StagingEnvironmentPreparer
creates the environment (such as special database tables or views) which is necessary to replicate database tables.
The StagingEnvironmentPreparer
StagingTablePreparer
associated with each staging processor.The retrieved staging table preparer actually creates the necessary database structures. Since staging processors may impose different requirements on their database environment, each staging processor invokes its own StagingTablePreparer.
Figure: StagingEnvironmentPreparer and StagingTablePreparer
The following figure shows the environment which the StagingTablePreparer
creates for database tables ( foobar and foobar_AV in the sample below) that are replicated via full replication (see Full Replication Processors).
The preparer
The resulting database structure is shown here:
Figure: Database Environment: Full Replication
The created database objects and their purpose are :
Object | Purpose |
---|---|
TABLE | Database tables contain the actual data (live table foobar$1, shadow table foobar$2). |
SYNONYM | Table data are accessed by the Java application servers via synonyms (synonym foobar). |
VIEW | Views provide access for the staging process to the table content in its according domain context, even if the accessed object itself does not have a domain ID. For example, the view foobar_AV$S joins the synonyms foobar_AV and foobar to get the domain ID from the table foobar$1. |
For tables replicated via delta replication, a more complex environment is required, due to the change tracking mechanism used. Changes are tracked in each staging table using a time frame defined by the last successful staging process and the current time. Inserts and updates are detected by the values in the LASTMODIFIED column in each staging table.
Note
Each persistent object is responsible to set the LASTMODIFIED column after/before inserts and updates. If the persistent object is generated using jGen, this functionality will be created automatically.
Deletions are tracked using a deletion trigger and a special deletion table. The deletion trigger and deletion table for each $1 and $2 table are created by the DeletionTrackingStagingTablePreparer (see Figure below). The deletion table stores
The deletion trigger establishes the deletion tracking mechanism by copying primary key and domain identifier from the source table into the deletion table and setting the LASTMODIFIED column to current database date.
To prepare tables for delta replication, the preparer:
In this way,the preparer is creating a structure as shown in the next figure,
Figure: Database Environment: Delta Replication
AddStagingGroupsInformationPreparer
This preparer adds resp. updates the staging processor configuration, the (optionally) assigned domain, and the localized staging group information (display name, description) of staging groups, which already exist in the database.
Note
This preparer does not add new staging groups.
Note
When changing a database StagingProcessor, it is required to remove the staging environment before from all staging entities (using DeleteStagingEntitiesEnvironmentPreparer)
of the staging group and to re-create the staging environment for the new staging processor (using MigrateStagingEnvironment).
AddStagingGroupsPreparer
This preparer is used to add new staging groups AND the according staging entities of these staging groups.
Note
This preparer does not update existing staging groups.
UpdateStagingGroupsPreparer
This preparer is used to update the attributes (group configuration) AND re-creates the according staging entities of staging groups belonging to the current cartridge.
Note
This preparer does not allow to add new staging groups.
RemoveStagingGroupsWithEntitiesPreparer
This preparer is used to remove given staging groups AND all their assigned staging entities. Additionally, the replication task assignments and the staging group resource assignments of the respective staging groups are removed.
Note
This preparer does NOT remove the staging environment (i.e., the $1, $2, $S etc.) from the staging tables to be removed. Call the DeleteStagingEntitiesEnvironmentPreparer
before to strip the staging environment from staging entities to be removed.
DeleteStagingEntitiesEnvironmentPreparer
This preparer is used to remove the staging environment from staging entities (staging tables).
DeleteStagingEntitiesPreparer
This preparer is used to delete existing staging entities from staging groups.
Note
This preparer does not remove staging groups, even if the staging group would become empty.
Note
This preparer does NOT remove the staging environment (i.e., the $1, $2, $S etc.) from the staging tables to be removed. Call the DeleteStagingEntitiesEnvironmentPreparer
before to strip the staging environment from staging entities to be removed.
UpdateStagingEntitiesPreparer
This preparer is used to add resp. update staging entities of a single staging group.
Note
This preparer does not allow to add new staging groups nor to remove staging entities.
AddResourceAssignmentsPreparer
This preparer is used to add additional resource assignments to staging groups.
MigrateStagingEnvironment
This preparer is used to migrate the staging environment of the current cartridge. It is normally used after staging groups or staging entities have been changed / added.
(Mass) Data Replication uses several configuration files:
All of these file reside in share/system/config/cluster. A closer description of these files is given below.
This file is used in both, source (editing) and target (live) systems. Configurable settings are:
Property | Default (Development) | Type | Range | Live | Description |
---|---|---|---|---|---|
General Settings: | |||||
staging.system.type | ESL 6.x: editing; | String | editing | Defines the type of staging system.
| |
staging.system.name | host (Editing System) | String | The name of staging system. Arbitrary names are supported. | ||
staging.statement. | inactive, begin gather_table_ | String | This SQL statement is used by decorator Note In releases higher than ESL6.4, the AnalyzeTablesDecorator uses the general configuration from database.properties. | ||
staging.prepareOnDBInit | false | Boolean | Defines if the staging environment should be prepared during DBInit (true) or during first staging process (false) | ||
staging. | false | Boolean | In case of the database being filled with the same database dump in source and target system, you can set this property to true. It avoids the call of stored procedure | ||
staging.WebServerURL | inactive, empty | URL | The web server URL being used by staging processes (optional). In the live system the property configures the URL of SOAP staging service. In the editing system it configures the web server from which the files should be downloaded. If no value is set, the standard web server URL configured in the appserver.properties is used. | ||
Database Communication Settings: | |||||
staging.database.connection.factory | com.intershop.beehive. | String | Defines the database connection factory to be used during staging process.
| ||
Parallelism Section: | These properties should be set in live and editing system. They are used for configuring the parallelism behavior during a staging process. | ||||
staging.process. | 2 | Integer | The number of staging processors executed in parallel. | ||
staging.process. | 3 | Integer | The number of parallely replicated entities per staging processor. Note Currently only the FullDomainSpecificStagingProcessor supports this setting. | ||
staging.process. | 1 | Integer | The number of parallel threads within database performing a SQL statement. Note See PARALLEL in hints in Oracle (it works only with an Oracle Enterprise Edition). | ||
staging.process. | 1000000 | Integer | The minimum number of rows a table must have to replicate its content with parallel SQL hints configured in the property above. | ||
Timeout section: | These properties should be set in the live system. If a timeout is reached, the staging process proceeds its execution. An according error is logged in the error log file. Warning In case a timeout is reached, the page cache may possess inconsistent data. FIX: Restart all application servers that did not respond and remove the page cache. | ||||
staging.locking. | 1200 | Integer | The maximum time the staging process waits for resources (in s). | ||
staging.timeout. | 600 | Integer | Defines the maximum time the staging process waits for each application server refreshing the cache of persistence layer (in s). | ||
staging.timeout. | 600 | Integer | Defines the maximum time the staging process waits for each application server switching their directories (in s). | ||
staging.timeout. | 7200 | Integer | The maximum time the staging process waits for a new state during a staging process (in s). Note This property is also required in the source system. | ||
staging.timeout. | 10800 | Integer | Defines the maximum time the initial staging process waits for the initial database replication. | ||
Live System Configuration section: | These properties need to be set in the target system. | ||||
staging.dblink.name | inactive, ISEDITING | String | This parameter defines the name of the database link from target (live) to source (editing) database. Please use only this OR | ||
staging.editing.schema. | inactive, empty | String | Defines the name of the editing schema . If this property is set, the staging process will directly access the editing schema instead of using the database link. This property must be set in the live system only. The editing and the live schema have to be located in the same database instance. Important Note If this property is set, the live user has to grant object privileges on certain objects of the editing system. Staging will fail if you do not do this properly. If you are unsure, simply leave the property unset. It is, therefore, necessary to login to the database as editing user and execute the stored procedure staging.grant_live_user_privs('NAME_OF_LIVE_USER') An example: exec staging.grant_live_user_privs('INTERSHOP_LIVE0') | ||
staging.live.servergroup | BOS | String | Defines which server group should be used for staging. | ||
Staging Processor Configuration Section: | This section contains the configuration of the staging processors. These settings express the assignment of the staging processor name as defined in the Staging-Processor-To_Staging-Group assignment (StagingGroupInformation.properties), which only represents a processor name (like an alias) to an implementing staging processor Java class together with assigned staging processor decorator(s)' Java class(es). Warning If these properties contain invalid entries, staging can result in data corruption! Please make sure you have understood the documentation before changing these settings! | ||||
Syntax: | |||||
staging.processor.<StagingProcessorName>.className = <implementingClassInclusiveJavaPackage> | |||||
Example: FullDomainSpecificStagingProcessor | Configuration of the database staging processor that transfers domain-specifc data (Products, Discounts, etc.). | ||||
staging.processor.FullDomainSpecificStagingProcessor.className = com.intershop.beehive.core.capi.staging.process.FullDomainSpecificStagingProcessor | |||||
Staging index/constraint performance section: | These settings should be set in the target system. They work with
| ||||
staging.process.unusableIndex.rowCountLimit[.TableName] | 0 | Integer | Set the 'global' or 'table' specific limit (table row count) to enable the unusable index processing. Note Write table names in UPPERCASE letters.
| ||
Examples: | (default value): staging.process.unusableIndex.rowCountLimit = 0
staging.process.unusableIndex.rowCountLimit = 100000
staging.process.unusableIndex.rowCountLimit.PRODUCT = 200000 | ||||
staging.process.unusableIndex.rebuildParallelism | 1 | Integer | Set the number of parallel threads within database performing a SQL unusable index rebuild statement. | ||
staging.process.disableConstraint.rowCountLimit[.TableName] | see Examples below | Integer resp. String | Set the 'global' or 'table' specific limit (table row count) to enable the disable constraint processing. Note Write table names in UPPERCASE letters.
| ||
Examples: | (default value): staging.process.disableConstraint.rowCountLimit = ${staging.process.unusableIndex.rowCountLimit}
staging.process.disableConstraint.rowCountLimit.PRODUCT = ${staging.process.unusableIndex.rowCountLimit.PRODUCT} | ||||
staging.process.disableConstraint.enableParallelism | see Description column | Integer resp. String | Set the number of parallel threads within database performing a SQL disable constraint statement. | ||
staging.contextIndexCreationMode | sync | String | sync | Defines the bahavior of the staging process depending on the creation of context indexes. Note Write table names in UPPERCASE letters.
| |
Examples: | (default value): staging.contextIndexCreationMode=sync
staging.contextIndexCreationMode.PRODUCT=async | ||||
Staging Processor Configuration Section, older version: | This section contains the configuration of the staging processors, as it was valid in ESL6.5. These setting are possibly not up-to-date now. Since these settings are subject of release-specific changes, probably these old information will be removed soon. | ||||
staging.processor. | c.i.b.c.c.s.p.Full | String | Configuration of the database staging processor that transfers domain-specific data (Products, Discounts, etc.). The processor replicates only the content of the selected domains during a batch process. The processor class is used to stage tables containing domain-specific content. | ||
staging.processor. | c.i.b.c.c.s.p.Analyze | String | See previous description. This decorator is used to analyze tables of the editing and live system during the staging process. In the editing system tables are analyzed on preparation hook, in live the system on replication hook. | ||
staging.processor. | c.i.b.c.c.s.p.Disable | String | See previous description. This decorator is used to disable all constraints on shadow tables of live system before the synchronization starts. After replication the constraints will be enabled. | ||
staging.processor. | c.i.c.m.c.s.Remove | String | See previous description. This decorator is used to mark the catalog domains as deleted that have been removed by the replication process. | ||
staging.processor. | c.i.b.c.c.s.p.Execute | String | This staging processor is based on full staging processor switching $1 and $2 tables on publication phase. Further, it calls query files on each staging hook to perform the replication. | ||
staging.processor. | c.i.b.c.c.s.p.Full | String | Configuration of the database staging processor transferring system content like regional settings, permissions, roles, etc. This processor is used to perform staging processes for tables containing system-wide content. | ||
staging.processor. | c.i.b.c.c.s.p.Analyze | String | This decorator is used to analyze tables of editing and live system during staging process. In editing system tables are analyzed on preparation hook, in live system on replication hook. | ||
staging.processor. | c.i.b.c.c.s.p.Disable | String | This decorator is used to disable all constraints on shadow tables of live system before the synchronization starts. After replication the constraints will be enabled. | ||
staging.processor. | c.i.b.c.c.s.p.Merge | String | Configuration of the database staging processor transferring domain-specific content, that may be written in storefront of live system (like Users). This staging processor is used to replicate database content residing in database tables being changed in source as well as target system. Due to this, the replication occurs in one huge transaction. It uses the 'MERGE' sql statement to transfer the new and updated content and uses the deletion tracking with deletion trigger to realize removed rows in the editing system. The 'MERGE' statement has a restriction. It does not work on tables having a column with a context index. So, only tables with normal indexes are supported. | ||
staging.processor. | c.i.b.c.c.s.p.Disable | String | This decorator is used to disable all constraints on shadow tables of live system before the synchronization starts. After replication the constraints will be enabled. | ||
staging.processor. | c.i.b.c.c.s.p.Append | String | Configuration of the database staging processor transferring domain-specific content, that is only appended to live system content. Old content is neither replicated, deleted nor changed. | ||
staging.processor. | c.i.b.c.c.s.p.Merge | String | Configuration of the database staging processor transferring domain-specific content that may be written in the storefront of the live system (like Users) and have a lot of rows in the live system. The Undo process is not supported. | ||
staging.processor. | c.i.b.c.c.s.p.Simple | String | Configuration of the file system staging processor transferring simple files (gifs,...). | ||
staging.processor. | c.i.b.c.c.s.p.Simple | String | Configuration of the file system staging processor transferring localization files. It is based on file system staging processor, too. | ||
staging.processor. | c.i.b.c.c.s.p.Refresh | String | The decorator reloads the localization files in the live system after the localization files have been replicated. | ||
staging.processor. | c.i.b.c.c.s.p.Simple | String | Configuration of the file system staging processor transferring search indexes. It is based on file system staging processor, too. | ||
staging.processor. | c.i.c.f.c.r.Refresh | String | The decorator refreshes the search indexes on each application server in the live system. | ||
staging.processor. | c.i.b.c.c.s.p. | String | Configuration of the mview staging processor refreshing materialized views referencing affected tables. | ||
staging.processor. | c.i.b.c.c.s.p.FullFast | String | Configuration of the database staging processor transferring rules. This processor uses direct path SQL statements improving performance during replication of huge amount of data. Further, during replication the indexes are not maintained. After the replication has been finished the rebuild of all indexes affected by replication will be rebuilt. Furthermore, replicated rules will be reloaded in the target system. This staging processor operates in the same way like Note In case of a database crash the data inserted by this staging processor are not recoverable due to only direct load DML being used. | ||
staging.processor. | c.i.b.c.c.s.p.Analyze | String | This decorator is used to analyze tables of the editing and live system during the staging process. In the editing system tables are analyzed on preparation hook, in live system on replication hook. | ||
staging.processor. | c.i.b.c.c.s.p.Disable | String | This decorator is used to disable all constraints on shadow tables of the live system before the synchronization starts. After replication the constraints will be enabled. | ||
staging.processor. | c.i.b.c.c.s.p.Execute | String | This staging processor is based on full staging processor switching $1 and $2 tables on publication phase. Further, it calls query files on each staging hook to perform the replication. | ||
staging.processor. | c.i.c.s.c.s.Shipping | String | This decorator is used to reload the shipping rules, after the rules of cartridge | ||
staging.objects.chunksize | inactive, 15 | Integer | Business Object Replication: If the user plans to replicate a lot of objects (e.g. 10000 products), these objects will be sent in several loops, 15 objects each loop and the cache refresh is started after all objects have been sent and merged. Note Remember that Business Object Replication is only meant for emergency updates of a few objects. If you want to replicate a lot of data use the Mass Data Tasks menu. |
Instead of declaring the staging-processors and decorators inside the staging.properties, they are defined in code with Guice-modules now. Customers who are using customized staging properties can still use them. The content of the staging properties will be used instead of what is defined in the Guice modules.
In case the staging.properties contains a registration of a processor, all code bindings (for processor and decorators) of this processor will be ignored. That means, the system does not support a mixed scenario of code and properties for a specific processor. But it is possible to declare one processor in Guice (code) and another in properties.
The complete deactivation of Guice binding for decorators allows the project to override standard decorators, as in previous versions. In case the project only needs to add processors or decorator, it is recommended to register these via code. It is possible to overwrite one single processor and decorator with an entry in these staging properties.
public class CoreStagingModule extends AbstractModule { @Override protected void configure() { MapBinder<String, BasicStagingProcessor> processorBinder = MapBinder.newMapBinder(binder(), String.class, BasicStagingProcessor.class); MapBinder<String, StagingDecoratorFactory> decoratorBinder = MapBinder .newMapBinder(binder(), String.class, StagingDecoratorFactory.class) .permitDuplicates(); /* * FullDomainSpecificStagingProcessor */ addBinding(processorBinder, decoratorBinder, "FullDomainSpecificStagingProcessor", FullDomainSpecificStagingProcessor.class, AnalyzeTablesDecoratorFactory.class, DisableConstraintsDecoratorFactory.class, ExecuteQueryDecoratorFactory.class); /* * FullStagingProcessor */ addBinding(processorBinder, decoratorBinder, "FullStagingProcessor", FullStagingProcessor.class, AnalyzeTablesDecoratorFactory.class, DisableConstraintsDecoratorFactory.class); /* * DeltaDomainSpecificStagingProcessor */ addBinding(processorBinder, decoratorBinder, "DeltaDomainSpecificStagingProcessor", MergeDomainSpecificStagingProcessor.class, DisableConstraintsDecoratorFactory.class); /* * AppendDomainSpecificStagingProcessor */ addBinding(processorBinder, decoratorBinder, "AppendDomainSpecificStagingProcessor", AppendDomainSpecificStagingProcessor.class, ExecuteQueryDecoratorFactory.class); /* * DeleteAppendDomainSpecificStagingProcessor */ addBinding(processorBinder, decoratorBinder, "DeleteAppendDomainSpecificStagingProcessor", DeleteAppendDomainSpecificStagingProcessor.class, ExecuteQueryDecoratorFactory.class); /* * MergeDomainSpecificStagingProcessor */ processorBinder.addBinding("MergeDomainSpecificStagingProcessor") .to(MergeDomainSpecificStagingProcessor.class); /* * MergeDomainSpecificStagingProcessor */ addBinding(processorBinder, decoratorBinder, "MergeDomainSpecificAndQueryStagingProcessor", MergeDomainSpecificStagingProcessor.class, ExecuteQueryDecoratorFactory.class); /* * TemplateStagingProcessor */ addBinding(processorBinder, decoratorBinder, "TemplateStagingProcessor", SimpleFileSystemStagingProcessor.class, CompileTemplatesDecoratorFactory.class); /* * LocalizationStagingProcessor */ addBinding(processorBinder, decoratorBinder, "LocalizationStagingProcessor", FullDomainSpecificStagingProcessor.class, RefreshLocalizationsDecoratorFactory.class); /* * PipelineStagingProcessor */ addBinding(processorBinder, decoratorBinder, "PipelineStagingProcessor", SimpleFileSystemStagingProcessor.class, RefreshPipelinesDecoratorFactory.class); /* * MViewStagingProcessor */ processorBinder.addBinding("MViewStagingProcessor") .to(MViewStagingProcessor.class); } private void addBinding(final MapBinder<String, BasicStagingProcessor> processorBinder, final MapBinder<String, StagingDecoratorFactory> decoratorBinder, final String name, final Class<? extends BasicStagingProcessor> processor, final Class<? extends StagingDecoratorFactory>... decorators) { processorBinder.addBinding(name) .to(processor); for (Class<? extends StagingDecoratorFactory> decorator : decorators) { decoratorBinder.addBinding(name) .to(decorator); } } }
public class BcShippingStagingModule extends AbstractModule { /* * This modules configures the RuleStagingProcessor. * * Configuration of the database staging processor transferring * Rules. This processor uses direct path SQL statements improving performance during replication of huge amount of * data. Further, during replication the indexes are not maintained. Afterwards the replication has been finished * the rebuild of all indexes affected by replication will be rebuilt. Furthermore, replicated Rules will be * reloaded in the target system. */ @Override public void configure() { MapBinder<String, BasicStagingProcessor> processorBinder = MapBinder.newMapBinder(binder(), String.class, BasicStagingProcessor.class); processorBinder.addBinding("RulesStagingProcessor") .to(FullDomainSpecificStagingProcessor.class); MapBinder<String, StagingDecoratorFactory> stagingBinder = MapBinder.newMapBinder(binder(), String.class, StagingDecoratorFactory.class).permitDuplicates(); stagingBinder.addBinding("RulesStagingProcessor") .to(AnalyzeTablesDecoratorFactory.class); stagingBinder.addBinding("RulesStagingProcessor") .to(DisableConstraintsDecoratorFactory.class); stagingBinder.addBinding("RulesStagingProcessor") .to(ExecuteQueryDecoratorFactory.class); stagingBinder.addBinding("RulesStagingProcessor") .to(ShippingRuleEngineStagingProcessorDecoratorFactory.class); } }
Since a decorator needs a processor to decorate, each decorator must have a factory that takes the processor as parameter. In the binding, not the decorator but the factory is bound.
In replication-clusters.xml the communication parameters used for replication are specified. Actually, these settings define the communication infrastructure for both mass data replication and business object replication (fast publishing, e.g., of products).
replication-clusters.xml resides in editing (source) system(s).
INTERSHOP 7 supports system setups that can be spatially distributed over multiple data centers, each data center keeping its own database, webservers and appservers with (among others) their own database users and web-URLs. From a physical and IT technical point of view, the INTERSHOP 7 systems in all data centers are different systems, but from a business point of view, they may form a logical unit.
For data replication this means that the target of one replication process might be not only one single INTERSHOP 7 system but several INTERSHOP 7 systems residing in multiple data centers. Therefore, the concept of data replication with one target system as the recipient of replication data was extended to replication target clusters.
A replication target cluster represents the recipient of replication data from a business point of view. Logically one recipient, technically it consists of one or more replication target systems, while a target system represents one (technical) INTERSHOP 7 cluster with its own web URL and database user.
A Data Replication Manager will now select one replication target cluster as the target of a data replication process, and under the surface the replication mechanism will have to transfer the data to every target system belonging to the selected target cluster.
Accordingly, the data replication's configuration needs to provide information about the replication clusters now, which potentially are intended to be updated by the respective source system. Moreover, it has to keep the information as to which target systems belong to each of the target clusters, and how these target systems can be reached.
The replication-clusters.xml (to be found in <IS_SHARE>/system/config/cluster) defines the communication parameters for both mass data replication and business object replication. It is required in the source (editing) system of a data replication environment.
The XML file structure is defined in replication.xsd.
Some example configurations are shown in the Cookbook - Mass Data Replication - Administration.
replication-configuration,
defining the xsd schema localization, and one target clusters list.<?xml version="1.0" encoding="UTF-8" ?> <replication-configuration xsi:schemaLocation="http://www.intershop.com/xml/ns/enfinity/6.5.0/core/replication replication.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.intershop.com/xml/ns/enfinity/6.5.0/core/replication"> <target-clusters> .... </target-clusters> </replication-configuration>
id
attribute.<target-clusters> <target-cluster id="Cluster1"> .. </target-cluster> .. .. <target-cluster id="ClusterN"> .. </target-cluster> </target-clusters>
id
. A target cluster configuration contains one target systems list.<target-cluster id="Cluster42"> <target-systems> ... </target-systems> </target-cluster>
id
attribute and holding an active
attribute.<target-systems> <target-system id="TargetSystem1" active="true"> .. </target-system> .. .. <target-system id="TargetSystemN" active="false"> .. </target-system> </target-systems>
id
attribute, an active
attribute, and a set of connection parameters. active
attribute can be "true" or "false". It defines whether a target system configuration is used for data replication or not. If the target system uses URLMapping, then the complete URL to the SOAP servlet (including the server group to be used in the target system) has to be given, according to the settings for intershop.urlmapping.urlPrefix
and intershop.urlmapping.servlet.webadapter
in appserver.properties of the target server.
Note
<target-system id="TargetSystem_with_URLMapping" active="true"> <webserver-url>http://ts3.mydomain.com:80/INTERSHOP/servlett/BOS/SOAP</webserver-url> .. </target-system>
<target-system id="TargetSystem_without_URLMapping" active="true"> <webserver-url>http://ts2.mydomain.com:80</webserver-url> <target-server-group>STG</target-server-group> .. </target-system>
<target-system id="TargetSystem" active="true"> .. <source-server-group>BOS</source-server-group> .. </target-system>
<target-system id="TargetSystem_using_DBLink" active="true"> .. <source-database-link>ISEDITING.world</source-database-link> </target-system>
<target-system id="TargetSystem_using_DBLink" active="true"> .. <target-database-user>INTERSHOP_LIVE</target-database-user> </target-system>
Note
It is possible to use database access via database link from one, and direct database access from another target system within one target cluster.
The following example shows some basic configuration examples of replication-clusters.xml.
<?xml version="1.0" encoding="UTF-8" ?> <replication-configuration xsi:schemaLocation="http://www.intershop.com/xml/ns/enfinity/6.5.0/core/replication replication.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.intershop.com/xml/ns/enfinity/6.5.0/core/replication"> <target-clusters> <target-cluster id="Cluster42"> <target-systems> <target-system id="TargetSystem1" active="true"> <webserver-url>http://ts1.mydomain.com:80</webserver-url> <source-server-group>BOS</source-server-group> <target-server-group>BOS</target-server-group> <source-database-link>ISEDITING.world</source-database-link> </target-system> <target-system id="TargetSystem2" active="false"> <webserver-url>http://ts2.mydomain.com:80</webserver-url> <source-server-group>BOS</source-server-group> <target-server-group>STG</target-server-group> <target-database-user>INTERSHOP_LIVE</target-database-user> </target-system> <target-system id="TargetSystem_with_URLMapping" active="true"> <webserver-url>http://ts3.mydomain.com:80/INTERSHOP/servlett/BOS/SOAP</webserver-url> <source-server-group>WFS</source-server-group> <source-database-link>ISEDITING.world</source-database-link> </target-system> </target-systems> </target-cluster> </target-clusters> </replication-configuration>
Explanations:
+"Cluster42":
The file contains one cluster definition for the cluster named "Cluster42", which involves three target systems, "TargetSystem1", "TargetSystem2" and "TargetSystem_with_URLMapping".
"TargetSystem1":
BOS
in both, source as well as in target system.ISEDITING.world
, which has to be defined in the target database schema to point to the source schema."TargetSystem2":
BOS
in source and server group STG
in target system. INTERSHOP_LIVE
; the system will grant access in the source schema to INTERSHOP_LIVE.
"TargetSystem_with_URLMapping":
intershop.urlmapping.urlPrefix
and intershop.urlmapping.servlet.webadapter
in appserver.properties of the target server, where URLPrefix /INTERSHOP and urlmapping.servlet /servlett (sic!) is used.uses server group BOS
in target system.
Note
WFS
in source system.ISEDITING.world,
which has to be defined in the target database schema to point to the editing schema.Together with replication_clusters.xml, replication.xml holds the configuration for the data replication functionality of INTERSHOP 7. While replication_clusters.xml defines the communication channels both for mass data replication and for business object replication (fast publishing e.g., of products), replication.xml is only used by mass data replication.
In replication.xml, the replication groups and their descriptions, which are useable in the back office, are defined. Additionally, in this file mass data replication processes can be predefined.
The XML file structure is defined in replication.xsd.
replication.xml is required in editing (source) system(s).
The file replication.xml consists of three parts:
While the groups section is mandatory, processes and tasks definitions are optional.
The following schema shows the basic structure of replication.xml.
<?xml version="1.0" encoding="UTF-8" ?> <replication-configuration xsi:schemaLocation="http://www.intershop.com/xml/ns/enfinity/6.5.0/core/replication replication.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.intershop.com/xml/ns/enfinity/6.5.0/core/replication"> <!-- In this (mandatory) section all replication groups are defined, that are shown in the Data Replication Manager's backoffice. --> <groups> ... </groups> <!-- In this (optional) section all replication processes are specified, that can be replicated by job 'Regular Replication Process' in SLDSystem (manually, i.e., single time, or on a regular, i.e., recurring base) . --> <processes> ... </processes> <!-- This (optional) section contains all replication tasks, that can be reused by several replication processes. Each referenced replication task is created at the beginning of replication process in according enterprise or channel. --> <tasks> ... </tasks> </replication-configuration>
The following schema shows an excerpt of replication.xml dealing with the replication group definition.
A replication group definition consists of:
The example below depicts the replication group "Search Indexes" with configurations for locales "en_US" and "de_DE".
<?xml version="1.0" encoding="UTF-8" ?> <replication-configuration ...> ... <!-- In this (mandatory) section are defined all replication groups, that are shown in the Data Replication Manager's backoffice. --> <groups> <group id="SEARCH_INDEXES" > <business-unit-types>20 30 52</business-unit-types> <locale id="en_US"> <name>Search Indexes</name> <description>Search indexes and their configuration, search query definitions (predefined product filters), and search redirects. Note: The objects group that is indexed, e.g. PRODUCTS and PAGELETS, must be added to avoid inconsistencies. </description> </locale> <locale id="de_DE"> <name>Suchindizes</name> <description>Suchindizes und Indexkonfiguration, vordefinierte Suchanfragen und Such-Redirects. Achtung: Die Replikationsgruppe, die indizierte Objekte enthält (z.b. PRODUCTS, CATALOG oder PAGELETS), muss ebenfalls repliziert werden. </description> </locale> </group> <group id="... > ... </group> </groups> ... </replication-configuration>
The following schema shows an excerpt of replication.xml dealing with the (mass data) replication process definition. These process definitions can be read by the job Regular Replication Process in domain SLDSystem to create automated replication processes.
Note
If no predefined replication processes are needed, remove the "processes" section from replication.xml or comment it.
A replication process definition consists of:
ReplicationProcessID.
Note
ReplicationProcessID,
it is necessary to create (copy) an own job for each replication process to be executed by job.The example below depicts the definition of a replication process "nightly" of type "ReplicationPublication" with attached replication tasks "PrimeTechProducts" and "PrimeTechSpecialsProducts".
<?xml version="1.0" encoding="UTF-8" ?> <replication-configuration ...> ... <!-- In this (optional) section there are specified all replication processes that can be replicated by job 'Regular Replication Process' in SLDSystem (manually, i.e., single time, or on a regular, i.e., recurring base) . --> <processes> ... <process id="nightly"> <type>ReplicationPublication</type> <description>This process is started every night.</description> <target-cluster-id>Cluster42</target-cluster-id> <task ref="PrimeTechProducts"/> <task ref="PrimeTechSpecialsProducts"/> </process> ... </processes> ... </replication-configuration>
The following schema shows an excerpt of replication.xml dealing with the (mass data) replication task definition. These task definitions are referenced by job Regular Replication Process in domain SLDSystem when creating automated replication processes.
Note
If no predefined replication processes / tasks are needed, remove the "processes" and "tasks" sections from replication.xml or comment them.
A replication task definition consists of:
Note
Note
Note
The channel name is case-sensitive and needs to be written like in table DOMAININFORMATION.
The example below depicts the definition of
<?xml version="1.0" encoding="UTF-8" ?> <replication-configuration ...> ... <!-- This (optional) section contains all replication tasks that can be reused by several replication processes. Each referenced replication task is created at the beginning of the replication process in the according enterprise or channel. --> <tasks> ... <task id="PrimeTechSpecialsProducts"> <organization>PrimeTech</organization> <channel>PrimeTechSpecials</channel> <description>Replicates all products of channel PrimeTechSpecials</description> <group ref="CATALOGS"/> <group ref="PRODUCTS"/> </task> <task id="PrimeTechProducts"> <organization>PrimeTech</organization> <description>Replicates all products of channel PrimeTechSpecials</description> <group ref="PRODUCTS"/> </task> ... </tasks> ... </replication-configuration>
For customization aspects, see the according information provided in the Cookbook - Mass Data Replication - Customization and Adaption.
A data replication system can be configured to serve as source and target system. Hence, it is possible to set up data replication chains in which content is transferred consecutively across multiple systems (e.g., system A replicates to system B, and then system B replicated to system C).
As a business case example, setting up a data replication chain may be required for test or acceptance systems where data or design changes are tested or approved before they go live.
Note
It is NOT supported to create a replication ring, i.e., that the last target system in a replication chain serves as source system to replicate data back to the orininal source system (e.g., system A to system B, then system B to system C, and then system C to system A).
The following figure depicts a data replication chain with 3 stages in a simplified form. For easier understanding it shows only target systems instead of target clusters, since only one system in a target cluster can act as an editing system for the next stage in a replication chain.
Figure: Mass Data Replication: Simplified schema of a data replication chain.
When setting up data replication chains, take care of the following topics:
Replication processes are intended to be an atomic operation, i.e., they are counted as successfully finished only if they
Therefore, when- and whereever an error occurs during a replication process, the whole replication / staging process is broken up and signed as failed.
By default, status / errors of replication and staging processes are written into the PROCESS table. The status of a replication process is displayed in the back office (SLDSystem | Data Replication | Replikation Processes | Process detail page).
Additionally, errors in replication and staging processes are tracked in error*.log files in share/system/log. Status information and errors within the staging framework are tracked in staging*.log files in the same directory.
Out-of-the-box, there is no additional error notification implemented for data replication, but there is a standard mechanism to call a custom pipeline at special stages of a staging process, which can be used to implement and call a custom notification pipeline at the end of a staging process (StagingProcessCustomization -> OnPreCompletition).
All information related to a staging process (pipelines TriggerStagingProcess, TriggerPublicationProcess, TriggerUndoProcess in editing, and StagingProcess in target system) is written to staging*.log files in share/system/log (not restricted to errors, but contains errors, too).
Errors occuring in the replication process (pipeline TriggerReplicationProcess) will be tracked in error*.log files.
Replication processes, staging processes, and their staging sub processes store their process states in the PROCESS table. In case of a failed or blocking replication sometimes it may be helpful to check the respective process states in the database. Since PROCESS rows contain a LASTMODIFIED column, the most recent replication resp. staging process can easily be determined by ordering the rows by LASTMODIFIED.
The following table gives an overview of the occuring process states. To check the most recent process for its process state, execute the following SQL command (replace <ProcessName> by the respective name from the table below):
select state from PROCESS where name=<ProcessName> order by LASTMODIFIED desc;
Meaning of table columns:
Column | Meaning |
---|---|
Process type | Process type, where a process state does occur (replication, staging, or staging sub process). |
Process name | Name of the process type as it occurs in the PROCESS table. |
Process state | Name of the process state as it occurs in the PROCESS table. |
State type | Defines if the described state is set in the middle of a running staging process (type process), |
System | Shows where the described state occurs (source, or target system). |
Process type | Process name | Process state | State type | System | Description / |
---|---|---|---|---|---|
Replication process | 'ReplicationProcess' | WAITING | process | source | The replication process is prepared but the execution time is not yet reached. |
CANCELED | final | source | The replication process was canceled in the back office. | ||
RUNNING | process | source | The replication process is underway. | ||
COMPLETED | final | source | The replication process has successfully finished. | ||
FAILED | final | source | The replication process has finished due to errors. | ||
Staging process | 'StagingProcess' | ErrorInternal | final | source | Any severe failure when calling the source system's staging pipeline. |
ErrorExecutingEditingStagingPipeline | final | source | Staging pipeline in editing system cannot be executed. | ||
ErrorNonStagedDomains | final | source | Some replication content references at least one domain that is not part of the current replication process nor exists at least in one target system. | ||
ErrorNonStagedParentSites | final | source | Some replication content belongs to at least one unit whose parent site is not part of the current replication process nor exists at least in one target system. | ||
ErrorConnectToEditingDB | final | source | The source system cannot create the staging identification token. | ||
ErrorConnectLiveSystem | final | source | The source system's staging web service cannot connect to a target system. | ||
ErrorCreatingLiveStagingProcess | final | source | Failure when copying the staging process to (at least) one target system. | ||
ErrorAcquiringLiveLocks | final | source | Failure when acquiring the locks for staging resources in (at least) one target system. | ||
ErrorAcquiringEditingLocks | final | source | Failure / timeout when acquiring the locks for staging resources in source system. | ||
ErrorInitializingStagingProcessors | final | source | Failure when checking the assignments of staging processors for all staging groups. | ||
ErrorStagingProcessModeNotSupported | final | source | (At least) one staging processor does not support the current replication process type, i.e., the staging process mode. | ||
StartingPreparation | process | source | The preparation phase is starting. | ||
PreparationSuccessfullyFinished | process | source | The preparation phase finished successfully. | ||
ErrorPreparation | final | source | The preparation phase finished with an error. | ||
FatalErrorPreparation | final | source | Fatal error during error handling in preparation phase. | ||
ErrorCallingLivePipeline | final | source | An error occured while the source system called the staging pipeline in a target system. | ||
StartingSynchronization | process | target | The synchronization phase is starting. | ||
SynchronizationSuccessfullyFinished | process | target | The synchronization phase finished successfully. | ||
ErrorSynchronization | final | target * | The synchronization phase finished with an error. | ||
FatalErrorSynchronization | final | target * | Fatal error during error handling in synchronization phase. | ||
StartingReplication | process | target | The replication phase is starting. | ||
ReplicationSuccessfullyFinished | process | target | The replication phase finished successfully. | ||
ErrorReplication | final | target * | The replication phase finished with an error. | ||
FatalErrorReplication | final | target * | Fatal error during error handling in replication phase. | ||
ReplicationProcessCompleted | final | target * | A staging process of type Replication has successfully finished. | ||
StartPublication | process | source * | This is a state used to get the target systems in sync before the publication phase can start. | ||
StartingPublication | process | target | The publication phase is starting. | ||
PublicationSuccessfullyFinished | process | target | The publication phase finished successfully. | ||
ErrorPublication | final | target * | The publication phase finished with an error. | ||
FatalErrorPublication | final | target * | Fatal error during error handling in publication phase. | ||
StartRefreshCache | process | source * | This is a state used to get the target systems in sync before the refresh_cache phase can start. | ||
StartingRefreshCache | process | target | The refresh_cache phase is starting. | ||
RefreshCacheSuccessfullyFinished | process | target | The refresh_cache phase finished successfully. | ||
ErrorRefreshCache | final | target * | The refresh_cache phase finished with an error. | ||
FatalErrorRefreshCache | final | target * | Fatal error during error handling in refresh_cache phase. | ||
StagingProcessCompleted | final | target * | A staging process of type ReplicationPublication or of type Publication has successfully finished. | ||
ErrorDeterminingUndoContent | final | source | An error occurred while determining the Undo content. | ||
StartingSaveNoneUndoContent | process | target | The sub-step SaveNoneUndoContent of Undo phase is starting. | ||
SaveNoneUndoContentSuccessfullyFinished | process | target | The sub-step SaveNoneUndoContent of Undo phase finished successfully. | ||
ErrorSaveNoneUndoContent | final | target * | The sub-step SaveNoneUndoContent of Undo phase finished with an error. | ||
FatalErrorSaveNoneUndoContent | final | target * | Fatal error during error handling in sub-step SaveNoneUndoContent of Undo phase. | ||
StartingRestoreUndoContent | process | target | The sub-step RestoreUndoContent of Undo phase is starting. | ||
RestoreUndoContentSuccessfullyFinished | process | target | The sub-step RestoreUndoContent of Undo phase finished successfully. | ||
ErrorRestoreUndoContent | final | target * | The sub-step RestoreUndoContent of Undo phase finished with an error. | ||
FatalErrorRestoreUndoContent | final | target * | Fatal error during error handling in sub-step RestoreUndoContent of Undo phase. | ||
StagingUndoCompleted | final | target * | A staging process of type UnDo has successfully finished. | ||
ErrorUndoStaging | final | target * | A staging process of type UnDo has finished with error(s). | ||
ErrorInternalInLiveSystem | final | target * | Any severe failure when calling the target system's staging pipeline. | ||
ErrorEditingStagingProcessKilled | final | source | At its start-up time INTERSHOP 7 checks the PROCESS table for staging processes with any non-final state (such a process would be broken due to shutdown or crash of the appservers). If so, this process is set to ErrorEditingStagingProcessKilled in source system. | ||
ErrorLiveStagingProcessKilled | final | target * | At its start-up time INTERSHOP 7 checks the PROCESS table for a staging process with any non-final state (this process would be broken due to shutdown or crash of the appservers). If so, this process is set to ErrorLiveStagingProcessKilled in target system. |
There is a job*.log file in share/system/log, but it would normally only inform whether a pipeline was successfully executed (i.e., without technical failure), or not. For replication processes started by jobs, the more diagnostic information can also be found in error*.log (replication level) and staging*.log (staging level) resp. - if enabled - the debug*.log files.
In case an error occurs during a replication process, both editing and live system(s) will keep the active data as they were active before the now broken replication process. In this sense, a data recovery is not needed if a replication process threw an error.
However, there is a situation where a manual intervention might be needed: In case the INTERSHOP 7 application server that executes the replication process in a target system just crashed in that moment when it is performing the synonym switches, it might be that synonyms point to the newly filled table while this information is still not written to the database table STAGINGTABLE, which is used as an administration table for staging.
If such a situation occurs, open a SQL prompt as the target system's database user and execute the procedure staging.restore_synonyms.
exec staging.restore_synonyms
Database connection fault (wrong DBLink configuration or broken connection) resp. database access forbidden from target to editing database schema
The target system's database user needs access to the source system's database schema to transfer database content.
Check as the according operation system user in the target system, e.g., isas1, if you can connect to the target system's database schema using SQLPlus and the credentials as defined in orm.properties. Check, if you can access source system data, e.g., by:
select count(1) from product@<source_dblink_name>;
or
select count(1) from <source_schema_name>.product;
The Data Replication functionality is capable to support multiple data centers.
The basic concepts of multiple data center support in INTERSHOP 7 are described in detail in separate articles. In short, they assume the following conditions:
Regarding data replication environments, multi data center support means additionally:
As already described before, INTERSHOP 7 introduced the concept of target clusters, which allows to update multiple target systems (possibly in different data centers) quasi in parallel with a single data replication process (both mass data, and business object replications). Thus, all target systems of a target cluster will be updated with the same database and file system data.
Especially for data replication, the only needed configuration file to support multi data center usage is the replication-clusters.xml in <IS_HOME>/share/system/config/cluster. It's content and syntax was already described before.
To ease the setup and deployment of distributed data replication configurations, it is possible to add the data center name as defined by IS_DATA_CENTER
in intershop.properties as a prefix to replication-clusters.xml. This way, one can distribute the data center-specific replication-clusters.xml files independently to all source systems.
When looking up for replication-clusters.xml, INTERSHOP 7 will first check whether IS_DATA_CENTER
is set and - if so - whether there is a data center-specific <$IS_DATA_CENTER>_replication_clusters.xml. If present, the system will use it. If not found, the system will look for the default name replication-clusters.xml.
Example: Assuming the name of a data center is "DC_THX1138", then in intershop.properties the variable IS_DATA_CENTER
would be set to:
IS_DATA_CENTER=DC_THX1138
A data center-specific replication-clusters.xml would then be looked for as DC_THX1138_replication-clusters.xml.
Basically, the data replication mechanism requires the source and target system to have the same structure in the database (tables, indexes, ...) and the same base content (system domain, root site, ...).
Mass Data Replication supports the transfer of new organizations and channels, that were created in the editing system, to the target system. However, the content (catalogs, products, ..) of an organization or channel can only be transfered if the respective structure (domain hierarchy, i.e., the organization resp. channel themselves) has been replicated before or is involved in the same replication process. Hence, after creating a new organization or channel in the editing system it is suggested to create a replication task in the organization / enterprise (e.g., PrimeTech) with replication groups Organization and Channels/MasterRepository first and have it replicated to the target system. Subsequently replicate the other data like catalogs, products etc.
For example, assume a partner organization Miller working in the partner channel Reseller Channel of the sales organization "PrimeTech". In case the organization Miller wants to replicate master repository data, then Miller, Reseller Channel and PrimeTech also have to exist on the target system.
Note
It is not possible to replicate content into the repository of a different organization, or an organization working in a different channel.
Note
Do not create a new organization or channel manually in the target system, which already exists in the editing system if you want to use data replication to update data of the target organization / channel with data from the editing system. Use data replication instead to transfer the organization / channel from editing to target system! Though the displayed name of an organization / channel is the same in editing and target system, their DOMAINID will differ if not transferred by mass data replication, and so the data replication will count both organizations resp. channels as different ones, i.e., data replication will not work between them, since data replication depends on the DOMAINID.
If a channel uses catalogs that are shared from a superior organization (enterprise or partner organization), changes of catalog data in the superior organization require to be replicated before or together with the (derived) data of the channel catalog to be available in the target system (replication group Catalogs, categories and product types in both the superior organization and the channel). If only the channel catalog is replicated, changes in the organization's catalog will not be replicated automatically and will be missed in the target system.
If a channel uses products that are shared from a superior organization (i.e., the master repository in the enterprise or partner organization), changes of product data in the master repository require to be replicated before or together with the (derived) data of the channel products to be available in the target system (replication group Products in both the superior organization and the channel). If only the channel products are replicated, changes in the master repository will not be replicated automatically with mass data replication and will be missed in the target system.
Both in the organization (i.e., the master repository) and in the channel, product prices are replicated implicitly with replication group Products but can also be replicated explicitly with replication group Product Prices.
Like with products, product prices of the channel that are shared from a superior organization require the replication of product prices of the organization, too, if the prices were changed in the organization.
Image definitions (types, views, sets and the relations between them) exist only on organization level and are referenced from organization level in the organization and in channels. Therefore, changes of image definitions need to be replicated in the organization (replication group Image Definitions).
Image references use image definitions, which are only maintained in the organization. So, when image definitions were changed, they have to be replicated at organization level.
Image references themselves can be considered to be references to product pictures. They can exist like products at organization and channel level.
Image references for products in an organization (i.e., in the master repository) have to be replicated in the organization (replication group Image References), image references of channel products have to be replicated within the channels.
Localization data can be maintained on organization and on channel level. Data is stored separately in an according localization repository for each level. The localization functionality of INTERSHOP 7 uses a lookup mechanism that first searches in the current channel's localization repository and then in the superior organization(s)'s localization repository. So, if localization data is modified on organization level and on channel level, then the localization data has to be replicated in the organization and in the channel (replication group Localization Data).