Persistence overview

Table of Contents:



introduction:

This document provides an overview of the persistence implementation in SANDev, which serves as an updated reference and framework for Structs and Nodes Development (SAND). Persistence in SAND is transactionally safe, queryable, permanent storage of application information which can be implemented over any persistence technology. The default implementation in SANDev uses JDBC.

A typical persistent struct declaration looks like this:

    /**
     * My typical sample persistent struct declaration.
     *
     * @sand.structmessage persist
     * @sand.verbforms update query collection
     */
    public class MyDataStruct {
        ...
    }
See the documentation of the basic @tags for more details on these and other generator tags. In the case of the example declaration, the following messages are created: along with supporting persistency code. This document describes persistency from the perspectives of the application, implementation, deployment and testing.

TOC

Using persistent data within your application:

A SAND application works with persistent data through communication with a DataManagerNode. To retrieve persistent instances, an application calls the DataManager with a query and gets back a collection. An update is used to add, modify, or delete an instance. To update several instances within a single transaction, an AggregateUpdate message is used.

Queries and Collections:

To make this happen within the MyAppNode business logic:

  1. create a new MyDataQuery instance (a SandQueryMessage)
  2. setMatchInfo, setMaxReturn, setOrderBy as appropriate
  3. use the autogenerated callMyDataQuery method defined in MyAppNodeBase which returns a MyDataCollection (a SandCollectionMessage)
  4. verify getSandTransmitStatus in the returned message is SandTransmitMessage.STATUS_NORMAL
  5. process the collection as required by the application

Updates:

To make this happen within the MyAppNode business logic:

  1. create a new MyDataUpdate instance (a SandUpdateMessage)
  2. set the action to be ACTION_ADD, ACTION_UPDATE, or ACTION_DELETE as appropriate
  3. set the MyData instance. The persistence fields in the instance (accessors are defined in SandPersistMessage) should be left alone.
  4. use the autogenerated callMyDataUpdate method defined in MyAppNodeBase which returns a MyDataUpdate with the updated instance information.
  5. verify getSandTransmitStatus in the returned message is SandTransmitMessage.STATUS_NORMAL
  6. process the updated MyData instance as required by the application

If instead of a single instance update, we needed to modify several instances withing a single transaction, then we would use an AggregateUpdate message instead:

TOC


Caching persistent data within your application:

Once your application logic has retrieved persistent information from the DataManager, you can avoid future retrieval overhead by saving the instance in an IDCache. If the information never changes, then that's all that is necessary. If the information frequently changes, then caching is generally not recommended.

For information that changes infrequently, you can track updates via CacheAction messages, through a CacheManager configured with your deployment. To track changes, you must first declare your application node to work with CacheAction messages. For example MyAppNodeDecl might declare:

  * @sand.call 
  *     org.sandev.basics.sandmessages.CacheAction
  *     org.sandev.basics.sandmessages.CacheAction
  *     cacheActionRegistration
  * @sand.subscribe
  *     org.sandev.basics.sandmessages.CacheAction
  *     cacheActionSource

These javadoc tags are defined in org.sandev.generator.tags. Here is an example deployment configuration with the data flow for the registration and change notifications:

To cache an instance:

  1. put the instance into the IDCache
  2. create a CacheAction action==REGISTER with the uniqueID and messageClass of the instance.
  3. callCacheActionRegistration with the registration action
  4. override onDelivery(CacheAction msg) to remove the instance from the cache. Removing the instance automatically handles the full range of change cases:
This "lazy evaluation" or "fetch on demand" strategy is generally a good approach, although some cases (eg a product index) might require proactive retrieval or "pre-fetch" in response to changes or dumps. When removing an instance from the IDCache remember to UNREGISTER that instance from the CacheManager.

TOC


Modeling persistent data:

When a struct is declared persistent (through the "persist" @sand.structmessage flag), the following fields are added to the generated SandPersistMessage:

A uniqueID assigned to an instance for its lifetime. At runtime, the application can safely assume that a uniqueID value always refers to a single object instance (in most cases this assumption can be carried through to the persistent storage, although the values may potentially be remapped by the Persister). The revisionNumber is used to ensure that all updates are referencing the latest data (see the SandUpdateMessage for details). Under normal circumstances an application only has access to ACTIVE instances, with DELETED and ARCHIVED records potentially moving offline if dictated by storage requirements. All fields are required by persistence processing except for lastModifiedReason.

A persistent struct may only contain:

Additional types are handled through field tag metadata:

For any persistent struct declaration, the generated SandPersistMessage contains methods to automatically resolve references (or arrays of references) into instances (or arrays of instances). For example org.sandev.TaskHeap.sandmessages.Plan has a

method generated in response to the reference array declared in org.sandev.TaskHeap.structs.PlanStruct. These generated methods can be used to easily shift from a reference to an instance when needed by the application, while avoiding unnecessary data retrieval and memory use. When combined with appropriate caching, references support "lazy loading", "pre-fetch" and hybrid models for trees.

Using reference arrays to create a tree is typical for categorical classifications. It is also an example of a "closely coupled" association between instances, where one instance literally holds an array of references. This is in contrast to "loosely coupled" association, where the association is computed via query. For example if an OrderStruct contains a reference to a CustomerStruct, then the application can find all the orders placed by a customer by querying for orders with the customer uniqueID. Which form of reference coupling is appropriate depends on the application requirements.

TOC


Storage mapping example: default SQL

By default, the JDBCPersister that comes with SANDev maps persistent struct definitions as follows:

Other Persister implementations may work differently. With the exception of JDBCPersister.java the files in org.sandev.tools.JDBC are autogenerated, and are configured in the build to accept ${StructMapper} as an extra parameter. If this property is set to the fully qualified classname of a class implementing the org.sandev.generator.StructMapper interface, then the table names, field types, field names, and uniqueID management will be changed accordingly.

Struct remapping is limited to situations where all fields can be represented. When working with a legacy database containing equivalent fields (or if fields can be added to the legacy database), then it may be possible to reverse map the legacy database into a struct representation. While this may not result in an optimum object model, it does provide a basis for application logic. Working with a struct model that is significantly divergent from the persistence model is generally not recommended.

Where a legacy database model is not adequate to support application logic, options include:

  1. Using mapping technology to provide a different logical database view.
  2. Writing a custom (hand built) persistence layer
  3. Using a translation node.

The application logic primarily leverages application objects which are not declared as persistent, but have all the verb forms declared. These application objects are sent to the AppPersistMgr which translates them into their equivalent persistent objects and calls through to the LegacyDataMgr. The translation node is a standard "adaptor" pattern for interfacing with existing technology.

TOC


Working with data over time:

While an application is running, data is managed by application logic. Data that must exist prior to startup is defined in the deployment Configuration, which is loaded and checked by the Persister at initialization time. Configuration.initialData is used in testing, and is one of the few times where all field values (including the persistent fields) can be specified explicitely. It is important to use values in accordance with any UniqueIDManager implementation assumptions.

A Persister will not access any recordStatus ARCHIVED or DELETED instances for a normal query, so these instances can be moved to auxilliary (such as read-only) storage and/or offline. A mix of storage media can also be utilized depending on the age of the data.

Versioning of data structures (to support changes in application logic) require that the underlying database structures also be changed. In some cases, the new field can simply be added, with default values used for earlier versions of the data. In other cases both the new and the old versions of the struct will need to coexist, with translation of old data occurring on demand. In most other cases, the data will need to be forward migrated through a data transformation process:

Each node is typically declared in its own deployment, with the ConverterNode aware of both peristency models. An alternative is to create a custom messaging serializer to convert the old data, in which case the data migration system would be reduced to:

Data migration can be a significant operation and can potentially result in errors even with autogenerated persistency code on both ends. If possible, it is recommended that the migrated data be verified against the original data through a reverse mapping post completion.

TOC










© 2002-2003 SAND Services Inc.
All Rights Reserved.