DDMS
[ class tree: DDMS ] [ index: DDMS ] [ all elements ]
Prev Next

User Guide for Distributed Data Management System (DDMS)

To provide a functional description of the DDMS.

Jan van der Breggen
{@link mailto:jan.vanderbreggen@rigpa.org Jan van der Breggen}

Table of Contents

Introduction

The aim of this document is to give a functional description of the DDMS. It is part of a set of documents that together provide a full description of the DDMS and of functionality that needs to be changed and newly developed in future (insert links to the other documents). In this set of documents we will give a description of the system a number of times, each time going a bit deeper and providing more details. It is up to you as the reader to decide what level of detail is useful. In terms of future changes and new developments, in other words, outstanding tasks, the most practical and clear overview will be given in the {@link todolist.html Todo List}.

As for this document: the first section describes the DDMS as much as possible in layman's language but inevitably a bit of jargon will have to be introduced. After this there is a section that describes the DDMS in more detail, and inevitably also with a bit more jargon.

DDMS General Description

Sharing data between databases

The Distributed Data Management System (herafter referred to as DDMS) provides functionality for data sharing between databases and applications that are located in different places. To make clear how this sharing of data works we'll use the example of a retreat registration system. When using a database application to manage the process of retreat registration it might be helpful to have access to more information about the people who register for retreats than just the details they provide when registering for retreats. For example, it might be useful to know which retreats in other countries the person has attended. For that matter, the retreat team might want to get an idea of how many retreatants have completed which practices, or any other information that is stored in another database, located in another part of the world.

The DDMS is designed to provide exactly this kind of functionality for business applications of any kind. In the case of Rigpa it will for example allow all the different countries to have a common way of sharing information about students, financial sponsors etc.


All this can be done without changing your own local database

In that case, you might wonder, do I need to change all the student numbers in my local database so that for each person they match the student number in the international database? If that were the case, that would be a serious disadvantage as you might have lots of information in your own local database that refers to the local student numbers, and would become unusable if all student numbers would be changed. Luckily, the DDMS has been designed in such a way that it allows local databases to maintain their own student numbers. It simply assigns a second number to each student, which is the International Student Number. All that is needed to make this system work is a relatively small effort to match up the local and international student numbers. After this has been done the DDMS will take care of downloading information for you from the international database(s) and uploading any information to the international database(s) that you decide to share.


Privacy - Big Brother is not watching you

Yes, that's right,you are fully in control of which data you wish to make available to the international database(s) and which data you want to keep private. To explain this a bit further we have to become a bit technical in the way we talk about the DDMS. The DDMS consists of two parts: a Client and a Server. In the case of our example of a retreat registration system, your retreat registration system is the Client and the database that you want to extract information from is the Server. The Client, that is, your registration system, is that one that takes the initiative: through its configuration settings it can decide exactly which data it wishes to share, and whether it simply wants to download information from other databases or also wants to provide some of its information to other databases. The Client is the 'active' part of the system. The Server simply waits until a Client asks it to do something, either give it some information from the database it is connected to, or writing some information that is presented to it back to its database. Actually, in its current state of development the DDMS is not yet capable of writing information back from the Client, that is for example our retreat registration system, and the server, that is for example the Care and Admin database or the International Retreat Registrations databse. But the vision is for this kind of write back functionality to be added to it soon.


Main functions of the DDMS

OK. So what exactly can the DDMS in its current state do? When a certain part of the information in a database is marked as linked data, for that set of data it:

  • gets all the data from the database that the local data is linked to and stores it in your local database.

  • It refreshes all or part of the linked data on a regular basis to ensure that any updates in the linked database are also made in your local copy of the data. During this 'refresh' action it retrieves any new information that was added to the linked database since that last update, it deletes information from the database that was deleted from the linked database, and it updates information that was updated. All this happens automatically without anything needed to be done by the users of your business application, like the retreat registration system that we used as an example above.

Another important feature of the DDMS is that when there is no network connection your local system will be able to continue to work. This is because your application has its own copy of the data that it is linked to in the remode database.


DDMS Future Developments

During the next phase of systems development of the DDMS we will create the functionality for writing back data that was newly created on the Client side back to the Server Side, and we will add a system of authentication to make the system more secure.


A More Detailed Description of the DDMS

After the above general description of the DDMS I will now describe it in more precise detail. This is still a rather high level description of the system. The real detailed stuff will be described in the separate documentation for the Client and Server sides of the system.

As was already said above, the DDMS is a system that consists of a Client and a Server. Both of these are collections of program code written in PHP. The Client is is meant to be integrated into business applications that make use of a MySql database. Within this database a subset of the data can be linked to an external database. The DDMS client manages these data links on behalf of the business application. Some important features of the way the DDMS works are (items in bold indicate functionality that has not yet been implemented):

  • It is fully independent of any specific database structure. This means it can link any two MySql databases to each other

  • The Client does not connect directly to the remote database. Instead it sends its data requests to the Server Program which handles them on behalf of the Client. This is an important feature of the system for various reasons.

  • The business application will have its own working copy of any data it has a link to in a remote database. Thanks to this, once a business application works with linked data through the DDMS it will still be able to function fully in case the data link cannot be established.

  • Linked data can be of two kinds

    • Data that 'belongs' to the external database:the server side. The term we'll use for the database where a particular data item belongs will in the rest of this documentation be referred to as 'Database of Record' (abbreviated as DoR). In other words, for this type of linked data, the server side is the DoR

    • Data that 'belongs' to the local database, that is the client side. For this data the Client side is the DoR.

  • For linked data that has the Server side as the DoR, the DDMS offers the following ways to create datalinks:

    • 'select only' ? in this case the client just wants to have access to data from an external database but not write new data back to it.

    • 'select and insert' ? in this case, in addition to merely retrieving the data from the external database, the hosting business application will also create new data which the Client program will send back to the Server for insertion into the external database. In the case of 'select and insert' data, each data record on the client side will have its own local primary key value and in addition also maintain the primary key value it has on the Server side.

  • An advantage of assigning local primary key values to linked data items is that this creates the possibilities for the local database to maintain its referential integrity with other local databases. For example, if through the DDMS the set of data containing name, address and contact details for people is linked to an external database, the system of assigning a local primary key in addition to the remote primary key means that this set of data can maintain any relationships with other local data it already had prior to establishing the data link.

  • Any linked data that has the Client side as the DoR will on a regular basis be sent to the Server for insertion into the remote database.

  • For data that has the Server side as the DoR, the Client side of the DDMS will from time to time run a 'refresh data' event. During this time newly inserted records will be imported from the DoR and inserted locally, records that have been deleted from the DoR will be deleted locally, and a selected fragment of the data will be updated to reflect changes in the DoR. How exactly this 'refresh event' works will be descibed in the DDMS Client documentation.

  • The 'refresh event' is configurable in the context of the hosting business application. To give an example: in the context of a retreat registration application it will be possible to run a refresh event only on linked data that is referred to be retreat registrations for current retreats.

  • The timing of the 'refresh' event is also configurable. The business application can schedule its 'refresh' data needs flexibly.

  • The Client side is capable of handling multiple server connections. In other words, linkages with more than one external database can be established.

  • Related to this, but significantly different in nature: a business application can be both Client and Server. For example, a retreat registration application can be the Client when it comes to Student Contact Details information, but the Server when it comes to synchronising data with a related online registraton website that has its own local cache of registration data

  • The Client side is the 'active' side and the Server side is the 'passive' side. In other words: unless the Client initiates an interaction, there will be no exchange of data of any kind taking place

  • In terms of which data is shared, the rule is that the DoR is in charge. In other words: for data where the server side is the DoR, the server side decides which data it makes available for Clients to link to. When the Client side is the DoR it is the Client that decides which data it will from time to time send to the Server for storage.

  • The system is secure: before requests are handled by the Server an authentication process has to be succesfully completed

  • The DDMS allows the Client side to establish a datalink to a vertical fragment of the data on the Server side. This means that it can select a subset of fields in a given data table that is managed by the server, and does not have to link to all fields. An example makes this really easy to understand: if you are only interested in the name and contact details information of students, but the server side database has in addition to this much more information available that you are not interested in, such as place of birth, or passport number, you have a choice to only establish a link to the name and contact details without receiving all the other information you are not interested in.

  • The DDMS does not allow the Client side to establish a datalink to horizontal data fragments of the data on the server side. Example: it is not possible to link to only the students from a particular country, ignoring all the students from other countries. If a need for horizontal fragmentation arises this functionality might be developed in future.

  • Currently the DDMS Client is only capable of linking an entire local data table to one table on the server side (or a vertical fragment of one table on the server side as described above) on the server side.

  • When it comes to refreshing local data the DDMS does allow for horizontal data fragmentation. In other words: it is possible to only refresh a small subset of records of linked tables.

Data Communication between DDMS Client and Server

Currently the data exchange between client and server takes place in the following two ways:

  • A String containing database records in the following format: fields are delimited by tabs (\t) and records by newline characters (\n)

  • A String reprepresenting a serialized object or array

Which of these two applies is specified in the documentation of the various methods the DDMS client can invoke on the server side.

The area of data comms format will be a subject for revision during future development phases of the DDMS. It is envisioned that the DDMS Server could provide Web Service style functions to other applications besides the DDMS Client and in that case a more standardized form of data comms, such as XML will be considered.


Prev   Next
User Guide for the Retreat Registration Version Two (RV2)Project User Guide for the DDMS Client Side Application

Documentation generated on Mon, 18 May 2009 11:21:26 +0200 by phpDocumentor 1.4.1