Sunday, December 22, 2013

We looked at indexing and monitoring in a data warehouse. We now look at the interfaces to many technologies. The data warehouse recevies data from the operational systems and ODS and passes it to the data marts, DSS applications and alternate storage. The passing of data through the ware house should be smooth and easy. This is an important criterion for the data warehouse and note that this does not mean online passage of data which is not supported. The tests to see whether the data is passing smoothly through the DBMS is to see whether the passage requires changes to the format or if it can be passed to multi-dimensional processing or whether the data can be change data capture or if the context is lost in translation.
When the data is placed, the developer would like to improve efficiency of access and update with the placement of data at the physical block or page level.
Another requirement is to manage the data in parallel. This will improve performance.
Another technological requirement is to have a solid metadata control. And yet another technological requirement is to have a rich language interface to the data warehouse.
Parallel storage lets the volume grow arbitrarily. The entire issue of parallel storage and management of data is complex. The performance improves with the inverse of number of physical devices over which the data is physically distributed.
The use of metadata becomes all the more important with the heuristic iterative development cycle of the data warehouse. The metadata must always represent the warehouse with uptodate and accurate information.  Metadata contains such things as :
table structures
table attributes
source data
mapping from source data to warehouse
data model
extract logging
common routines for access of data
definitions or descriptions of data
relationships to one unit of data or the other.
As with data, every technology in the business environment has its own metadata. Report writers, ODS environments and ETL all have their own metadata.
Similarly the language interface could tremendously improve the usage of data. The requirements from a language interface include the ability to access data a set at a time, a record at a time, use one or more indexes, have  a SQL interface, perform update, delete and insert etc.
Note that the use of CLR and the SQL interfaces satisfy these conditions and were developed to that end.

No comments:

Post a Comment