Sunday, February 24, 2013

Integrated access to multiple data sources

Integrated access to multiple data sources
Large organizations typically have several databases and users may want to access data from more than one source. For example, an organization can have one data store for product catalog also called master data and another for billing and payments and yet another for reporting. These databases may contain some common information, determining the exact relationship between tables in different databases can get tough. For example, prices in one database might be dollars per dozen item and in another might be dollars per item. This is therefore typically avoided using XML DTDs which offer the promise that such semantic mismatches can be avoided if all parties conform to a single standard DTD. However, there are many legacy databases and most domains may not yet have an agreed-upon DTD. Semantic mismatches can be resolved and hidden from users by using relational views over the tables from the two databases. Defining a collection of views to give users a uniform presentation of relevant data from multiple databases is called semantic integration.  The task of defining these views for semantic integration can be challenging when there is little or no documentation for the existing databases. 
If the underlying databases are managed by different DBMS, some kind of middleware may be used to evaluate queries over integrating views, retrieving data over query execution time. Alternatively, the integrating views can be materialized and stored in a data warehouse. Queries can be run over the warehoused data instead of the source DBMS at run-time.

No comments:

Post a Comment