Operational data store
A subject-oriented system that is optimized for looking up one or two records at a time for decision making. An operational data store is a hybrid form of data warehouse that contains timely, current, integrated information. The data typically is of a higher level granularity than the transaction. You can use an operational data store for clerical, day-to-day decision making. This data can serve as the common source of data for data warehouses.
Wednesday, July 23, 2008
Data Mart
What is Data mart?
A subset of data warehouse that is stored in a smaller database and that is oriented toward a specific purpose or data subject rather than for enterprise-wide strategic planning. A data mart can contain operational data, summarized data, spatial data, or metadata. Typically, you use a dimensional data model to build a data mart.
The Star Schema
A good dimensional data warehouse has a simple database structure. Technically, a simple structure means faster queries. In a dimensional data warehouse, the implementation relational database has two types of tables, the fact table and the dimension table. The fact table consists of the facts (or measures) of the business. Dimension tables contain descriptions for querying the database.
The tables in a data warehouse are related such that the schema looks like a star, hence the term star schema

A star consists of a fact table surrounded by two or more dimension tables. A one-star structure has one fact table only. A multi-star structure has multiple fact tables, one for each star
dimension tables may be shared among multiple fact tables.
In Data warehouse the star structure, the snowflake structure could also be used However, it is more difficult to model than the star structure. As well, the snowflake structure is not that easy to understand and implement and the performance of its queries is slower than that of the star structure. These drawbacks make the snowflake structure unsuitable for dimensional data warehousing.
The Dimension, Fact table used term to suffix the table _dim, _fact.
Every Dimension table has one or more primary key which is reference to fact table. so based on this we can build query to retrieve the date what we required.
Above figure..
The customer_dim (customer dimensional table) is reference the Sales_order_fact table so that we can trace customer information. By using the Product_id in sales_order_fact table, we can retrieve the product information in production_dim table. By joining the sales_order_fact table and the date_dim table on the date_id column, you’ll get the order date. And by joining the fact table and the order_dim table on the order_id column, you obtain the order number.
The tables in a data warehouse are related such that the schema looks like a star, hence the term star schema
A star consists of a fact table surrounded by two or more dimension tables. A one-star structure has one fact table only. A multi-star structure has multiple fact tables, one for each star
dimension tables may be shared among multiple fact tables.
In Data warehouse the star structure, the snowflake structure could also be used However, it is more difficult to model than the star structure. As well, the snowflake structure is not that easy to understand and implement and the performance of its queries is slower than that of the star structure. These drawbacks make the snowflake structure unsuitable for dimensional data warehousing.
The Dimension, Fact table used term to suffix the table _dim, _fact.
Every Dimension table has one or more primary key which is reference to fact table. so based on this we can build query to retrieve the date what we required.
Above figure..
The customer_dim (customer dimensional table) is reference the Sales_order_fact table so that we can trace customer information. By using the Product_id in sales_order_fact table, we can retrieve the product information in production_dim table. By joining the sales_order_fact table and the date_dim table on the date_id column, you’ll get the order date. And by joining the fact table and the order_dim table on the order_id column, you obtain the order number.
Resolving Loops in a Universe
Define what a Loop is
Detect and recognize loops in Universe
Using Aliases and Contexts to resolve the loops
Identify when in the design process you resolve loops
Be able to use shortcut joins and know when to use them as alternative for resolving loops.
Be able to resolve recursive table structure loops.
…
A loop exists when the joins between tables form a continuous path
Three automated routines can be used to resolve most loop situations in the universe structure window.
1. Must be first Cardinality Detection
2. Alias Detection routine
3. Context Detection routine.
Click the Detect cardinalities or you can set manually also.
What is an Alias?
• An Alias is an exact duplicate of the original table with a new name. The data in the table is exactly the same.
• The Alias is used only to resolve the loop in the structure of the universe. There is no impact on the schema of the database
– Easy to define
– Easy to maintain
– Easy to use
• A loop with a single lookup table should be resolved by an alias
• A lookup table can be identified by its cardinality
• A lookup table only has the ‘one’ end of joins attached to it

Designer routines detect loops and candidates for aliases.
Break the loop by creating an alias of the lookup table for each side of the loop.
Some DBs require a separate alias for both sides of the loop.
After that you can detect loop once again,
Alias detection routine
You can use automatic loop detection also,
Click the Detect loops button, it will be identifies loops are exist or not
You can get the message in Loop detect window and it suggest to candidate contexts or aliases
If it is structure multi star situation it not possible using alias.

Normalized dimensions are not amenable to loop resolution using Aliases
• If you try to resolve the loop using Aliases:
And also we get duplicate of objects

This will be effective sales universe Rentals universe
In this case we can use to resolve with contexts.

after result look like this way.
Detecting and create context.

Double click the context in the list mode window
- give it context name
- if want write any description to update description this will be user
Detect and recognize loops in Universe
Using Aliases and Contexts to resolve the loops
Identify when in the design process you resolve loops
Be able to use shortcut joins and know when to use them as alternative for resolving loops.
Be able to resolve recursive table structure loops.
…
A loop exists when the joins between tables form a continuous path
Three automated routines can be used to resolve most loop situations in the universe structure window.
1. Must be first Cardinality Detection
2. Alias Detection routine
3. Context Detection routine.
Click the Detect cardinalities or you can set manually also.
What is an Alias?
• An Alias is an exact duplicate of the original table with a new name. The data in the table is exactly the same.
• The Alias is used only to resolve the loop in the structure of the universe. There is no impact on the schema of the database
– Easy to define
– Easy to maintain
– Easy to use
• A loop with a single lookup table should be resolved by an alias
• A lookup table can be identified by its cardinality
• A lookup table only has the ‘one’ end of joins attached to it
Designer routines detect loops and candidates for aliases.
Break the loop by creating an alias of the lookup table for each side of the loop.
Some DBs require a separate alias for both sides of the loop.
After that you can detect loop once again,
Alias detection routine
You can use automatic loop detection also,
Click the Detect loops button, it will be identifies loops are exist or not
You can get the message in Loop detect window and it suggest to candidate contexts or aliases
If it is structure multi star situation it not possible using alias.
Normalized dimensions are not amenable to loop resolution using Aliases
• If you try to resolve the loop using Aliases:
And also we get duplicate of objects
This will be effective sales universe Rentals universe
In this case we can use to resolve with contexts.
after result look like this way.
Detecting and create context.
Double click the context in the list mode window
- give it context name
- if want write any description to update description this will be user
Sunday, July 20, 2008
Data warehouse Concepts - Day 2 i
What is datawarehouse?
A data warehouse is built on relational database that is designed for query and analysis rather than for transaction processing. It usually contains historical data derived from transaction data, but it can include data from other sources. It separates analysis workload from transaction workload and enables an organization to consolidate data from several sources.
In addition to a relational database, a data warehouse environment includes an extraction, transportation, transformation, and loading (ETL) solution, an online analytical processing (OLAP) engine
Characteristics of a data warehouse
Subject Oriented
Integrated
Nonvolatile
Time Variant
Different Types of Data
Metadata
Dimension data
Fact Data
Aggregated Data
Data warehouse Architectures
Dimension Modeling
Identify a Business Process
Identify Grain
Identify elements which describe the process (Dimensions)
Identify elements which measure process(facts)10
Data warehouse Concepts - Day 2
dimension table:
A dimensional table is a collection of hierarchies and categories along which the user can drill down and drill up. it contains only the textual attributes.
LookUp table
A LookUp table is the one which is used when updating a warehouse. When the lookup is placed on the target table (fact table / warehouse) based upon the primary key of the target, it just updates the table by allowing only new records or updated records based on the lookup condition.
Aggregate table
Aggregate table contains the summary of existing warehouse data which is grouped to certain levels of dimensions.Retrieving the required data from the actual table, which have millions of records will take more time and also affects the server performance.To avoid this we can aggregate the table to certain required level and can use it.This tables reduces the load in the database server and increases the performance of the query and can retrieve the result very fastly
What is ODS?
1. ODS means Operational Data Store.
2. A collection of operation or bases data that is extracted from operation databases and standardized, cleansed, consolidated, transformed, and loaded into an enterprise data architecture. An ODS is used to support data mining of operational data, or as the store for base data that is summarized for a data warehouse. The ODS may also be used to audit the data warehouse to assure summarized and derived data is calculated properly. The ODS may further become the enterprise shared operational database, allowing operational systems that are being reengineered to use the ODS as there operation databases.
A dimensional table is a collection of hierarchies and categories along which the user can drill down and drill up. it contains only the textual attributes.
LookUp table
A LookUp table is the one which is used when updating a warehouse. When the lookup is placed on the target table (fact table / warehouse) based upon the primary key of the target, it just updates the table by allowing only new records or updated records based on the lookup condition.
Aggregate table
Aggregate table contains the summary of existing warehouse data which is grouped to certain levels of dimensions.Retrieving the required data from the actual table, which have millions of records will take more time and also affects the server performance.To avoid this we can aggregate the table to certain required level and can use it.This tables reduces the load in the database server and increases the performance of the query and can retrieve the result very fastly
What is ODS?
1. ODS means Operational Data Store.
2. A collection of operation or bases data that is extracted from operation databases and standardized, cleansed, consolidated, transformed, and loaded into an enterprise data architecture. An ODS is used to support data mining of operational data, or as the store for base data that is summarized for a data warehouse. The ODS may also be used to audit the data warehouse to assure summarized and derived data is calculated properly. The ODS may further become the enterprise shared operational database, allowing operational systems that are being reengineered to use the ODS as there operation databases.
Saturday, July 19, 2008
Data warehouse Concepts - Day 1
Data warehouse :
A Data warehouse is a repository of integrated information, available for queries and analysis. Data and information are extracted from heterogeneous sources as they are generated. This makes it much easier and more efficient to run queries over data that originally came from different sources". Another definition for data warehouse is: " A data warehouse is a logical collection of information gathered from many different operational databases used to create business intelligence that supports business analysis activities and decision-making tasks, primarily, a record of an enterprise's past transactional and operational information, stored in a database designed to favour efficient data analysis and reporting (especially OLAP)". Generally, data warehousing is not meant for current "live" data, although 'virtual' or 'point-to-point' data warehouses can access operational data. A 'real' data warehouse is generally preferred to a virtual DW because stored data has been validated and is set up to provide reliable results to common types of queries used in a business
Data Warehouse is a repository of integrated information, available for queries and analysis. Data and information are extracted from heterogeneous sources as they are generated....This makes it much easier and more efficient to run queries over data that originally came from different sources.
Typical relational databases are designed for on-line transactional processing (OLTP) and do not meet the requirements for effective on-line analytical processing (OLAP). As a result, data warehouses are designed differently than traditional relational databases
Data Warehousing is concerned with the activities involved with creating a DataRepository where you store all the facts you know about some large enterprise or activity, and then analyse the data in different ways,to look for patterns on the data. Typically this is done by large retail organisations, trying to identify patterns in buying behaviour so they can sell more stuff to the poor unsuspecting customer.
A Data warehouse is a repository of integrated information, available for queries and analysis. Data and information are extracted from heterogeneous sources as they are generated. This makes it much easier and more efficient to run queries over data that originally came from different sources". Another definition for data warehouse is: " A data warehouse is a logical collection of information gathered from many different operational databases used to create business intelligence that supports business analysis activities and decision-making tasks, primarily, a record of an enterprise's past transactional and operational information, stored in a database designed to favour efficient data analysis and reporting (especially OLAP)". Generally, data warehousing is not meant for current "live" data, although 'virtual' or 'point-to-point' data warehouses can access operational data. A 'real' data warehouse is generally preferred to a virtual DW because stored data has been validated and is set up to provide reliable results to common types of queries used in a business
Data Warehouse is a repository of integrated information, available for queries and analysis. Data and information are extracted from heterogeneous sources as they are generated....This makes it much easier and more efficient to run queries over data that originally came from different sources.
Typical relational databases are designed for on-line transactional processing (OLTP) and do not meet the requirements for effective on-line analytical processing (OLAP). As a result, data warehouses are designed differently than traditional relational databases
Data Warehousing is concerned with the activities involved with creating a DataRepository where you store all the facts you know about some large enterprise or activity, and then analyse the data in different ways,to look for patterns on the data. Typically this is done by large retail organisations, trying to identify patterns in buying behaviour so they can sell more stuff to the poor unsuspecting customer.
Subscribe to:
Posts (Atom)