Tuesday, November 11, 2008

Oracle Learning - 13

Oracle/PLSQL: Set Transaction

There are three transaction control functions. These are:
1. SET TRANSACTION READ ONLY;
2. SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
3. SET TRANSACTION USE ROLLBACK SEGMENT name;

Oracle/PLSQL: Lock Table

The syntax for a Lock table is:
LOCK TABLE tables IN lock_mode MODE [NOWAIT];
Tables is a comma-delimited list of tables.
Lock_mode is one of:
ROW SHARE
ROW EXCLUSIVE
SHARE UPDATE
SHARE
SHARE ROW EXCLUSIVE
EXCLUSIVE.
NoWait specifies that the database should not wait for a lock to be released.
Oracle/PLSQL Topics: Cursors

A cursor is a mechanism by which you can assign a name to a "select statement" and manipulate the information within that SQL statement.
A cursor is a SELECT statement that is defined within the declaration section of your PLSQL code. We'll take a look at three different syntaxes for cursors.
Cursor without parameters (simplest)
The basic syntax for a cursor without parameters is:
CURSOR cursor_name
IS
SELECT_statement;

For example, you could define a cursor called c1 as below.
CURSOR c1
IS
SELECT course_number
from courses_tbl
where course_name = name_in;
The result set of this cursor is all course_numbers whose course_name matches the variable called name_in.

Below is a function that uses this cursor.
CREATE OR REPLACE Function FindCourse
( name_in IN varchar2 )
RETURN number
IS
cnumber number;
CURSOR c1
IS
SELECT course_number
from courses_tbl
where course_name = name_in;

BEGIN
open c1;
fetch c1 into cnumber;

if c1%notfound then
cnumber := 9999;
end if;

close c1;
RETURN cnumber;
END;

Cursor with parameters
The basic syntax for a cursor with parameters is:
CURSOR cursor_name (parameter_list)
IS
SELECT_statement;

For example, you could define a cursor called c2 as below.
CURSOR c2 (subject_id_in IN varchar2)
IS
SELECT course_number
from courses_tbl
where subject_id = subject_id_in);
The result set of this cursor is all course_numbers whose subject_id matches the subject_id passed to the cursor via the parameter.

Cursor with return clause
The basic syntax for a cursor with a return clause is:
CURSOR cursor_name
RETURN field%ROWTYPE
IS
SELECT_statement;
For example, you could define a cursor called c3 as below.
CURSOR c3
RETURN courses_tbl%ROWTYPE
IS
SELECT *
from courses_tbl
where subject = 'Mathematics;
The result set of this cursor is all columns from the course_tbl where the subject is Mathematics.
Oracle/PLSQL: OPEN Statement

Once you've declared your cursor, the next step is to open the cursor.
The basic syntax to OPEN the cursor is:
OPEN cursor_name;

For example, you could open a cursor called c1 with the following command:
OPEN c1;

Below is a function that demonstrates how to use the OPEN statement:
CREATE OR REPLACE Function FindCourse
( name_in IN varchar2 )
RETURN number
IS
cnumber number;
CURSOR c1
IS
SELECT course_number
from courses_tbl
where course_name = name_in;

BEGIN
open c1;
fetch c1 into cnumber;

if c1%notfound then
cnumber := 9999;
end if;

close c1;
RETURN cnumber;
END;

Oracle/PLSQL: FETCH Statement

The purpose of using a cursor, in most cases, is to retrieve the rows from your cursor so that some type of operation can be performed on the data. After declaring and opening your cursor, the next step is to FETCH the rows from your cursor.
The basic syntax for a FETCH statement is:
FETCH cursor_name INTO ;

For example, you could could have a cursor defined as:
CURSOR c1
IS
SELECT course_number
from courses_tbl
where course_name = name_in;
The command that would be used to fetch the data from this cursor is:
FETCH c1 into cnumber;
This would fetch the first course_number into the variable called cnumber;

Below is a function that demonstrates how to use the FETCH statement.
CREATE OR REPLACE Function FindCourse
( name_in IN varchar2 )
RETURN number
IS
cnumber number;
CURSOR c1
IS
SELECT course_number
from courses_tbl
where course_name = name_in;

BEGIN
open c1;
fetch c1 into cnumber;

if c1%notfound then
cnumber := 9999;
end if;

close c1;
RETURN cnumber;
END;
Oracle/PLSQL: CLOSE Statement

The final step of working with cursors is to close the cursor once you have finished using it.
The basic syntax to CLOSE the cursor is:
CLOSE cursor_name;

For example, you could close a cursor called c1 with the following command:
CLOSE c1;

Below is a function that demonstrates how to use the CLOSE statement:
CREATE OR REPLACE Function FindCourse
( name_in IN varchar2 )
RETURN number
IS
cnumber number;
CURSOR c1
IS
SELECT course_number
from courses_tbl
where course_name = name_in;

BEGIN
open c1;
fetch c1 into cnumber;

if c1%notfound then
cnumber := 9999;
end if;

close c1;
RETURN cnumber;
END;

Oracle Learning - 12

Oracle/PLSQL: Sequences (Autonumber)

In Oracle, you can create an autonumber field by using sequences. A sequence is an object in Oracle that is used to generate a number sequence. This can be useful when you need to create a unique number to act as a primary key.
The syntax for a sequence is:
CREATE SEQUENCE sequence_name
MINVALUE value
MAXVALUE value
START WITH value
INCREMENT BY value
CACHE value;
For example:
CREATE SEQUENCE supplier_seq
MINVALUE 1
MAXVALUE 999999999999999999999999999
START WITH 1
INCREMENT BY 1
CACHE 20;
This would create a sequence object called supplier_seq. The first sequence number that it would use is 1 and each subsequent number would increment by 1 (ie: 2,3,4,...}. It will cache up to 20 values for performance.
If you omit the MAXVALUE option, your sequence will automatically default to:
MAXVALUE 999999999999999999999999999
So you can simplify your CREATE SEQUENCE command as follows:
CREATE SEQUENCE supplier_seq
MINVALUE 1
START WITH 1
INCREMENT BY 1
CACHE 20;
Now that you've created a sequence object to simulate an autonumber field, we'll cover how to retrieve a value from this sequence object. To retrieve the next value in the sequence order, you need to use nextval.
For example:
supplier_seq.nextval
This would retrieve the next value from supplier_seq. The nextval statement needs to be used in an SQL statement. For example:
INSERT INTO suppliers
(supplier_id, supplier_name)
VALUES
(supplier_seq.nextval, 'Kraft Foods');
This insert statement would insert a new record into the suppliers table. The supplier_id field would be assigned the next number from the supplier_seq sequence. The supplier_name field would be set to Kraft Foods.

Frequently Asked Questions

One common question about sequences is:
Question: While creating a sequence, what does cache and nocache options mean? For example, you could create a sequence with a cache of 20 as follows:
CREATE SEQUENCE supplier_seq
MINVALUE 1
START WITH 1
INCREMENT BY 1
CACHE 20;

Or you could create the same sequence with the nocache option:
CREATE SEQUENCE supplier_seq
MINVALUE 1
START WITH 1
INCREMENT BY 1
NOCACHE;

Answer: With respect to a sequence, the cache option specifies how many sequence values will be stored in memory for faster access.
The downside of creating a sequence with a cache is that if a system failure occurs, all cached sequence values that have not be used, will be "lost". This results in a "gap" in the assigned sequence values. When the system comes back up, Oracle will cache new numbers from where it left off in the sequence, ignoring the so called "lost" sequence values.
Note: To recover the lost sequence values, you can always execute an ALTER SEQUENCE command to reset the counter to the correct value.
Nocache means that none of the sequence values are stored in memory. This option may sacrifice some performance, however, you should not encounter a gap in the assigned sequence values.


Question: How do we set the LASTVALUE value in an Oracle Sequence?
Answer: You can change the LASTVALUE for an Oracle sequence, by executing an ALTER SEQUENCE command.
For example, if the last value used by the Oracle sequence was 100 and you would like to reset the sequence to serve 225 as the next value. You would execute the following commands.
alter sequence seq_name
increment by 124;
select seq_name.nextval from dual;
alter sequence seq_name
increment by 1;
Now, the next value to be served by the sequence will be 225.

Oracle/PLSQL: Commit

The syntax for the COMMIT statement is:
COMMIT [WORK] [COMMENT text];
The Commit statement commits all changes for the current session. Once a commit is issued, other users will be able to see your changes.
Oracle/PLSQL: Commit

The syntax for the COMMIT statement is:
COMMIT [WORK] [COMMENT text];
The Commit statement commits all changes for the current session. Once a commit is issued, other users will be able to see your changes.

Oracle Learning - 11

Oracle/PLSQL: FOR Loop

The syntax for the FOR Loop is:
FOR loop_counter IN [REVERSE] lowest_number..highest_number
LOOP
{.statements.}
END LOOP;
You would use a FOR Loop when you want to execute the loop body a fixed number of times.

Let's take a look at an example.
FOR Lcntr IN 1..20
LOOP
LCalc := Lcntr * 31;
END LOOP;
This example will loop 20 times. The counter will start at 1 and end at 20.

The FOR Loop can also loop in reverse. For example:
FOR Lcntr IN REVERSE 1..15
LOOP
LCalc := Lcntr * 31;
END LOOP;
This example will loop 15 times. The counter will start at 15 and end at 1. (loops backwards)

Oracle/PLSQL: CURSOR FOR Loop

The syntax for the CURSOR FOR Loop is:
FOR record_index in cursor_name
LOOP
{.statements.}
END LOOP;
You would use a CURSOR FOR Loop when you want to fetch and process every record in a cursor. The CURSOR FOR Loop will terminate when all of the records in the cursor have been fetched.
Here is an example of a function that uses a CURSOR FOR Loop:
CREATE OR REPLACE Function TotalIncome
( name_in IN varchar2 )
RETURN varchar2
IS
total_val number(6);

cursor c1 is
select monthly_income
from employees
where name = name_in;

BEGIN
total_val := 0;
FOR employee_rec in c1
LOOP
total_val := total_val + employee_rec.monthly_income;
END LOOP;

RETURN total_val;
END;
In this example, we've created a cursor called c1. The CURSOR FOR Loop will terminate after all records have been fetched from the cursor c1.
Oracle/PLSQL: While Loop

The syntax for the WHILE Loop is:
WHILE condition
LOOP
{.statements.}
END LOOP;
You would use a WHILE Loop when you are not sure how many times you will execute the loop body. Since the WHILE condition is evaluated before entering the loop, it is possible that the loop body may not execute even once.
Let's take a look at an example:
WHILE monthly_value <= 4000
LOOP
monthly_value := daily_value * 31;
END LOOP;
In this example, the WHILE Loop would terminate once the monthly_value exceeded 4000.

Oracle/PLSQL: Repeat Until Loop

Oracle doesn't have a Repeat Until loop, but you can emulate one. The syntax for emulating a REPEAT UNTIL Loop is:
LOOP
{.statements.}
EXIT WHEN boolean_condition;
END LOOP;
You would use an emulated REPEAT UNTIL Loop when you do not know how many times you want the loop body to execute. The REPEAT UNTIL Loop would terminate when a certain condition was met.
Let's take a look at an example:
LOOP
monthly_value := daily_value * 31;
EXIT WHEN monthly_value > 4000;
END LOOP;
In this example, the LOOP would repeat until the monthly_value exceeded 4000.
Oracle/PLSQL: Exit Statement

The syntax for the EXIT statement is:
EXIT [WHEN boolean_condition];
The EXIT statement is most commonly used to terminate LOOP statements.
Let's take a look at an example:
LOOP
monthly_value := daily_value * 31;
EXIT WHEN monthly_value > 4000;
END LOOP;
In this example, the LOOP would terminate when the monthly_value exceeded 4000.

Oracle Learning - 10

Oracle/PLSQL: Case Statement

In Oracle 9i, you can use the case statement within an SQL statement. It has the functionality of an IF-THEN-ELSE statement.
The syntax for the case statement is:
CASE expression
WHEN condition_1 THEN result_1
WHEN condition_2 THEN result_2
...
WHEN condition_n THEN result_n
ELSE result END
expression is the value that you are comparing to the list of conditions. (ie: condition_1, condition_2, ... condition_n)
condition_1 to condition_n must all be the same datatype. Conditions are evaluated in the order listed. Once a condition is found to be true, the case statement will return the result and not evaluate the conditions any further.
result_1 to result_n must all be the same datatype. This is the value returned once a condition is found to be true.

Note:
If no condition is found to be true, then the case statement will return the value in the ELSE clause.
If the ELSE clause is omitted and no condition is found to be true, then the case statement will return NULL.
You can have up to 255 comparisons in a case statement. Each WHEN ... THEN clause is considered 2 comparisons.

For Example:
You could use the case statement in an SQL statement as follows:
select table_name,
CASE owner
WHEN 'SYS' THEN 'The owner is SYS'
WHEN 'SYSTEM' THEN 'The owner is SYSTEM'
ELSE 'The owner is another value' END
from all_tables;

The above case statement is equivalent to the following IF-THEN-ELSE statement:
IF owner = 'SYS' THEN
result := 'The owner is SYS';
ELSIF owner = 'SYSTEM' THEN
result := 'The owner is SYSTEM'';
ELSE
result := 'The owner is another value';
END IF;

The case statement will compare each owner value, one by one.

One thing to note is that the ELSE clause within the case statement is optional. You could have omitted it. Let's take a look at the SQL statement above with the ELSE clause omitted.
Your SQL statement would look as follows:
select table_name,
CASE owner
WHEN 'SYS' THEN 'The owner is SYS'
WHEN 'SYSTEM' THEN 'The owner is SYSTEM' END
from all_tables;
With the ELSE clause omitted, if no condition was found to be true, the case statement would return NULL.

Oracle/PLSQL: GOTO Statement

The GOTO statement causes the code to branch to the label after the GOTO statement.
For example:
GOTO label_name;

Then later in the code, you would place your label and code associated with that label.
Label_name: {statements

Oracle/PLSQL: Loop Statement

The syntax for the LOOP statement is:
LOOP
{.statements.}
END LOOP;
You would use a LOOP statement when you are not sure how many times you want the loop body to execute and you want the loop body to execute at least once.
The LOOP statement is terminated when it encounters either an EXIT statement or when it encounters an EXIT WHEN statement that evaluated to TRUE.
Let's take a look at an example:
LOOP
monthly_value := daily_value * 31;
EXIT WHEN monthly_value > 4000;
END LOOP;
In this example, the LOOP would terminate when the monthly_value exceeded 4000.

Oracle Learning - 9

Oracle/PLSQL: IS NULL

In other languages, a null value is found using the = null syntax. However in PLSQL to check if a value is null, you must use the "IS NULL" syntax.
To check for equality on a null value, you must use "IS NULL".
For example,
IF Lvalue IS NULL then
.
END IF;
If Lvalue contains a null value, the "IF" expression will evaluate to TRUE.

You can also use "IS NULL" in an SQL statement. For example:
select * from suppliers
where supplier_name IS NULL;
This will return all records from the suppliers table where the supplier_name contains a null value.

Oracle/PLSQL: IS NOT NULL

In other languages, a not null value is found using the != null syntax. However in PLSQL to check if a value is not null, you must use the "IS NOT NULL" syntax.
For example,
IF Lvalue IS NOT NULL then
.
END IF;
If Lvalue does not contain a null value, the "IF" expression will evaluate to TRUE.

You can also use "IS NOT NULL" in an SQL statement. For example:
select * from suppliers
where supplier_name IS NOT NULL;
This will return all records from the suppliers table where the supplier_name does not contain a null value.

Oracle/PLSQL: IF-THEN-ELSE Statement

There are three different syntaxes for these types of statements.
Syntax #1: IF-THEN
IF condition THEN
{...statements...}
END IF;

Syntax #2: IF-THEN-ELSE
IF condition THEN
{...statements...}
ELSE
{...statements...}
END IF;

Syntax #3: IF-THEN-ELSIF
IF condition THEN
{...statements...}
ELSIF condition THEN
{...statements...}
ELSE
{...statements...}
END IF;

Here is an example of a function that uses the IF-THEN-ELSE statement:
CREATE OR REPLACE Function IncomeLevel
( name_in IN varchar2 )
RETURN varchar2
IS
monthly_value number(6);
ILevel varchar2(20);
cursor c1 is
select monthly_income
from employees
where name = name_in;
BEGIN
open c1;
fetch c1 into monthly_value;
close c1;
IF monthly_value <= 4000 THEN ILevel := 'Low Income'; ELSIF monthly_value > 4000 and monthly_value <= 7000 THEN ILevel := 'Avg Income'; ELSIF monthly_value > 7000 and monthly_value <= 15000 THEN
ILevel := 'Moderate Income';
ELSE
ILevel := 'High Income';
END IF;
RETURN ILevel;
END;
In this example, we've created a function called IncomeLevel. It has one parameter called name_in and it returns a varchar2. The function will return the income level based on the employee's name.

Oracle Learning - 8

SQL: CREATE Table

The basic syntax for a CREATE TABLE is:
CREATE TABLE table_name
(column1 datatype null/not null,
column2 datatype null/not null,
...
);
Each column must have a datatype. The column should either be defined as "null" or "not null" and if this value is left blank, the database assumes "null" as the default.

For example:

CREATE TABLE supplier
( supplier_id numeric(10) not null,
supplier_name varchar2(50) not null,
contact_name varchar2(50)
)

SQL: CREATE Table from another table

You can also create a table from an existing table by copying the existing table's columns.
It is important to note that when creating a table in this way, the new table will be populated with the records from the existing table (based on the SELECT Statement).

Syntax #1 - Copying all columns from another table
The basic syntax is:
CREATE TABLE new_table
AS (SELECT * FROM old_table);

For example:
CREATE TABLE suppliers
AS (SELECT *
FROM companies
WHERE id > 1000);
This would create a new table called suppliers that included all columns from the companies table.
If there were records in the companies table, then the new suppliers table would also contain the records selected by the SELECT statement.

Syntax #2 - Copying selected columns from another table
The basic syntax is:
CREATE TABLE new_table
AS (SELECT column_1, column2, ... column_n FROM old_table);

For example:
CREATE TABLE suppliers
AS (SELECT id, address, city, state, zip
FROM companies
WHERE id > 1000);
This would create a new table called suppliers, but the new table would only include the specified columns from the companies table.
Again, if there were records in the companies table, then the new suppliers table would also contain the records selected by the SELECT statement.

Syntax #3 - Copying selected columns from multiple tables
The basic syntax is:
CREATE TABLE new_table
AS (SELECT column_1, column2, ... column_n
FROM old_table_1, old_table_2, ... old_table_n);


For example:
CREATE TABLE suppliers
AS (SELECT companies.id, companies.address, categories.cat_type
FROM companies, categories
WHERE companies.id = categories.id
AND companies.id > 1000);
This would create a new table called suppliers based on columns from both the companies and categories tables.

SQL: ALTER Table

The ALTER TABLE command allows you to add, modify, or drop a column from an existing table.

Adding column(s) to a table
Syntax #1
To add a column to an existing table, the ALTER TABLE syntax is:
ALTER TABLE table_name
ADD column_name column-definition;
For example:
ALTER TABLE supplier
ADD supplier_name varchar2(50);
This will add a column called supplier_name to the supplier table.

Syntax #2
To add multiple columns to an existing table, the ALTER TABLE syntax is:

ALTER TABLE table_name
ADD ( column_1 column-definition,
column_2 column-definition,
...
column_n column_definition );
For example:

ALTER TABLE supplier
ADD ( supplier_name varchar2(50),
city varchar2(45) );
This will add two columns (supplier_name and city) to the supplier table.

Modifying column(s) in a table
Syntax #1
To modify a column in an existing table, the ALTER TABLE syntax is:
ALTER TABLE table_name
MODIFY column_name column_type;
For example:
ALTER TABLE supplier
MODIFY supplier_name varchar2(100) not null;
This will modify the column called supplier_name to be a data type of varchar2(100) and force the column to not allow null values.

Syntax #2
To modify multiple columns in an existing table, the ALTER TABLE syntax is:

ALTER TABLE table_name
MODIFY ( column_1 column_type,
column_2 column_type,
...
column_n column_type );
For example:

ALTER TABLE supplier
MODIFY ( supplier_name varchar2(100) not null,
city varchar2(75) );
This will modify both the supplier_name and city columns.

Drop column(s) in a table
Syntax #1
To drop a column in an existing table, the ALTER TABLE syntax is:
ALTER TABLE table_name
DROP COLUMN column_name;
For example:
ALTER TABLE supplier
DROP COLUMN supplier_name;
This will drop the column called supplier_name from the table called supplier.

Rename column(s) in a table
(NEW in Oracle 9i Release 2)
Syntax #1
Starting in Oracle 9i Release 2, you can now rename a column.
To rename a column in an existing table, the ALTER TABLE syntax is:
ALTER TABLE table_name
RENAME COLUMN old_name to new_name;
For example:
ALTER TABLE supplier
RENAME COLUMN supplier_name to sname;
This will rename the column called supplier_name to sname.

SQL: DROP Table

The basic syntax for a DROP TABLE is:
DROP TABLE table_name;

For example:
DROP TABLE supplier;
This would drop table called supplier.


SQL: Global Temporary tables

Global temporary tables are distinct within SQL sessions.
The basic syntax is:
CREATE GLOBAL TEMPORARY TABLE table_name ( ...);

For example:

CREATE GLOBAL TEMPORARY TABLE supplier
( supplier_id numeric(10) not null,
supplier_name varchar2(50) not null,
contact_name varchar2(50)
)
This would create a global temporary table called supplier .

SQL: Local Temporary tables

Local temporary tables are distinct within modules and embedded SQL programs within SQL sessions.
The basic syntax is:
DECLARE LOCAL TEMPORARY TABLE table_name ( ...);

SQL: VIEWS

A view is, in essence, a virtual table. It does not physically exist. Rather, it is created by a query joining one or more tables.
The syntax for a VIEW is:
CREATE VIEW view_name AS
SELECT columns
FROM table
WHERE predicates;


For example:
CREATE VIEW sup_orders AS
SELECT supplier.supplier_id, orders.quantity, orders.price
FROM supplier, orders
WHERE supplier.supplier_id = orders.supplier_id
and supplier.supplier_name = 'IBM';
This would create a virtual table based on the result set of the select statement. You can now query the view as follows:
SELECT *
FROM sup_orders;

Frequently Asked Questions

Question: Can you update the data in a view?
Answer: A view is created by joining one or more tables. When you update record(s) in a view, it updates the records in the underlying tables that make up the view.
So, yes, you can update the data in a view providing you have the proper privileges to the underlying tables.

Oracle Learning - 7

SQL: INTERSECT Query

The INTERSECT query allows you to return the results of 2 or more "select" queries. However, it only returns the rows selected by all queries. If a record exists in one query and not in the other, it will be omitted from the INTERSECT results.
Each SQL statement within the INTERSECT query must have the same number of fields in the result sets with similar data types.
The syntax for an INTERSECT query is:
select field1, field2, . field_n
from tables
INTERSECT
select field1, field2, . field_n
from tables;

Example #1
The following is an example of an INTERSECT query:
select supplier_id
from suppliers
INTERSECT
select supplier_id
from orders;
In this example, if a supplier_id appeared in both the suppliers and orders table, it would appear in your result set.

Example #2 - With ORDER BY Clause
The following is an INTERSECT query that uses an ORDER BY clause:
select supplier_id, supplier_name
from suppliers
where supplier_id > 2000
INTERSECT
select company_id, company_name
from companies
where company_id > 1000
ORDER BY 2;
Since the column names are different between the two "select" statements, it is more advantageous to reference the columns in the ORDER BY clause by their position in the result set. In this example, we've sorted the results by supplier_name / company_name in ascending order, as denoted by the "ORDER BY 2".
The supplier_name / company_name fields are in position #2 in the result set.

SQL: MINUS Query

The MINUS query returns all rows in the first query that are not returned in the second query.
Each SQL statement within the MINUS query must have the same number of fields in the result sets with similar data types.
The syntax for an MINUS query is:
select field1, field2, . field_n
from tables
MINUS
select field1, field2, . field_n
from tables;

Example #1
The following is an example of an MINUS query:
select supplier_id
from suppliers
MINUS
select supplier_id
from orders;
In this example, the SQL would return all supplier_id values that are in the suppliers table and not in the orders table. What this means is that if a supplier_id value existed in the suppliers table and also existed in the orders table, the supplier_id value would not appear in this result set.

Example #2 - With ORDER BY Clause
The following is an MINUS query that uses an ORDER BY clause:
select supplier_id, supplier_name
from suppliers
where supplier_id > 2000
MINUS
select company_id, company_name
from companies
where company_id > 1000
ORDER BY 2;
Since the column names are different between the two "select" statements, it is more advantageous to reference the columns in the ORDER BY clause by their position in the result set. In this example, we've sorted the results by supplier_name / company_name in ascending order, as denoted by the "ORDER BY 2".
The supplier_name / company_name fields are in position #2 in the result set.
SQL: UPDATE Statement

The UPDATE statement allows you to update a single record or multiple records in a table.
The syntax the UPDATE statement is:
UPDATE table
SET column = expression
WHERE predicates;

Example #1 - Simple example
Let's take a look at a very simple example.
UPDATE supplier
SET name = 'HP'
WHERE name = 'IBM';
This statement would update all supplier names in the supplier table from IBM to HP.

Example #2 - More complex example
You can also perform more complicated updates.
You may wish to update records in one table based on values in another table. Since you can't list more than one table in the UPDATE statement, you can use the EXISTS clause.
For example:

UPDATE supplier
SET supplier_name = ( SELECT customer.name
FROM customers
WHERE customers.customer_id = supplier.supplier_id)
WHERE EXISTS
( SELECT customer.name
FROM customers
WHERE customers.customer_id = supplier.supplier_id);
Whenever a supplier_id matched a customer_id value, the supplier_name would be overwritten to the customer name from the customers table.
Learn more about the EXISTS condition.

SQL: INSERT Statement

The INSERT statement allows you to insert a single record or multiple records into a table.
The syntax for the INSERT statement is:
INSERT INTO table
(column-1, column-2, ... column-n)
VALUES
(value-1, value-2, ... value-n);

Example #1 - Simple example
Let's take a look at a very simple example.
INSERT INTO supplier
(supplier_id, supplier_name)
VALUES
(24553, 'IBM');
This would result in one record being inserted into the supplier table. This new record would have a supplier_id of 24553 and a supplier_name of IBM.

Example #2 - More complex example
You can also perform more complicated inserts using sub-selects.
For example:
INSERT INTO supplier
(supplier_id, supplier_name)
SELECT account_no, name
FROM customers
WHERE city = 'Newark';
By placing a "select" in the insert statement, you can perform multiples inserts quickly.
With this type of insert, you may wish to check for the number of rows being inserted. You can determine the number of rows that will be inserted by running the following SQL statement before performing the insert.
SELECT count(*)
FROM customers
WHERE city = 'Newark';

Frequently Asked Questions

Question: I am setting up a database with clients. I know that you use the "insert" statement to insert information in the database, but how do I make sure that I do not enter the same client information again?
Answer: You can make sure that you do not insert duplicate information by using the EXISTS condition.
For example, if you had a table named clients with a primary key of client_id, you could use the following statement:
INSERT INTO clients
(client_id, client_name, client_type)
SELECT supplier_id, supplier_name, 'advertising'
FROM suppliers
WHERE not exists (select * from clients
where clients.client_id = suppliers.supplier_id);
This statement inserts multiple records with a subselect.
If you wanted to insert a single record, you could use the following statement:
INSERT INTO clients
(client_id, client_name, client_type)
SELECT 10345, 'IBM', 'advertising'
FROM dual
WHERE not exists (select * from clients
where clients.client_id = 10345);
The use of the dual table allows you to enter your values in a select statement, even though the values are not currently stored in a table.
Learn more about the EXISTS condition.




SQL: DELETE Statement

The DELETE statement allows you to delete a single record or multiple records from a table.
The syntax for the DELETE statement is:
DELETE FROM table
WHERE predicates;

Example #1 - Simple example
Let's take a look at a simple example:
DELETE FROM supplier
WHERE supplier_name = 'IBM';
This would delete all records from the supplier table where the supplier_name is IBM.
You may wish to check for the number of rows that will be deleted. You can determine the number of rows that will be deleted by running the following SQL statement before performing the delete.
SELECT count(*)
FROM supplier
WHERE supplier_name = 'IBM';

Example #2 - More complex example
You can also perform more complicated deletes.
You may wish to delete records in one table based on values in another table. Since you can't list more than one table in the FROM clause when you are performing a delete, you can use the EXISTS clause.
For example:
DELETE FROM supplier
WHERE EXISTS
( select customer.name
from customer
where customer.customer_id = supplier.supplier_id
and customer.customer_name = 'IBM' );
This would delete all records in the supplier table where there is a record in the customer table whose name is IBM, and the customer_id is the same as the supplier_id.
Learn more about the EXISTS condition.
If you wish to determine the number of rows that will be deleted, you can run the following SQL statement before performing the delete.
SELECT count(*) FROM supplier
WHERE EXISTS
( select customer.name
from customer
where customer.customer_id = supplier.supplier_id
and customer.customer_name = 'IBM' );

Frequently Asked Questions

Question: How would I write an SQL statement to delete all records in TableA whose data in field1 & field2 DO NOT match the data in fieldx & fieldz of TableB?
Answer: You could try something like this:
DELETE FROM TableA
WHERE NOT EXISTS
( select *
from TableB
where TableA .field1 = TableB.fieldx
and TableA .field2 = TableB.fieldz );

Oracle Learning - 6

SQL: UNION Query

The UNION query allows you to combine the result sets of 2 or more "select" queries. It removes duplicate rows between the various "select" statements.
Each SQL statement within the UNION query must have the same number of fields in the result sets with similar data types.
The syntax for a UNION query is:
select field1, field2, . field_n
from tables
UNION
select field1, field2, . field_n
from tables;

Example #1
The following is an example of a UNION query:
select supplier_id
from suppliers
UNION
select supplier_id
from orders;
In this example, if a supplier_id appeared in both the suppliers and orders table, it would appear once in your result set. The UNION removes duplicates.

Example #2 - With ORDER BY Clause
The following is a UNION query that uses an ORDER BY clause:
select supplier_id, supplier_name
from suppliers
where supplier_id > 2000
UNION
select company_id, company_name
from companies
where company_id > 1000
ORDER BY 2;
Since the column names are different between the two "select" statements, it is more advantageous to reference the columns in the ORDER BY clause by their position in the result set. In this example, we've sorted the results by supplier_name / company_name in ascending order, as denoted by the "ORDER BY 2".
The supplier_name / company_name fields are in position #2 in the result set.

Frequently Asked Questions

Question: I need to compare two dates and return the count of a field based on the date values. For example, I have a date field in a table called last updated date. I have to check if trunc(last_updated_date >= trun(sysdate-13).
Answer: Since you are using the COUNT function which is an aggregate function, we'd recommend using a UNION query. For example, you could try the following:
SELECT a.code as Code, a.name as Name, count(b.Ncode)
FROM cdmaster a, nmmaster b
WHERE a.code = b.code
and a.status = 1
and b.status = 1
and b.Ncode <> 'a10'
and trunc(last_updated_date) <= trunc(sysdate-13)
group by a.code, a.name
UNION
SELECT a.code as Code, a.name as Name, count(b.Ncode)
FROM cdmaster a, nmmaster b
WHERE a.code = b.code
and a.status = 1
and b.status = 1
and b.Ncode <> 'a10'
and trunc(last_updated_date) > trunc(sysdate-13)
group by a.code, a.name;
The UNION query allows you to perform a COUNT based on one set of criteria.
trunc(last_updated_date) <= trunc(sysdate-13)
As well as perform a COUNT based on another set of criteria.
trunc(last_updated_date) > trunc(sysdate-13)

SQL: UNION ALL Query

The UNION ALL query allows you to combine the result sets of 2 or more "select" queries. It returns all rows (even if the row exists in more than one of the "select" statements).
Each SQL statement within the UNION ALL query must have the same number of fields in the result sets with similar data types.
The syntax for a UNION ALL query is:
select field1, field2, . field_n
from tables
UNION ALL
select field1, field2, . field_n
from tables;

Example #1
The following is an example of a UNION ALL query:
select supplier_id
from suppliers
UNION ALL
select supplier_id
from orders;
If a supplier_id appeared in both the suppliers and orders table, it would appear multiple times in your result set. The UNION ALL does not remove duplicates.

Example #2 - With ORDER BY Clause
The following is a UNION query that uses an ORDER BY clause:
select supplier_id, supplier_name
from suppliers
where supplier_id > 2000
UNION ALL
select company_id, company_name
from companies
where company_id > 1000
ORDER BY 2;
Since the column names are different between the two "select" statements, it is more advantageous to reference the columns in the ORDER BY clause by their position in the result set. In this example, we've sorted the results by supplier_name / company_name in ascending order, as denoted by the "ORDER BY 2".
The supplier_name / company_name fields are in position #2 in the result set.

Oracle Learning - 4

SQL: GROUP BY Clause

The GROUP BY clause can be used in a SELECT statement to collect data across multiple records and group the results by one or more columns.
The syntax for the GROUP BY clause is:
SELECT column1, column2, ... column_n, aggregate_function (expression)
FROM tables
WHERE predicates
GROUP BY column1, column2, ... column_n;
aggregate_function can be a function such as SUM, COUNT, MIN, or MAX.

Example using the SUM function
For example, you could also use the SUM function to return the name of the department and the total sales (in the associated department).
SELECT department, SUM(sales) as "Total sales"
FROM order_details
GROUP BY department;
Because you have listed one column in your SELECT statement that is not encapsulated in the SUM function, you must use a GROUP BY clause. The department field must, therefore, be listed in the GROUP BY section.

Example using the COUNT function
For example, you could use the COUNT function to return the name of the department and the number of employees (in the associated department) that make over $25,000 / year.
SELECT department, COUNT(*) as "Number of employees"
FROM employees
WHERE salary > 25000
GROUP BY department;

Example using the MIN function
For example, you could also use the MIN function to return the name of each department and the minimum salary in the department.
SELECT department, MIN(salary) as "Lowest salary"
FROM employees
GROUP BY department;

Example using the MAX function
For example, you could also use the MAX function to return the name of each department and the maximum salary in the department.
SELECT department, MAX(salary) as "Highest salary"
FROM employees
GROUP BY department;
SQL: ORDER BY Clause

The ORDER BY clause allows you to sort the records in your result set. The ORDER BY clause can only be used in SELECT statements.
The syntax for the ORDER BY clause is:
SELECT columns
FROM tables
WHERE predicates
ORDER BY column ASC/DESC;
The ORDER BY clause sorts the result set based on the columns specified. If the ASC or DESC value is omitted, the system assumed ascending order.
ASC indicates ascending order. (default)
DESC indicates descending order.

Example #1
SELECT supplier_city
FROM supplier
WHERE supplier_name = 'IBM'
ORDER BY supplier_city;
This would return all records sorted by the supplier_city field in ascending order.

Example #2
SELECT supplier_city
FROM supplier
WHERE supplier_name = 'IBM'
ORDER BY supplier_city DESC;
This would return all records sorted by the supplier_city field in descending order.

Example #3
You can also sort by relative position in the result set, where the first field in the result set is 1. The next field is 2, and so on.
SELECT supplier_city
FROM supplier
WHERE supplier_name = 'IBM'
ORDER BY 1 DESC;
This would return all records sorted by the supplier_city field in descending order, since the supplier_city field is in position #1 in the result set.

Example #4
SELECT supplier_city, supplier_state
FROM supplier
WHERE supplier_name = 'IBM'
ORDER BY supplier_city DESC, supplier_state ASC;
This would return all records sorted by the supplier_city field in descending order, with a secondary sort by supplier_state in ascending order.

Oracle Learning - 5

SQL: Joins

A join is used to combine rows from multiple tables. A join is performed whenever two or more tables is listed in the FROM clause of an SQL statement.
There are different kinds of joins. Let's take a look at a few examples.

Inner Join (simple join)
Chances are, you've already written an SQL statement that uses an inner join. It is is the most common type of join. Inner joins return all rows from multiple tables where the join condition is met.
For example,
SELECT suppliers.supplier_id, suppliers.supplier_name, orders.order_date
FROM suppliers, orders
WHERE suppliers.supplier_id = orders.supplier_id;
This SQL statement would return all rows from the suppliers and orders tables where there is a matching supplier_id value in both the suppliers and orders tables.

Let's look at some data to explain how inner joins work:
We have a table called suppliers with two fields (supplier_id and supplier_ name).
It contains the following data:

supplier_id supplier_name
10000 IBM
10001 Hewlett Packard
10002 Microsoft
10003 Nvidia

We have another table called orders with three fields (order_id, supplier_id, and order_date).
It contains the following data:

order_id supplier_id order_date
500125 10000 2003/05/12
500126 10001 2003/05/13

If we ran the SQL statement below:
SELECT suppliers.supplier_id, suppliers.supplier_name, orders.order_date
FROM suppliers, orders
WHERE suppliers.supplier_id = orders.supplier_id;

Our result set would look like this:

supplier_id name order_date
10000 IBM 2003/05/12
10001 Hewlett Packard 2003/05/13
The rows for Microsoft and Nvidia from the supplier table would be omitted, since the supplier_id's 10002 and 10003 do not exist in both tables.

Outer Join
Another type of join is called an outer join. This type of join returns all rows from one table and only those rows from a secondary table where the joined fields are equal (join condition is met).
For example,
select suppliers.supplier_id, suppliers.supplier_name, orders.order_date
from suppliers, orders
where suppliers.supplier_id = orders.supplier_id(+);
This SQL statement would return all rows from the suppliers table and only those rows from the orders table where the joined fields are equal.
The (+) after the orders.supplier_id field indicates that, if a supplier_id value in the suppliers table does not exist in the orders table, all fields in the orders table will display as in the result set.
The above SQL statement could also be written as follows:
select suppliers.supplier_id, suppliers.supplier_name, orders.order_date
from suppliers, orders
where orders.supplier_id(+) = suppliers.supplier_id

Let's look at some data to explain how outer joins work:
We have a table called suppliers with two fields (supplier_id and name).
It contains the following data:

supplier_id supplier_name
10000 IBM
10001 Hewlett Packard
10002 Microsoft
10003 Nvidia

We have a second table called orders with three fields (order_id, supplier_id, and order_date).
It contains the following data:

order_id supplier_id order_date
500125 10000 2003/05/12
500126 10001 2003/05/13

If we ran the SQL statement below:
select suppliers.supplier_id, suppliers.supplier_name, orders.order_date
from suppliers, orders
where suppliers.supplier_id = orders.supplier_id(+);

Our result set would look like this:

supplier_id supplier_name order_date
10000 IBM 2003/05/12
10001 Hewlett Packard 2003/05/13
10002 Microsoft
10003 Nvidia
The rows for Microsoft and Nvidia would be included because an outer join was used. However, you will notice that the order_date field for those records contains a value.

Oracle Learning - 3

SQL: BETWEEN Condition

The BETWEEN condition allows you to retrieve values within a range.
The syntax for the BETWEEN condition is:
SELECT columns
FROM tables
WHERE column1 between value1 and value2;
This SQL statement will return the records where column1 is within the range of value1 and value2 (inclusive). The BETWEEN function can be used in any valid SQL statement - select, insert, update, or delete.

Example #1 - Numbers
The following is an SQL statement that uses the BETWEEN function:
SELECT *
FROM suppliers
WHERE supplier_id between 5000 AND 5010;
This would return all rows where the supplier_id is between 5000 and 5010, inclusive. It is equivalent to the following SQL statement:
SELECT *
FROM suppliers
WHERE supplier_id >= 5000
AND supplier_id <= 5010;

Example #2 - Dates
You can also use the BETWEEN function with dates.
SELECT *
FROM orders
WHERE order_date between to_date ('2003/01/01', 'yyyy/mm/dd')
AND to_date ('2003/12/31', 'yyyy/mm/dd');
This SQL statement would return all orders where the order_date is between Jan 1, 2003 and Dec 31, 2003 (inclusive).
It would be equivalent to the following SQL statement:
SELECT *
FROM orders
WHERE order_date >= to_date('2003/01/01', 'yyyy/mm/dd')
AND order_date <= to_date('2003/12/31','yyyy/mm/dd');

Example #3 - NOT BETWEEN
The BETWEEN function can also be combined with the NOT operator.
For example,
SELECT *
FROM suppliers
WHERE supplier_id not between 5000 and 5500;
This would be equivalent to the following SQL:
SELECT *
FROM suppliers
WHERE supplier_id < 5000
OR supplier_id > 5500;
In this example, the result set would exclude all supplier_id values between the range of 5000 and 5500 (inclusive).
SQL: EXISTS Condition

The EXISTS condition is considered "to be met" if the subquery returns at least one row.
The syntax for the EXISTS condition is:
SELECT columns
FROM tables
WHERE EXISTS ( subquery );
The EXISTS condition can be used in any valid SQL statement - select, insert, update, or delete.

Example #1
Let's take a look at a simple example. The following is an SQL statement that uses the EXISTS condition:
SELECT *
FROM suppliers
WHERE EXISTS
(select *
from orders
where suppliers.supplier_id = orders.supplier_id);
This select statement will return all records from the suppliers table where there is at least one record in the orders table with the same supplier_id.

Example #2 - NOT EXISTS
The EXISTS condition can also be combined with the NOT operator.
For example,
SELECT *
FROM suppliers
WHERE not exists (select * from orders Where suppliers.supplier_id = orders.supplier_id);
This will return all records from the suppliers table where there are no records in the orders table for the given supplier_id.

Example #3 - DELETE Statement
The following is an example of a delete statement that utilizes the EXISTS condition:
DELETE FROM suppliers
WHERE EXISTS
(select *
from orders
where suppliers.supplier_id = orders.supplier_id);

Example #4 - UPDATE Statement
The following is an example of an update statement that utilizes the EXISTS condition:

UPDATE supplier
SET supplier_name = ( SELECT customer.name
FROM customers
WHERE customers.customer_id = supplier.supplier_id)
WHERE EXISTS
( SELECT customer.name
FROM customers
WHERE customers.customer_id = supplier.supplier_id);

Example #5 - INSERT Statement
The following is an example of an insert statement that utilizes the EXISTS condition:
INSERT INTO supplier
(supplier_id, supplier_name)
SELECT account_no, name
FROM suppliers
WHERE exists (select * from orders Where suppliers.supplier_id = orders.supplier_id);

Oracle Learning - 2

SQL: Combining the "AND" and "OR" Conditions

The AND and OR conditions can be combined in a single SQL statement. It can be used in any valid SQL statement - select, insert, update, or delete.
When combining these conditions, it is important to use brackets so that the database knows what order to evaluate each condition.


Example #1
The first example that we'll take a look at an example that combines the AND and OR conditions.
SELECT *
FROM supplier
WHERE (city = 'New York' and name = 'IBM')
or (city = 'Newark');
This would return all suppliers that reside in either New York whose name is IBM, all supplies that reside in Newark. The brackets determine what order the AND and OR conditions are evaluated in.

Example #2
The next example takes a look at a more complex statement.
For example:
SELECT supplier_id
FROM supplier
WHERE (name = 'IBM')
or (name = 'Hewlett Packard' and city = 'Atlantic City')
or (name = 'Gateway' and status = 'Active' and city = 'Burma');
This SQL statement would return all supplier_id values where the supplier's name is IBM or the name is Hewlett Packard and the city is Atlantic City or the name is Gateway and the city is Burma.
SQL: LIKE Condition

The LIKE condition allows you to use wildcards in the where clause of an SQL statement. This allows you to perform pattern matching. The LIKE condition can be used in any valid SQL statement - select, insert, update, or delete.
The patterns that you can choose from are:
% allows you to match any string of any length (including zero length)
_ allows you to match on a single character

Examples using % wildcard
The first example that we'll take a look at involves using % in the where clause of a select statement. We are going to try to find all of the suppliers whose name begins with 'Hew'.
SELECT * FROM supplier
WHERE supplier_name like 'Hew%';


You can also using the wildcard multiple times within the same string. For example,
SELECT * FROM supplier
WHERE supplier_name like '%bob%';
In this example, we are looking for all suppliers whose name contains the characters 'bob'.

You could also use the LIKE condition to find suppliers whose name does not start with 'T'. For example,
SELECT * FROM supplier
WHERE supplier_name not like 'T%';
By placing the not keyword in front of the LIKE condition, you are able to retrieve all suppliers whose name does not start with 'T'.

Examples using _ wildcard
Next, let's explain how the _ wildcard works. Remember that the _ is looking for only one character.
For example,
SELECT * FROM supplier
WHERE supplier_name like 'Sm_th';
This SQL statement would return all suppliers whose name is 5 characters long, where the first two characters is 'Sm' and the last two characters is 'th'. For example, it could return suppliers whose name is 'Smith', 'Smyth', 'Smath', 'Smeth', etc.

Here is another example,
SELECT * FROM supplier
WHERE account_number like '12317_';
You might find that you are looking for an account number, but you only have 5 of the 6 digits. The example above, would retrieve potentially 10 records back (where the missing value could equal anything from 0 to 9). For example, it could return suppliers whose account numbers are:
123170
123171
123172
123173
123174
123175
123176
123177
123178
123179.


Examples using Escape Characters
Next, in Oracle, let's say you wanted to search for a % or a _ character in a LIKE condition. You can do this using an Escape character.
Please note that you can define an escape character as a single character (length of 1) ONLY.
For example,
SELECT * FROM supplier
WHERE supplier_name LIKE '!%' escape '!';
This SQL statement identifies the ! character as an escape character. This statement will return all suppliers whose name is %.

Here is another more complicated example:
SELECT * FROM supplier
WHERE supplier_name LIKE 'H%!%' escape '!';
This example returns all suppliers whose name starts with H and ends in %. For example, it would return a value such as 'Hello%'.

You can also use the Escape character with the _ character. For example,
SELECT * FROM supplier
WHERE supplier_name LIKE 'H%!_' escape '!';
This example returns all suppliers whose name starts with H and ends in _. For example, it would return a value such as 'Hello_'.
SQL: "IN" Function

The IN function helps reduce the need to use multiple OR conditions.
The syntax for the IN function is:
SELECT columns
FROM tables
WHERE column1 in (value1, value2, .... value_n);
This SQL statement will return the records where column1 is value1, value2..., or value_n. The IN function can be used in any valid SQL statement - select, insert, update, or delete.


Example #1
The following is an SQL statement that uses the IN function:
SELECT *
FROM supplier
WHERE supplier_name in ( 'IBM', 'Hewlett Packard', 'Microsoft');
This would return all rows where the supplier_name is either IBM, Hewlett Packard, or Microsoft. Because the * is used in the select, all fields from the supplier table would appear in the result set.
It is equivalent to the following statement:
SELECT *
FROM supplier
WHERE supplier_name = 'IBM'
OR supplier_name = 'Hewlett Packard'
OR supplier_name = 'Microsoft';
As you can see, using the IN function makes the statement easier to read and more efficient.

Example #2
You can also use the IN function with numeric values.
SELECT *
FROM orders
WHERE order_id in (10000, 10001, 10003, 10005);
This SQL statement would return all orders where the order_id is either 10000, 10001, 10003, or 10005.
It is equivalent to the following statement:
SELECT *
FROM orders
WHERE order_id = 10000
OR order_id = 10001
OR order_id = 10003
OR order_id = 10005;

Example #3 - "NOT IN"
The IN function can also be combined with the NOT operator.
For example,
SELECT *
FROM supplier
WHERE supplier_name not in ( 'IBM', 'Hewlett Packard', 'Microsoft');
This would return all rows where the supplier_name is neither IBM, Hewlett Packard, or Microsoft. Sometimes, it is more efficient to list the values that you do not want, as opposed to the values that you do want.

Oracle Learning -1

SQL: SELECT Statement

The SELECT statement allows you to retrieve records from one or more tables in your database.
The syntax for the SELECT statement is:
SELECT columns
FROM tables
WHERE predicates;

Example #1
Let's take a look at how to select all fields from a table.
SELECT *
FROM supplier
WHERE city = 'Newark';
In our example, we've used * to signify that we wish to view all fields from the supplier table where the supplier resides in Newark.

Example #2
You can also choose to select individual fields as opposed to all fields in the table.
For example:
SELECT name, city, state
FROM supplier
WHERE supplier_id > 1000;
This select statement would return all name, city, and state values from the supplier table where the supplier_id value is greater than 1000.

Example #3
You can also use the select statement to retrieve fields from multiple tables.
SELECT orders.order_id, supplier.name
FROM supplier, orders
WHERE supplier.supplier_id = orders.supplier_id;
The result set would display the order_id and suppier name fields where the supplier_id value existed in both the supplier and orders table.



SQL: DISTINCT Clause

The DISTINCT clause allows you to remove duplicates from the result set. The DISTINCT clause can only be used with select statements.
The syntax for the DISTINCT clause is:
SELECT DISTINCT columns
FROM tables
WHERE predicates;

Example #1
Let's take a look at a very simple example.
SELECT DISTINCT city
FROM supplier;
This SQL statement would return all unique cities from the supplier table.

Example #2
The DISTINCT clause can be used with more than one field.
For example:
SELECT DISTINCT city, state
FROM supplier;
This select statement would return each unique city and state combination. In this case, the distinct applies to each field listed after the DISTINCT keyword.

SQL: COUNT Function

The COUNT function returns the number of rows in a query.
The syntax for the COUNT function is:
SELECT COUNT(expression)
FROM tables
WHERE predicates;



Simple Example
For example, you might wish to know how many employees have a salary that is above $25,000 / year.
SELECT COUNT(*) as "Number of employees"
FROM employees
WHERE salary > 25000;
In this example, we've aliased the count(*) field as "Number of employees". As a result, "Number of employees" will display as the field name when the result set is returned.

Example using DISTINCT
You can use the DISTINCT clause within the COUNT function.
For example, the SQL statement below returns the number of unique departments where at least one employee makes over $25,000 / year.
SELECT COUNT(DISTINCT department) as "Unique departments"
FROM employees
WHERE salary > 25000;
Again, the count(DISTINCT department) field is aliased as "Unique departments". This is the field name that will display in the result set.

Example using GROUP BY
In some cases, you will be required to use a GROUP BY clause with the COUNT function.
For example, you could use the COUNT function to return the name of the department and the number of employees (in the associated department) that make over $25,000 / year.
SELECT department, COUNT(*) as "Number of employees"
FROM employees
WHERE salary > 25000
GROUP BY department;
Because you have listed one column in your SELECT statement that is not encapsulated in the COUNT function, you must use a GROUP BY clause. The department field must, therefore, be listed in the GROUP BY section.

TIP: Performance Tuning
Since the COUNT function will return the same results regardless of what field(s) you include as the COUNT function parameters (ie: within the brackets), you can change the syntax of the COUNT function to COUNT(1) to get better performance as the database engine will not have to fetch back the data fields.

For example, based on the example above, the following syntax would result in better performance:
SELECT department, COUNT(1) as "Number of employees"
FROM employees
WHERE salary > 25000
GROUP BY department;
Now, the COUNT function does not need to retrieve all fields from the employees table as it had to when you used the COUNT(*) syntax. It will merely retrieve the numeric value of 1 for each record that meets your criteria.
SQL: WHERE Clause

The WHERE clause allows you to filter the results from an SQL statement - select, insert, update, or delete statement.
It is difficult to explain the basic syntax for the WHERE clause, so instead, we'll take a look at some examples.

Example #1
SELECT *
FROM supplier
WHERE supplier_name = 'IBM';
In this first example, we've used the WHERE clause to filter our results from the supplier table. The SQL statement above would return all rows from the supplier table where the supplier_name is IBM. Because the * is used in the select, all fields from the supplier table would appear in the result set.

Example #2
SELECT supplier_id
FROM supplier
WHERE supplier_name = 'IBM'
or supplier_city = 'Newark';
We can define a WHERE clause with multiple conditions. This SQL statement would return all supplier_id values where the supplier_name is IBM or the supplier_city is Newark.

Example #3
SELECT supplier.suppler_name, orders.order_id
FROM supplier, orders
WHERE supplier.supplier_id = orders.supplier_id
and supplier.supplier_city = 'Atlantic City';
We can also use the WHERE clause to join multiple tables together in a single SQL statement. This SQL statement would return all supplier names and order_ids where there is a matching record in the supplier and orders tables based on supplier_id, and where the supplier_city is Atlantic City.

SQL: "AND" Condition

The AND condition allows you to create an SQL statement based on 2 or more conditions being met. It can be used in any valid SQL statement - select, insert, update, or delete.
The syntax for the AND condition is:
SELECT columns
FROM tables
WHERE column1 = 'value1'
and column2 = 'value2';
The AND condition requires that each condition be must be met for the record to be included in the result set. In this case, column1 has to equal 'value1' and column2 has to equal 'value2'.

Example #1
The first example that we'll take a look at involves a very simple example using the AND condition.
SELECT *
FROM supplier
WHERE city = 'New York'
and type = 'PC Manufacturer';
This would return all suppliers that reside in New York and are PC Manufacturers. Because the * is used in the select, all fields from the supplier table would appear in the result set.

Example #2
Our next example demonstrates how the AND condition can be used to "join" multiple tables in an SQL statement.
SELECT order.order_id, supplier.supplier_name
FROM supplier, order
WHERE supplier.supplier_id = order.supplier_id
and supplier.supplier_name = 'IBM';
This would return all rows where the supplier_name is IBM. And the supplier and order tables are joined on supplier_id. You will notice that all of the fields are prefixed with the table names (ie: order.order_id). This is required to eliminate any ambiguity as to which field is being referenced; as the same field name can exist in both the supplier and order tables.
In this case, the result set would only display the order_id and supplier_name fields (as listed in the first part of the select statement.).
SQL: "OR" Condition

The OR condition allows you to create an SQL statement where records are returned when any one of the conditions are met. It can be used in any valid SQL statement - select, insert, update, or delete.

The syntax for the OR condition is:
SELECT columns
FROM tables
WHERE column1 = 'value1'
or column2 = 'value2';
The OR condition requires that any of the conditions be must be met for the record to be included in the result set. In this case, column1 has to equal 'value1' OR column2 has to equal 'value2'.

Example #1
The first example that we'll take a look at involves a very simple example using the OR condition.
SELECT *
FROM supplier
WHERE city = 'New York'
or city = 'Newark';
This would return all suppliers that reside in either New York or Newark. Because the * is used in the select, all fields from the supplier table would appear in the result set.

Example #2
The next example takes a look at three conditions. If any of these conditions is met, the record will be included in the result set.
For example:
SELECT supplier_id
FROM supplier
WHERE name = 'IBM'
or name = 'Hewlett Packard'
or name = 'Gateway';
This SQL statement would return all supplier_id values where the supplier's name is either IBM, Hewlett Packard or Gateway.

Tuesday, November 4, 2008

Business Objects FAQs

1. What is the difference between thin client & thick client?
Thin Client is a browser based version, whereas thick client is a desktop based version. In thick client, you have lot of functions and formatting options.

Desktop Intelligence is full client. It is 2 tier architecture, where Web-I is 3 tier with Enterprise server in between.

Desktop-I and Web-I differs in some syntaxes.
E.g: [] in Web-I, <> in Deski.

Also scheduling can be done directly in Web-I (Xi R2), where as we need additional software to schedule Deski reports.

You can view the Deski reports in Web-I, but not Web-I reports in Deski.

But we can schedule the Deski reports via Web-I.

WebI: it is 3tier architecture. and also known as thin client . in boxir2 wise merge option available,edit sql also.
scheduling directly, Reports will generate not only corporate doc's, others also(excel,pdf--),hide option is not available,

Deski: it is 2 tier architecture & also known as full client. Here hide option, edit Sql, rank options are available. Desktop intelligence reports are dynamic. desktop intellegence is window base tool and need installation on every PC
where as WebI is web base tool and can be access any where through interenet explorer

Crystal reports are static and pagewise.

Infoview: WebI is a part of infoview. it generates java based reports. while open the reports through infoview.

2. In BOXIR2 we have following tools.
Import Wizard
Report Conversion Tool
Repository Migration Wizard

3. Multi pass SQL:
Multipass: Breaking one large SQL into multiple SQLs.If you are using the star schema with two or more fact tables,and you enable this feature, BO will automatically generate two or more SQLs (i.e. one SQL for each fact table object used in the report). Then the results will be synchronised in the report.

4. what is isolated joins in check integrity
Isolated join is the join which is not included in any of your contexts, so you are getting that error.

Solution :
First of all find what are all the joins you left with out including in any of your contexts and join them to any of the context which
you thnk appropriate.

5. Migration Process:
To migrate the bo 6.5 to XIR2:--

a) open the migration wizard
b) select ur source location(Here give ur BO 6.5 doc)
c) click next
d) select ur destination location (ur BoXIR2 environment)
e) select the users or admin or specific users
f) click next
f) click ok.

6. Difference b/w Break & section?

Break removes duplicates but the same thing cannot be done by section.
Break displays data within the same cell content and Sectioning appears outside the grid.
When you do any arithmetic operation on break say sum or count, you can see the sum for individual block and for all the blocks in bottom.

In Section it performs operation only on individual block.

7. What is master-detail report?
Master-Detail report allows us to display the result in Section wise. It splits large blocks of data into sections. It minimizes the repeating values. We can have subtotals also.

It displays the data section wise. If you have the following in a report, for e.g. Country, Store, Sales, you can change it into a master detail report country wise by dragging and dropping Country as a section when the cursor shows the text 'Drop here to create a section' you can seethe data country wise.

8. how big is ur team?

in my enviroment their are 3000+ Business End Users and 4 report Developers and Two Universe Designer and one Administrator.

9. is it possible to creating reports without universe.
it is possible to creating reports without universe. By Using personal data files&free hand sql
By Using personal data files&free hand sql. This is possible only in deski reports not in WebI.

11. how to solve #multivalue, #syntax, #error. iwant complete solution process in practical wise?
#Multivalue :- this error will occur in 3ways
1) #multivalue in aggregation
2) #multivalue in breaks header or footer
3) #multivalue in section level.

1:-- the o/p context not include i/p context its situation this error occurs.
Ex: in a report i have year,city dia's & Revenue measure is there.
= In
The above condition will to run the query getting revenue column #multivalue error occurs.

solution: cilck the formulabar in view menu
select the error containg cell, edit the formula to write below condition.
= In(,) In
The above formula will run correct data will appear in the report.
Note: the above condition by default it will take agg "sum" function.

#syntax:--
the variable in the formula no longer exist in the
condition.

Ex:- *
The above condition will run this error will occur.

Solution:- Click edit dataprovider--> new object will be
need --> select error cell --> edit formula --> click ok.

#error:--
the variable in the formula is incorrect. this
error will occur.

solution : go for data menu --> click variable
select the error containing a cell --> copy the formula in
edit menu --> paste it in new cell --> go for formula bar in
view menu --> --> take the first error containg cell
-->edit the formula --> repeat the above steps.

12. What is the difference B/W Variable & Formula?
Whenever we execute the formula , the result will be stored in the variable.

13. what is ment by incompatable object error in the report level?

When the contexts are not properly defined we will get the error as incompaitable combination of objects.

14. In a report i want to fetch the data from 2 data Providers. which condition will satisfy to link the 2 data providers.
ex: Q1 have columns A,B,C Q2 has a X,Y,Z columns. requirement is like i want to get all the columns from those 2 tables in report level..like A,B,C,X,Y,Z in a single report.

in BOxir2 wise it is possible. would u have base uni & dervied uni's. i think ur requirement is solve by using "combining query" option & just select "union" operator wise ur query is solve. anotherwise u go for WebI, select "MERGE" option it is possible.

otherwise, ur requirement is not possible. because ur columns names are not maching.

u go for deski, select "datamanager" --> click "link to" Option it is possible.

15. What is meant by For each For all function. In which case we use the option in BO?

for each-add all objects to the context,
for all-delete all objects from the context
we use forall for summary purpose and foreach for detail purpose

16. how we improve the performance of report and universe?
By creating the summary tables and using aggregate awareness.

17. In xir2 how to send reports to end user?
You can send reports to any user via the scheduling options for a report. The report will then run as per the scheduled options and when successful, it will send a copy to the user's email address or inbox (in BO), depending on the options selected.

18.which is the best way to resolve loops? Context or Alias?
in a schema we have only one look loop u can create Alias. But, when we have multiple look up u can create contexts.
most of the cases are using context can be use to resolve the loops. why because more number of complexity loops and
also schema contains 2 or more fact tables in ur universe design.
If there are more than one fact table use a context or if only one fact table use alias.

19. How do we link 2 universes?
The linking can be done at reporter level by linking of data providers. We can link the dimensions and measures of two different universes with 2 different connections by linking the data providers built upon them.

20. What r the types of joins universe supports?
Inner Join, outerjoin, left, right outer join, full outer Join.

Monday, November 3, 2008

Desktop Intelligence vs. WebIntelligence XI R2

Entering Deski/Webi:
For Deski:
Wizard: Universe vs. Other Data Source
4 wizard options (cell, table, crosstab,chart)
Many Microsoft formatting toolbars
For Webi:
Universes (Or OLAP) Only
No personal data files (Excel, XML, etc)
No real wizard
Limited Microsoft formatting toolbars
Interactive Mode: Can Enter By accident

Query Panel:
For Deski: Data Tab
When editing query, does add new objects to the report
Radial button for display of classes and object or predefined conditions
Button For: Save & Close/View/Run/Cancel
View Button for look at data and other functions
Add Query From Report Manager Window
Right Click in white area in Data Section
Insert New Data Wizard pops up
Report Manager: Click radial button to sort by data provider
Edit only 1 query at a time
User Objects can be created
View SQL

For Webi: Data Tab
When editing existing query, does NOT add in to the report
Edit Query/Edit Report Icon
Properties tab for queries
Predefined conditions integrated together with classes and objects
Run Query Button on top (Only 1 option)
Can selectively run only 1 instead of all queries (Refresh too)
No View Button
No statistics/view data options
Can hide the Query Filter Box
Add Query Button (To open up another query panel)
Creates a Query Tab in Query Window
Has mini speed menu for those Tabs
Report Manager: Click down arrow to sort by query
Can click on query tab to edit directly (jump around)
No regular templates option
No User Objects capability
View SQL now available
Scope of Analysis Option (Click On/Off)
Appears on bottom of query panel (Below Query Filters Box)
Creating Query Filters (Conditions) more convenient: List of Operators and some Operand settings displayed within Query Filter-Builder.
No ‘Show List of prompts’ choice in Query Filters.
(Properties?) Tab next to Data Tab has box for changing retrieval record limit or retrieval time.

Report Manager:
For Deski:
Slice & Dice Panel
Format Templates
No drag and drop templates
Microsoft Formatting Toolbars
No Report Filter Window
Drilling: Must Grab All dimensions down path, or use scope of analysis

For Webi:
No Slice & Dice Panel
“Templates” Option (Drag and Drop)
No Format Templates
No Query on Query/Subquery Calc
No Grouping (Clip Icon)
No hide Objects
No Count All
No Fold option
Dragging/Dropping within Report Window very easy.
Can drag objects directly from Results Object window to Query Filters
No personal lov’s
Limited Microsoft Formatting Toolbars
Right Click on Edge of Report: Turn To Option
4 Report Options + 1 Full Chart Options as well
Report Filter Window Option (Appears on top of display)
To Remove Calcs: Drag Off or Structure Mode or Right Click/Remove Row or Column
Custom Sorts: But less sorting options
Breaks: Less Property Options
Appear on left side via properties tab (Must drill down)
Ranking: But less property options
Properties Tab on Left:
Have to click on option to see pull down’s
Contexts now different
Prompting options far more powerful and easy to use
Formulas/Variables:
Includes most Deski functions now
IF is a Function (Not a command): Like Excel
Display Format: More Difficult
Tabs on Left: Data/Functions/Operators
Formula on Right/Bottom
Name/Definition on Right/Top
Operators list remains fixed
Subquery Done Via Toolbar Option (Not in conditions)
Linking Multiple Data Providers: Merge Dimensions
New Toolbar Option
Easy to Use Menu
Drilling: Will Drill via New Query to lower level
Snapshot more limited

Tuesday, September 16, 2008

UNDERSTANDING ENTITIES - CHAPTER 2

What is an Entity ?

An entity is s physical representation of a logical grouping of data. Entities can be tangible, real things, such as a PERSON or ICE CREAM, or intangible concepts, such as a COST CENTER or MARKET. Entities do not represent single things. Instead, they represent collections of instances that contain the information of interest for all instances or occurrences. For example a PERSON entity represents instances of things of type Person. Gabriel De Angelies, R.J golcher, Jessica Corter, and Venessa Westley are examples of specific instances of PERSON. A specific instance of an entity is represented by a row and is identified by a primary key.

An entity has the following characterstics:
• It has a name and description.
• It represents a class, rather than a single instance of a concept.
• It has the ability to uniquely identify each specific instance.
• It contains a logical grouping of attributes representing the information of
interest to the enterprise.


Formal Entity Definitions

The following list contains entity definitions from some of the most influential leaders in data modeling. Notice the similarities:
• Chen (1976): “A thing which can be distinctly identified.”
• Date (1986): “Any distinguishable object that is to be represented in the database.”
• Finklestein (1989): “A data entity represents some ‘thing’that is to be stored for later reference. The term entity to the logical representaion of data.”

Defining Entity Types

Within the independent and dependent entities are entity types:
• Core entities-- These are sometimes called primary or prime entities. They represent the important objects about which the enterprise in interested in keeping data.
• Code/reference/classification entities-These entities contain rows that define the set of values, or domain, for an attribute.
• Associative entities--These entities are used to resolve many-to-many relationships.
• Subtype entities-These entities come in two types, exclusive and inclusive.

Core Entity

Core entities are the most important objects about which an enterprise is interested in keeping data. They are often referred to as prime, principal, or primary entities. Because these entities are so important, it is likely that they are used elsewhere in the enterprise. Take the time to look for similar entities because there are many opportunities for the reuse of core entities. Core entities should be modeled consistently throughout the enterprise. Good modelers consider this an essential best practice.


Note the straight corners of the independent entities, STORE and ICE CREAM and the rounded corners of the dependent entity STORE ICE CREAM.

A core entity can be an independent entity or a depenent entity. Figure 2.1 provides examples of core entities for an enterprie that sells ice cream. ICE CREAM represents the base products sold by the enterprise. STORE is an example of a distribution channel, or the vehicle through which a product is sold.

Consider that the enterprise is doing well and has decided to add another STORE. The model requires no change to support the addittion of a new instance of STORE. It is simpy another row added to the STORE entity. The same applies to ICE CREAM.

Notice the core entities ICE CREAM and STORE. Although the example may seem straight forward, it illustrates a powerful concept regarding the modeling of core entities.

Understanding how to model core entities as scalable and extensible containers of information requires the modeler to think about the entity as an abstract concept and to model the information independently of the way it is used today. In this example, model ICE CREAM completely outside the context of STORE and vice versa. So, if the enterprise decides to sell ICE CREAM using an addittional channel, such as the Internet or door-to-door, the new channel can be added without distrubing other entities.

Code Entity

Code entities are always independent entities. They are often referred to as reference, classification, or type entities, depending on the methodology. The unique instances represented by code entities define the domain of values for attributes present in other entities. You might be tempted to use a single attribute in a code table. It is a best practice to include at least three attributes in a code entity: an identifier, a name (sometimes called a short name), and a description.

In Figure 2.2, TOPPING is an independent entity; note the sharp corners. TOPPING is also a code or classification entity. The instances (or rows) of TOPPING define the list of toppings available.





Figure 2.2
Code entities allow an enterprise to define a set of values for consistent use throughoput the enterprise. The instances of a code entity define a domain of values for use elsewhere in the model.

Code entities usually contain a limited number of attributes. I have seen instances where these entities contain only a single attribute. I prefer to model code entities with an artificial identifier. Using an artificial identifier, along with a name and description, allows the addittion of new kinds of TOPPING to be added as instances (rows) in the entity. Note that TOPPING contains three attributes.

I often refer to code entities as corporate business objects. The name, corporate business objects, indicates that the entities are defined and shared at a corporate level, not by a single application, system, or business unit. These entities are often shared by many databases to allow consistent roll-up reporting or trending analysis.

Associative entity

Associative entities are entities that contain the primary key from two or more other entities. Associative entities are always dependent entities. They are used to resolve many-to many relationships between other entities. Many – to many relationships are those in which many instances of one entity are related to many instances of another. Associative entities allow us to model the intersection between the instances of the two entities, thereby allowing each instance in the associatie entity to be unique.

Note

Many- to – many relationships cannot be implemented in a physical database. ERwin will automatically create an associative entity to resolve a many-tomany relationship when the model is changed from logical to physical mode.

Figure 2.1 uses an associative entity to resolve a many-to-many relationship between STORE and ICE CREAM. The addittion of an associative entity allows the same ICE CREAM to be sold in many instances of STORE, while not requiring every STORE to sell the same ICE CREAM. The associative entity STORE ICE CREAM resolves the fact that an instance of STORE sells many instances of ICE CREAM and an instance of ICE CREAM is sold by many instances of STORE.


Subtype Entity

Subtype entities are always dependent entities. You should use subtype entities when it makes sense to keep different sets of attributes for the instances of an entity. Finklestein refers to subtype entities as secondary entities. Subtype entities almost always have one or more “sibling” entities. The subtype entity siblings are related to a parent entity through a special relationship that is either exclusive or inclusive.

Note

Subtype sibling entities that have an exclusive relationship to the parent entity indicate that only one sibling has an instance for each instance of the parent entity. Exclusive subtypes represent an “is a” relationship.
Subtype sibling entities that have an inclusive relationship to the parent entity indicate that more than one sibling can have an instance for each instance of the parent entity.

Figure 2.3 shows the CONTAINER entity and the subtype entities CONE and CUP. The ice cream store apparently does not sell ice cream in bulk, only single servings. Note that an instance of CONTAINER must be either a CONE or a CUP. A CONTAINER cannot be both a CONE and a CUP. This is an exclusive subtype.

Figure 2.3, the PERSON entity has two subtypes, EMPLOYEE and CUSTOMER. Note that an exclusive subtype would not allow a single instance of PERSON to contain facts common to both an EMPLOYEE and a CUSTOMER. A VENDOR can also be a CUSTOMER.These are examples of inclusive subtypes.






Figure2.3
Two examples of subtype entities, PERSON and CONTAINER. Both use ERwin IE notation to represent exclusive and inclusive subtypes. The (X) in the subtype symbol of CONTAINER indicates exlusive. The absence of the (X) in the subtype symbol indicates inclusive.


Structure Entity

Sometimes, instances of the same entity are related. In his 1992 book Strategic Systems Development, Clive Finklestein proposes the use of a structure entity to represent relationships between instances of an entity. Relationships between instances of an entity are called recursive relationships. “ Recursive relationships are a logical concept, a concept sometimes difficult for users to grasp.

Figure 2.4 shows the addittion of a structure entity that allows a relationship between instances of EMPLOYEE. The diagram shows that the EMPLOYEE subtype of the PERSON entity has two subtypes, SERVER and MANAGER. The EMPLOYEE STRUCTURE entity represents the relationship between instances of EMPLOYEE.




Figure 2.4
Structure entity illustrates Clive Finklestein’s resolution for recursive relationship.

Naming Entities

The name assigned to an entity should be indicative of the instances of the entity. The name should be understood and accepted across the enterprise. When selecting a name, keep an enterprise view and take care to use a name that reflects how the data is used throughout the entire enterprise, not just a single area. Use names that are meaningful to the user community and domain experts.

I hope you have a set of naming conventions that were developed for use in the enterprise, or an enterprise data model, to guide you. Using naming conventions ensures that names are constructed consistently across the enterprise, regardless of who constructs the name. The following sections provide a starter set of naming conventions and give examples of good and bad names.

Entity Naming Conventions

Naming conventions might not seem important if you work in a small organization with a small set of users. However, in a large organization with many development teams and many users, naming conventions greatly facilitate communication and data sharing. As a best practice, you should develop and maintain naming conventions in a central location and then document and publish them for the whole enterprise.

I include some pointers for beginning a good set of naming conventions, just in case your organization has not yet developed one:

• An entity name should be as descriptive as necessary. Use single-word names only when the name is a widely accepted concept. Consider using noun phrases.

• An entity name should be a singular noun or noun phrase. Use PERSON instead of PERSONS or PEOPLE, or CONTAINER instead of CONTAINERS.

• An entity name should be unique. Using the same entity name to contain different data, or a different entity name to contain the same data, is needlessly confusing to developers and users alike.

• An entity name should be indicative of the data that will be contained for each instance.

• An entity name should be indicative of the data that will be contained for each instance.

• An entity name should not contain special characters (such as! @,#,$,%,^,&,*, and so on) or show possession (PERSON’S ICE CREAM).

• An entity name should not include acronyms or abbreviations unless they are part of the accepted naming conventions.

I encourage modelers to use good naming conventions if they are available and to develop them if they do not follow these guidelines.

Sunday, September 14, 2008

DATA MODELING CONCEPTS CHAPTER 1



CHAPTER 1-DATA MODELING CONCEPTS



The Role of Data Modeling



Data modeling tasks provide the most benefit when performed early in the development lifecycle. The model provides information critical to understanding the scope of a project for iterative development phases. Beginning the implementation phase without a clear understanding of the data requirements might cause your project to incur costly overruns or end up on the scrap heap.



An Introduction to Project Development



Many publications discuss project development, and this text does not cover this subject in detail. I included this section to assist modelers in understanding the role of data modeling in project development and to provide an understanding of when modeling should occur.



Most companies follow a methodology that outlines the development lifecycle selected to guide the development process. To some degree, most adhere to the same order of high-level concepts:



1. Problem definition

2. Requirements analysis

3. Conceptual design

4. Detail design

5. Implementation

6. Testing



This development method is generally referred to as the waterfall method. As you can see in Figure 1.1, each phase is completed before moving to the next, creating a “waterfall” effect.





Figure 1.1

The waterfall method of project development. Note that the results of each phase cascade into the next.



Many projects are developed using iterations or phases. An iterative development approach decreases risk by breaking the project into discrete manageable phases. Each phase includes analysis, detail design, implementation, and testing. Subsequent phases build upon and leverage the functionality of the preceding phase. However, within each phase, the waterfall method applies.



As with most engineering projects, you create a data model by following a set of steps.

1. Problem and scope definition

2. Requirements gathering

3. Analysis

4. Logical data model creation

5. Physical data model creation

6. Database creation



Figure 1.2 illustrates how each step provides input for the next.







Figure 1.2

Logical data model creation can occur prior to selecting a database platform (Oracle, DB2, Sybase, and so on). ERwin can provide support for specific physical properties if the physical data model is produced after the database platform is selected.



Problem and Scope Definition



Begin logical data modeling by defining the problem. This step is sometimes rederred to as writing a mission or scoping statement. The problem definition can be a simple pragraph or it can be a complex document that outlines a series of business objectives. The problem definition defines the scope, or boundary, of the data model, much the way a survey defines property boundaries.



Gathering Information Requirements



Most industry experts agree that the most critical task in a development project is an accurate and complete definition of the requirements. In fact, an incomplete or inaccurate understanding of requirements can cause expensive re-work and significant delay.



Gathering information requirements is the act of discovering and documenting the information necessary to identify and define the entities, attributes, and business rules for the logical model. There are two well-recognized methods for gathering requirements: facilitated sessions and interviews. Most development methodologies recommend facilitated sessions. The sections that follow provide high-level guidelines for gathering information requirements using facilitated sessions. A later exercise demonstrates how to use the information gathered to create a data model using ERwin.



Analysis



You must analyze and research the data requirements and business rules to produce a complete logical model. Analysis tasks should provide accurate and complete definitions for all entities, attributes, and relationships. Metadata, data about the data, is collected and documented during the analysis phase.



The analysis can be performed by the modeler or by a business analyst. Either the model or the business analyst works with users to document how users intend to use the data. These tasks drive out the corporate business objects needed to support the information requirements. Corporate business objects are also called code, reference, or classification data structures. This is also the opportunity to document code values that will be used. You should carefully document any derived data, data that is created by manipulating or combining one or more other data elements, and data elements used in the derivation.



Logical Data Model



A logical data model is a visual representation of data structures, data attributes, and business rules. The logical model represents data in a way that can be easily understood by business users. The logical model design should be independent of platform or implementation language requirements or how the data will be used.



The modeler uses the data requirements and the results of analysis to produce the logical data model. The modeler also resolves the logical model to third normal form and validates against the enterprise data model, if available. Later sections provide a description of a complete logical model, resolving a logical model to third normal form, an overview of an enterprise model, and provide some tips on validating a logical model against an enterprise model.



After you compare the logical model and enterprise data model and make any necessary changes, it is important to review the model for accuracy and completeness. The best practice includes a peer review as well as a review with the business partners and development team.





Entities



Entities represent the things about which the enterprise is interested in keeping data. An entity can be a tangible object such as a person or a book, but it can also be conceptual such as a cost center or business unit. Entities are nouns and are expressed in singular form, CUSTOMER as opposed to CUSTOMER, for clarity and consistency.



You should describe an entity using factual particulars that make it uniquely identifiable. Each instance of an entity must be separate and cleary identifiable from all other instances of that entity. For example, a data model to store information about customers must have a way of distinguishing one customer from another.



Figure 1.3 provides some examples of entities.









Figure 1.3

Here are examples of using ERwin to display entities in their simplest form.





Attributes



Attributes represent the data the enterprise is interested in keeping about objects. Attributes are nouns that describe the characteristics of entities.







Relationships



Relationships represent the associations between the objects about which the enterprise is interested in keeping data. A relationship is expressed as a verb or verb phrase that describes the association. Figure 1.5 provides some examples using ERwin’s Information Engineering (IE) notation to represent relationships.



Normalization



Normalization is the act of moving attributes to appropriate entities to satisfy the normal forms. Normalization is usually presented as a set of complex statements that make it seem a complicated concept. Actually, normalization is quite straightforward: “One fact in one place,” as stated by C.J.Date in his 1999 books An Introduction to Database Systems. Normalizing data means you design the data structures in such a way as to remove redundancy and limit unrelated structures.



Five normal forms are widely accepted in the industry. The forms are simply named first normal form, second normal form, third normal form, fourth normal form, and fifth normal form. In practice, many logical models are only resolved to third normal form.



Formal Definitions of Normal Forms



The following normal form definitions might seem intimidating; just consider them formulas for achieving normalization. Normal forms are based on relational algebra and should be interpreted as mathematical functions.



Business Normal Forms



In his 1992 book, Strategic Systems Development, Clive Finklestein takes a different approach to normalization. He defines business normal forms in terms of the resolution to those forms. Many modelers, myself include, find this business approach more intuitive and practical.



First business normal form (1BNF) removes repeating groups to another entity. This entity takes its name, and primary (compound) key attributes, from the original entity and forms the repeating group.



Second business normal form (2BNF) removes attributes that are partially dependent on the primary key to another entity. The primary (compound) key of this entity is the primary key of the entity in which it originally resided, together with all additional keys on which the attribute is wholly dependent.



Third business normal form (3BNF) removes attributes that are not dependent at all on the primary key to another entity where they are wholly dependent on the primary key of that entity.



Fourth business normal form (4BNF) removes attributes that are dependent on the values of the primary key or that are optional to a secondary entity where they wholly depend on the value of the primary key or where they must (it is mandatory) exist in that entity.



Fifth business normal form (BNF) exists as a structure entity if recursive or other associations exist between occurrences of secondary entities or if recursive associations exist between occurrences of their principal entity.



A Complete Logical Data Model



A complete logical model should be in third business normal form and include all entities, attributes, and relational ships required to support the data requirements and the business rules associated with the data.