Wednesday, March 21, 2012

Oracle Datatype Summary


Code Datatype Description
1
VARCHAR2(size [BYTE | CHAR])
Variable-length character string having maximum length size bytes or characters. Maximum size is 4000 bytes or characters, and minimum is 1 byte or 1 character. You must specify size for VARCHAR2.
BYTE indicates that the column will have byte length semantics. CHAR indicates that the column will have character semantics.
2
NVARCHAR2(size)
Variable-length Unicode character string having maximum length size characters. The number of bytes can be up to two times size for AL16UTF16 encoding and three times size for UTF8 encoding. Maximum size is determined by the national character set definition, with an upper limit of 4000 bytes. You must specify size for NVARCHAR2.
3
NUMBER [ (p [, s]) ]
Number having precision p and scale s. The precision p can range from 1 to 38. The scale s can range from -84 to 127. Both precision and scale are in decimal digits. A NUMBER value requires from 1 to 22 bytes.
4
FLOAT [(p)]
A subtype of the NUMBER datatype having precision p. A FLOAT value is represented internally as NUMBER. The precision p can range from 1 to 126 binary digits. A FLOAT value requires from 1 to 22 bytes.
5
LONG
Character data of variable length up to 2 gigabytes, or 231 -1 bytes. Provided for backward compatibility.
6
DATE
Valid date range from January 1, 4712 BC, to December 31, 9999 AD. The default format is determined explicitly by the NLS_DATE_FORMAT parameter or implicitly by the NLS_TERRITORY parameter. The size is fixed at 7 bytes. This datatype contains the datetime fields YEAR, MONTH, DAY, HOUR, MINUTE, and SECOND. It does not have fractional seconds or a time zone.
7
BINARY_FLOAT
32-bit floating point number. This datatype requires 5 bytes, including the length byte.
8
BINARY_DOUBLE
64-bit floating point number. This datatype requires 9 bytes, including the length byte.
9
TIMESTAMP [(fractional_seconds_precision)]
Year, month, and day values of date, as well as hour, minute, and second values of time, where fractional_seconds_precision is the number of digits in the fractional part of the SECOND datetime field. Accepted values of fractional_seconds_precision are 0 to 9. The default is 6. The default format is determined explicitly by the NLS_DATE_FORMAT parameter or implicitly by the NLS_TERRITORY parameter. The sizes varies from 7 to 11 bytes, depending on the precision. This datatype contains the datetime fields YEAR, MONTH, DAY, HOUR, MINUTE, and SECOND. It contains fractional seconds but does not have a time zone.
10
TIMESTAMP [(fractional_seconds)] WITH TIME ZONE
All values of TIMESTAMP as well as time zone displacement value, where fractional_seconds_precision is the number of digits in the fractional part of the SECOND datetime field. Accepted values are 0 to 9. The default is 6. The default format is determined explicitly by the NLS_DATE_FORMAT parameter or implicitly by the NLS_TERRITORY parameter. The size is fixed at 13 bytes. This datatype contains the datetime fields YEAR, MONTH, DAY, HOUR, MINUTE, SECOND, TIMEZONE_HOUR, and TIMEZONE_MINUTE. It has fractional seconds and an explicit time zone.
11
TIMESTAMP [(fractional_seconds)] WITH LOCAL TIME ZONE
All values of TIMESTAMP WITH TIME ZONE, with the following exceptions:
  • Data is normalized to the database time zone when it is stored in the database.
  • When the data is retrieved, users see the data in the session time zone.
The default format is determined explicitly by the NLS_DATE_FORMAT parameter or implicitly by the NLS_TERRITORY parameter. The sizes varies from 7 to 11 bytes, depending on the precision.
12
INTERVAL YEAR [(year_precision)] TO MONTH
Stores a period of time in years and months, where year_precision is the number of digits in the YEAR datetime field. Accepted values are 0 to 9. The default is 2. The size is fixed at 5 bytes.
13
INTERVAL DAY [(day_precision)] TO SECOND [(fractional_seconds)]
Stores a period of time in days, hours, minutes, and seconds, where
  • day_precision is the maximum number of digits in the DAY datetime field. Accepted values are 0 to 9. The default is 2.
  • fractional_seconds_precision is the number of digits in the fractional part of the SECOND field. Accepted values are 0 to 9. The default is 6.
The size is fixed at 11 bytes.
14
RAW(size)
Raw binary data of length size bytes. Maximum size is 2000 bytes. You must specify size for a RAW value.
15
LONG RAW
Raw binary data of variable length up to 2 gigabytes.
16
ROWID
Base 64 string representing the unique address of a row in its table. This datatype is primarily for values returned by the ROWID pseudocolumn.
17
UROWID [(size)]
Base 64 string representing the logical address of a row of an index-organized table. The optional size is the size of a column of type UROWID. The maximum size and default is 4000 bytes.
18
CHAR [(size [BYTE | CHAR])]
Fixed-length character data of length size bytes or characters. Maximum size is 2000 bytes or characters. Default and minimum size is 1 byte.
BYTE and CHAR have the same semantics as for VARCHAR2.
19
NCHAR[(size)]
Fixed-length character data of length size characters. The number of bytes can be up to two times size for AL16UTF16 encoding and three times size for UTF8 encoding. Maximum size is determined by the national character set definition, with an upper limit of 2000 bytes. Default and minimum size is 1 character.
20
CLOB
A character large object containing single-byte or multibyte characters. Both fixed-width and variable-width character sets are supported, both using the database character set. Maximum size is (4 gigabytes - 1) * (database block size).
21
NCLOB
A character large object containing Unicode characters. Both fixed-width and variable-width character sets are supported, both using the database national character set. Maximum size is (4 gigabytes - 1) * (database block size). Stores national character set data.
22
BLOB
A binary large object. Maximum size is (4 gigabytes - 1) * (database block size).
23
BFILE
Contains a locator to a large binary file stored outside the database. Enables byte stream I/O access to external LOBs residing on the database server. Maximum size is 4 gigabytes.

Wednesday, October 20, 2010

when an update statement is executed in oracle database

shortly, 
1. write the modified data to the archive redo log file and 2. update control file to indicate the most recent checkpoint 


in detail 
1. The data set for the update is taken from the DB CACHE.

2. The ‘Pre Change’ data is written to the UNDO CACHE.

3. The ‘Post Update’ data is rewritten to the REDO CACHE as a delta (a piece of data that contains enough information to reconstruct the original data). The ‘Post Update’ data from the UNDO CACHE is also written to the REDO CACHE (alternatively known as the REDO LOG BUFFER).

4. After a COMMIT is issued the LOG WRITER Process (LGWR) writes the REDO CACHE to the REDO LOG FILES.

5. If ARCHIVELOG is enabled then the ARCHIVER Process (ARCH) writes a copy of the REDO LOG FILES to an alternative destination(s) (Up to 10) before the REDO LOGS are reused.

6. The CHECKPOINT Process (CKPT) is called regularly and this ensures that changed data (dirty blocks) are written from the DB CACHE and back to the datafiles. In addition this process updates the CONTROL FILES.

Tuesday, October 19, 2010

Recreate control file (1z0-042 solution in detail -ques 1)


alter database backup controlfile to ['filename' | trace]

This command comes in two versions. One backs up the control file in a binary format while the other backs it up in a human readable form. It is required if the database is running inarchive log mode and a structural change was made to the database.
Backing it up in a binary format:
alter database backup controlfile to '/some/arbitrary/path';
alter database backup controlfile to '/some/arbitrary/path' reuse;
Backing it up in a human readable format:
alter database backup controlfile to trace;
alter database backup controlfile to trace as '/some/arbitrary/path';
alter database backup controlfile to trace as '/some/arbitrary/path' reuse;

2. remove the control files(control01.ctl,control02.ctl,control03.ctl) from the location C:\ORACLE\PRODUCT\10.2.0\ORADATA\ASSAD\ for testing purpose to re-create it.

3. run the script below to recreate a lost control files

CREATE CONTROLFILE REUSE DATABASE "assad" RESETLOGS FORCE LOGGING ARCHIVELOG
  MAXLOGFILES 16
  MAXLOGMEMBERS 3
  MAXDATAFILES 100
  MAXINSTANCES 8
  MAXLOGHISTORY 292
LOGFILE
GROUP 1 'C:\ORACLE\PRODUCT\10.2.0\ORADATA\ASSAD\redo01.log'  SIZE 5M,
GROUP 2 'C:\ORACLE\PRODUCT\10.2.0\ORADATA\ASSAD\redo02.log'  SIZE 5M,
GROUP 3 'C:\ORACLE\PRODUCT\10.2.0\ORADATA\ASSAD\redo03.log'  SIZE 5M
-- STANDBY LOGFILE
DATAFILE
'C:\ORACLE\PRODUCT\10.2.0\ORADATA\ASSAD\SYSTEM01.DBF',
'C:\ORACLE\PRODUCT\10.2.0\ORADATA\ASSAD\UNDOTBS01.DBF',
'C:\ORACLE\PRODUCT\10.2.0\ORADATA\ASSAD\SYSAUX01.DBF',
'C:\ORACLE\PRODUCT\10.2.0\ORADATA\ASSAD\USERS01.DBF',
'C:\ORACLE\PRODUCT\10.2.0\ORADATA\ASSAD\EXAMPLE01.DBF',
'C:\ORACLE\PRODUCT\10.2.0\ORADATA\ASSAD\EMP',
'C:\ORACLE\PRODUCT\10.2.0\ORADATA\ASSAD\EMPDATA'
CHARACTER SET WE8ISO8859P1
;
Control file created.

SQL> alter database open resetlogs;

Database altered.

and this is the end of your test.




Tuesday, October 12, 2010

Step towards data warehouse

The following are the typical steps involved in the datawarehousing project cycle.
  • ETL
    •   Task Description
      • The ETL (Extraction, Transformation, Loading) process typically takes the longest to develop, and this can easily take up to 50% of the data warehouse implementation cycle or longer. The reason for this is that it takes time to get the source data, understand the necessary columns, understand the business rules, and understand the logical and physical data models.
    • Time Requirement
      • 1 - 6 weeks.
    • Deliverables
      • Data Mapping Document
      • ETL Script / ETL Package in the ETL tool

  • OLAP Cube Design
    •   Task Description
    • Time Requirement

      1 - 2 weeks.

      Deliverables
      • Documentation specifying the OLAP cube dimensions and measures.
      • Actual OLAP cube / report.

  • Front End Development
    •   Task DescriptionRegardless of the strength of the OLAP engine and the integrity of the data, if the users cannot visualize the reports, the data warehouse brings zero value to them. Hence front end development is an important part of a data warehousing initiative. he most important thing is that the reports should need to be delivered over the web, so the only thing that the user needs is the standard browser.
    • Time Requirement
      1 - 4 weeks.
      Deliverables
      Front End Deployment Documentation
  • Report Development
    •   Task Description
      • Report specification typically comes directly from the requirements phase. To the end user, the only direct touchpoint he or she has with the data warehousing system is the reports they see. So, report development, although not as time consuming as some of the other steps such as ETL and data modeling, nevertheless plays a very important role in determining the success of the data warehousing project.
      •   Time Requirement1 - 2 weeks.
        Deliverables
        • Report Specification Documentation.
        • Reports set up in the front end / reports delivered to user's preferred channel.

Tuesday, October 5, 2010

simply Data Warehouse - in details

Extract, transform, load

Extract, transform, and load (ETL) is a process in database usage and especially in data warehousing that involves:
  • Extracting data from outside sources
  • Transforming it to fit operational needs (which can include quality levels)
  • Loading it into the end target (database or data warehouse)

Extract

The first part of an ETL process involves extracting the data from the source systems. Most data warehousing projects consolidate data from different source systems. Each separate system may also use a different data organization/format. Common data source formats are relational databases and flat files.
 In general, the goal of the extraction phase is to convert the data into a single format which is appropriate for transformation processing.
An intrinsic part of the extraction involves the parsing of extracted data, resulting in a check if the data meets an expected pattern or structure. If not, the data may be rejected entirely or in part.

Transform

The transform stage applies a series of rules or functions to the extracted data from the source to derive the data for loading into the end target. Some data sources will require very little or even no manipulation of data. In other cases, one or more of the following transformation types may be required to meet the business and technical needs of the target database: selecting, sorting , joining, aggregation, etc.

Load

The load phase loads the data into the end target, usually the data warehouse (DW). Depending on the requirements of the organization, this process varies widely. Some data warehouses may overwrite existing information with cumulative information, frequently updating extract data is done on daily, weekly or monthly basis.

Real-life ETL cycle

The typical real-life ETL cycle consists of the following execution steps:
  1. Cycle initiation
  2. Build reference data
  3. Extract (from sources)
  4. Validate
  5. Transform (clean, apply business rules, check for data integrity, create aggregates or disaggregates)
  6. Stage (load into staging tables, if used)
  7. Audit reports (for example, on compliance with business rules. Also, in case of failure, helps to diagnose/repair)
  8. Publish (to target tables)
  9. Archive
  10. Clean up




Monday, October 4, 2010

simply Data Warehouse

data warehouse is a place where data is stored for archival, analysis and security purposes. Usually a data warehouse is either a single computer or many computers (servers) tied together to create one giant computer system.
Data can consist of raw data or formatted data. It can be on various types of topics including organization's sales, salaries, operational data, summaries of data including reports, copies of data, human resource data, inventory data, external data to provide simulations and analysis, etc.
Besides being a store house for large amount of data, they must possess systems in place that make it easy to access the data and use it in day to day operations. A data warehouse is sometimes said to be a major role player in a decision support system ( Glossary Link DSS). DSS is a technique used by organizations to come up with facts, trends or relationships that can help them make effective decisions or create effective strategies to accomplish their organizational goals.

Data Warehousing Models

There are many different models of data warehouses. Online Transaction Processing, which is a data warehouse model, is built for speed and ease of use. Another type of data warehouse model is called Online Analytical processing, which is more difficult to use and adds an extra step of analysis within the data. Usually it requires more steps which slows the process down and requires much more data in order to analyze certain queries.
In addition to this model, one of the more common data warehouse models include a data warehouse that is subject oriented, time variant, non volatile and integrated. Subject oriented means that data is linked together and is organized by relationships. Time variant means that any data that is changed in the data warehouse can be tracked. Usually all changes to data are stamped with a time-date and with a before and after value, so that you can show the changes throughout a period of time. Non volatile means that the data is never deleted or erased. This is a great way to protect your most crucial data. Because this data is retained, you can continue to use it in a later analysis. Finally, the data is integrated, which means that a data warehouse uses data that is organizational wide instead of from just one department.
Besides the term data warehouse, a term that is frequently used is a data mart. Data marts are smaller and less integrated data housings. They might be just a database on human resources records or sales data on just one division.

Types of Data Warehouses

With improvements in technology, as well as innovations in using data warehousing techniques, data warehouses have changed from Offline Operational Databases to include an Online Integrated data warehouse.
Offline Operational Data Warehouses are data warehouses where data is usually copied and pasted from real time data networks into an offline system where it can be used. It is usually the simplest and less technical type of data warehouse.
Offline Data Warehouses are data warehouses that are updated frequently, daily, weekly or monthly and that data is then stored in an integrated structure, where others can access it and perform reporting.
Real Time Data Warehouses are data warehouses where it is updated each moment with the influx of new data. For instance, a Real Time Data Warehouse might incorporate data from a Point of Sales system and is updated with each sale that is made.
Integrated Data Warehouses are data warehouses that can be used for other systems to access them for operational systems. Some Integrated Data Warehouses are used by other data warehouses, allowing them to access them to process reports, as well as look up current data.

oracle data warehousing


Implementation of Dimensional Model

there are two options to implement the dimension model to see how it gets physically represented in the database.
  •   relational implementation - is implemented in the database with tables and foreign keys
  •   multidimensional implementation - requires a special feature in a database that allows defining cubes directly as objects in the database
Relational implementation (star schema)
the ERD of relational data warehousing would be shaped somewhat like a star, and thus the term star schema is used to refer this kind of implementation. 



The main table in the middle is referred to as the fact table because it holds the facts, or measures.This represents the cube. The tables surrounding the fact table are known as dimension tables. These are the dimensions of our cube. 

Oracle Warehouse - Multidimensional Implementation (OLAP)

A multidimensional implementation or OLAP (online analytic or analytical processing) requires a databasewith special features that allow it to store cubes as actual objects in the database, and not just tables that are used to represent a cube and dimensions. It also provides advanced calculation and analytic content built into the database to facilitate advanced analytic querying. Oracle's Essbase product is one such database and was originally developed by Hyperion. Oracle recently acquired Hyperion, and is now promoting Essbase as a tool for custom analytics and enterprise performance management applications. The Oracle Database Enterprise Edition has an additional feature that can be licensed called OLAP that embeds a full-featured OLAP server directly in an Oracle database. 




Oracle Warehouse - Designing the Cube

A dimensional model is naturally able to handle this concept of the different levels of data by being able to model a hierarchy within a dimension. The time/date dimension is an easy example of using of various levels. Add up the daily totals to get the totals for the month, and add up 12 monthly totals to get the yearly sales. 
The time/date dimension just needs to store a value to indicate the day, month, and year to be able to provide a view of the data at each of those levels. Combining various levels together then defines a hierarchy



    Office Activation