25
STAATLICH ANERKANNTE FACHHOCHSCHULE Author I: Dip.-Inf. (FH) Johannes Hoppe Author II: M.Sc. Johannes Hofmeister Author III: Prof. Dr. Dieter Homeister Date: 18.03.2011 STUDIEREN UND DURCHSTARTEN.

DMDW Lesson 03 - Data Warehouse Theory

Embed Size (px)

DESCRIPTION

I'm planning to give you a detailed introduction to the concepts of the data warehouse world. We will also see why data mining and data warehouses are closely connected to each other.

Citation preview

Page 1: DMDW Lesson 03 - Data Warehouse Theory

STAATLICHANERKANNTEFACHHOCHSCHULE

Author I: Dip.-Inf. (FH) Johannes HoppeAuthor II: M.Sc. Johannes HofmeisterAuthor III: Prof. Dr. Dieter HomeisterDate: 18.03.2011

STUDIERENUND DURCHSTARTEN.

Page 2: DMDW Lesson 03 - Data Warehouse Theory

STAATLICHANERKANNTEFACHHOCHSCHULE

Data Warehouse

Author I: Dip.-Inf. (FH) Johannes HoppeAuthor II: M.Sc. Johannes HofmeisterAuthor III: Prof. Dr. Dieter Homeister Date: 18.03.2011

Page 3: DMDW Lesson 03 - Data Warehouse Theory

Data Warehouse

01

3

Page 4: DMDW Lesson 03 - Data Warehouse Theory

Definition DW

4

“A data warehouse is a single source for key, corporate information needed to enable business decisions .”

Dieter Homeister (his DM Script)

Page 5: DMDW Lesson 03 - Data Warehouse Theory

Data Warehouse

OLTP and DSS Defined

› An application that updates is called an on-line transaction processing (OLTP) application

› An application that issues queries to the readonly database is called a decision support system (DSS)

5

Page 6: DMDW Lesson 03 - Data Warehouse Theory

Data Warehouse

Stovepipe vs. Integration

› When systems stand by themselves they are often referred to as stovepipes

› Systems that easily share data are called well integrated systems

6

Page 7: DMDW Lesson 03 - Data Warehouse Theory

Data Warehouse

Problems with Stovepipe Architecture (1/2)

› Problems› Users who wish to access data must query several different DSS

to find it › Data may have fundamental conflicts between DSS › a department code table in one DSS may differ in another DSS › a measurement may be stored in meters in one DSS and yards in

another

7

Page 8: DMDW Lesson 03 - Data Warehouse Theory

Data Warehouse

Problems with Stovepipe Architecture (2/2)

› Solution:› Use a data warehouse, where data is integrated from the

several different stovepipe systems › Data warehouse is really sharing-lite -- you don’t have to

co-ordinate as much when applications are built and you still reap the benefits of data sharing

8

Page 9: DMDW Lesson 03 - Data Warehouse Theory

Data Warehouse

Data Warehouse Solution

› A data warehouse is an attempt to integrate separate DSS so that users can query one place to find the answers to their questions

› A data warehouse has the key, corporate data in the organization

› A data warehouse tracks historical data

9

Page 10: DMDW Lesson 03 - Data Warehouse Theory

Selling the Data Warehouse

02

10

Page 11: DMDW Lesson 03 - Data Warehouse Theory

Data Warehouse

Selling the Data Warehouse (1/2)

› A data warehouse project will fail without corporate sponsorship

› Preferably, the project should be sponsored by the CEO › The CEO must be sold on the value to the business to improve

competitive advantage by deploying a data warehouse

11

Page 12: DMDW Lesson 03 - Data Warehouse Theory

Data Warehouse

Selling the Data Warehouse (2/2)

› If an active, corporate sponsor does not exist, data sources will be very difficult to identify

› Only add data to the warehouse that will answer key, corporate questions asked by the corporate sponsor. Otherwise, you will have a data dump

12

Page 13: DMDW Lesson 03 - Data Warehouse Theory

Data Warehouse

Building a Useful Data Warehouse

You really need: › strong executive sponsorship › good knowledge of the data › sound software engineering › stability from source systems › users who want a success › A 75 percent failure rate is often cited

13

Page 14: DMDW Lesson 03 - Data Warehouse Theory

Data Warehouse

Enterprise Information System

› An EIS (Enterprise Information System) allows users to query data in a data warehouse

› Now users can access key, corporate data in the data warehouse

14

Page 15: DMDW Lesson 03 - Data Warehouse Theory

Data Warehouse

Users of an Enterprise Information System (1/3)

› multiple EIS (or different graphical interfaces) are needed to satisfy different types of users

General users› want a tool that provides detailed data, but is easy to use › Want access to the data warehouse to do routine tasks

such as Find me Joes phone number, etc. › Simple application, not focused on large reports

15

Page 16: DMDW Lesson 03 - Data Warehouse Theory

Data Warehouse

Users of an Enterprise Information System (2/3)

Executives› Want a high-level, summary data (and a simple tool)› Must be easy to use, users want to click a few buttons and

get data they want › Results must be graphs › Users should be able to drill-down into key areas.

16

Page 17: DMDW Lesson 03 - Data Warehouse Theory

Data Warehouse

Users of an Enterprise Information System (3/3)

Analysts› want a flexible, more detailed tool › Often very knowledgeable about the data › Willing to do more work to learn about the data › Sometimes even learn SQL to issue their own ad-hoc

queries

17

Page 18: DMDW Lesson 03 - Data Warehouse Theory

Data Warehouse

Need for Data Warehouses

› Data warehouses provide a single place to store key corporate data › users can go one place to find this key data using an enterprise

information system (EIS) › also a place to store and access historical data

› Users measure performance goals for their company over a period of time› Company statistics are available › Data not stored in the same place is difficult to locate and compare, easily lost › Single query can be used to access key data

18

Page 19: DMDW Lesson 03 - Data Warehouse Theory

Data Warehouse

Security & Data Warehouses

› Building a data warehouse does increase security risk because key, corporate information are all in one place

› Risk reduction: database system components can be used to protect the data warehouse. These include › Views › Access control › Security Administration › Encryption › Audit (logging of all accesses)

19

Page 20: DMDW Lesson 03 - Data Warehouse Theory

Data Warehouse

Moving Data into the Data Warehouse

› Moving data from source OLTP systems to the data warehouse is one of the hardest tasks in data warehousing

› Updates to the data warehouse are performed periodically › weekly , nightly, monthly …

› Occasionally, real-time data is needed in a data warehouse, but this is not very common

see the document about ETL, too!

20

Page 21: DMDW Lesson 03 - Data Warehouse Theory

Data Warehouse

Data Mart

› A data mart is a subset of the data warehouse that may make it simpler for users to access key corporate data

› Sometimes, users only need a piece of data from the data warehouse

› The data mart is typically fed from the data warehouse

21

Page 22: DMDW Lesson 03 - Data Warehouse Theory

References

Data Warehouse Books and References

Ralph Kimball, Margy Rossl: The Data Warehouse Toolkit, 2nd Ed., John Wiley & Sons 2002 (Lists of pitfalls, very detailed for several applications like CRM, HR, Insurances)

W. H. Inmon: Building the Data Warehouse, 3rd Ed., John Wiley & Sons 2002 (DW design, migration, techical details)

Claudia Imhoff, Nicholas Galemmo, Jonathan G. Geiger: Mastering Data Warehouse Design, John Wiley & Sons 2003 (Technical and business view, design, optimization)

Donald K. Burleson, W. H. Inmon, Joseph Hudicka: The Data Warehouse eBusiness DBA Handbook, BMC Software and DBAzine/Rampant Techpress 2003 (Available as eBook, technical details, eBusiness, focus on Oracle, DB/DW administration, tools)

22

Page 23: DMDW Lesson 03 - Data Warehouse Theory

References

Data Warehouse Books and References

Maria Sueli Almeida, Missao Ishikawa, Joerg Reinschmidt, Torsten Roeber: Getting Started with DataWarehouse and Business Intelligence, www.redbooks.ibm.com, 1999 (eBook from IBM, focus on DB2, very technical)

Mark W. Humphries, Michael W. Hawkins, Michelle C. Dy: Data Warehousing, Pearson Education, 1998 (Very technical, incl. project mgmt., architecture, hardware and parallel computing)

Chris Todman: Designing a Data Warehouse, Prentice Hall 2000 (Introduction, not very detailed)

23

Page 24: DMDW Lesson 03 - Data Warehouse Theory

THANK YOUFOR YOUR ATTENTION

24

Page 25: DMDW Lesson 03 - Data Warehouse Theory

References

Data Warehouse Books and References

David Grossman, Ophir Frieder: Introduction to Data Warehouse, Illinois Institute of Technology 2005

Dr. Andreas Geppert, Credit Suisse: Data Warehousing - Data-Warehouse-Entwurf, 2006, http://arvo.ifi.unizh.ch/dbtg/Classes/DWH/Slides/dwh-04-sl.pdf (p31: Explaination of star/snowflake/galaxy scheme, in German)

Carmela R. Balassiano: Data Warehouse Design Feb. 2007, http://academic.brooklyn.cuny.edu/cis/cbalassiano/CIS717-2%20course%20documents/week2/Data%20Warehouse%20primer.ppt (p12, p18: Explaination of star/snowflake/galaxy scheme, in English)

25