mysql多维数据仓库指南--第二篇第8章(2)
时间:2008-04-22 来源:jiangdm
实验数据
为了实验定期装载,你需要准备客户,产品和销售订单实验数据。每种数据的来源将在下面的小节中讨论。
关于Customer.csv 文件
customer.csv文件有以下几个变化:
n 编号6的客户的街道编号现在是7777 Ritter Rd(它原来是7000 Ritter Rd.)。
n 编号7的客户名称现在是Distinguished Agencies(它原来是Distinguished Partners)。
n 新增了一个客户是第8个客户。
customer.csv文件内容如下:
CUSTOMER NO, CUSTOMER NAME,STREET ADDRESS, ZIP CODE,CITY,STATE
1, Really Large Customers, 7500 Louise Dr., 17050, Mechanicsburg, PA
2, Small Stores, 2500 Woodland St., 17055, Pittsburgh, PA
3, Medium Retailers, 1111 Ritter Rd., 17055, Pittsburgh, PA
4, Good Companies, 9500 Scott St., 17050, Mechanicsburg, PA
5, Wonderful Shops, 3333 Rossmoyne Rd., 17050, Mechanicsburg, PA
6, Extremely Loyal Clients, 7777 Ritter Rd., 17055, Pittsburgh, PA
7, Distinguished Agencies, 9999 Scott St., 17050, Mechanicsburg, PA
8, Subsidiaries, 10000 Wetline Blvd., 17055, Pittsburgh, PA
关于product.txt文件
Product.txt文件发生的变化是:
n 产品3的名称现在是Flat Pan(它原来是LCD Panel)。
n 新增一个产品为第4个产品。
修改后的product.txt文件:
PRODUCT CODE,PRODUCT NAME,PRODUCT GROUP
1 Hard Disk Drive Storage
2 Floppy Drive Storage
3 Flat Panel Monitor
4 Keyboard Peripheral
销售订单交易数据
最后一个你需要准备的测试数据是销售订单。假设你在2007-03-01(你第一次执行定期装载的日期)开始数据仓库运作。列表8-2中的脚本新增16个订单日期是2007-03-01的销售订单。
列表8-2:新增销售订单
/*****************************************************************/
/* */
/* sales_order_regular.sql */
/* */
/*****************************************************************/
USE source;
INSERT INTO sales_order VALUES
(22, 1, 1, '2007–03–01', '2007–03–01', 1000)
, (23, 2, 2, '2007–03–01', '2007–03–01', 2000)
, (24, 3, 3, '2007–03–01', '2007–03–01', 3000)
, (25, 4, 4, '2007–03–01', '2007–03–01', 4000)
, (26, 5, 2, '2007–03–01', '2007–03–01', 1000)
, (27, 6, 2, '2007–03–01', '2007–03–01', 3000)
, (28, 7, 3, '2007–03–01', '2007–03–01', 5000)
, (29, 8, 4, '2007–03–01', '2007–03–01', 7000)
, (30, 1, 1, '2007–03–01', '2007–03–01', 1000)
, (31, 2, 2, '2007–03–01', '2007–03–01', 2000)
, (32, 3, 3, '2007–03–01', '2007–03–01', 4000)
, (33, 4, 4, '2007–03–01', '2007–03–01', 6000)
, (34, 5, 1, '2007–03–01', '2007–03–01', 2500)
, (35, 6, 2, '2007–03–01', '2007–03–01', 5000)
, (36, 7, 3, '2007–03–01', '2007–03–01', 7500)
, (37, 8, 4, '2007–03–01', '2007–03–01', 1000)
;
/* end of script */
用下面命令运行sales_order_regular.sql脚本:
mysql> \. c:\mysql\scripts\sales_order_regular.sql
你的控制台将出现如下显示:
Database changed
Query OK, 16 rows affected (0.08 sec)
Records: 16 Duplicates: 0 Warnings: 0
sales_order表现在有37条记录。
运行定期装载脚本
在你运行列表 8-1所示的dw_regular.sql 脚本之前,你需要设置mysql数据库时间为2007-03-01。然后运行该脚本:
mysql> \. c:\mysql\scripts\dw_regular.sql
你将在你的控制台上看到
Database changed
Query OK, 7 rows affected (0.12 sec)
Query OK, 8 rows affected (0.05 sec)
Records: 8 Deleted: 0 Skipped: 0 Warnings: 0
Query OK, 1 row affected (0.06 sec)
Rows matched: 1 Changed: 1 Warnings: 0
Query OK, 1 row affected (0.07 sec)
Records: 1 Duplicates: 0 Warnings: 0
Query OK, 2 rows affected (0.06 sec)
Rows matched: 2 Changed: 2 Warnings: 0
Query OK, 1 row affected (0.06 sec)
Records: 1 Duplicates: 0 Warnings: 0
Query OK, 3 rows affected (0.05 sec)
Query OK, 4 rows affected (0.08 sec)
Records: 4 Deleted: 0 Skipped: 0 Warnings: 0
Query OK, 1 row affected (0.05 sec)
Rows matched: 1 Changed: 1 Warnings: 0
Query OK, 1 row affected (0.06 sec)
Records: 1 Duplicates: 0 Warnings: 0
Query OK, 1 row affected (0.06 sec)
Records: 1 Duplicates: 0 Warnings: 0
Query OK, 16 rows affected (0.07 sec)
Records: 16 Duplicates: 0 Warnings: 0
Query OK, 16 rows affected (0.11 sec)
Records: 16 Duplicates: 0 Warnings: 0
确定定期装载成功
为了确认你的定期装载执行成功,你需要查询维度和事实表。查询customer_dim用如下SQL语句:
mysql> select * from customer_dim \G
结果如下:
The result is as follows.
*************************** 1. row ***************
customer_sk: 1
customer_number: 1
customer_name: Really Large Customers
customer_street_address: 7500 Louise Dr.
customer_zip_code: 17050
customer_city: Mechanicsburg
customer_state: PA
effective date: 2005–03–01
expiry_date: 9999–12–31
*************************** 2. row **************
customer_sk: 2
customer_number: 2
customer_name: Small Stores
customer_street_address: 2500 Woodland St.
customer_zip_code: 17055
customer_city: Pittsburgh
customer_state: PA
effective_date: 2005–03–01
expiry_date: 9999–12–31
*************************** 3. row ***************************
customer_sk: 3
customer_number: 3
customer_name: Medium Retailers
customer_street_address: 1111 Ritter Rd.
customer_zip_code: 17055
customer_city: Pittsburgh
customer state: PA
effective_date: 2005–03–01
expiry_date: 9999–12–31
*************************** 4. row ***************************
customer_sk: 4
customer_number: 4
customer_name: Good Companies
customer_street_address: 9500 Scott St.
customer_zip_code: 17050
customer_city: Mechanicsburg
customer_state: PA
effective_date: 2005–03–01
expiry_date: 9999–12–31
*************************** 5. row ***************************
customer_sk: 5
customer_number: 5
customer_name: Wonderful Shops
customer_street_address: 3333 Rossmoyne Rd.
customer_zip_code: 17050
customer_city: Mechanicsburg
customer_state: PA
effective date: 2005–03–01
expiry_date: 9999–12–31
*************************** 6. row ***********************************
customer_sk: 6
customer_number: 6
customer_name: Extremely Loyal Clients
customer_street_address: 7070 Ritter Rd.
customer_zip_code: 17055
customer_city: Pittsburgh
customer_state: PA
effective_date: 2005–03–01
expiry_date: 2007–02–28
*************************** 7. row ***************************
customer_sk: 7
customer_number: 7
customer_name: Distinguished Agencies
customer_street_address: 9999 Scott St.
customer_zip_code: 17050
customer_city: Mechanicsburg
customer_state: PA
effective_date: 2005–03–01
expiry_date: 9999–12–31
*************************** 8. row ***************************
customer_sk: 8
customer_number: 6
customer_name: Extremely Loyal Clients
customer_street_address: 7777 Ritter Rd.
customer_zip_code: 17055
customer_city: Pittsburgh
customer_state: PA
effective_date: 2007–03–01
expiry_date: 9999–12–31
*************************** 9. row ***************************
customer_sk: 9
customer_number: 8
customer_name: Subsidiaries
customer_street_address: 10000 Wetline Blvd.
customer_zip_code: 17055
customer_city: Pittsburgh
customer_state: PA
effective_date: 2007–03–01
expiry_date: 9999–12–31
9 rows in set (0.00 sec)
查询结果说明:
n SCD2被作用于客户6的街道地址
n SCD1被作用于所有客户6记录的名称上。
n SCD1被作用于客户7的名称上。
n 新增了客户记录:客户8。
查询product_dim表用如下sql语句
mysql> select * from product_dim \G
结果如下:
*************************** 1. row ***************************
product_sk: 1
product_code: 1
product_name: Hard Disk Drive
product_category: Storage
effective_date: 2005–03–01
expiry_date: 9999–12–31
*************************** 2. row ***************************
product_sk: 2
product_code: 2
product_name: Floppy Drive
product_category: Storage
effective_date: 2005–03–01
expiry_date: 9999–12–31
*************************** 3. row ***************************
product_sk: 3
product_code: 3
product_name: LCD Panel
product_category: Monitor
effective_date: 2005–03–01
expiry_date: 2007–02–28
*************************** 4. row ***************************
product_sk: 4
product_code: 3
product_name: Flat Panel
product_category: Monitor
effective date: 2007–03–01
expiry_date: 9999–12–31
*************************** 5. row **************************
product_sk: 5
product_code: 4
product_name: Keyboard
product_category: Peripheral
effective_date: 2007–03–01
expiry_date: 9999–12–31
5 rows in set (0.01 sec)
查询product_dim表的结果说明:
n SCD2被应用到产品3的名称
n 新增新的产品:产品4
现在,查询order_dim表:
mysql> select * from order_dim;
这里是结果,你将有35条订单记录,其中19条是第7章“初始化装载”载入的,16条是本章载入的。
+----------+--------------+----------------+-------------+
| order_sk | order_number | effective_date | expiry_date |
+----------+--------------+----------------+-------------+
| 1 | 3 | 2005–03–01 | 9999–12–31 |
| 2 | 4 | 2005–04–15 | 9999–12–31 |
| 3 | 5 | 2005–05–20 | 9999–12–31 |
| 4 | 6 | 2005–07–30 | 9999–12–31 |
| 5 | 7 | 2005–09–01 | 9999–12–31 |
| 6 | 8 | 2005–11–10 | 9999–12–31 |
| 7 | 9 | 2006–01–05 | 9999–12–31 |
| 8 | 10 | 2006–02–10 | 9999–12–31 |
| 9 | 11 | 2006–03–15 | 9999–12–31 |
| 10 | 12 | 2006–04–20 | 9999–12–31 |
| 11 | 13 | 2006–05–30 | 9999–12–31 |
| 12 | 14 | 2006–06–01 | 9999–12–31 |
| 13 | 15 | 2006–07–15 | 9999–12–31 |
| 14 | 16 | 2006–08–30 | 9999–12–31 |
| 15 | 17 | 2006–09–05 | 9999–12–31 |
| 16 | 18 | 2006–10–05 | 9999–12–31 |
| 17 | 19 | 2007–01–10 | 9999–12–31 |
| 18 | 20 | 2007–02–20 | 9999–12–31 |
| 19 | 21 | 2007–02–28 | 9999–12–31 |
| 20 | 22 | 2007–03–01 | 9999–12–31 |
| 21 | 23 | 2007–03–01 | 9999–12–31 |
| 22 | 24 | 2007–03–01 | 9999–12–31 |
| 23 | 25 | 2007–03–01 | 9999–12–31 |
| 24 | 26 | 2007–03–01 | 9999–12–31 |
| 25 | 27 | 2007–03–01 | 9999–12–31 |
| 26 | 28 | 2007–03–01 | 9999–12–31 |
| 27 | 29 | 2007–03–01 | 9999–12–31 |
| 28 | 30 | 2007–03–01 | 9999–12–31 |
| 29 | 31 | 2007–03–01 | 9999–12–31 |
| 30 | 32 | 2007–03–01 | 9999–12–31 |
| 31 | 33 | 2007–03–01 | 9999–12–31 |
| 32 | 34 | 2007–03–01 | 9999–12–31 |
| 33 | 35 | 2007–03–01 | 9999–12–31 |
| 34 | 36 | 2007–03–01 | 9999–12–31 |
| 35 | 37 | 2007–03–01 | 9999–12–31 |
+----------+--------------+----------------+-------------+
35 rows in set (0.00 sec)
销售订单事实表
现在你可以查询sales_order_fact表
mysql> select * from sales_order_fact;
这里是结果显示:
+----------+-------------+------------+---------------+--------------+
| order_sk | customer_sk | product_sk | order_date_sk | order_amount |
+----------+-------------+------------+---------------+--------------+
| 1 | 3 | 3 | 1 | 4000.00 |
| 2 | 4 | 1 | 46 | 4000.00 |
| 3 | 5 | 2 | 81 | 6000.00 |
| 4 | 6 | 3 | 152 | 6000.00 |
| 5 | 7 | 1 | 185 | 8000.00 |
| 6 | 1 | 2 | 255 | 8000.00 |
| 7 | 2 | 3 | 311 | 1000.00 |
| 8 | 3 | 1 | 347 | 1000.00 |
| 9 | 4 | 2 | 380 | 2000.00 |
| 10 | 5 | 3 | 416 | 2500.00 |
| 11 | 6 | 1 | 456 | 3000.00 |
| 12 | 7 | 2 | 458 | 3500.00 |
| 13 | 1 | 3 | 502 | 4000.00 |
| 14 | 2 | 1 | 548 | 4500.00 |
| 15 | 3 | 2 | 554 | 1000.00 |
| 16 | 4 | 3 | 584 | 1000.00 |
| 17 | 5 | 1 | 681 | 4000.00 |
| 18 | 6 | 2 | 722 | 4000.00 |
| 19 | 7 | 3 | 730 | 4000.00 |
| 20 | 1 | 1 | 731 | 1000.00 |
| 21 | 2 | 2 | 731 | 2000.00 |
| 22 | 3 | 4 | 731 | 3000.00 |
| 23 | 4 | 5 | 731 | 4000.00 |
| 24 | 5 | 2 | 731 | 1000.00 |
| 25 | 8 | 2 | 731 | 3000.00 |
| 26 | 7 | 4 | 731 | 5000.00 |
| 27 | 9 | 5 | 731 | 7000.00 |
| 28 | 1 | 1 | 731 | 1000.00 |
| 29 | 2 | 2 | 731 | 2000.00 |
| 30 | 3 | 4 | 731 | 4000.00 |
| 31 | 4 | 5 | 731 | 6000.00 |
| 32 | 5 | 1 | 731 | 2500.00 |
| 33 | 8 | 2 | 731 | 5000.00 |
| 34 | 7 | 4 | 731 | 7500.00 |
| 35 | 9 | 5 | 731 | 1000.00 |
+----------+-------------+------------+---------------+--------------+
35 rows ir n set (0.00 sec :)
结果说明:
n 开始于2007-03-01的16条订单被加入。
n 以订单日期为日期基准,有效的产品和客户被准确的获得。
u 代理键值为4代表产品3而不是代理键值为3的那个记录。
u 代理键值为8代表客户6,而不是代理键值为6的那个记录。
小结
本章你学习本实验了一个每日定期装载过程,在其获取模式中,全部数据源和变化数据捕获的抽取方式,即SCD1和SCD2被应用。在下一章,你将学习如何部署这个定期装载以每日为周期在window平台上运行。