首页 诗词 字典 板报 句子 名言 友答 励志 学校 网站地图
当前位置: 首页 > 教程频道 > 数据库 > 数据库开发 >

求解:数据仓库与数据挖掘题1,该怎么解决

2012-03-08 
求解:数据仓库与数据挖掘题1不好意思是英文版的。如果分嫌少可以在加!请各位达人帮忙,谢了!每题10分!一、Dat

求解:数据仓库与数据挖掘题1
不好意思是英文版的。如果分嫌少可以在加!请各位达人帮忙,谢了!
每题10分!

一、Data   warehouse   design
(1)   Enumerate   three   classes   of   schemas   that   are   popularly   used   for   modeling   data   warehouses.
(2)   Draw   a   snowflake   schema   diagram   for   the   Big_University   data   warehouse   which   consists   of   four   dimensions:   student,   course,   semester   and   instructor,   and   two   measures:   count,   and   avg_grade,   where   avg_grade   is   the   actual   grade   of   student   in   the   lowest   concept   layer,   whereas   in   the   higher   concept   layers,   avg_grade   is   the   average   grade   for   the   given   student,   course,   semester   and   instructor.
(3)   Starting   with   the   base   cuboid   (student,   course,   semester,   instructor),   what   specific   OLAP   operations   should   be   performed   in   order   to   list   the   average   grade   of   each   student   taken   the   course   of   “CS”,   eg,   roll   up   from   “semester”   to   “year”?
(4)   If   each   dimension   contains   5   layers(including   all),   eg,   student   <   major   <   status   <   university   <   all,   then   how   many   cuboids   in   this   data   cube   (   including   base   cuboid   and   apex   cuboid)?

二、Data   cube   computation
Suppose   a   base   cuboid   has   3   dimensions,   (A,   B,   C),   with   the   number   of   cells   shown   below:   |A|   =   1,000,000,   |B|   =   100,   and   |C|   =   1,000.   Suppose   each   dimension   is   partitioned   evenly   into   10   portions   for   chunking.
(1)   Assuming   each   dimension   has   only   one   level,   draw   the   complete   lattice   of   the   cube.
(2)   If   each   cube   cell   stores   one   measure   with   4   bytes,   what   is   the   total   size   of   the   computed   cube   if   the   cube   is   dense?
(3)   If   the   cube   is   very   sparse,   describe   an   effective   multidimensional   array   structure   to   store   the   sparse   cube.
(4)   State   the   order   for   computing   the   chunks   in   the   cube   which   requires   the   least   amount   of   space,   and   compute   the   total   amount   of   main   memory   space   required   for   computing   the   2-D   planes.

三、Mining   association   rules
Suppose   we   have   the   following   transactional   data.
    TID     Items_bought
    T100     {K,   A,   D,   B}
    T200     {D,   A,   C,   E,   B}


    T300     {C,   A,   B,   E}
    T400     {B,   A,   D}
Assume   that   the   minimum   support   and   minimum   confidence   thresholds   are   60%   and   80%,   respectively.
(1)   Find   the   set   of   frequent   itemsets   using   the   Apriori   algorithm   and   FP-tree   respectively.   Show   the   derivation   of   Ck   and   Lk   for   each   iteration   k   in   Apriori   algorithm   and   show   the   “conditional   pattern   base,   conditional   FP-tree,   frequent   patterns”   for   each   item   in   FP-tree   as   showed   in   Table   6-1   of   textbook.          
(2)   Generate   strong   association   rules   from   the   frequent   itemsets   (with   support   and   confidence)   found   above.

[解决办法]
Enumerate three classes of schemas that are popularly used for modeling data warehouses.
列举出在数据仓库中常用的3类建模模式
[解决办法]
1) Enumerate three classes of schemas that are popularly used for modeling data warehouses.
a:
star schema,snowflake
其他的记不住了
建议看oracle帮助 data warehousing guide 里头都有

[解决办法]
学习,加油!

热点排行