首页 诗词 字典 板报 句子 名言 友答 励志 学校 网站地图
当前位置: 首页 > 教程频道 > 其他教程 > 开源软件 >

五。hbase高级部分:table design schema

2013-10-23 
5。hbase高级部分:table design schemastudy and summarie below?art 1:Table attributes?attrdefaultusage

5。hbase高级部分:table design schema

study and summarie below

?

art 1:Table attributes

?

attrdefaultusage/principleuse casenote?Bloom filterdisablecost some mem to impove lookup time TBDdo huge range scan tablethis attr contains 'row','row-col',or noneColumn families???? a printable string since this will be used as the dir name under region-name?Maximum file size?10G in 94.2???maxStoreSize in fact;i.e. property "hbase.hregion.max.filesize" set in hbase-site.xml?Read-only?false??like a firmware to keep safe .i.e. a 'dead' table that never changed??Memstore flush size?128m in 94.2?same effect with property in xml 'hbase.hregion.memstore.flush.size'?

?1.this value determine the frequency of generating store file

2.as 1,this effects the replay time of hlog when a rs down.

Deferred log flushfalseif false,use 'hbase.regionserver.optionallogflushinterval' to check period to sumit edits?

if true may cause data loss as these cached data are in memory before sync to fs

??????????

?

?

?

?

Part 2:Column Family attributes

attrdefaultusage/principleuse casenoteIn-memoryfalsecache some blocks of a small family in mem to speed up queryanalogous to secondarny index table ,for small tablenot guanrantee to when or how much blocks being cached?Bloom filter????see Part 1?Replication scope?0(disable)?sync local cluster data with remote ones TBD?for load balance by distribute req to clusters???Maximum versions?3?control that how many versions(changes)are kept in storage?

?use 1 in general.if u want to check last verion only,given '2' is a good idea.

this will interact with 'Time-to-live'

?Compression?nonecompress this family if specified SNAPPY,LZO,GZ..??u must be clear completely what your requirements are then use corresponding oneBlock size64ka store file is splited into certain blocks,so smaller block cause faster reading randomly;else use bigger if for sequential readings TBD??Block cachetruewhen read some rows from hbase,this dertermine whehter to write back to cache to speed up last accessuse 'true' if clients used access to the much duplicted rows ;'false' if do a whole table scan or less readings than writes system?Time-to-livemax.int(sec in unit)how along a cell value will be kept in storage

if this is a 'recycled' system(ie. rolling),use a appropriate value to keep data size

this will interact with 'Maximum versions',that is both attributes contorl the data verions overlying by this?????

?

?

?Ref:

hbase definitive book

?

?

?

热点排行