Posts tagged with: merge

HBase regions merge

HBase writes data to multiple servers, called Region Servers.

Each region server contains one or several Regions, and data is allocated on these regions; Hbase will control which region server controls which region(s).

Regions number can be defined at the table creation level :

[hbase@gw vagrant]$ kinit -kt /etc/security/keytabs/hbase.headless.keytab hbase
[hbase@gw vagrant]$ hbase shell
hbase(main):001:0> create 'table2', 'columnfamily1', {NUMREGIONS => 5, SPLITALGO => 'HexStringSplit'}

We have previously defined that 5 regions would be accurate, regarding region servers number and desired regions size, and 2 basic algorithms are supplied, HexStringSplit and UniformSplit (but you can add yours).

You can provide your own splits :

hbase(main):001:0> create 'table2', 'columnfamily1', {NUMREGIONS => 5, SPLITS=> ['a', 'b', 'c']}

So this table2 has been created with our 5 regions, let’s go to HBase webUI to see what it looks like :

hbase01We do have our 5 regions, see the keys repartition, and we can see in the regions names : table_name,start_key,end_key,timestamp.ENCODED_REGIONNAME.

So now, if we want to merge regions, we can use the merge_region in hbase shell.
The regions have to be adjacent.

hbase(main):010:0> merge_region '234a12e83e203f2e3158c39e1da6b6e7', '89dd2d5a88e1b2b9787e3254b85b91d3'
0 row(s) in 0.0140 seconds


Notice that the ENCODED_REGIONNAME of the result region is a new one.

hbase(main):012:0> merge_region 'bfad503057fca37bd60b5a83109f7dc6','e37d7ab5513e06268459c76d5e7335e4'
0 row(s) in 0.0040 seconds

Let merge all regions, eventually !

hbase(main):013:0> merge_region '0f5fc22bf0beacbf83c1ad562324c778','af6d7af861f577ba456cff88bf5e5e38','3f1e029afd907bc62f5e5fb8b6e1b5cf','3f1e029afd907bc62f5e5fb8b6e1b5cf'
0 row(s) in 0.0290 seconds

Then we can see that only one region remains :



For the record, you can create a HBase table pre-splitted if you know the repartition of your keys : either by passing SPLITS, or by providing a SPLITS_FILE which contains the points of splitting (so lines number =regions -1)
Be aware of the order, SPLITS_FILE before {…} won’t work.

[hbase@gw vagrant]$ echo "a\nb\nc" > /tmp/splits.txt;
[hbase@gw vagrant]$ kinit -kt /etc/security/keytabs/hbase.headless.keytab hbase
[hbase@gw vagrant]$ hbase shell
hbase(main):011:0> create 'test_split', { NAME=> 'cf', VERSIONS => 1, TTL => 69200 }, SPLITS_FILE => '/tmp/splits.txt'

And the result :