mapreduce - HBase-Mapreducer, optimal number of reducers when using TableReducer -
we using map reduce write data hbase. since have formatting done, implemented our own reducer extending tablereducer. custom reducer behaving differently in production , development environments. getting following error
error: org.apache.hadoop.hbase.client.retriesexhaustedwithdetailsexception: failed 659 actions: regiontoobusyexception: 659 times,
from here, understood flushing not done properly. however, same working fine in dev environment.
along above option, feel configuring number of reducers might effect, how data sent region server.
we using salt span row keys among region servers. of now, salt 20m , number of region servers 60. should salt chosen equal number of region servers span records evenly? if not, how identify optimal value number of reducers, while loading data hbase.
also, in general, maximum number of connections allowed @ client side, interact hbase. here, using api provided map reducer, in general, we handle client connection hbase, maximum number of client connections can play important role. in advance help
the hbase-mapreduce api, decide number of reducers equal regional server count. this, code base confirms it. so, problem was, wehn write our map reduce, giving number of reducers value, s different default. so, looks like, default value here shoyuld work well, unless have specific requirement.
Comments
Post a Comment