hadoop - Hive pause and resume task -


my question

i new in hive , hadoop environment. want pause , resume hive job running on hadoop.

what have tried

i want ideas related same. thinking might save state of mappers , reducer if feasible.

but not know how keep track of mapper , reducer. have found interfaces , classes in hadoop jobid, jobclient can in keeping track of same. read workflow kind of stuff tracing each task not clarity.

this practically impossible

if not mistaken hive job (or hadoop mapreduce job matter) can waiting, running or finished (either succesfully, or failed).

there no way pause hive job , continue. there not 'debug shortcut' in languages allow pause processing in middle of step, , have not seen breakpoints either.

but here how can close

1. split job

this practical (though limited) approach.

rather making 1 hive script, make 2 , run first one. first 1 part of steps, or operate on part of data, allowing 'pause'. resuming running complementary second script.

(if want can use scheduler start first one, time later or after trigger start second one, start simple)

2. freeze complete environment

this not practical intents , purposes, may possible , usefull resource purposes.

you can perhaps freeze whole cluster, should possible halfway select or so, if want in depth.

how (and investigate state of system) not question regarding hive, whole os of nodes. if have 1 node, suppose straightforward putting virtual machine.


Comments

Popular posts from this blog

php - Vagrant up error - Uncaught Reflection Exception: Class DOMDocument does not exist -

vue.js - Create hooks for automated testing -

Add new key value to json node in java -