Pentaho's "Hadoop File Input" (Spoon) always displays error when trying to read a file from HDFS -


i new pentaho , spoon , trying process file local hadoop node "hadoop file input" item in spoon (pentaho). problem every uri have tried far seems incorrect. don't know how connect hdfs pentaho.

to make clear, correct uri is:

hdfs://localhost:9001/user/data/prueba_concepto/listadoproductos_2017_02_13-15_59_con_id.csv

i know it's correct 1 because tested via command-line , works:

hdfs dfs -ls hdfs://localhost:9001/user/data/prueba_concepto/listadoproductos_2017_02_13-15_59_con_id.csv  

so, setting environment field "static", here of uris have tried in spoon:

  • hdfs://localhost:9001/user/data/prueba_concepto/listadoproductos_2017_02_13-15_59_con_id.csv
  • hdfs://localhost:8020/user/data/prueba_concepto/listadoproductos_2017_02_13-15_59_con_id.csv
  • hdfs://localhost:9001
  • hdfs://localhost:9001/user/data/prueba_concepto/
  • hdfs://localhost:9001/user/data/prueba_concepto
  • hdfs:///

i tried solution garci garcĂ­a gives here: pentaho hadoop file input setting port 8020 , use following uri:

  • hdfs://catalin:@localhost:8020/user/data/prueba_concepto/listadoproductos_2017_02_13-15_59_con_id.csv

and changed 9001 , tried same technique:

  • hdfs://catalin:@localhost:9001/user/data/prueba_concepto/listadoproductos_2017_02_13-15_59_con_id.csv

but still nothing worked me ... everytime press mostrar fichero(s)... button (show file(s)), error pops saying that file cannot found.

i added "hadoop file input" image here.

thank you.

okey, solved this.

i had add new hadoop cluster tab "view" -> right click on hadoop cluster -> new

there had input hdfs hadoop configuration:

  • storage: hdfs
  • hostname: localhost
  • port: 9001 (by default 8020)
  • username: catalin
  • password: (no password)

after that, if hit "test" button, of tests fail. solved second 1 copying configuration properties had in local hadoop configuration file ($local_hadoop_home/etc/hadoop/core-site.xml) spoon's hadoop configuration file:

data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations/hdp25/core-site.xml

after that, had modify data-integration/plugins/pentaho-big-data-plugin/plugin.properties , set property "active.hadoop.configuration" hdp25:

active.hadoop.configuration=hdp25

restart spoon , you're go.


Comments

Popular posts from this blog

javascript - Create a stacked percentage column -

Optimising Firebase database by automatically overwriting data -

javascript - Angular UI-Grid customTemplate directive causing rows to load slowly/? -