hadoop - How can I add external python libraries into HDFS? -


is there way, how add external libraries this one hdfs? seems pyspark needs external libs have them in shared folder on hdfs. byt since using shellscript, runs pyspark script external libraries, fails importing them.

see post here importerror.

you can add external lib --py-files option. can provide either .py file or .zip.

for exemple, using spark submit :

spark-submit --master yarn --py-files ./hdfs.zip myjob.py 

check corresponding documentation : submitting applications


Comments

Popular posts from this blog

vue.js - Create hooks for automated testing -

php - Vagrant up error - Uncaught Reflection Exception: Class DOMDocument does not exist -

serial port - hub4com OVERRUN Error -