ArrayOutOfBoundException when iterating through a data frame in spark SQL -


i have data set of called people.json

{"name":"michael"} {"name":"andy", "age":30} {"name":"justin", "age":19} 

the following code gives me arrayoutofboundsexception.

  import org.apache.spark.sql.sparksession    val sparksession = sparksession.builder     .master("local")     .appname("my-spark-app")     .config("spark.some.config.option", "config-value")     .getorcreate()    val peopledf = sparksession.sparkcontext.     textfile("c:/users/desktop/spark/people.json").     map(_.split(",")).     map(attributes => person(attributes(0),attributes(1).trim.toint)).     todf()    peopledf.createorreplacetempview("person")    val teenagersdf = sparksession.sql("select name, age person")    teenagersdf.show() 

looks trying work through empty dataframe. can tell me why empty?

when have valid json file, should use sqlcontext read json file dataframe.

 import org.apache.spark.sql.sparksession    val sparksession = sparksession.builder     .master("local")     .appname("my-spark-app")     .config("spark.some.config.option", "config-value")     .getorcreate()    val peopledf = sparksession.sqlcontext.read.json("c:/users/desktop/spark/people.json")    peopledf.createorreplacetempview("person")    val teenagersdf = sparksession.sql("select name, age person")    teenagersdf.show() 

this should work


Comments

Popular posts from this blog

php - Vagrant up error - Uncaught Reflection Exception: Class DOMDocument does not exist -

vue.js - Create hooks for automated testing -

Add new key value to json node in java -