ArrayOutOfBoundException when iterating through a data frame in spark SQL -
i have data set of called people.json
{"name":"michael"} {"name":"andy", "age":30} {"name":"justin", "age":19}
the following code gives me arrayoutofboundsexception.
import org.apache.spark.sql.sparksession val sparksession = sparksession.builder .master("local") .appname("my-spark-app") .config("spark.some.config.option", "config-value") .getorcreate() val peopledf = sparksession.sparkcontext. textfile("c:/users/desktop/spark/people.json"). map(_.split(",")). map(attributes => person(attributes(0),attributes(1).trim.toint)). todf() peopledf.createorreplacetempview("person") val teenagersdf = sparksession.sql("select name, age person") teenagersdf.show()
looks trying work through empty dataframe. can tell me why empty?
when have valid json
file, should use sqlcontext
read json
file dataframe
.
import org.apache.spark.sql.sparksession val sparksession = sparksession.builder .master("local") .appname("my-spark-app") .config("spark.some.config.option", "config-value") .getorcreate() val peopledf = sparksession.sqlcontext.read.json("c:/users/desktop/spark/people.json") peopledf.createorreplacetempview("person") val teenagersdf = sparksession.sql("select name, age person") teenagersdf.show()
this should work
Comments
Post a Comment