ArrayOutOfBoundException when iterating through a data frame in spark SQL -
i have data set of called people.json
{"name":"michael"} {"name":"andy", "age":30} {"name":"justin", "age":19} the following code gives me arrayoutofboundsexception.
import org.apache.spark.sql.sparksession val sparksession = sparksession.builder .master("local") .appname("my-spark-app") .config("spark.some.config.option", "config-value") .getorcreate() val peopledf = sparksession.sparkcontext. textfile("c:/users/desktop/spark/people.json"). map(_.split(",")). map(attributes => person(attributes(0),attributes(1).trim.toint)). todf() peopledf.createorreplacetempview("person") val teenagersdf = sparksession.sql("select name, age person") teenagersdf.show() looks trying work through empty dataframe. can tell me why empty?
when have valid json file, should use sqlcontext read json file dataframe.
import org.apache.spark.sql.sparksession val sparksession = sparksession.builder .master("local") .appname("my-spark-app") .config("spark.some.config.option", "config-value") .getorcreate() val peopledf = sparksession.sqlcontext.read.json("c:/users/desktop/spark/people.json") peopledf.createorreplacetempview("person") val teenagersdf = sparksession.sql("select name, age person") teenagersdf.show() this should work
Comments
Post a Comment