Transform data in spark scala column to row -
this question has answer here:
- transpose column row spark 5 answers
input dataset:
customerid customername sun mon tue 1 abc 0 12 10 2 def 10 0 0
required output dataset:
customerid customername day value 1 abc sun 0 1 abc mon 12 1 abc tue 10 2 def sun 10 2 def mon 0 2 def tue 0
please note, number of "sun mon tue" columns 82 in dataset!
assuming input dataset
generated using case class
as
case class infos(customerid: int, customername: string, sun: int, mon: int, tue: int)
for testing purpose creating dataset
as
import sqlcontext.implicits._ val ds = seq( infos(1, "abc", 0, 12, 10), infos(2, "def", 10, 0, 0) ).tods
which should give input dataset
+----------+------------+---+---+---+ |customerid|customername|sun|mon|tue| +----------+------------+---+---+---+ |1 |abc |0 |12 |10 | |2 |def |10 |0 |0 | +----------+------------+---+---+---+
getting final required dataset
requires create case class
as
case class finalinfos(customerid: int, customername: string, day: string, value: int)
final required dataset
can achieved doing following
val names = ds.schema.fieldnames ds.flatmap(row => array(finalinfos(row.customerid, row.customername, names(2), row.sun), finalinfos(row.customerid, row.customername, names(3), row.mon), finalinfos(row.customerid, row.customername, names(4), row.tue)))
which should give dataset
+----------+------------+---+-----+ |customerid|customername|day|value| +----------+------------+---+-----+ |1 |abc |sun|0 | |1 |abc |mon|12 | |1 |abc |tue|10 | |2 |def |sun|10 | |2 |def |mon|0 | |2 |def |tue|0 | +----------+------------+---+-----+
Comments
Post a Comment