Transform data in spark scala column to row -
this question has answer here:
- transpose column row spark 5 answers
input dataset:
customerid customername sun mon tue 1 abc 0 12 10 2 def 10 0 0 required output dataset:
customerid customername day value 1 abc sun 0 1 abc mon 12 1 abc tue 10 2 def sun 10 2 def mon 0 2 def tue 0 please note, number of "sun mon tue" columns 82 in dataset!
assuming input dataset generated using case class as
case class infos(customerid: int, customername: string, sun: int, mon: int, tue: int) for testing purpose creating dataset as
import sqlcontext.implicits._ val ds = seq( infos(1, "abc", 0, 12, 10), infos(2, "def", 10, 0, 0) ).tods which should give input dataset
+----------+------------+---+---+---+ |customerid|customername|sun|mon|tue| +----------+------------+---+---+---+ |1 |abc |0 |12 |10 | |2 |def |10 |0 |0 | +----------+------------+---+---+---+ getting final required dataset requires create case class as
case class finalinfos(customerid: int, customername: string, day: string, value: int) final required dataset can achieved doing following
val names = ds.schema.fieldnames ds.flatmap(row => array(finalinfos(row.customerid, row.customername, names(2), row.sun), finalinfos(row.customerid, row.customername, names(3), row.mon), finalinfos(row.customerid, row.customername, names(4), row.tue))) which should give dataset
+----------+------------+---+-----+ |customerid|customername|day|value| +----------+------------+---+-----+ |1 |abc |sun|0 | |1 |abc |mon|12 | |1 |abc |tue|10 | |2 |def |sun|10 | |2 |def |mon|0 | |2 |def |tue|0 | +----------+------------+---+-----+
Comments
Post a Comment