PySpark: save RDD of JSON strings to MongoDB -
i want move spark dataframe mongodb , want use mongodb connection.
i came rdd of json strings (which should immediate insert mongodb using pymongo example).
myrdd.take(1) '{\n "language": "french",\n "id": 358539,\n "title": "effet tetris",\n "topics": : [\n {\n "topic": "video_games",\n ... however i'm stuck @ point, convering rdd proper dataframe not option since have repeated , nested fields.
converting rdd dataframe 1 column (the json string) , saving mongo provide different structure (since dataframe have name column , end mongodb
{ "_id" : ... , "colname" : "myjsonstring" } which not want.
any suggestion?
Comments
Post a Comment