PySpark: save RDD of JSON strings to MongoDB -


i want move spark dataframe mongodb , want use mongodb connection.

i came rdd of json strings (which should immediate insert mongodb using pymongo example).

myrdd.take(1)  '{\n  "language": "french",\n  "id": 358539,\n  "title": "effet tetris",\n  "topics": : [\n    {\n      "topic": "video_games",\n ... 

however i'm stuck @ point, convering rdd proper dataframe not option since have repeated , nested fields.

converting rdd dataframe 1 column (the json string) , saving mongo provide different structure (since dataframe have name column , end mongodb

{    "_id" : ... ,    "colname" : "myjsonstring" } 

which not want.

any suggestion?


Comments

Popular posts from this blog

javascript - Create a stacked percentage column -

Optimising Firebase database by automatically overwriting data -

javascript - Angular UI-Grid customTemplate directive causing rows to load slowly/? -