python - Pickling and Unpickling in different modules -


i know has been covered number of other questions (unable load files using pickle , multipile modules) can't see how solutions apply situation.

this project structure (as minimal possible):

classify-updater/ ├── main.py └── updater     ├── __init__.py     └── updater.py classify └── main.py 

in classify-updater/main.py:

import sys sklearn.feature_extraction.text import countvectorizer updater.updater import updater  def main(argv):     vectorizer = countvectorizer(stop_words='english')     updater = updater(vectorizer)     updater.update()  if __name__ == "__main__":     main(sys.argv) 

in classify-updater/updater/updater.py:

import dill  class updater:      def __init__(vectorizer):         vectorizer.preprocessor = lambda doc: doc.text.encode('ascii', 'ignore')         self.vectorizer = vectorizer      def update(self):         pickled_vectorizer = dill.dumps(self.vectorizer)         # save google cloud storage 

in classify/main.py

import dill import sys  def main(argv):     # load google cloud storage     vectorizer = dill.loads(vectorizer_blob)  if __name__ == "__main__":     main(sys.argv) 

this results in importerror.

traceback (most recent call last):   file "classify.py", line 102, in <module>     app.main(sys.argv)   file "classify.py", line 50, in main     vectorizer = self.fetch_vectorizer()   file "classify.py", line 86, in fetch_vectorizer     vectorizer = dill.loads(vectorizer_blob.download_as_string())   file "/usr/local/lib/python2.7/site-packages/dill/dill.py", line 299, in loads     return load(file)   file "/usr/local/lib/python2.7/site-packages/dill/dill.py", line 288, in load     obj = pik.load()   file "/usr/local/cellar/python/2.7.13_1/frameworks/python.framework/versions/2.7/lib/python2.7/pickle.py", line 864, in load     dispatch[key](self)   file "/usr/local/cellar/python/2.7.13_1/frameworks/python.framework/versions/2.7/lib/python2.7/pickle.py", line 1096, in load_global     klass = self.find_class(module, name)   file "/usr/local/lib/python2.7/site-packages/dill/dill.py", line 445, in find_class     return stockunpickler.find_class(self, module, name)   file "/usr/local/cellar/python/2.7.13_1/frameworks/python.framework/versions/2.7/lib/python2.7/pickle.py", line 1130, in find_class     __import__(module) importerror: no module named updater.updater 

it has been explained elsewhere pickle needs class definition load object, can't see reference updater module comes i'm pickling instance of vectorizer.

i've simplified example heavily. 2 packages sit quite far apart in terms of our codebase. importing 1 module other might not feasible. there way work around this?

the issue here lambda (anonymous function).

it possible pickle self-contained object vectorizer. however, preprocessing function used in example scoped updater class updater class required unpickle.

rather having preprocessor function, preprocess data , pass in fit vectorizer. remove need updater class when unpickling.


Comments

Popular posts from this blog

php - Vagrant up error - Uncaught Reflection Exception: Class DOMDocument does not exist -

vue.js - Create hooks for automated testing -

Add new key value to json node in java -