indexing - Creating a .p file from an InvertedIndex -
i trying construct python search engine pull queried tweets out of twitterapi. i cannot use tweepy. have constructed class twitterwrapper, class engine, , have created (pickled) corpus query term of “raptor”, no problem there. have created class invertedindex cannot seem construct index of lemmatized words , add jupyter notebooks pickle file (.p) (which must do) in order use twittir.py interface:
t = twittir.engine(“raptor.p”, “index (to named).p”) results = t.query (“raptor”) result in results: print(result)
so, question is, how make index (using inverted index), , save jupyter notebook .p file? of course, of python beginner.
def build_index(self, corpus): index = {} if self.lemmatizer == none: self.lemmatizer = nltk.wordnetlemmatizer() if self.stop_words none: self.stop_words = [self.wordtotoken(word) word in nltk.corpus.stopwords.words('english')] doc in corpus: self.add_document(doc['text'], self.ndocs) self.ndocs = self.ndocs + 1 def __getitem__(self,index): return self.index[self.normalize_document(index)[0]]
Comments
Post a Comment