CSV Silently Not Reading All Lines on Python on Windows -
i'm trying read lines of tsv file list. however, tsv reader terminating , not reading whole file. know because data
1/6 of length of whole file. no errors thrown when happens.
when manually inspect line terminates on (corresponding length of data
, lines have tons of unicode symbols. thought catch unicodedecodeerror, instead of throwing error, quits out of reading whole file entirely. imagine it's hitting that's triggering end-of-file??
what's throwing me loop: error occurs when i'm using python 2.7 on windows server 2012. file reads 100% on unix implementations of python 2.7 using both code snippets below. i'm running inside anaconda on both.
here's i've tried , neither works:
data = [] open('data.tsv','r') infile: csvreader = csv.reader((x.replace('\0', '') x in infile), delimiter='\t', quoting=csv.quote_none) data = list(csvreader)
i tried reading line line...
with open('data.tsv','r') infile: line in infile: try: d = line.split('\t') q = d[0].decode('utf-8') #where unicode symbols located data.append(d) except unicodedecodeerror: continue
thanks in advance!
as per general suggestion the documentation:
if csvfile file object, must opened ‘b’ flag on platforms makes difference.
so open file with:
with open('data.csv', 'rb') infile: csvreader = csv.reader(infile, delimiter='\t', quoting=csv.quote_none) data = list(csvreader)
also, have decode strings if have unicode data, or use unicodecsv
drop-in replacement don't have worry it.
Comments
Post a Comment