Python subprocess call to xpdf's pdftotext not working with encoding -


i trying run pdftotext using python subprocess module.

import subprocess  pdf = r"path\to\file.pdf" txt = r"path\to\out.txt" pdftotext = r"path\to\pdftotext.exe"  cmd = [pdftotext, pdf, txt, '-enc utf-8'] response = subprocess.check_output(cmd,                  shell=true,                 stderr=subprocess.stdout) 

tb

calledprocesserror: command '['path\\to\\pdftotext.exe', 'path\\to\\file.pdf', 'path\\to\\out.txt', '-enc utf-8']' returned non-zero exit status 99 

when remove last argument '-enc utf-8' cmd, works ok in python.

when run pdftotext pdf txt -enc utf-8 in cmd, works ok.

what missing?

thanks.

subprocess has complicated rules handling commands. docs:

the shell argument (which defaults false) specifies whether use shell program execute. if shell true, recommended pass args string rather sequence.

more details explained in answer here.

so, docs explain, should convert command string:

cmd = r"""{} "{}" "{}" -enc utf-8""".format('pdftotext', pdf, txt)  

now, call subprocess as:

subprocess.call(cmd, shell=true, stderr=subprocess.stdout) 

Comments

Popular posts from this blog

php - Vagrant up error - Uncaught Reflection Exception: Class DOMDocument does not exist -

vue.js - Create hooks for automated testing -

Add new key value to json node in java -