2017-03-03 6 views
0

Ich habe ein Python-Skript, das PyPDF2 verwendet, um die Reihenfolge der Seiten einer PDF umzukehren.PyPDF2: Stream wurde unerwartet beendet

from PyPDF2 import PdfFileWriter, PdfFileReader 

output = PdfFileWriter() 
rpage = [] 
name = input("What's the file called?") 

filename = name.split('.', 1) 

input1 = PdfFileReader(open(name,'rb'), strict = False) 

pages = list(range(1,input1.getNumPages() + 1)) 

for i in range(0, (input1.getNumPages())): 
    rpage.append(pages[input1.getNumPages() - i -1]) 
for i in rpage: 
    output.addPage(input1.getPage(i-1)) 

outputpath = filename[0] + '-reversed.pdf' 

outputStream = open(outputpath, "wb") 
output.write(outputStream) 

, die als bis versuchen, den Ausgangsstrom zu schreiben beabsichtigte Funktionen einrichten, wo es diesen Fehler zurückgibt:

PdfReadWarning: Invalid stream (index 59) within object 108 0: Stream has ended unexpectedly [pdf.py:1573] 
Traceback (most recent call last): 
    File "D:\Documents\Google Drive\Programming\Python\PDF Scripts\reverse pdf.py", line 22, in <module> 
output.write(outputStream) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 482, in write 
self._sweepIndirectReferences(externalReferenceMap, self._root) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 571, in _sweepIndirectReferences 
self._sweepIndirectReferences(externMap, realdata) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences 
value = self._sweepIndirectReferences(externMap, value) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 571, in _sweepIndirectReferences 
self._sweepIndirectReferences(externMap, realdata) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences 
    value = self._sweepIndirectReferences(externMap, value) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 556, in _sweepIndirectReferences 
    value = self._sweepIndirectReferences(externMap, data[i]) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 571, in _sweepIndirectReferences 
    self._sweepIndirectReferences(externMap, realdata) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences 
    value = self._sweepIndirectReferences(externMap, value) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 577, in _sweepIndirectReferences 
    newobj = data.pdf.getObject(data) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 1611, in getObject 
    retval = readObject(self.stream, self) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\generic.py", line 66, in readObject 
    return DictionaryObject.readFromStream(stream, pdf) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\generic.py", line 611, in readFromStream 
    data["__streamdata__"] = stream.read(length) 
TypeError: integer argument expected, got 'NullObject' 

Der Code stellt eine PDF-Datei erstellen, aber es hat eine Größe von 0 KB hat und ist, daher unlesbar. Ich habe ein Beispielskript getestet zu verschmelzen drei PDFs here, die eine weitere leere Datei und führt zu diesem Fehler produziert:

PdfReadWarning: Invalid stream (index 59) within object 108 0: Stream has ended unexpectedly [pdf.py:1573] 
Traceback (most recent call last): 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 1567, in _getObjectFromStream 
    obj = readObject(streamData, self) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\generic.py", line 98, in readObject 
    return NumberObject.readFromStream(stream) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\generic.py", line 269, in readFromStream 
    num = utils.readUntilRegex(stream, NumberObject.NumberPattern) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\utils.py", line 134, in readUntilRegex 
    raise PdfStreamError("Stream has ended unexpectedly") 
PyPDF2.utils.PdfStreamError: Stream has ended unexpectedly 

During handling of the above exception, another exception occurred: 

Traceback (most recent call last): 
    File "D:\Documents\Google Drive\Programming\Python\PDF Scripts\untitled1.py", line 27, in <module> 
    merger.write(output) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\merger.py", line 230, in write 
    self.output.write(fileobj) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 482, in write 
    self._sweepIndirectReferences(externalReferenceMap, self._root) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 571, in _sweepIndirectReferences 
    self._sweepIndirectReferences(externMap, realdata) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences 
    value = self._sweepIndirectReferences(externMap, value) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 571, in _sweepIndirectReferences 
    self._sweepIndirectReferences(externMap, realdata) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences 
    value = self._sweepIndirectReferences(externMap, value) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 556, in _sweepIndirectReferences 
    value = self._sweepIndirectReferences(externMap, data[i]) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 571, in _sweepIndirectReferences 
    self._sweepIndirectReferences(externMap, realdata) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences 
    value = self._sweepIndirectReferences(externMap, value) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 577, in _sweepIndirectReferences 
    newobj = data.pdf.getObject(data) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 1611, in getObject 
    retval = readObject(self.stream, self) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\generic.py", line 66, in readObject 
    return DictionaryObject.readFromStream(stream, pdf) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\generic.py", line 609, in readFromStream 
    length = pdf.getObject(length) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 1593, in getObject 
    retval = self._getObjectFromStream(indirectReference) 
    File "C:\Users\Charles\Anaconda3\lib\site-packages\PyPDF2\pdf.py", line 1576, in _getObjectFromStream 
    raise utils.PdfReadError("Can't read object stream: %s"%e) 
PyPDF2.utils.PdfReadError: Can't read object stream: Stream has ended unexpectedly 

Der vorherige Fehler wird auch ausgegeben, wenn dieses Skript verwendet wird, eine PDF-Datei in ihre Seiten zu unterteilen:

from PyPDF2 import PdfFileWriter, PdfFileReader 
infile = PdfFileReader(open('test.pdf', 'rb')) 

for i in range(infile.getNumPages()): 
    p = infile.getPage(i) 
    outfile = PdfFileWriter() 
    outfile.addPage(p) 
    with open('page-%02d.pdf' % i, 'wb') as f: 
     outfile.write(f) 

Der obige Code erzeugt (n-1) lesbar PDFs, aber mit n-ten PDF ist eine leere Datei. Irgendeine Idee, wie ich das beheben kann?

Antwort

0

Ihr Skript zählt durch die Seiten an verschiedenen Stellen, deren Zweck mir nicht klar ist. Ich glaube, wie du rückwärts zählst, ist die Quelle deines Fehlers.

Ich nahm Ihr Skript und passte es zuerst an 2.7 an (da das ist, was ich gerade betreibe), dann vereinfachte es, rückwärts durch Ihre Quelldatei einmal zu gehen und Ihre umgekehrte Datei zu erstellen.

from PyPDF2 import PdfFileWriter, PdfFileReader 

output = PdfFileWriter() 
# rpage = [] removed because it's not needed anymore 
name = raw_input("What's the file called? ") #Changed for the 2.7 environment 

filename = name[:-4] #Simplified, since we know where the piece we want is. 

input1 = PdfFileReader(name,"rb") 
#Simplified, because I couldn't figure out why it was complex. 

for i in range(input1.getNumPages(),0,-1): 
    #getNumPages counts like a human and gives the total number of pages 
    #This counts backwards, so no need to count forward and use that to 
    #reverse the numbers. 
    output.addPage(input1.getPage(i-1)) 
    #getPage counts like a computer and needs to finish with page 0. 

outputpath = filename + '-reversed.pdf' 

outputStream = open(outputpath, "wb") 
output.write(outputStream) 
outputStream.close() #Closes the file and stream once you're done. 
+0

Ich habe dieses Programm ausgeführt, änderte es zurück zu Python 3 durch Ersetzen von "raw_input" durch "Eingabe" und bekam diesen Fehler http://pastebin.com/PgwQvCyQ –

+0

Auf Ihrem Reader-Objekt, versuchen Sie dies: 'input1 = PdfFileReader (Name, 'rb', strict = False) ' Laut der Website gibt es einen möglichen Fehler im Reader. –

0

Wenn alles, was Sie wollen in der Lage sein, die Seiten für den Druck rückgängig zu machen, und Sie kümmern sich nicht um den Versuch, interne Links und Anmerkungen zu bewahren, pdfrw könnte für die Aufgabe besser als pyPDF2:

from pdfrw import PdfWriter, PdfReader 

iname = input("What's the file called? ") 
oname = iname.rsplit('.', 1)[0] + '-reversed.pdf' 

output = PdfWriter() 
output.addpages(reversed(PdfReader(iname).pages)) 
output.write(oname) 

Haftungsausschluss: Ich bin der primäre Autor pdfrw.

Verwandte Themen