Hi,
I use Aspose PdfKit 4.2 for java
I do this simple code
InputStream input = new FileInputStream("/home/mburger/tmp/mfm/207_2008_1_T4_2.pdf");
OutputStream output = new FileOutputStream("/home/mburger/tmp/mfm/xxxx.pdf");
PdfFileEditor editor = new PdfFileEditor();
editor.extract(input, 1, 1, output);
output.flush();
output.close();
input.close();
The Output File has ALWAYS the same size as the input file, if my document has 10 pages and is 5MB big, I extract only the first page from the file, the result file has my desired PAGE but is still 5MB big!!! If I extract 2 pages the file has the same size as the extracted with one page!
can you help me?
Same thing if I try to splitToPages …
any single page has the same size then the WHOLE PDF document
InputStream input = new FileInputStream("/home/mburger/tmp/mfm/207_2008_1_T4_2.pdf");
PdfFileEditor editor = new PdfFileEditor();
int i = 0;
for (ByteArrayOutputStream splitToPages : editor.splitToPages("/home/mburger/tmp/mfm/207_2008_1_T4_2.pdf")) {
i++;
System.out.println(i);
OutputStream outputStream = new FileOutputStream ("/home/mburger/tmp/mfm/x"+i+".pdf");
outputStream.write(splitToPages.toByteArray());
outputStream.flush();
outputStream.close();
};
Hi Michael,
Same for
editor.splitFromFirst
!
I opend a new Thread becouse I wasn’t able to set the thread to private!!
see here
http://www.aspose.com/community/forums/514596/extract-file-from-pdf-private/showthread.aspx#514596
Incredible!
On Version 4.4 it doesn’t work at all!!!
Extract creates only an empty file!
That’s not the first time on passing version in AsposePdfKit from 4.2 to 4.4 nothing works…
I love AsposeWords!!!
but the Aspose Pdf Kit is very very very unstable and basic functions like working with attachments doesn’t work …
I think I know the problem,
inserted images would not be saved on the PAGE but someone else in the PDF File (header or somehting like that, as attachments)
So on extracting a page … will extract me al included images.
Is there a way to say the PDF FILE delete anything you don’t use? Or delete images in file?
thx
Michael
Ok I found many ways to extract the first page with the image on first page!
But now solution works, there are many errors … !!!
For example I can still using PdfFileEditor to extract the first page and delete the Images with PdfContentEditor!
BUT!
pdfContentEditor.deleteImages(1, new int[] {2,3,4,5,6});
doesn’t delete the 2., 3., 4., 5. and 6. image! Becouse your 2. image is my 1. image!!! But on Aspose Words I’m inserting the images sequencially … and on open the AsposeWord convertet PDF File I see my first image on first page!
Then on deleting it with deleteImages my first images is the 2. in your function!!!
one other bug …
I can do this:
pdfContentEditor.deleteImages(1, new int[] {2, 3});
but this raise an exception
pdfContentEditor.deleteImages(1, new int[] {2});
pdfContentEditor.deleteImages(1, new int[] {3});
Then there other exceptions on using your libs … I don’t have the time to explain them all …
I think the way I can resolve my problem is using
PdfExtractor …
If I do this
// Working solution
PdfExtractor extractor = new PdfExtractor();
extractor.bindPdf(fileName);
extractor.extractImage();
extractor.getNextImage("/home/mburger/tmp/mfm/image1.pdf");
extractor.close();
// Working solution
It creates an PDF File with the extracted Image, and it seems it is always the right image (the first one and not randomize)
But The problem I have … The created PDF File is big then the extracted image … but I need a A4PDF file … so is there away to create a new A4 PDF File with the extracted image?
I can’t find anymore classes like com.apsos.pdf.Document!
thx
Michael
OMG
// Working solution
PdfExtractor extractor = new PdfExtractor();
extractor.bindPdf(fileName);
extractor.extractImage();
extractor.getNextImage("/home/mburger/tmp/mfm/image1.pdf");
extractor.close();
// Working solution
On version 4.2 I with my demo file I get the 1. first page
On version 4.4 I with my demo file i get THE LAST PAGE!!!
It is randomise!
Hi Michael,
Thanks for contacting support.
In order to split the PDF file to Single page documents, I would recommend you to please follow the instructions specified over Split PDF File to Individual Pages
michael.burger@siag.it:
Same for
editor.splitFromFirst
!
michael.burger@siag.it: Incredible!
On Version 4.4 it doesn’t work at all!!! Extract creates only an empty file!
That’s not the first time on passing version in AsposePdfKit from 4.2 to 4.4 nothing works…
I love AsposeWords!!! but the Aspose Pdf Kit is very very very unstable and basic functions like working with attachments doesn’t work …
Hi Michael,
Aspose.Pdf.Kit for Java has been discontinued as separate product and all its classes and enumerations are not present under com.aspose.pdf.facades
package of autoported Aspose.Pdf for Java. We recommend you to please try using the latest release of Aspose.Pdf for Java 4.4.0 and in case you still face the same issue, please share some details with code snippet. We apologize for your inconvenience.
Now concerning to your point related to attachments, I have used the following code snippet to add an attachment to PDF file and as per my observations, the resultant file is properly being generated.
[Java]
//open first document
com.aspose.pdf.Document pdfDocument1 = new com.aspose.pdf.Document("c:/pdftest/source.PDF");
//setup new file to be added as attachment
com.aspose.pdf.FileSpecification fileSpecification
= new com.aspose.pdf.FileSpecification("c:/pdftest/Formatted_Test1.pdf", "Sample PDF file");
//add attachment to document's attachment collection
pdfDocument1.getEmbeddedFiles().add(fileSpecification);
// Save updated document containing table object
pdfDocument1.save("c:/pdftest/Attachment_output.pdf");
michael.burger@siag.it: I think I know the problem, inserted images would not be saved on the PAGE but someone else in the PDF File (header or something like that, as attachments) So on extracting a page will extract al included images.
Is there a way for the PDF FILE to delete anything you don’t use? Or delete images in the file?
Hi Michael,
Aspose.Pdf for Java supports the feature to optimize the size of the PDF file but, unfortunately, it does not currently support the feature to remove unused objects from the PDF document. For the sake of implementation I have logged this as PDFNEWJAVA-33908 in our issue tracking system. We will further look into the details of this problem and will keep you updated on the status of correction. Please be patient and spare us a little time. We are sorry for this inconvenience.
Java
// open first document
com.aspose.pdf.Document pdfDocument1 = new com.aspose.pdf.Document("c:/pdftest/demo.pdf");
// optimize the PDF file
pdfDocument1.optimizeResources();
// save updated document
pdfDocument1.save("c:/pdftest/Optimized.pdf")
michael.burger@siag.it:
pdfContentEditor.deleteImages(1, new int[] {2,3,4,5,6});
doesn’t delete the 2., 3., 4., 5. and 6. image! Because your 2. image is my 1. image!!! But on Aspose Words I’m inserting the images sequentially … and on opening the AsposeWord converted PDF File I see my first image on the first page!
Then on deleting it with deleteImages my first image is the 2. in your function!!!
one other bug …
I can do this:
pdfContentEditor.deleteImages(1, new int[] {2, 3});
but this raises an exception:
pdfContentEditor.deleteImages(1, new int[] {2});
pdfContentEditor.deleteImages(1, new int[] {3});
Hi,
Thanks for sharing the details.
I have tested the scenario using Aspose.Pdf for Java 4.4.0 where I have used the following code snippet with demo.pdf and I am unable to notice any problem when using the component in Eclipse Juno application running over Windows 7 (x64) where I have JDK 1.7.
[Java]
com.aspose.pdf.facades.PdfContentEditor editor = new com.aspose.pdf.facades.PdfContentEditor();
editor.bindPdf("c:/pdftest/demo.pdf");
// editor.deleteImage(1, new int[]{1,2,});
editor.deleteImage(1, new int[] { 2 });
editor.deleteImage(1, new int[] { 3 });
editor.save("c:/pdftest/ImagesRemoved.pdf");
michael.burger@siag.it:
I think the way I can resolve my problem is using PdfExtractor …
If I do this
// Working solution PdfExtractor extractor = new PdfExtractor(); extractor.bindPdf(fileName); extractor.extractImage(); extractor.getNextImage("/home/mburger/tmp/mfm/image1.pdf"); extractor.close(); // Working solution
It creates an PDF File with the extracted Image, and it seems it is always the right image (the first one and not randomized).
But The problem I have … The created PDF File is bigger than the extracted image … but I need an A4PDF file … so is there a way to create a new A4 PDF File with the extracted image?
In order to set the page size, please try using the following code snippet.
[Java]
// Instantiate PageEditor object
com.aspose.pdf.facades.PdfPageEditor page_editor = new com.aspose.pdf.facades.PdfPageEditor();
// Bind the source PDF file
page_editor.bindPdf("c:/pdftest/ImagesRemoved.pdf");
// Set the page size as A4
page_editor.setPageSize(com.aspose.pdf.facades.PageSize.getA4()); // (new com.aspose.pdf.facades.PageSize(, arg1))
// Save updated document
page_editor.save("c:/pdftest/A4PageSize.pdf");
michael.burger@siag.it:
I can’t find anymore classes like com.apsos.pdf.Document
The Document class is introduced in the api release starting from 4.0.0. Please try using the latest release of Aspose.Pdf for Java 4.4.0 and in case you still face any problem, please feel free to contact.
michael.burger@siag.it:
// Working solution
PdfExtractor extractor = new PdfExtractor(); extractor.bindPdf(fileName); extractor.extractImage(); extractor.getNextImage("/home/mburger/tmp/mfm/image1.pdf"); extractor.close();
// Working solution
On version 4.2 I with my demo file I get the 1. first page
On version 4.4 I with my demo file i get THE LAST PAGE!!!It is randomise!
Hi Michael,
The com.aspose.pdf.facades.PdfExtractor
class provides the feature to extract Text, Image and Attachments from PDF document. In case you need to get particular page from PDF file, please try using the following code snippet.
Java:`
//open first document
com.aspose.pdf.Document pdfDocument1 = new com.aspose.pdf.Document("c:/pdftest/demo.pdf");
// get the page at particular index of
//Page Collection
com.aspose.pdf.Page pdfPage =
pdfDocument1.getPages().get_Item(6);
// create a new Document object
com.aspose.pdf.Document newDocument = new com.aspose.pdf.Document();
// add page to pages collection of new
// document object
newDocument.getPages().add(pdfPage);
// save the newly generated PDF file
newDocument.save("c:/pdftest/page_"+ pdfPage.getNumber() + ".pdf");
I found workaround (combination of specific version of Words+PDF), please follow this:
Re: Extract file from PDF Private - #13 by michael.burgersiag.i - Free Support Forum - aspose.com
Hi Michael,
Hi Michael,
Thanks for your patience.
I am pleased to share that the feature to remove unused objects from PDF file is supported and its fix will be included in next release of Aspose.Pdf for Java 4.6.0 (which is planned to release in March-2014). In order to accomplish this requirement, please try using the following code snippet.
[Java]
com.aspose.pdf.Document doc = new Document("source.pdf");
OptimizationOptions opt = new Document.OptimizationOptions();
opt.setRemoveUnusedObjects(true);
doc.optimizeResources(opt);
doc.save("optimized.pdf");
Hi Michael,
Thanks for your patience. We have further investigated the issue reported earlier and as per our observations, the pages of the document use shared resources. That’s why all resources are included in the resultant files. In order to decrease the size, the customer should use the OptimizeResources()
method.
Java:
String myDir = "D:\\";
Document pdfDocument1 = new Document(myDir + "36197.pdf");
// loop through all the pages
for (int pdfPage = 1; pdfPage <= 4; /*pdfDocument1.Pages.Count*/ pdfPage++) {
// create a new Document object
Document newDocument = new Document();
// get the page at a particular index of the Page Collection
newDocument.getPages().add(pdfDocument1.getPages().get_Item(pdfPage));
// Optimize the newly created Document
OptimizationOptions opt = new Document.OptimizationOptions();
opt.setRemoveUnusedObjects(true);
opt.setRemoveUnusedStreams(true);
newDocument.optimizeResources(opt); // try to test with this line commented out
// save the newly generated PDF file
newDocument.save(myDir + pdfPage + "_test1.pdf");
}
The issues you have found earlier (filed as PDFNEWJAVA-33908) have been fixed in Aspose.Pdf for Java 4.6.0.