Как я могу прочитать внутреннюю дату создания / изменения PDF в Windows PowerShell?

Question

Как я могу прочитать внутреннюю дату создания / изменения PDF в Windows PowerShell?

3191

MostlyHarmless 2014-06-06 в 15:09

PDF-файлы, кажется, имеют отдельный набор свойств файла, которые содержат (среди прочего) дату создания и дату изменения (см. Скриншот здесь: http://ventajamarketing.com/writingblog/wp-content/uploads/2012/02/Acrobat -Document-Properties1-300x297.png ).

Эти даты, очевидно, могут отличаться от даты создания и изменения, отображаемых в файловой системе (проводник Windows).

Как я могу получить доступ к информации о дате в файле PDF и прочитать ее Windows 7с помощью Windows PowerShell(или, возможно, другого метода)?

1

2 ответа на вопрос

2

1

Fazer87 2014-06-06 в 15:22

First of all, you need to get access to a .net library which is able to read the document properties as these are not native shell properties: http://sourceforge.net/projects/itextsharp/

Next up, you need to look at scripting out the objects from the pdf such as:

# load ITextSHarp.dll [System.Reflection.Assembly]::LoadFrom("C:\users\testuser\desktop\itextsharp.dll") $raf = New-object iTextSharp.text.pdf.RandomAccessFileOrArray("C:\users\testuser\desktop\bitcoin.pdf") # load pdf properties $reader = New-object iTextSharp.text.pdf.PdfReader($raf, $Nothing) $reader

Вы также можете сделать это без дополнительной библиотеки, поскольку метаданные XMP могут быть встроены и доступны в виде текста. Julian Knight 9 лет назад 1

Accepted Answer · 2014-06-06 15:21:15

You can read a PDF file (at least you can in newer formats) as though it were text. You will find an embedded XML section that uses the Adobe XMP schema. This contains the metadata you need.

Here is an example:

%PDF-1.5 %âãÏÓ 2 0 obj << /AcroForm 4 0 R /Lang (en-GB) /MarkInfo << /Marked true >> /Metadata 5 0 R /Pages 6 0 R /StructTreeRoot 7 0 R /Type /Catalog >> endobj 5 0 obj << /Length 2971 /Subtype /XML /Type /Metadata >> stream <?xpacket begin="ï»¿" id="W5M0MpCehiHzreSzNTczkc9d"?> <x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 5.1.2"> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="" xmlns:xmp="http://ns.adobe.com/xap/1.0/"> <xmp:CreateDate>2014-03-05T15:03:02+01:00</xmp:CreateDate> <xmp:ModifyDate>2014-05-30T11:58:02+01:00</xmp:ModifyDate> <xmp:MetadataDate>2014-03-05T14:03:46Z</xmp:MetadataDate> </rdf:Description> <rdf:Description rdf:about="" xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/"> <xmpMM:DocumentID>uuid:8b5fe011-ed77-4298-aa84-d1eda797b9ff</xmpMM:DocumentID> <xmpMM:InstanceID>uuid:88074e0b-42f7-4268-bc89-0162e417c9ad</xmpMM:InstanceID> </rdf:Description> <rdf:Description rdf:about="" xmlns:dc="http://purl.org/dc/elements/1.1/"> <dc:format>application/pdf</dc:format> </rdf:Description> </rdf:RDF> </x:xmpmeta>

The following example will retrieve the create date:

$a = Select-String "CreateDate\>(.*)\<" .\filename.pdf

Which returns something like:

filename.pdf:20: <xap:CreateDate>2009-11-03T10:54:29Z</xap:CreateDate> filename.pdf:12921: <xap:CreateDate>2009-11-03T10:54:29Z</xap:CreateDate>

Getting to the exact data:

$a.Matches.Groups[1]

Which returns:

2009-11-03T10:54:29Z

Как я могу прочитать внутреннюю дату создания / изменения PDF в Windows PowerShell?

2 ответа на вопрос

Похожие вопросы