PDA

View Full Version : [SLEEPER:] VBA Code - Convert PDF to Excel using Word



MHamid
10-02-2019, 09:55 PM
Hello,

I am using the code below in Excel to open a PDF File in Word and extract the data as text only into Excel. The code works perfectly fine as it is, however, I want to loop through all pdf files within the folder instead of hardcoding the filename in pathAndFileName variable. Can someone assist with this piece of the code?

My goal is to be able to:



Open each PDF file in Word
Extract its data as text into the Excel “Test”
Run two excel macros to parse the data that Ineed into another sheet within the same workbook set in the code
Close Word without saving the file
Clear both Word and PDF
Clear the data in Excel “Test” worksheet
Move PDF File to another folder
Repeat steps 1-6 for the next PDF File



Can anyone assist in tweaking this code to do what I need?


Sub pdf_To_Excel_Word()
'Macro opens PDF Files as an editable Word Document
'Copies the contents of the Word document
'Pastes the Clipboard contents into Excel
'Declare Variables
Dim myWorksheet As Worksheet
Dim wordApp AsWord.Application
Dim myWshShell As wshShell
Dim pathAndFileName As String
Dim registryKey As String
Dim wordVersion As String
'Set Variables
Set myWorksheet =ActiveWorkbook.Worksheets("Test")
Set wordApp = NewWord.Application
Set myWshShell = New wshShell
pathAndFileName ="C:\Users\mh15601\Desktop\Projects\Audit Plan\2019\Testing\PDF Files\DONOT DELETE\A171065-ICG-TTS-Cash Management _
and Trade-Citibank NA-IndonesiaAudit Report.pdf"
wordVersion = wordApp.Version
registryKey ="HKCU\SOFTWARE\Microsoft\Office\" & wordVersion &"\Word\Options\"
'Open and Copy PDF Files
myWshShell.RegWriteregistryKey & "DisableConvertPdfWarning", 1,"REG_DWORD"
wordApp.documents.Open Filename:=pathAndFileName, confirmconversions:=False
myWshShell.RegWriteregistryKey & "DisableConvertPdfWarning", 0,"REG_DWORD"
'Copy Data from Word
wordApp.ActiveDocument.Content.Copy
'Excel
With myWorksheet
.Range("A1").Select
.PasteSpecialFormat:="Text"
End With
'Close Word
wordApp.QuitSaveChanges:=wDoNotSaveChanges
'Clear Word and PDF
Set wordApp = Nothing
Set myWshShell = Nothing
End Sub

Thank you

gmayor
10-03-2019, 03:02 AM
The following should paste each pdf document in strPath to a new worksheet.
I'll let you decode what you want to do with the PDFs


Sub pdf_To_Excel_Word()
'Macro opens PDF Files as a editable Word Documenta
'Copies the contents of the Word documents
'Pastes the Clipboard contents into Excel
'Declare Variables
Dim myWorksheet As Worksheet
Dim wordApp As Object
Dim myWshShell As Object
Dim strPath As String
Dim oDoc As Object
Dim strFile As String
Dim registryKey As String
Dim wordVersion As String
'Set Variables
On Error Resume Next
Set wordApp = GetObject(, "Word.Application")
If Err Then
Set wordApp = CreateObject("Word.Application")
End If
On Error GoTo 0
Set myWshShell = CreateObject("WScript.Shell")
strPath = "C:\Users\mh15601\Desktop\Projects\Audit Plan\2019\Testing\PDF Files\DONOT DELETE\"
wordVersion = wordApp.Version
registryKey = "HKCU\SOFTWARE\Microsoft\Office\" & wordVersion & "\Word\Options\"
'Open and Copy PDF Files
myWshShell.RegWrite registryKey & "DisableConvertPdfWarning", 1, "REG_DWORD"
strFile = Dir$(strPath & "*.pdf")
While strFile <> ""
Set oDoc = wordApp.Documents.Open(FileName:=strPath & strFile, confirmconversions:=False)
oDoc.Content.Copy
'Excel
Set myWorksheet = ActiveWorkbook.Worksheets.Add
With myWorksheet
.Range("A1").Select
.PasteSpecial Format:="Text"
End With
oDoc.Close SaveChanges:=0
strFile = Dir$()
Wend
'Close Word
wordApp.Quit SaveChanges:=0
myWshShell.RegWrite registryKey & "DisableConvertPdfWarning", 0, "REG_DWORD"
'Clear Word and PDF
Set wordApp = Nothing
Set myWshShell = Nothing
End Sub

MHamid
10-03-2019, 03:42 PM
Hello,

I tested your code and its not doing anything at all.
There is no error or anything happening when I run it.