I have a word document with html tags that I need converted to formatted text. For example I would like <strong>Hello</strong>
to display as Hello instead.
I've never used VBA before, but I've been trying to piece something together that would allow me to copy the html text from a specific table cell in Word, use IE to display the formatted version of that text, copy the formatted text from IE, and then paste it back into that same Word table cell. I think I've been able to figure out some of the code, but I don't think I'm referring to the table cells correctly. Can anyone help? This is what I have so far:
Dim Ie As Object
Set Ie = CreateObject("InternetExplorer.Application")
With Ie
.Visible = False
.Navigate "about:blank"
.Document.body.InnerHTML = ActiveDocument.Tables(1).Cell(2, 2)
.Document.execCommand "SelectAll"
.Document.execCommand "Copy"
ActiveDocument.Paste Destination = ActiveDocument.Tables(1).Cell(2, 2)
.Quit
End With
End Sub
You need two different appraoches for the two uses of .cell(2,2).
To get text from the cell you need to amend the first line to read
.Document.body.InnerHTML = ActiveDocument.Tables(1).Cell(2, 2).range.text
In the second case your terminology is incorrect. It should read
ActiveDocument.Tables(1).Cell(2, 2).range.paste
You can get help on individual keywords/properties quite easily. In the VBA IDE just place your cursor on a keyword/property and press F1. You will be taken to the MS help page for the keyword/property. Sometimes you will have an additional selection step when there is more than one alternative.
You should also be aware that the property .cell(row,column) is prone to failure as it relies on their being no merged cells in the Table. A more robust approach is to use the .cells(index) property.
It may be that you can take an alternative appraoch and use a wildcard search to find the tag and then replace the portion you need whilst applying a suitable linked style (You won't be able to use paragraph styles because you will be trying to format only part of a paragraph and character styles don't seem to work with find/replace).
An example of such code that removes HTML tags and formats the remaining text is below
Option Explicit
Sub replaceHTML_WithFormattedText()
' a comma seperated list of HTML tags
Const myTagsList As String = "strong,small,b,i,em"
' a list of linked styles chosen or designed for each tag
' Paragraph styles cannot be used as we are replacing only part of a paragraph
' Character styles just don't seem to work
' The linked styles below were just chosen from the default Word styles as an example
Const myStylesList As String = "Heading 1,Heading 9,Comment Subject,Intense Quote,Message Header"
' <, > and / are special characters therefore need escaping with '\' to get the actual character
Const myFindTag As String = "(\<Tag\>)(*)(\<\/Tag\>)"
Const myReplaceStr As String = "\2"
Dim myTagsHTML() As String
Dim myTagsStyles() As String
Dim myIndex As Long
myTagsHTML = Split(myTagsList, ",")
myTagsStyles = Split(myStylesList, ",")
If UBound(myTagsHTML) <> UBound(myTagsStyles) Then
MsgBox "Different number of tags and Styles", vbOKOnly
Exit Sub
End If
For myIndex = 0 To UBound(myTagsHTML)
With ActiveDocument.StoryRanges(wdMainTextStory).Find
.ClearFormatting
.Format = True
.Text = Replace(myFindTag, "Tag", Trim(myTagsHTML(myIndex)))
.MatchWildcards = True
.Replacement.Text = myReplaceStr
.Replacement.Style = myTagsStyles(myIndex)
.Execute Replace:=wdReplaceAll
End With
Next
End Sub