Description

wodHtmlParser Class


Object Model




Remarks

wodHtmlParser is the main engine for parsing. It will Load document from the disk (or use memory data put to Body property), and create collection of wodHtmlEntity objects. Each entity will contain it's Type, Text, possibly few attributes etc.

Parsing is done recursively - each entity may contain it's own child entities. All entities are listed through main wodHtmlParser's Parts property, but also are accessible through parent entities.

For example, when wodHtmlParser encounters tag like this:

<IMG src="image.jpg" border=0>

it will create new wodHtmlEntity object, set its type to IMG, create two attributes (src and border), setup start and end position, and try to extract readable text from it.

 




Members
Methods
About Displays aboutbox.
Load Loads and parses file.
Unload Unloads file from the memory.

Properties
Body Read-write property Holds body of the file that is parsed.
Filename Read-write property Holds filename of the file that is parsed.
Parts Read-only property Holds reference to collection of all parts.
Size Read-only property Holds size of data in the memory.
Version Read-write property Holds component version number.