Description

wodHtmlEntity Class


Object Model





Remarks

Entity is the main object that is used in wodHtmlParser. Once document is parsed, it will create many IwodHtmlEntity objects, one for each tag found in the document. Typical example:

<p>
<img src="/HttpDLX/HttpDLX.gif" border=0>
<b>HttpDLX released - HTTP client with SSL, Proxy, NTLM, uploads, ....</b>
<BR>
Why don't you take a closer look <a href="/index.asp?showform=HttpDLX">here</a>

would create 5 entities: P, IMG, B, BR and A - each of those will have above properties you will be able to access and get information about the entity, such as its attributes (href="http://www.weonlydo....." is one such attribute), Text extracted from the entity (HttpDLX released..... is one such text) etc.

During parsing, nested entities will also be shown as separate entities. As example, TABLE entity would contain several TR and TD entities which could be named 'child' entities, and accessible through Parts property of parent entity.




Members
Properties
Attributes Read-only property Holds reference to collection of entity attributes.
End Read-only property Holds end position of the entity in the data.
Index Read-only property Holds entity index in main collection.
Parts Read-only property Holds reference to collection of entity child parts.
Raw Read-only property Holds raw entity data.
Start Read-only property Holds start position of the entity in the data.
Text Read-only property Holds text extracted from the entity.
Type Read-only property Holds entity type.