public class XPathFilter extends ParseFilter
Modifier and Type | Field and Description |
---|---|
protected Map<String,List<com.digitalpebble.stormcrawler.parse.filter.XPathFilter.LabelledExpression>> |
expressions |
Constructor and Description |
---|
XPathFilter() |
Modifier and Type | Method and Description |
---|---|
void |
configure(Map stormConf,
com.fasterxml.jackson.databind.JsonNode filterParams)
Called when this filter is being initialized
|
void |
filter(String URL,
byte[] content,
DocumentFragment doc,
ParseResult parse)
Called when parsing a specific page
|
boolean |
needsDOM()
Specifies whether this filter requires a DOM representation of the
document
|
public void filter(String URL, byte[] content, DocumentFragment doc, ParseResult parse)
ParseFilter
filter
in class ParseFilter
URL
- the URL of the page being parsedcontent
- the content being parseddoc
- the DOM tree resulting of the parsing of the content or null
if ParseFilter.needsDOM()
returns false
parse
- the metadata to be updated with the resulting of the parsingpublic void configure(Map stormConf, com.fasterxml.jackson.databind.JsonNode filterParams)
ParseFilter
configure
in class ParseFilter
stormConf
- The Storm configuration used for the parsing boltfilterParams
- the filter specific configuration. Never nullpublic boolean needsDOM()
ParseFilter
needsDOM
in class ParseFilter
true
if this needs a DOM representation of the
document, false
otherwise.Copyright © 2018 DigitalPebble Ltd. All rights reserved.