Spacy Rule Extractor¶
-
class
etk.extractors.spacy_rule_extractor.
Pattern
(d: Dict, nlp)[source]¶ Bases:
object
class pattern represent each token
For each token, we let user specify constrains for tokens. Some attributes are spacy build-in attributes, which can be used with rule-based matching: https://spacy.io/usage/linguistic-features#section-rule-based-matching Some are custom attributes, need to apply further filtering after we get matches
-
class
etk.extractors.spacy_rule_extractor.
Rule
(d: Dict, nlp)[source]¶ Bases:
object
Class Rule represent each matching rule, each rule contains many pattern
-
class
etk.extractors.spacy_rule_extractor.
SpacyRuleExtractor
(nlp, rules: Dict, extractor_name: str)[source]¶ Bases:
etk.extractor.Extractor
- Description
- This extractor takes a spaCy rule as reference and extracts the substring which matches the given spaCy rule.
Examples
rules = json.load(open('path_to_spacy_rules.json', "r")) sample_rules = rules["test_SpacyRuleExtractor_word_1"] spacy_rule_extractor = SpacyRuleExtractor(nlp=nlp, rules=sample_rules) spacy_rule_extractor.extract(text=text)