Spacy Rule Extractor¶
-
class
etk.extractors.spacy_rule_extractor.Pattern(d: Dict, nlp)[source]¶ Bases:
objectclass pattern represent each token
For each token, we let user specify constrains for tokens. Some attributes are spacy build-in attributes, which can be used with rule-based matching: https://spacy.io/usage/linguistic-features#section-rule-based-matching Some are custom attributes, need to apply further filtering after we get matches
-
class
etk.extractors.spacy_rule_extractor.Rule(d: Dict, nlp)[source]¶ Bases:
objectClass Rule represent each matching rule, each rule contains many pattern
-
class
etk.extractors.spacy_rule_extractor.SpacyRuleExtractor(nlp, rules: Dict, extractor_name: str)[source]¶ Bases:
etk.extractor.Extractor- Description
- This extractor takes a spaCy rule as reference and extracts the substring which matches the given spaCy rule.
Examples
rules = json.load(open('path_to_spacy_rules.json', "r")) sample_rules = rules["test_SpacyRuleExtractor_word_1"] spacy_rule_extractor = SpacyRuleExtractor(nlp=nlp, rules=sample_rules) spacy_rule_extractor.extract(text=text)