How to use the serpextract.serpextract.SearchEngineParser function in serpextract

To help you get started, we’ve selected a few serpextract examples, based on popular ways it is used in public projects.

Secure your code as it's written. Use Snyk Code to scan source code in minutes - no build needed - and fix issues immediately.

github Parsely / serpextract / serpextract / serpextract.py View on Github external
}

        for rule in rule_group:
            if any(url for url in rule['urls'] if '{}' in url):
                rule['urls'] = _expand_country_codes(rule['urls'])
            for i, domain in enumerate(rule['urls']):
                if i == 0:
                    defaults['extractor'] = rule['params']
                    if 'backlink' in rule:
                        defaults['link_macro'] = rule['backlink']
                    if 'charsets' in rule:
                        defaults['charsets'] = rule['charsets']
                    if 'hiddenkeyword' in rule:
                        defaults['hiddenkeyword'] = rule['hiddenkeyword']

                _engines[domain] = SearchEngineParser(engine_name,
                                                      defaults['extractor'],
                                                      defaults['link_macro'],
                                                      defaults['charsets'],
                                                      defaults['hiddenkeyword'])

    return _engines
github Parsely / serpextract / serpextract / serpextract.py View on Github external
def add_custom_parser(match_rule, parser):
    """
    Add a custom search engine parser to the cached ``_engines`` list.

    :param match_rule: A match rule which is used by :func:`get_parser` to look
                       up a parser for a given domain/path.
    :type match_rule:  ``unicode``

    :param parser:     A custom parser.
    :type parser:      :class:`SearchEngineParser`
    """
    assert isinstance(match_rule, text_type)
    assert isinstance(parser, SearchEngineParser)

    global _engines
    _get_search_engines()  # Ensure that the default engine list is loaded

    _engines[match_rule] = parser