Back to Repositories

Testing SearxNG Search Integration in GPT Academic

This test suite validates the SearxNG search functionality integration in the GPT Academic project, focusing on handling search requests and response processing for both general and scientific queries.

Test Coverage Overview

The test suite covers SearxNG search API integration with emphasis on query parameter handling and response processing.

Key functionality includes:
  • General and scientific category search requests
  • Custom engine selection
  • Response parsing and error handling
  • Proxy configuration support

Implementation Analysis

The testing approach implements direct API interaction with SearxNG’s search endpoint, validating both successful and error scenarios.

Technical patterns include:
  • HTTP POST request handling with custom headers
  • JSON response parsing and transformation
  • Error code handling (200, 429)
  • Path validation and environment setup

Technical Details

Testing infrastructure includes:
  • Python requests library for HTTP operations
  • Local SearxNG instance (localhost:50001)
  • Custom user agent and language headers
  • Path validation utilities
  • JSON response formatting

Best Practices Demonstrated

The test implementation showcases robust error handling and input validation practices.

Notable features:
  • Comprehensive error message handling
  • Parameter validation and type checking
  • Modular function design
  • Clear separation of configuration and logic

binary-husky/gpt_academic

tests/test_searxng.py

            
def validate_path():
    import os, sys
    os.path.dirname(__file__)
    root_dir_assume = os.path.abspath(os.path.dirname(__file__) + "/..")
    os.chdir(root_dir_assume)
    sys.path.append(root_dir_assume)
validate_path()  # validate path so you can run from base directory

from toolbox import get_conf
import requests

def searxng_request(query, proxies, categories='general', searxng_url=None, engines=None):
    url = 'http://localhost:50001/'

    if engines is None:
        engine = 'bing,'
    if categories == 'general':
        params = {
            'q': query,         # 搜索查询
            'format': 'json',   # 输出格式为JSON
            'language': 'zh',   # 搜索语言
            'engines': engine,
        }
    elif categories == 'science':
        params = {
            'q': query,         # 搜索查询
            'format': 'json',   # 输出格式为JSON
            'language': 'zh',   # 搜索语言
            'categories': 'science'
        }
    else:
        raise ValueError('不支持的检索类型')
    headers = {
        'Accept-Language': 'zh-CN,zh;q=0.9',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36',
        'X-Forwarded-For': '112.112.112.112',
        'X-Real-IP': '112.112.112.112'
    }
    results = []
    response = requests.post(url, params=params, headers=headers, proxies=proxies, timeout=30)
    if response.status_code == 200:
        json_result = response.json()
        for result in json_result['results']:
            item = {
                "title": result.get("title", ""),
                "content": result.get("content", ""),
                "link": result["url"],
            }
            print(result['engines'])
            results.append(item)
        return results
    else:
        if response.status_code == 429:
            raise ValueError("Searxng(在线搜索服务)当前使用人数太多,请稍后。")
        else:
            raise ValueError("在线搜索失败,状态码: " + str(response.status_code) + '\t' + response.content.decode('utf-8'))
res = searxng_request("vr environment", None, categories='science', searxng_url=None, engines=None)
print(res)