An intelligent, fully local system that discovers, downloads, and filters research papers from open-access repositories using Ollama Qwen3:8B for AI-powered relevance analysis — no data leaves your machine.
A deterministic automated workflow from research query to filtered, organised, relevance-scored paper library — running entirely on local infrastructure.
User submits a research topic or keyword query to initiate paper discovery across all sources.
Simultaneous API queries to arXiv, DOAJ, PubMed Central, and PLOS ONE for comprehensive discovery.
PDF retrieval via Requests, text extraction with PyPDF2 — 5 pages and 5000 characters per document.
Ollama Qwen3:8B analyses each paper's extracted text and scores it against the original query intent.
Relevant papers saved to query-named folders; rejected papers logged with reasons for transparency.
Chosen for local-first, privacy-preserving operation. Every component runs on-device — no API keys, no cloud dependency, no data leaving the machine.
Primary language driving the full pipeline, from API calls to PDF processing and file organisation.
Advanced language model for relevance classification and content analysis — deployed locally with no external inference calls.
Four open-access academic APIs providing comprehensive, legally accessible paper discovery across scientific domains.
Calibrated defaults for balanced throughput, LLM context quality, and API rate-limit compliance.
Six capabilities that distinguish aiRPD from generic paper scrapers and cloud-dependent research tools.
Qwen3:8B intelligently filters papers based on semantic relevance to your query — not just keyword matching — saving time and storage.
Searches arXiv, DOAJ, PubMed Central, and PLOS ONE simultaneously for the broadest possible open-access coverage.
All processing runs locally. No queries, document text, or metadata are sent to external servers at any stage of the pipeline.
Adjustable page extraction limits, character context windows, and processing delays to match your hardware and throughput requirements.
Every filtered paper is logged with the AI-generated reason for rejection, providing a full audit trail for transparency and review.
Relevant papers are automatically sorted into query-named folders with clear naming conventions for easy retrieval and citation.
Four open-access APIs covering preprints, peer-reviewed journals, biomedical literature, and multidisciplinary publications.