NodeRefine Docs
NodeRefine is the performance module for RAG system engineering. It transforms bloated vector databases into lean, high-precision knowledge assets through three progressive refinement stages: semantic de-duplication, contextual pruning, and topology re-linking.
This documentation covers SDK installation, API endpoints, core concepts, and industry-specific configuration templates.
Quickstart
Get NodeRefine running against your vector database in under 5 minutes.
1. Install the SDK
2. Initialize the Client
3. Run Your First Refinement
Authentication
NodeRefine is currently in private beta. API keys are only issued to whitelisted users. Request access to receive your credentials.
All API requests require a Bearer token. Once your account is approved, you can generate API keys from the Lab console under Settings → API Keys.
API keys follow two conventions:
nr_live_*— Production keys. All refinement operations are permanent.nr_test_*— Sandbox keys. Refinements are simulated (dry_run enforced).
Python SDK
The official Python SDK provides a high-level interface to all NodeRefine capabilities. Requires Python 3.9+.
Installation
Basic Usage
Async Usage
Rust SDK
The Rust SDK is optimized for high-throughput, low-latency refinement pipelines in production environments.
Installation
Basic Usage
Semantic De-duplication
The first stage of the NodeRefine pipeline identifies chunks that are worded differently but carry overlapping semantic meaning. Unlike naive hash-based dedup, NodeRefine uses cross-encoder models to compute pairwise semantic similarity.
How It Works
- Candidate Selection — Fast bi-encoder pre-filtering narrows the comparison space from O(n²) to O(n·k) by selecting only the top-k nearest neighbors per node.
- Cross-Encoder Scoring — Each candidate pair is scored by a high-precision cross-encoder. Pairs above the threshold (default: 0.88) are marked as duplicates.
- Merge Resolution — The node with the highest aggregate retrieval score is promoted. Metadata from the demoted node is absorbed, preserving context breadth.
Configuration
Contextual Pruning
The second stage removes noise from individual chunks. This includes outdated metadata, formatting artifacts, conversion debris (from PDF/HTML extraction), and low-information filler text.
What Gets Pruned
- Dead metadata — Timestamps, file paths, page numbers that add no retrieval value.
- Format artifacts — HTML tags, markdown escapes, OCR errors from document conversion.
- Semantic filler — Repeated boilerplate, disclaimers, headers/footers that appear across multiple chunks.
- Orphan nodes — Chunks with fewer than N retrievals over a defined period and below a similarity floor.
Configuration
Topology Re-linking
The third and most powerful stage builds logical dependency edges between semantically adjacent nodes. This transforms a flat vector store into a traversable knowledge graph.
Benefits
- Context enrichment — When an LLM retrieves node A, it also receives the most logically coupled nodes B and C, providing richer context without additional queries.
- Multi-hop reasoning — Edge traversal enables the LLM to follow logical chains across documents, dramatically improving answers to complex questions.
- Reduced hallucination — By providing structurally connected evidence, the LLM has less incentive to fabricate connections.
Configuration
API Reference
POST /v1/refine
Trigger a refinement job on a vector collection.
| Parameter | Type | Required | Description |
|---|---|---|---|
collection |
string | Yes | Target vector collection name |
strategies |
string[] | Yes | Array of strategies: "dedup", "prune", "relink" |
dry_run |
boolean | No | Preview changes without applying (default: false) |
config |
object | No | Strategy-specific configuration overrides |
webhook_url |
string | No | URL to receive completion callback |
POST /v1/query
Run a retrieval query with optional before/after comparison.
GET /v1/status/:job_id
Check the status of a refinement job.
Industry Templates
Pre-configured refinement templates for high-accuracy industries. Each template includes optimized thresholds, compliance guardrails, and domain-specific heuristics.
Legal
Law firms and legal tech platforms deal with massive document corpora where precision is non-negotiable. The Legal template enforces strict version arbitration and preserves all jurisdictional metadata.
Key features: Citation graph preservation, jurisdiction-aware dedup, ruling recency weighting, privilege-tag immunity.
Healthcare
Healthcare RAG systems require HIPAA-compliant refinement. The Medical template never prunes PHI-tagged nodes and maintains audit trails for every refinement action.
Key features: PHI auto-detection and protection, evidence-level weighting, ICD/CPT code preservation, HIPAA audit logging.
Engineering
Technical documentation, API specs, and code repositories benefit from version-aware refinement that respects semver and deprecation patterns.
Key features: Semver-aware conflict resolution, code block integrity, module-based re-linking, deprecation demotion.
LangChain Integration
Drop NodeRefine into your existing LangChain pipeline as a retriever wrapper.
LlamaIndex Integration
NodeRefine plugs into LlamaIndex as a node post-processor.
Need Access or Help?
NodeRefine is invite-only during private beta. Request access to get your API credentials, or reach out if you need support.