Home The Lab Docs Pricing
Documentation

NodeRefine Docs

NodeRefine is the performance module for RAG system engineering. It transforms bloated vector databases into lean, high-precision knowledge assets through three progressive refinement stages: semantic de-duplication, contextual pruning, and topology re-linking.

This documentation covers SDK installation, API endpoints, core concepts, and industry-specific configuration templates.

Quickstart

Get NodeRefine running against your vector database in under 5 minutes.

1. Install the SDK

# Python pip install noderefine # Rust cargo add noderefine

2. Initialize the Client

from noderefine import Client client = Client( api_key="nr_live_your_api_key", db_provider="pinecone", # or "milvus", "qdrant", "weaviate", "chroma", "pgvector" db_config={ "index": "my-knowledge-base", "api_key": "pc_..." } )

3. Run Your First Refinement

result = client.refine( strategies=["dedup", "prune", "relink"], dry_run=True # Preview changes before applying ) print(result.summary) # → Analyzed 128,431 nodes. Found 34,102 redundant, 21,088 pruneable. # → Estimated noise reduction: 73%. Token savings: ~$42K/month. # Apply when ready result.apply()

Authentication

Whitelist Required

NodeRefine is currently in private beta. API keys are only issued to whitelisted users. Request access to receive your credentials.

All API requests require a Bearer token. Once your account is approved, you can generate API keys from the Lab console under Settings → API Keys.

# Include in all requests Authorization: Bearer nr_live_your_api_key # Test keys (sandbox) Authorization: Bearer nr_test_your_api_key

API keys follow two conventions:

  • nr_live_* — Production keys. All refinement operations are permanent.
  • nr_test_* — Sandbox keys. Refinements are simulated (dry_run enforced).

Python SDK

The official Python SDK provides a high-level interface to all NodeRefine capabilities. Requires Python 3.9+.

Installation

pip install noderefine # With optional async support pip install noderefine[async]

Basic Usage

from noderefine import Client, Strategy client = Client(api_key="nr_live_...") # Connect to your vector database collection = client.connect( provider="qdrant", url="https://your-qdrant-instance.io", collection="knowledge-base" ) # Configure refinement strategies pipeline = collection.pipeline([ Strategy.dedup(threshold=0.88), Strategy.prune(min_frequency=2, min_similarity=0.32), Strategy.relink(max_edges=5, min_coupling=0.65), ]) # Execute with progress tracking for event in pipeline.stream(): print(f"[{event.stage}] {event.message}")

Async Usage

import asyncio from noderefine import AsyncClient async def main(): client = AsyncClient(api_key="nr_live_...") result = await client.refine( collection="legal-docs", strategies=["dedup", "prune"] ) print(result.stats) asyncio.run(main())

Rust SDK

The Rust SDK is optimized for high-throughput, low-latency refinement pipelines in production environments.

Installation

# Cargo.toml [dependencies] noderefine = "0.4" tokio = { version = "1", features = ["full"] }

Basic Usage

use noderefine::{Client, Strategy, Config}; #[tokio::main] async fn main() -> Result<(), noderefine::Error> { let client = Client::new("nr_live_..."); let result = client .refine("knowledge-base") .strategy(Strategy::Dedup { threshold: 0.88 }) .strategy(Strategy::Prune { min_freq: 2, min_sim: 0.32, }) .execute() .await?; println!("Refined {} nodes, saved {} tokens", result.nodes_processed, result.tokens_saved); Ok(()) }

Semantic De-duplication

The first stage of the NodeRefine pipeline identifies chunks that are worded differently but carry overlapping semantic meaning. Unlike naive hash-based dedup, NodeRefine uses cross-encoder models to compute pairwise semantic similarity.

How It Works

  1. Candidate Selection — Fast bi-encoder pre-filtering narrows the comparison space from O(n²) to O(n·k) by selecting only the top-k nearest neighbors per node.
  2. Cross-Encoder Scoring — Each candidate pair is scored by a high-precision cross-encoder. Pairs above the threshold (default: 0.88) are marked as duplicates.
  3. Merge Resolution — The node with the highest aggregate retrieval score is promoted. Metadata from the demoted node is absorbed, preserving context breadth.

Configuration

Strategy.dedup( threshold=0.88, # Semantic similarity threshold (0.0–1.0) model="cross-encoder-v3", # Cross-encoder model variant merge="absorb", # "absorb" | "keep_latest" | "keep_highest" batch_size=1000, # Processing batch size )

Contextual Pruning

The second stage removes noise from individual chunks. This includes outdated metadata, formatting artifacts, conversion debris (from PDF/HTML extraction), and low-information filler text.

What Gets Pruned

  • Dead metadata — Timestamps, file paths, page numbers that add no retrieval value.
  • Format artifacts — HTML tags, markdown escapes, OCR errors from document conversion.
  • Semantic filler — Repeated boilerplate, disclaimers, headers/footers that appear across multiple chunks.
  • Orphan nodes — Chunks with fewer than N retrievals over a defined period and below a similarity floor.

Configuration

Strategy.prune( min_frequency=2, # Min retrievals in the lookback period min_similarity=0.32, # Floor cosine similarity to retain lookback_days=30, # Activity lookback window preserve_compliance=True # Never prune compliance-tagged data )

Topology Re-linking

The third and most powerful stage builds logical dependency edges between semantically adjacent nodes. This transforms a flat vector store into a traversable knowledge graph.

Benefits

  • Context enrichment — When an LLM retrieves node A, it also receives the most logically coupled nodes B and C, providing richer context without additional queries.
  • Multi-hop reasoning — Edge traversal enables the LLM to follow logical chains across documents, dramatically improving answers to complex questions.
  • Reduced hallucination — By providing structurally connected evidence, the LLM has less incentive to fabricate connections.

Configuration

Strategy.relink( max_edges=5, # Maximum outgoing edges per node min_coupling=0.65, # Minimum logical coupling score edge_model="causal-v2", # Edge weight model variant bidirectional=True # Create bidirectional edges )

API Reference

POST /v1/refine

Trigger a refinement job on a vector collection.

Parameter Type Required Description
collection string Yes Target vector collection name
strategies string[] Yes Array of strategies: "dedup", "prune", "relink"
dry_run boolean No Preview changes without applying (default: false)
config object No Strategy-specific configuration overrides
webhook_url string No URL to receive completion callback
// Request POST /v1/refine { "collection": "legal-kb-prod", "strategies": ["dedup", "prune", "relink"], "dry_run": false, "config": { "dedup": { "threshold": 0.90 }, "prune": { "preserve_compliance": true } } } // Response { "job_id": "rf_a1b2c3d4", "status": "processing", "nodes_total": 128431, "estimated_completion": "2026-03-20T15:30:00Z" }

POST /v1/query

Run a retrieval query with optional before/after comparison.

POST /v1/query { "query": "What are the compliance requirements for data retention?", "collection": "legal-kb-prod", "top_k": 5, "compare": true } // Response includes both raw and refined results { "before": { "chunks": [...], "tokens": 2847, "top1_score": 0.82 }, "after": { "chunks": [...], "tokens": 1203, "top1_score": 0.96 }, "savings": { "token_reduction": "57.7%", "relevance_gain": "17.1%" } }

GET /v1/status/:job_id

Check the status of a refinement job.

GET /v1/status/rf_a1b2c3d4 { "job_id": "rf_a1b2c3d4", "status": "completed", "stages": { "dedup": { "status": "done", "merged": 34102 }, "prune": { "status": "done", "removed": 21088 }, "relink": { "status": "done", "edges_created": 89412 } }, "noise_reduction": "73%", "token_savings_monthly": "$42,180", "duration_ms": 184320 }

Industry Templates

Pre-configured refinement templates for high-accuracy industries. Each template includes optimized thresholds, compliance guardrails, and domain-specific heuristics.

Healthcare

Healthcare RAG systems require HIPAA-compliant refinement. The Medical template never prunes PHI-tagged nodes and maintains audit trails for every refinement action.

result = client.refine( collection="clinical-guidelines", template="medical", config={ "hipaa_mode": True, "audit_trail": True, "phi_protection": "strict", "evidence_level_weight": True, # Prioritize higher evidence levels } )

Key features: PHI auto-detection and protection, evidence-level weighting, ICD/CPT code preservation, HIPAA audit logging.

Engineering

Technical documentation, API specs, and code repositories benefit from version-aware refinement that respects semver and deprecation patterns.

result = client.refine( collection="api-docs-v3", template="engineering", config={ "version_aware": True, "deprecation_policy": "demote", # Demote deprecated APIs, don't delete "code_block_protection": True, "relink_by_module": True, } )

Key features: Semver-aware conflict resolution, code block integrity, module-based re-linking, deprecation demotion.

LangChain Integration

Drop NodeRefine into your existing LangChain pipeline as a retriever wrapper.

from langchain.vectorstores import Pinecone from noderefine.integrations import NodeRefineRetriever # Wrap your existing vector store vectorstore = Pinecone.from_existing_index("my-index", embeddings) retriever = NodeRefineRetriever( vectorstore=vectorstore, api_key="nr_live_...", strategies=["dedup", "relink"], top_k=5 ) # Use as a standard LangChain retriever docs = retriever.get_relevant_documents("What is our refund policy?")

LlamaIndex Integration

NodeRefine plugs into LlamaIndex as a node post-processor.

from llama_index.core import VectorStoreIndex from noderefine.integrations import NodeRefinePostProcessor index = VectorStoreIndex.from_documents(documents) query_engine = index.as_query_engine( node_postprocessors=[ NodeRefinePostProcessor( api_key="nr_live_...", strategies=["dedup", "prune"] ) ] ) response = query_engine.query("Summarize Q4 revenue trends")

Need Access or Help?

NodeRefine is invite-only during private beta. Request access to get your API credentials, or reach out if you need support.

Request Access Discord Community support@noderefine.com