Field Value Name knowledge-base-ragCategory automation Complexity advanced Tags rag, embeddings, pgvector, google-drive, document-processing, llm-generation, nats-events, s3-storage Author randybias Min Version 0.1.0
Ingest documents from Google Drive, generate embeddings, store in pgvector, and answer questions with source citations. The ingest pipeline runs on a daily schedule; the query path is an independent entry point triggered manually. Uses OpenAI for embeddings and Claude for answer generation.
poll-drive → store-originals → extract-and-chunk → generate-embeddings → store-vectors
Query path (independent):
Node Purpose poll-drivePoll Google Drive folders for new or updated documents store-originalsStore original documents in S3 extract-and-chunkExtract text and split into chunks generate-embeddingsGenerate vector embeddings via OpenAI API store-vectorsStore embeddings in pgvector answer-queryRetrieve relevant chunks, generate answer with citations
manual
cron — daily at 6:00 AM (0 6 * * *)
Service Type Required Google Drive API External Yes OpenAI API External Yes (embeddings) Anthropic API External Yes (answer generation) Slack webhook External Optional tentacular-postgres Exoskeleton Yes (with pgvector extension) tentacular-rustfs Exoskeleton Yes tentacular-nats Exoskeleton Optional
Key Default Description timeout300sPer-node timeout retries1Retry count per node drive_folder_ids(placeholder) Google Drive folder IDs to ingest query(empty) Question to answer (for query path) top_k5Number of nearest chunks to retrieve
google.access_token — Google API access token for Drive
openai.api_key — OpenAI API key for embedding generation
anthropic.api_key — Claude API key for answer generation
slack.webhook_url — Slack webhook for notifications (optional)
tntc scaffold init knowledge-base-rag
tntc scaffold init knowledge-base-rag my-custom-name
tntc scaffold info knowledge-base-rag
Scaffold source: quickstarts/knowledge-base-rag/