Karpathy shares 'LLM Knowledge Base' architecture that bypasses RAG with an evolving markdown library maintained by AI
Key Points:
- Andrej Karpathy has introduced a novel "LLM Knowledge Bases" approach that uses large language models (LLMs) to maintain and actively manage research knowledge in human-readable Markdown files, addressing the common AI issue of context loss during sessions.
- Unlike traditional Retrieval-Augmented Generation (RAG) systems relying on vector databases and embeddings, Karpathy’s method compiles raw data into structured wikis with summaries, backlinks, and continuous linting, creating a self-healing and auditable knowledge base.
- This approach offers significant enterprise potential by transforming unstructured corporate data into dynamic, real-time "Company Bibles," improving knowledge synthesis beyond simple search functionalities.
- The system favors a "file-over-app" philosophy, leveraging open Markdown files managed locally (e.g., with Obsidian), ensuring data ownership, transparency, and future-proofing compared to SaaS-dependent vector DB solutions.
- Looking ahead, Karpathy envisions evolving the knowledge base into synthetic data for fine-tuning smaller, personalized LLMs, effectively creating custom AI models that internalize a researcher’s or organization’s unique knowledge assets.