{"id":53959,"date":"2025-09-25T16:35:24","date_gmt":"2025-09-25T06:35:24","guid":{"rendered":"https:\/\/www.cloudproinc.com.au\/?p=53959"},"modified":"2025-09-25T16:35:27","modified_gmt":"2025-09-25T06:35:27","slug":"supercharge-langchain-apps-with-an-llm-cache","status":"publish","type":"post","link":"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/supercharge-langchain-apps-with-an-llm-cache\/","title":{"rendered":"Supercharge LangChain apps with an LLM Cache"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">In this blog post Supercharge LangChain apps with an LLM cache for speed and cost we will show how to make LangChain applications faster, cheaper, and more reliable by caching LLM outputs.<\/p>\n\n\n\n<!--more-->\n\n\n\n<p class=\"wp-block-paragraph\">Supercharge LangChain apps with an LLM cache for speed and cost is about one idea: do not recompute answers you already paid for. Caching turns repeat prompts into instant responses, smoothing spikes and protecting your budget. You will learn what is happening under the hood, when to use (and avoid) caching, and how to deploy it from laptop to production.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-why-llm-caching-matters\">Why LLM caching matters<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/www.cloudproinc.com.au\/index.php\/category\/llm\/\">LLM <\/a>calls are slow compared to memory or network cache lookups and they cost money. Many workloads repeat the same prompts: unit tests, evaluations, deterministic pipelines, or user flows with minor variations. A cache cuts latency from seconds to milliseconds and eliminates duplicate spend. As a bonus, it reduces provider rate-limit pressure and improves perceived reliability when an upstream API blips.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-how-langchain-caching-works-under-the-hood\">How LangChain caching works under the hood<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/www.cloudproinc.com.au\/index.php\/category\/langchain\/\">LangChain <\/a>ships a pluggable LLM cache that sits behind its model interfaces. When you call an LLM or ChatModel, LangChain computes a cache key that includes:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>The serialized model and parameters (e.g., model name, temperature, tools)<\/li>\n\n\n\n<li>The full prompt (or message list) after formatting<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">If the key exists, LangChain returns the stored generations. If not, it calls the provider and stores the result for future hits. Backends range from in-memory (fast, ephemeral) to SQLite (local persistence) to Redis (shared, production-grade). There are also semantic caches that use embeddings to match \u201csimilar\u201d prompts, not just exact strings.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-when-to-cache-and-when-not-to\">When to cache and when not to<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Cache: evaluation runs, prompt-engineering loops, deterministic chains, knowledge-base queries that change slowly, and expensive multi-step workflows.<\/li>\n\n\n\n<li>Be careful: prompts with real-time data (dates, stock prices), user-personalized content, or prompts where the latest context changes the answer.<\/li>\n\n\n\n<li>Mitigate staleness: set TTLs, include cache-busting context (e.g., content version), or use semantic cache with conservative thresholds.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-quick-start-with-an-in-memory-cache\">Quick start with an in-memory cache<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Great for local development and tests.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>pip install langchain langchain-openai\n<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code has-white-color has-black-background-color has-text-color has-background has-link-color wp-elements-f2ad9df9fc6575a924e376c37499880d\"><code>import os\nfrom langchain.globals import set_llm_cache\nfrom langchain.cache import InMemoryCache\nfrom langchain_openai import ChatOpenAI\n\nos.environ&#91;\"OPENAI_API_KEY\"] = \"&lt;your-key&gt;\"\n\n# Enable global LLM cache\nset_llm_cache(InMemoryCache())\n\nllm = ChatOpenAI(model=\"gpt-4o-mini\", temperature=0)\n\nresp1 = llm.invoke(\"Explain vector databases in one sentence.\")\nresp2 = llm.invoke(\"Explain vector databases in one sentence.\")  # served from cache\n\nprint(resp1.content)\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Note: the cache key includes the model and parameters. If you change temperature or system prompts, you will get a new entry.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-persist-results-with-sqlite\">Persist results with SQLite<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Use this when you want responses to survive restarts or share across small teams via a file.<\/p>\n\n\n\n<pre class=\"wp-block-code has-white-color has-black-background-color has-text-color has-background has-link-color wp-elements-57910695c8b8e8be8c1f19ba4df17674\"><code>from langchain.globals import set_llm_cache\nfrom langchain.cache import SQLiteCache\n\nset_llm_cache(SQLiteCache(database_path=\".langchain_cache.db\"))\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">SQLite is simple and reliable. Store the database with your experiment artifacts for reproducibility.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-scale-out-with-redis-in-production\">Scale out with Redis in production<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Redis gives you shared cache across app servers, eviction policies, metrics, and high availability.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>pip install redis\n<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code has-white-color has-black-background-color has-text-color has-background has-link-color wp-elements-3509b701274fa72cca184031a9eddda2\"><code>from redis import Redis\nfrom langchain.globals import set_llm_cache\nfrom langchain.cache import RedisCache\n\nredis_client = Redis(host=\"localhost\", port=6379, db=0)\nset_llm_cache(RedisCache(redis_client=redis_client))\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Tip: use a dedicated database or key prefix per environment (dev\/stage\/prod) to avoid accidental cross-talk.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-go-beyond-exact-matches-with-semantic-caching\">Go beyond exact matches with semantic caching<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Exact caching misses hits when prompts differ by small wording changes. A semantic cache hashes an embedding of the prompt and returns a prior answer if it is \u201cclose enough.\u201d This is powerful for chat UX and search-style prompts.<\/p>\n\n\n\n<pre class=\"wp-block-code has-white-color has-black-background-color has-text-color has-background has-link-color wp-elements-316bf006579aabaa4e6cdc1927ee1d17\"><code>pip install redis\npip install langchain-openai\n<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code has-white-color has-black-background-color has-text-color has-background has-link-color wp-elements-86e31b2d1c88e49b24d927eee6a2bdff\"><code>from langchain_openai import OpenAIEmbeddings\nfrom langchain.cache import RedisSemanticCache\nfrom langchain.globals import set_llm_cache\n\nemb = OpenAIEmbeddings(model=\"text-embedding-3-small\")\nset_llm_cache(\n    RedisSemanticCache(\n        redis_url=\"redis:\/\/localhost:6379\/0\",\n        embedding=emb,\n        score_threshold=0.85  # stricter for higher precision\n    )\n)\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Choose a threshold carefully. Start strict (e.g., 0.85\u20130.9 cosine similarity) and loosen only if you see good matches during evaluation.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-bypass-or-invalidate-the-cache\">Bypass or invalidate the cache<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Sometimes you want a fresh answer even if a cache entry exists.<\/p>\n\n\n\n<pre class=\"wp-block-code has-white-color has-black-background-color has-text-color has-background has-link-color wp-elements-cc0b24fd570874c627ecfc323989e311\"><code>from langchain.globals import get_llm_cache, set_llm_cache\n\n# Temporarily disable cache for a single call\nprev = get_llm_cache()\nset_llm_cache(None)\ntry:\n    fresh = llm.invoke(\"Explain vector databases in one sentence.\")\nfinally:\n    set_llm_cache(prev)\n\n# Clear the cache (supported by most backends)\ncache = get_llm_cache()\nif hasattr(cache, \"clear\"):\n    cache.clear()\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">For SQLite, you can also delete the database file. For Redis, consider expiring keys or selective deletion by prefix.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-use-caching-in-lcel-chains\">Use caching in LCEL chains<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Caching works seamlessly inside LangChain Expression Language pipelines.<\/p>\n\n\n\n<pre class=\"wp-block-code has-white-color has-black-background-color has-text-color has-background has-link-color wp-elements-f149bf46fc9bdd674891045bfbf7a8fb\"><code>from langchain_core.prompts import ChatPromptTemplate\nfrom langchain_core.output_parsers import StrOutputParser\nfrom langchain_openai import ChatOpenAI\nfrom langchain.globals import set_llm_cache\nfrom langchain.cache import SQLiteCache\n\nset_llm_cache(SQLiteCache(\".langchain_cache.db\"))\n\nprompt = ChatPromptTemplate.from_messages(&#91;\n    (\"system\", \"You are a concise assistant.\"),\n    (\"human\", \"Summarize: {text}\")\n])\n\nllm = ChatOpenAI(model=\"gpt-4o-mini\", temperature=0)\nchain = prompt | llm | StrOutputParser()\n\nout1 = chain.invoke({\"text\": \"LLM caching reduces repeated computation and cost.\"})\nout2 = chain.invoke({\"text\": \"LLM caching reduces repeated computation and cost.\"})  # cached\nprint(out1)\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-operational-tips\">Operational tips<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Keys and versions: include content version, tenant, and prompt template version in your prompts to control cache scope.<\/li>\n\n\n\n<li>TTLs: for Redis, set expiration where appropriate to avoid staleness and unbounded growth.<\/li>\n\n\n\n<li>Observability: track cache hit rate. For Redis, expose INFO stats and keyspace metrics; for SQLite, log cache hits in your application.<\/li>\n\n\n\n<li>Warmups: prefill the cache after deploy by replaying common prompts to avoid cold-start latency for users.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-security-and-compliance\">Security and compliance<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Do not cache secrets, PII, or raw user inputs if policy prohibits persistence. Use in-memory cache or encryption-at-rest where required.<\/li>\n\n\n\n<li>Segment caches by customer or environment to prevent data leakage.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-common-pitfalls\">Common pitfalls<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Stale data: set TTLs and embed context versioning.<\/li>\n\n\n\n<li>Hidden misses: whitespace or minor prompt differences cause misses; normalize prompts and prefer templates.<\/li>\n\n\n\n<li>Over-caching: avoid caching queries that depend on time or mutable state.<\/li>\n\n\n\n<li>High variance prompts: with high temperature or randomness, cached outputs may not represent expected variability; consider caching only deterministic steps.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-backends-at-a-glance\">Backends at a glance<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>InMemoryCache: fastest, process-local, great for tests.<\/li>\n\n\n\n<li>SQLiteCache: single-file persistence, simple ops, good for laptops and CI.<\/li>\n\n\n\n<li>RedisCache: shared, scalable, supports TTLs and ops tooling.<\/li>\n\n\n\n<li>RedisSemanticCache: fuzzy matching with embeddings for prompt variants.<\/li>\n\n\n\n<li>Other integrations: GPTCache and vendor caches can be swapped in if they implement LangChain\u2019s cache interface.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-conclusion\">Conclusion<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">LLM caching is a low-effort, high-impact optimization for LangChain apps. Start with in-memory during development, move to SQLite for reproducibility, and adopt Redis (exact or semantic) in production. With careful scoping, TTLs, and observability, you will cut latency, trim spend, and improve reliability without changing your application logic.<\/p>\n\n\n\n<ul class=\"wp-block-yoast-seo-related-links yoast-seo-related-links\">\n<li><a href=\"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/20\/build-data-driven-apps-with-streamlit\/\">Build Data Driven Apps With Streamlit<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/15\/what-are-cypher-queries\/\">What Are Cypher Queries<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/running-prompts-with-langchain\/\">Running Prompts with LangChain<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/langchain-architecture-explained\/\">LangChain Architecture Explained<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/document-definition-in-langchain\/\">Document Definition in LangChain<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Cut latency and costs by caching LLM outputs in LangChain. Learn what to cache, when not to, and how to ship in-memory, SQLite, and Redis caches.<\/p>\n","protected":false},"author":1,"featured_media":53961,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_yoast_wpseo_opengraph-title":"","_yoast_wpseo_opengraph-description":"","_yoast_wpseo_twitter-title":"","_yoast_wpseo_twitter-description":"","_et_pb_use_builder":"","_et_pb_old_content":"","_et_gb_content_width":"","_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_feature_clip_id":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_post_was_ever_published":false},"categories":[13,94,77],"tags":[],"class_list":["post-53959","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog","category-langchain","category-llm"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.3 (Yoast SEO v28.0) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Supercharge LangChain apps with an LLM Cache - CPI Consulting<\/title>\n<meta name=\"description\" content=\"Learn how to supercharge LangChain apps with an LLM cache for enhanced speed and reduced costs in your applications.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/supercharge-langchain-apps-with-an-llm-cache\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Supercharge LangChain apps with an LLM Cache\" \/>\n<meta property=\"og:description\" content=\"Learn how to supercharge LangChain apps with an LLM cache for enhanced speed and reduced costs in your applications.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/supercharge-langchain-apps-with-an-llm-cache\/\" \/>\n<meta property=\"og:site_name\" content=\"CPI Consulting\" \/>\n<meta property=\"article:published_time\" content=\"2025-09-25T06:35:24+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-09-25T06:35:27+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.cloudproinc.com.au\/wp-content\/uploads\/2025\/09\/supercharge-langchain-apps-with-an-llm-cache-for-speed-and-cost-1024x683.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1024\" \/>\n\t<meta property=\"og:image:height\" content=\"683\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"CPI Staff\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"CPI Staff\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/supercharge-langchain-apps-with-an-llm-cache\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/supercharge-langchain-apps-with-an-llm-cache\\\/\"},\"author\":{\"name\":\"CPI Staff\",\"@id\":\"https:\\\/\\\/cloudproinc.com.au\\\/#\\\/schema\\\/person\\\/192eeeb0ce91062126ce3822ae88fe6e\"},\"headline\":\"Supercharge LangChain apps with an LLM Cache\",\"datePublished\":\"2025-09-25T06:35:24+00:00\",\"dateModified\":\"2025-09-25T06:35:27+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/supercharge-langchain-apps-with-an-llm-cache\\\/\"},\"wordCount\":831,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/cloudproinc.com.au\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/www.cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/supercharge-langchain-apps-with-an-llm-cache\\\/#primaryimage\"},\"thumbnailUrl\":\"\\\/wp-content\\\/uploads\\\/2025\\\/09\\\/supercharge-langchain-apps-with-an-llm-cache-for-speed-and-cost.png\",\"articleSection\":[\"Blog\",\"LangChain\",\"LLM\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/supercharge-langchain-apps-with-an-llm-cache\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/supercharge-langchain-apps-with-an-llm-cache\\\/\",\"url\":\"https:\\\/\\\/www.cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/supercharge-langchain-apps-with-an-llm-cache\\\/\",\"name\":\"Supercharge LangChain apps with an LLM Cache - CPI Consulting\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/cloudproinc.com.au\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/supercharge-langchain-apps-with-an-llm-cache\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/supercharge-langchain-apps-with-an-llm-cache\\\/#primaryimage\"},\"thumbnailUrl\":\"\\\/wp-content\\\/uploads\\\/2025\\\/09\\\/supercharge-langchain-apps-with-an-llm-cache-for-speed-and-cost.png\",\"datePublished\":\"2025-09-25T06:35:24+00:00\",\"dateModified\":\"2025-09-25T06:35:27+00:00\",\"description\":\"Learn how to supercharge LangChain apps with an LLM cache for enhanced speed and reduced costs in your applications.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/supercharge-langchain-apps-with-an-llm-cache\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/supercharge-langchain-apps-with-an-llm-cache\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/supercharge-langchain-apps-with-an-llm-cache\\\/#primaryimage\",\"url\":\"\\\/wp-content\\\/uploads\\\/2025\\\/09\\\/supercharge-langchain-apps-with-an-llm-cache-for-speed-and-cost.png\",\"contentUrl\":\"\\\/wp-content\\\/uploads\\\/2025\\\/09\\\/supercharge-langchain-apps-with-an-llm-cache-for-speed-and-cost.png\",\"width\":1536,\"height\":1024},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.cloudproinc.com.au\\\/index.php\\\/2025\\\/09\\\/25\\\/supercharge-langchain-apps-with-an-llm-cache\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/cloudproinc.com.au\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Supercharge LangChain apps with an LLM Cache\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/cloudproinc.com.au\\\/#website\",\"url\":\"https:\\\/\\\/cloudproinc.com.au\\\/\",\"name\":\"Cloud Pro Inc - CPI Consulting Pty Ltd\",\"description\":\"Cloud, AI &amp; Cybersecurity Consulting | Melbourne\",\"publisher\":{\"@id\":\"https:\\\/\\\/cloudproinc.com.au\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/cloudproinc.com.au\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/cloudproinc.com.au\\\/#organization\",\"name\":\"Cloud Pro Inc - Cloud Pro Inc - CPI Consulting Pty Ltd\",\"url\":\"https:\\\/\\\/cloudproinc.com.au\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/cloudproinc.com.au\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"\\\/wp-content\\\/uploads\\\/2022\\\/01\\\/favfinalfile.png\",\"contentUrl\":\"\\\/wp-content\\\/uploads\\\/2022\\\/01\\\/favfinalfile.png\",\"width\":500,\"height\":500,\"caption\":\"Cloud Pro Inc - Cloud Pro Inc - CPI Consulting Pty Ltd\"},\"image\":{\"@id\":\"https:\\\/\\\/cloudproinc.com.au\\\/#\\\/schema\\\/logo\\\/image\\\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/cloudproinc.com.au\\\/#\\\/schema\\\/person\\\/192eeeb0ce91062126ce3822ae88fe6e\",\"name\":\"CPI Staff\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/2d96eeb53b791d92c8c50dd667e3beec92c93253bb6ff21c02cfa8ca73665c70?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/2d96eeb53b791d92c8c50dd667e3beec92c93253bb6ff21c02cfa8ca73665c70?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/2d96eeb53b791d92c8c50dd667e3beec92c93253bb6ff21c02cfa8ca73665c70?s=96&d=mm&r=g\",\"caption\":\"CPI Staff\"},\"sameAs\":[\"http:\\\/\\\/www.cloudproinc.com.au\"],\"url\":\"https:\\\/\\\/www.cloudproinc.com.au\\\/index.php\\\/author\\\/cpiadmin\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Supercharge LangChain apps with an LLM Cache - CPI Consulting","description":"Learn how to supercharge LangChain apps with an LLM cache for enhanced speed and reduced costs in your applications.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/supercharge-langchain-apps-with-an-llm-cache\/","og_locale":"en_US","og_type":"article","og_title":"Supercharge LangChain apps with an LLM Cache","og_description":"Learn how to supercharge LangChain apps with an LLM cache for enhanced speed and reduced costs in your applications.","og_url":"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/supercharge-langchain-apps-with-an-llm-cache\/","og_site_name":"CPI Consulting","article_published_time":"2025-09-25T06:35:24+00:00","article_modified_time":"2025-09-25T06:35:27+00:00","og_image":[{"width":1024,"height":683,"url":"https:\/\/www.cloudproinc.com.au\/wp-content\/uploads\/2025\/09\/supercharge-langchain-apps-with-an-llm-cache-for-speed-and-cost-1024x683.png","type":"image\/png"}],"author":"CPI Staff","twitter_card":"summary_large_image","twitter_misc":{"Written by":"CPI Staff","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/supercharge-langchain-apps-with-an-llm-cache\/#article","isPartOf":{"@id":"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/supercharge-langchain-apps-with-an-llm-cache\/"},"author":{"name":"CPI Staff","@id":"https:\/\/cloudproinc.com.au\/#\/schema\/person\/192eeeb0ce91062126ce3822ae88fe6e"},"headline":"Supercharge LangChain apps with an LLM Cache","datePublished":"2025-09-25T06:35:24+00:00","dateModified":"2025-09-25T06:35:27+00:00","mainEntityOfPage":{"@id":"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/supercharge-langchain-apps-with-an-llm-cache\/"},"wordCount":831,"commentCount":0,"publisher":{"@id":"https:\/\/cloudproinc.com.au\/#organization"},"image":{"@id":"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/supercharge-langchain-apps-with-an-llm-cache\/#primaryimage"},"thumbnailUrl":"\/wp-content\/uploads\/2025\/09\/supercharge-langchain-apps-with-an-llm-cache-for-speed-and-cost.png","articleSection":["Blog","LangChain","LLM"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/supercharge-langchain-apps-with-an-llm-cache\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/supercharge-langchain-apps-with-an-llm-cache\/","url":"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/supercharge-langchain-apps-with-an-llm-cache\/","name":"Supercharge LangChain apps with an LLM Cache - CPI Consulting","isPartOf":{"@id":"https:\/\/cloudproinc.com.au\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/supercharge-langchain-apps-with-an-llm-cache\/#primaryimage"},"image":{"@id":"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/supercharge-langchain-apps-with-an-llm-cache\/#primaryimage"},"thumbnailUrl":"\/wp-content\/uploads\/2025\/09\/supercharge-langchain-apps-with-an-llm-cache-for-speed-and-cost.png","datePublished":"2025-09-25T06:35:24+00:00","dateModified":"2025-09-25T06:35:27+00:00","description":"Learn how to supercharge LangChain apps with an LLM cache for enhanced speed and reduced costs in your applications.","breadcrumb":{"@id":"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/supercharge-langchain-apps-with-an-llm-cache\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/supercharge-langchain-apps-with-an-llm-cache\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/supercharge-langchain-apps-with-an-llm-cache\/#primaryimage","url":"\/wp-content\/uploads\/2025\/09\/supercharge-langchain-apps-with-an-llm-cache-for-speed-and-cost.png","contentUrl":"\/wp-content\/uploads\/2025\/09\/supercharge-langchain-apps-with-an-llm-cache-for-speed-and-cost.png","width":1536,"height":1024},{"@type":"BreadcrumbList","@id":"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/supercharge-langchain-apps-with-an-llm-cache\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/cloudproinc.com.au\/"},{"@type":"ListItem","position":2,"name":"Supercharge LangChain apps with an LLM Cache"}]},{"@type":"WebSite","@id":"https:\/\/cloudproinc.com.au\/#website","url":"https:\/\/cloudproinc.com.au\/","name":"Cloud Pro Inc - CPI Consulting Pty Ltd","description":"Cloud, AI &amp; Cybersecurity Consulting | Melbourne","publisher":{"@id":"https:\/\/cloudproinc.com.au\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/cloudproinc.com.au\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/cloudproinc.com.au\/#organization","name":"Cloud Pro Inc - Cloud Pro Inc - CPI Consulting Pty Ltd","url":"https:\/\/cloudproinc.com.au\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/cloudproinc.com.au\/#\/schema\/logo\/image\/","url":"\/wp-content\/uploads\/2022\/01\/favfinalfile.png","contentUrl":"\/wp-content\/uploads\/2022\/01\/favfinalfile.png","width":500,"height":500,"caption":"Cloud Pro Inc - Cloud Pro Inc - CPI Consulting Pty Ltd"},"image":{"@id":"https:\/\/cloudproinc.com.au\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/cloudproinc.com.au\/#\/schema\/person\/192eeeb0ce91062126ce3822ae88fe6e","name":"CPI Staff","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/2d96eeb53b791d92c8c50dd667e3beec92c93253bb6ff21c02cfa8ca73665c70?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/2d96eeb53b791d92c8c50dd667e3beec92c93253bb6ff21c02cfa8ca73665c70?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/2d96eeb53b791d92c8c50dd667e3beec92c93253bb6ff21c02cfa8ca73665c70?s=96&d=mm&r=g","caption":"CPI Staff"},"sameAs":["http:\/\/www.cloudproinc.com.au"],"url":"https:\/\/www.cloudproinc.com.au\/index.php\/author\/cpiadmin\/"}]}},"jetpack_featured_media_url":"\/wp-content\/uploads\/2025\/09\/supercharge-langchain-apps-with-an-llm-cache-for-speed-and-cost.png","jetpack-related-posts":[{"id":53960,"url":"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/langchain-architecture-explained\/","url_meta":{"origin":53959,"position":0},"title":"LangChain Architecture Explained","author":"CPI Staff","date":"September 25, 2025","format":false,"excerpt":"A practical tour of LangChain\u2019s building blocks\u2014models, prompts, chains, memory, tools, and RAG\u2014plus LCEL, tracing, and deployment tips for production AI apps.","rel":"","context":"In &quot;AI&quot;","block_context":{"text":"AI","link":"https:\/\/www.cloudproinc.com.au\/index.php\/category\/ai\/"},"img":{"alt_text":"","src":"\/wp-content\/uploads\/2025\/09\/langchain-architecture-explained-for-agents-rag-and-production-apps.png","width":350,"height":200,"srcset":"\/wp-content\/uploads\/2025\/09\/langchain-architecture-explained-for-agents-rag-and-production-apps.png 1x, \/wp-content\/uploads\/2025\/09\/langchain-architecture-explained-for-agents-rag-and-production-apps.png 1.5x, \/wp-content\/uploads\/2025\/09\/langchain-architecture-explained-for-agents-rag-and-production-apps.png 2x, \/wp-content\/uploads\/2025\/09\/langchain-architecture-explained-for-agents-rag-and-production-apps.png 3x, \/wp-content\/uploads\/2025\/09\/langchain-architecture-explained-for-agents-rag-and-production-apps.png 4x"},"classes":[]},{"id":56928,"url":"https:\/\/www.cloudproinc.com.au\/index.php\/2026\/02\/01\/protect-against-langgrinch-cve-2025-68664-in-langchain\/","url_meta":{"origin":53959,"position":1},"title":"Protect Against LangGrinch CVE-2025-68664 in LangChain","author":"CPI Staff","date":"February 1, 2026","format":false,"excerpt":"Learn what LangGrinch (CVE-2025-68664) means for LangChain-based apps and how to reduce risk with practical guardrails, testing, and operational controls.","rel":"","context":"In &quot;Blog&quot;","block_context":{"text":"Blog","link":"https:\/\/www.cloudproinc.com.au\/index.php\/category\/blog\/"},"img":{"alt_text":"","src":"\/wp-content\/uploads\/2026\/02\/post-1.png","width":350,"height":200,"srcset":"\/wp-content\/uploads\/2026\/02\/post-1.png 1x, \/wp-content\/uploads\/2026\/02\/post-1.png 1.5x, \/wp-content\/uploads\/2026\/02\/post-1.png 2x, \/wp-content\/uploads\/2026\/02\/post-1.png 3x, \/wp-content\/uploads\/2026\/02\/post-1.png 4x"},"classes":[]},{"id":56899,"url":"https:\/\/www.cloudproinc.com.au\/index.php\/2026\/01\/29\/3-mistakes-that-quietly-inflate-your-ai-budget\/","url_meta":{"origin":53959,"position":2},"title":"3 Mistakes That Quietly Inflate Your AI Budget","author":"CPI Staff","date":"January 29, 2026","format":false,"excerpt":"AI spend often rises from avoidable design choices. Learn three common mistakes\u2014no caching, unbound context, and the wrong model\u2014and practical steps to reduce costs without hurting quality.","rel":"","context":"In &quot;AI&quot;","block_context":{"text":"AI","link":"https:\/\/www.cloudproinc.com.au\/index.php\/category\/ai\/"},"img":{"alt_text":"","src":"\/wp-content\/uploads\/2026\/01\/post-7.png","width":350,"height":200,"srcset":"\/wp-content\/uploads\/2026\/01\/post-7.png 1x, \/wp-content\/uploads\/2026\/01\/post-7.png 1.5x, \/wp-content\/uploads\/2026\/01\/post-7.png 2x, \/wp-content\/uploads\/2026\/01\/post-7.png 3x, \/wp-content\/uploads\/2026\/01\/post-7.png 4x"},"classes":[]},{"id":53956,"url":"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/running-prompts-with-langchain\/","url_meta":{"origin":53959,"position":3},"title":"Running Prompts with LangChain","author":"CPI Staff","date":"September 25, 2025","format":false,"excerpt":"Learn how to design, run, and evaluate prompts with LangChain using modern patterns, from simple templates to retrieval and production-ready chains.","rel":"","context":"In &quot;AI&quot;","block_context":{"text":"AI","link":"https:\/\/www.cloudproinc.com.au\/index.php\/category\/ai\/"},"img":{"alt_text":"","src":"\/wp-content\/uploads\/2025\/09\/running-prompts-with-langchain-a-practical-guide-for-teams-and-leaders.png","width":350,"height":200,"srcset":"\/wp-content\/uploads\/2025\/09\/running-prompts-with-langchain-a-practical-guide-for-teams-and-leaders.png 1x, \/wp-content\/uploads\/2025\/09\/running-prompts-with-langchain-a-practical-guide-for-teams-and-leaders.png 1.5x, \/wp-content\/uploads\/2025\/09\/running-prompts-with-langchain-a-practical-guide-for-teams-and-leaders.png 2x, \/wp-content\/uploads\/2025\/09\/running-prompts-with-langchain-a-practical-guide-for-teams-and-leaders.png 3x, \/wp-content\/uploads\/2025\/09\/running-prompts-with-langchain-a-practical-guide-for-teams-and-leaders.png 4x"},"classes":[]},{"id":53958,"url":"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/25\/document-definition-in-langchain\/","url_meta":{"origin":53959,"position":4},"title":"Document Definition in LangChain","author":"CPI Staff","date":"September 25, 2025","format":false,"excerpt":"Understand LangChain\u2019s Document model and how to structure, chunk, and enrich metadata to build accurate, scalable RAG pipelines.","rel":"","context":"In &quot;AI&quot;","block_context":{"text":"AI","link":"https:\/\/www.cloudproinc.com.au\/index.php\/category\/ai\/"},"img":{"alt_text":"","src":"\/wp-content\/uploads\/2025\/09\/mastering-document-definition-in-langchain-for-reliable-rag.png","width":350,"height":200,"srcset":"\/wp-content\/uploads\/2025\/09\/mastering-document-definition-in-langchain-for-reliable-rag.png 1x, \/wp-content\/uploads\/2025\/09\/mastering-document-definition-in-langchain-for-reliable-rag.png 1.5x, \/wp-content\/uploads\/2025\/09\/mastering-document-definition-in-langchain-for-reliable-rag.png 2x, \/wp-content\/uploads\/2025\/09\/mastering-document-definition-in-langchain-for-reliable-rag.png 3x, \/wp-content\/uploads\/2025\/09\/mastering-document-definition-in-langchain-for-reliable-rag.png 4x"},"classes":[]},{"id":53838,"url":"https:\/\/www.cloudproinc.com.au\/index.php\/2025\/09\/15\/use-text2cypher-with-rag\/","url_meta":{"origin":53959,"position":5},"title":"Use Text2Cypher with RAG","author":"CPI Staff","date":"September 15, 2025","format":false,"excerpt":"Learn how to combine Text2Cypher and RAG to turn natural language into precise Cypher, execute safely, and deliver trustworthy graph answers.","rel":"","context":"In &quot;AI&quot;","block_context":{"text":"AI","link":"https:\/\/www.cloudproinc.com.au\/index.php\/category\/ai\/"},"img":{"alt_text":"","src":"\/wp-content\/uploads\/2025\/09\/use-text2cypher-with-rag-for-dependable-graph-based-answers-today.png","width":350,"height":200,"srcset":"\/wp-content\/uploads\/2025\/09\/use-text2cypher-with-rag-for-dependable-graph-based-answers-today.png 1x, \/wp-content\/uploads\/2025\/09\/use-text2cypher-with-rag-for-dependable-graph-based-answers-today.png 1.5x, \/wp-content\/uploads\/2025\/09\/use-text2cypher-with-rag-for-dependable-graph-based-answers-today.png 2x, \/wp-content\/uploads\/2025\/09\/use-text2cypher-with-rag-for-dependable-graph-based-answers-today.png 3x, \/wp-content\/uploads\/2025\/09\/use-text2cypher-with-rag-for-dependable-graph-based-answers-today.png 4x"},"classes":[]}],"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.cloudproinc.com.au\/index.php\/wp-json\/wp\/v2\/posts\/53959","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.cloudproinc.com.au\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.cloudproinc.com.au\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.cloudproinc.com.au\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.cloudproinc.com.au\/index.php\/wp-json\/wp\/v2\/comments?post=53959"}],"version-history":[{"count":2,"href":"https:\/\/www.cloudproinc.com.au\/index.php\/wp-json\/wp\/v2\/posts\/53959\/revisions"}],"predecessor-version":[{"id":53975,"href":"https:\/\/www.cloudproinc.com.au\/index.php\/wp-json\/wp\/v2\/posts\/53959\/revisions\/53975"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.cloudproinc.com.au\/index.php\/wp-json\/wp\/v2\/media\/53961"}],"wp:attachment":[{"href":"https:\/\/www.cloudproinc.com.au\/index.php\/wp-json\/wp\/v2\/media?parent=53959"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.cloudproinc.com.au\/index.php\/wp-json\/wp\/v2\/categories?post=53959"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.cloudproinc.com.au\/index.php\/wp-json\/wp\/v2\/tags?post=53959"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}