Articles tagged with "LLM"

Embedded Embeddings Database: Building a low cost serverless RAG solution

Retrieval-Augmented Generation (RAG) solutions are an impressive way to talk to one’s data. One of the challenges of RAG solutions is the associated cost, often driven by the vector database. In a previous blog article I presented how to tackle this issue by using Athena with Locality Sensitive Hashing (LSH) as a knowledge database. One the of the main limitations with Athena is the latency and the low number of concurrent queries. In this new blog article, I present a new low-cost serverless solution that makes use of an embedded vector database, SQLite, to achieve a low cost while maintaining high concurrency.

Building a low cost serverless Retrieval-Augmented Generation (RAG) solution

Large language models (LLMs) can generate complex text and solve numerous tasks such as question-answering, information extraction, and text summarization. However, they may suffer from issues such as information gaps or hallucinations. In this blog article, we will explore how to mitigate these issues using Retrieval Augmented Generation (RAG) and build a low-cost solution in the process.

RAG AI-LLM Databases on AWS: do not pay for oversized, go Serverless instead

The RAG - Retrieval Augmented Generation is an approach to reduce hallucination using LLMs (Large Language Models). With RAG you need a storage solution, which is a vector-store in most cases. When you have the task to build the infrastructure for such a use case, you have to decide which database to use. Sometimes, the best solution is not the biggest one. Then you should go serverless to a smaller solution, which fits the use-case better. In this post, I introduce some of the solutions and aid you in deciding which one to choose.

GO-ing to production with Bedrock RAG Part 1

The way from a cool POC (proof of concept), like a walk in monets garden, to a production-ready application for an RAG (Retrieval Augmented Generation) application with Amazon Bedrock and Amazon Kendra is paved with some work. Let`s get our hands dirty. With streamlit and langchain, you can quickly build a cool POC. This two-part blog is about what comes after that.

Climb the (bed)rock with Python, Javascript and GO

Bedrock is now available in eu-central-1. It’s time to get real and use it in applications. Reading all blog posts about Bedrock, you might get the impression that Python and LangChain is the only way to do it. Quite the opposite! As Bedrock makes calling the models available as AWS API, all AWS SDKs are supported! This post shows how to use Bedrock with Python, Javascript and GO.