Articles tagged with "genai"

Embedded Embeddings Database: Building a low cost serverless RAG solution

Retrieval-Augmented Generation (RAG) solutions are an impressive way to talk to one’s data. One of the challenges of RAG solutions is the associated cost, often driven by the vector database. In a previous blog article I presented how to tackle this issue by using Athena with Locality Sensitive Hashing (LSH) as a knowledge database. One the of the main limitations with Athena is the latency and the low number of concurrent queries. In this new blog article, I present a new low-cost serverless solution that makes use of an embedded vector database, SQLite, to achieve a low cost while maintaining high concurrency.

Who-Is-RAG?

We’ve used a gamified approach to showcase how Retrieval Augmented Generation enables businesses to use Large Language Models in combination with their company data. Based on the popular board game Who-Is-It?, we created a demo.

Building a low cost serverless Retrieval-Augmented Generation (RAG) solution

Large language models (LLMs) can generate complex text and solve numerous tasks such as question-answering, information extraction, and text summarization. However, they may suffer from issues such as information gaps or hallucinations. In this blog article, we will explore how to mitigate these issues using Retrieval Augmented Generation (RAG) and build a low-cost solution in the process.

Changing of the Guards - GenAI pattern to Bedrock service

10th of Juli: The ten new features, which were announced in AWS NY Summmit, show a trend in Amazon Bedrock: to implement Prompt Engineering Patterns as services. One of the best practices to avoid prompt injection attacks is GuardRails. Here, I do a deep dive into the new GuardRails features “contextual grounding filter” and “independent API to call your guardrails.” Note: Guardrails work ONLY with English currently.

RAG AI-LLM Databases on AWS: do not pay for oversized, go Serverless instead

The RAG - Retrieval Augmented Generation is an approach to reduce hallucination using LLMs (Large Language Models). With RAG you need a storage solution, which is a vector-store in most cases. When you have the task to build the infrastructure for such a use case, you have to decide which database to use. Sometimes, the best solution is not the biggest one. Then you should go serverless to a smaller solution, which fits the use-case better. In this post, I introduce some of the solutions and aid you in deciding which one to choose.

GO-ing to production with Bedrock RAG Part 1

The way from a cool POC (proof of concept), like a walk in monets garden, to a production-ready application for an RAG (Retrieval Augmented Generation) application with Amazon Bedrock and Amazon Kendra is paved with some work. Let`s get our hands dirty. With streamlit and langchain, you can quickly build a cool POC. This two-part blog is about what comes after that.