LLM | tecRacer Amazon AWS Blog

23 Oct '24

Embedded Embeddings Database: Building a low cost serverless RAG solution

Retrieval-Augmented Generation (RAG) solutions are an impressive way to talk to one’s data. One of the challenges of RAG solutions is the associated cost, often driven by the vector database. In a previous blog article I presented how to tackle this issue by using Athena with Locality Sensitive Hashing (LSH) as a knowledge database. One the of the main limitations with Athena is the latency and the low number of concurrent queries. In this new blog article, I present a new low-cost serverless solution that makes use of an embedded vector database, SQLite, to achieve a low cost while maintaining high concurrency.

Read Blog

21 Aug '24

Improving Accessibility by Generating Image-alt texts using GenAI

Written by Maurice Borgmeier

In this article, we’ll be using GenAI to generate alternative texts for images in Markdown documents, which will help people relying on screen readers to access your content.

Read Blog

09 Aug '24

Building a low cost serverless Retrieval-Augmented Generation (RAG) solution

Written by Franck Awounang Nekdem

Large language models (LLMs) can generate complex text and solve numerous tasks such as question-answering, information extraction, and text summarization. However, they may suffer from issues such as information gaps or hallucinations. In this blog article, we will explore how to mitigate these issues using Retrieval Augmented Generation (RAG) and build a low-cost solution in the process.

Read Blog

03 Apr '24

RAG AI-LLM Databases on AWS: do not pay for oversized, go Serverless instead

Written by Gernot Glawe

The RAG - Retrieval Augmented Generation is an approach to reduce hallucination using LLMs (Large Language Models). With RAG you need a storage solution, which is a vector-store in most cases. When you have the task to build the infrastructure for such a use case, you have to decide which database to use. Sometimes, the best solution is not the biggest one. Then you should go serverless to a smaller solution, which fits the use-case better. In this post, I introduce some of the solutions and aid you in deciding which one to choose.

Read Blog

11 Jan '24

GO-ing to production with Bedrock RAG Part 2: Develop, Deploy and Test the RAG Backend with SAM&Postman

Written by Gernot Glawe

In part one, we took the journey from a POC monolith to a scaleable two-tier architecture. The focus is on the DevOps KPI deployment time and the testability. With the right tools - AWS SAM and Postman - the dirty work becomes a nice walk in the garden again. See what a KEBEG stack can achieve!

Read Blog

10 Dec '23

GO-ing to production with Bedrock RAG Part 1

Written by Gernot Glawe

The way from a cool POC (proof of concept), like a walk in monets garden, to a production-ready application for an RAG (Retrieval Augmented Generation) application with Amazon Bedrock and Amazon Kendra is paved with some work. Let`s get our hands dirty. With streamlit and langchain, you can quickly build a cool POC. This two-part blog is about what comes after that.

Read Blog

22 Oct '23

Climb the (bed)rock with Python, Javascript and GO

Written by Gernot Glawe

Bedrock is now available in eu-central-1. It’s time to get real and use it in applications. Reading all blog posts about Bedrock, you might get the impression that Python and LangChain is the only way to do it. Quite the opposite! As Bedrock makes calling the models available as AWS API, all AWS SDKs are supported! This post shows how to use Bedrock with Python, Javascript and GO.

Read Blog

20 Sep '23

Stop LLM/GenAI hallucination fast: Serverless Kendra RAG with GO

Written by Gernot Glawe

RAG is a way to approach the “hallucination” problem with LLM: A contextual reference increases the accuracy of the answers. Do you want to use RAG (Retrieval Augmented Generation) in production? The Python langchain library may be too slow for your production services. So what about serverless RAG in fast GO Lambda?

Read Blog

Articles tagged with "LLM"