Day 13-15 : My LLMTwin Writes!

Aug 18, 2024

black and white robot toy on red wooden table — Photo by Andrea De Santis on Unsplash

This article covers days 12-15 of my 75 Days Of Generative AI series. Today I’ll share the last part of the process where I use RAG techniques to generate an article on a topic “The Future of AI in Content Creation” which will be written in my writing style derived from my LinkedIn Posts and Substack Articles.

If you are interested in previous articles which talk about feature pipeline and data pipeline here at the links

Building your own LLM Twin
Data Pipeline For Your LLM Twin
Garbage In -> Garbage Out
Generating article from LLM Twin - This article

Using Pinecone for Vector DB

Pinecone is another option for vector databases which I ended up using here. Previous article mentions QDrant as the option but i decided to move to Pincone due to the following reasons

Much better integration with Langchain ( our framework for RAG ) both for adding data and retrieval
Option of a free Index with almost 1 GB of storage available
Pinecone has been designed with a strong focus on developer experience, incorporating user feedback to refine its features. I personally found it more developer friendly

Qdrant is no doubt one of the best Vector DB out there but my personal choice while working and learning was Pinecone

Setting up a index in Pinecone.

The process is very simple.

Just sign up and start with the free plan
There is a clear option of creation on a new index where you can mention name, dimensions of your vector embedding, capacity mode and choice of cloud vendor. Dimension can be chosen via the model of embedding you are using

Updating data pipeline to add vector to Pinecone Instead

Using pinecone instead of Qdrant was very straightforward. Replace Qdrant connector mentioned here with the following

def store_document(document,generator):
    # Create HuggingFaceEmbeddings with specified size
    embeddings = HuggingFaceEmbeddings()

    # Initialize Pinecone with your API key
    pc = Pinecone(api_key="pinecone pi key")
    index = pc.Index("llmtwin")

    # Create a PineconeVectorStore
    vector_store = PineconeVectorStore(index=index, embedding=embeddings)

    # Initialize the text splitter
    text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)

    # Split the document into chunks
    docs = text_splitter.split_documents([document])

    vectorstore_from_docs = PineconeVectorStore.from_documents(
            docs,
            index_name="llmtwin",
            embedding=embeddings          )

As mentioned earlier its fairly straightfoward to add data to Pinecone due to its integration with Langchain’s CharacterTextSplitter.

Writing an article with my writing style

Now the part left is to fetch this stored data and use it in a RAG model to generate a LLM written article that doesn’t sound robotic!

Retrieve the stored embedding data

We start by creating a retriever object from langchain using the same embeddings and getting the data from our index “llmtwin“

from langchain_community.embeddings import HuggingFaceEmbeddings
from pinecone import Pinecone, ServerlessSpec
from langchain_pinecone import PineconeVectorStore

embeddings = HuggingFaceEmbeddings()

    # Initialize Pinecone with your API key
pc = Pinecone(api_key="pinecone_api_key")
index = pc.Index("llmtwin")

    # Create a PineconeVectorStore
vector_store = PineconeVectorStore(index=index, embedding=embeddings)
retriever = vector_store.as_retriever()

Using Groq as source of LLM

Due to lack of computational resources, this time I decided to go the API route. The most straightforward way is to use OpenAI apis but they are paid. The next best option which I ended up using is Groq API. Not only is it free but it support latest llama 3.1 model and is blazing fast!

Again the integration with Langchain is straightforward here

import os
os.environ["GROQ_API_KEY"] = "groq_api_key"
from langchain_groq import ChatGroq

llm = ChatGroq(temperature=0, model_name="llama-3.1-70b-versatile")

Building RetrievalQA chain

Chains are one of the most important concepts in Langchain. Chains are sequences of operations that process and transform data step-by-step to achieve a specific task in natural language processing applications.

Here is example of our chain

from langchain.chains import RetrievalQA

# Create RAG chain
rag_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)

Prompting to generate article

We craft a detailed prompt that instructs the model to generate an article based on the writing style of the retrieved documents.The RAG chain processes this prompt, retrieving relevant style examples from Pinecone and using them to inform the language model's generation of a new article.

The generated article is then printed, reflecting the writing style learned from the LinkedIn posts and Substack articles stored in Pinecone.

# Prompt for article generation
article_prompt = """
Based on the writing style of the provided LinkedIn posts and Substack articles, write a new article on the topic of 'The Future of AI in Content Creation'. 
Ensure the article:
1. Matches the tone and style of the reference content
2. Is approximately 500 words long
3. Includes an engaging introduction, 3-4 main points, and a conclusion
4. Uses examples or anecdotes similar to those in the reference content

Article:
"""

# Generate the article
response = rag_chain({"query": article_prompt})

# Print the generated article
print(response['result'])

Output

I’ll just put the output out here and compare it with same prompt, same LLM generating output without RAG. Its up to you to compare which one sounds more human!

Excerpt from RAG generated article

As I sit down to write this article, I'm reminded of the numerous times I've been asked, "Will AI replace human writers?" My response has always been, "Not quite." While AI has made tremendous progress in generating content, its true potential lies in augmenting human creativity, not replacing it. In this article, we'll explore the future of AI in content creation and how it's poised to revolutionize the way we produce and consume content.

Excerpt from generic LLM

In the ever-evolving landscape of digital media, artificial intelligence (AI) is emerging as a powerful ally for content creators. As we stand on the brink of a new era, AI's potential to transform content creation is both exciting and profound. From streamlining workflows to enhancing creativity, let's explore how AI is reshaping the future of content creation.

Bonus : Observability using Langsmith

Although we will be delving deeper into observability in upcoming articles, I just wanted to give a flavor of the same here. LangSmith is a tool that helps developers build, test, and optimize natural language processing applications efficiently. Its an amazing way to visualise your whole RAG chain which can cleary help you to understand whats happening under the hood.

Getting started

Create an account here
Create an API key and then create a new project

Follow the steps mentioned before you initialize your langchain LLM

LANGCHAIN_TRACING_V2=true
LANGCHAIN_ENDPOINT="https://api.smith.langchain.com"
LANGCHAIN_API_KEY="<your-api-key>"
LANGCHAIN_PROJECT="pr-elderly-temporary-39"

Here is an example of output for my RAG chain for LLMTwin

There is a lot more that can be done here which I will cover soon!

Conclusion

We’ve finally built an end to end app using the power of LLMs! Although its something very basic but its still very powerful.

Next we’ll take our journey a notch above where we will be building a production grade personal finance app. So stay tuned!

Varun’s Newsletter

Discussion about this post