Resume Automation 1/2

I have been wanting to play with LLMs for a while now but haven't had the opportunity to come up with a good project to use them.
I then remembered the repetitiveness of adapting my resume and motivation letter to each job description. What if I could build a system that loads job offers and adapts a CV and cover letter template?

LLMs are particularly suited for such tasks, both for their interpretative abilities and content generation. We all know LLMs and their potential now with renowned applications such as OpenAI's ChatGPT, Anthropic's Claude, or Meta's Llama which are now the base for countless applications ⇒
https://theresanaiforthat.com/.

At the time of this article (2024), LLMs and more generally GenAI have already disrupted sectors and, as they keep on improving, are even threatening entire job sectors. LLMs can code entire applications with scary accuracy (especially in Python).

https://evalplus.github.io/leaderboard.html

So, since I am soon obsolete and out of a job, a resume automation system might come in handy.

RAG

General LLMs are good at producing content but ultimately do not know anything about me. I first need to feed information about me and my professional experience to the models so that when it comes to hand-picking the most suited skills and experiences for a specific job description, the system knows where to get it.
For that, I could paste my CV into the prompt as plain text and then pass in a query to extract the relevant information. This would work, but most commonly in the field, we use
RAG( Retrieval-Augmented Generation).

RAG (Retrieval-Augmented Generation) combines two key AI abilities: finding information and creating responses. Think of it like having a smart assistant who first looks up facts in a huge library (retrieval), then uses those facts to give you a well-informed answer (generation). Unlike older AI models that relied on just one approach, RAG blends both skills to provide more accurate and relevant responses.

RAG with langchain

LangChain is a software framework that helps facilitate the integration of large language models into applications. The framework is open-source and has python and javascript client.

RAG always follows the same steps:

Langchain has an extensive number of LLM you can plug in including OpenAI, Aphrodite, MistralAI models and more (full list here).

However, one must pay to use those hence my choice to turn toward self-hosted LLM.

I turned to Ollama, a tool that allows you to run open-source large language models.

First, follow these instructions to set up and run a local Ollama instance:

Download and install Ollama onto the available supported platforms (including Windows Subsystem for Linux)

Fetch available LLM model via ollama pull

View a list of available models via the model library e.g., ollama pull llama3

This will download the default tagged version of the model. Typically, the default points to the latest, smallest sized-parameter model.

On Linux (or WSL), the models will be stored at /usr/share/ollama/.ollama/models

Specify the exact version of the model of interest as such ollama pull vicuna:13b-v1.5-16k-q4_0 (View the various tags for the Vicuna model in this instance)

To view all pulled models, use ollama list

To chat directly with a model from the command line, use ollama run
View the Ollama documentation for more commands. Run ollama help in the terminal to see available commands too.

data_directory = "rag_docs"
db_path = "chroma"
model_name = "llama3.1:8b"

Load

Here we want to load several PDFs present in the rag_docs folder. Those pdfs are mutliple CVs and motivation letter I have written as well as a pdf version of my linkedin account.

from langchain_community.document_loaders import PyPDFLoader
import os

docs = []
for file in os.listdir(data_directory):
    if file.endswith('.pdf'):
        pdf_path = os.path.join(data_directory, file)
        loader = PyPDFLoader(pdf_path)
        docs.extend(loader.load())

Split

Text splitters break large Documents into smaller chunks. This is useful both for indexing data and for passing it in to a model, since large chunks are harder to search over and won’t fit in a model’s finite context window.

from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)

Store

We need somewhere to store and index our splits, so that they can later be searched over. This is often done using a VectorStore and Embeddings model.

from langchain_community.embeddings import OllamaEmbeddings
from langchain_chroma import Chroma

vectordb = Chroma.from_documents(
    documents=splits,
    persist_directory=db_path,
    embedding = OllamaEmbeddings(model=model_name)
)

Let’s test it

from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama import ChatOllama

retriever = vectordb.as_retriever()

llm_model = ChatOllama(model="llama3.1:8b")

system_prompt = (
    """You are an assistant for question-answering tasks.    
    Use the following pieces of retrieved context, a set of resumes 
    and motivation letters from Corentin, to answer the question.    
    If you don't know the answer, say that you don't know.    
    Use three sentences maximum and keep the answer concise.    
    nn    
    {context}""")
    
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)
question_answer_chain = create_stuff_documents_chain(llm_model, prompt)
rag_chain = create_retrieval_chain(retriever, question_answer_chain)
results = rag_chain.invoke({"input": "List 5 technical skills from Corentin?"})
print(results['answer'])

Based on the provided context, here are 5 technical skills listed for Corentin:

1. Time series analysis (trend analysis, smoothing, fault detection)
2. Open data mining coupled with GIS
3. Energy demand prediction using ANN (Artificial Neural Networks)
4. Genetic algorithms for renewable energies optimization
5. Backend development using Java and BIM models

RAG with kotaemon

Another (and maybe simpler) way to quickly implement a RAG is to use the open-source kotaemon .

Koteamon is an open-source tool for building document-based Q&A applications. It features a Gradio web interface, supports multiple document formats (PDF, DOC), and uses RAG technology to provide sourced answers with document previews. The framework is extensible and includes built-in vector storage for efficient document retrieval.

Install With Docker

lite & full version of Docker images exists. With full, the extra packages of unstructured will be installed as well, it can support additional file types ( .doc, .docx, …) but the cost is larger docker image size. For most users, the lite image should work well in most cases.

To use the lite version.

docker run -e GRADIO_SERVER_NAME=0.0.0.0 -e GRADIO_SERVER_PORT=7860 -p 7860:7860 -it --rm ghcr.io/cinnamon/kotaemon:main-lite

To use the full version.

docker run -e GRADIO_SERVER_NAME=0.0.0.0 -e GRADIO_SERVER_PORT=7860 -p 7860:7860 -it --rm ghcr.io/cinnamon/kotaemon:main-full

Once everything is set up correctly, you can go to http://localhost:7860/ to access the WebUI.

Kotaemon uses GHCR to store docker images, all images can be found here.

Note: to tap in the DB with python, copy paste it onto your local directory:

sudo docker cp kotaemon:/app/ktem_app_data/user_data /mnt/c/Users/Corentin/OneDrive/website/projects/LLM Scraping