← Back to projects
In progress
Book Recommendation Engine
Recommendation project focused on using vector embeddings to compare semantic similarity between books
and generate stronger suggestions than a keyword-only pipeline.
Overview
Semantic retrieval instead of simple title matching.
The goal is to encode book metadata, descriptions, and related signals into embeddings so the engine can rank titles
by conceptual similarity rather than relying on exact overlap in words or genres alone.
I also pulled millions of books from Open Library, cleaned and normalized the dataset, and loaded it into a local PostgreSQL server
so the recommendation pipeline could work against a structured catalog instead of loose raw exports.
Current status
Current build phase
- Embedding strategy and dataset shape are still being refined.
- Millions of Open Library records have already been cleaned and inserted into a local PostgreSQL database.
- Google Books API integration is set up to fetch cover images for titles that exist in the catalog.
- Repository and live demo links are placeholders until the project is published.
- The current focus is retrieval quality, ranking logic, and evaluation workflow.
Planned flow
Planned recommendation flow
- Normalize cleaned book metadata and descriptive text from the local PostgreSQL catalog into a searchable corpus.
- Generate vector embeddings for each title or feature bundle.
- Compare vectors with similarity search to retrieve related books.
- Attach Google Books cover images for titles with matching external metadata.
- Layer ranking signals to improve usefulness and reduce shallow matches.
What it demonstrates
What this project demonstrates
This project reflects adaptability across applied machine learning ideas, data modeling, and product thinking:
not just building a model, but shaping an experience that returns recommendations people would actually want to use.