← Back to projects

In progress

Book Recommendation Engine

Recommendation project focused on using vector embeddings to compare semantic similarity between books and generate stronger suggestions than a keyword-only pipeline.

Overview

Semantic retrieval instead of simple title matching.

The goal is to encode book metadata, descriptions, and related signals into embeddings so the engine can rank titles by conceptual similarity rather than relying on exact overlap in words or genres alone.

I also pulled millions of books from Open Library, cleaned and normalized the dataset, and loaded it into a local PostgreSQL server so the recommendation pipeline could work against a structured catalog instead of loose raw exports.

Current status

Current build phase

  • Embedding strategy and dataset shape are still being refined.
  • Millions of Open Library records have already been cleaned and inserted into a local PostgreSQL database.
  • Google Books API integration is set up to fetch cover images for titles that exist in the catalog.
  • Repository and live demo links are placeholders until the project is published.
  • The current focus is retrieval quality, ranking logic, and evaluation workflow.

Planned flow

Planned recommendation flow

  1. Normalize cleaned book metadata and descriptive text from the local PostgreSQL catalog into a searchable corpus.
  2. Generate vector embeddings for each title or feature bundle.
  3. Compare vectors with similarity search to retrieve related books.
  4. Attach Google Books cover images for titles with matching external metadata.
  5. Layer ranking signals to improve usefulness and reduce shallow matches.

What it demonstrates

What this project demonstrates

This project reflects adaptability across applied machine learning ideas, data modeling, and product thinking: not just building a model, but shaping an experience that returns recommendations people would actually want to use.