This project evaluates various Hugging Face large language models (LLMs) for a car dealership chatbot application called "Car-ing is Sharing". The chatbot is designed to assist customers and provide support to human agents through multiple NLP functionalities.
- Sentiment Analysis: Classify car reviews as positive or negative
- Text Translation: Translate customer reviews between English and Spanish
- Question Answering: Extract specific information from car reviews
- Text Summarization: Generate concise summaries of longer car reviews
.
├── car.jpeg # Project image
├── Makefile # Build automation
├── notebook.ipynb # Main analysis notebook
├── requirements.txt # Project dependencies
└── data/
├── car_reviews.csv # Car review dataset
└── reference_translations.txt # Translation reference data
- Clone the repository:
git clone https://github.com/Shuyib/HF_model_preview.git
cd HF_model_review
- Create and activate a virtual environment:
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
This project leverages several pre-trained models:
-
Sentiment Analysis:
distilbert-base-uncased-finetuned-sst-2-english
Qwen/Qwen2.5-1.5B-Instruct
(via API)
-
Translation:
Helsinki-NLP/opus-mt-en-es
-
Question Answering:
deepset/minilm-uncased-squad2
-
Summarization:
cnicu/t5-small-booksum
facebook/bart-large-cnn
- Open the Jupyter notebook:
jupyter notebook notebook.ipynb
- Run the cells to see model evaluations for:
- Sentiment analysis with accuracy and F1 metrics
- Translation quality with BLEU score calculation
- Question answering capabilities
- Text summarization quality
For some models, you'll need to set up API access from huggingface.co. You can do this by creating a token on your Hugging Face account and setting it as an environment variable:
export HF_token="your_huggingface_token"
import os
os.environ["HF_TOKEN"] = "your_huggingface_token"
This project includes a comprehensive Makefile with useful commands:
make install
: Set up the environmentmake format
: Format codemake lint
: Lint codemake test
: Run testsmake clean
: Clean up environment
Run make help
to see all available commands.
- Python 3.10 or higher (Used 3.11.9)
- See requirements.txt for Python dependencies:
- transformers
- evaluate
- datasets
- sentencepiece
- openai
- tenacity
- ipykernel
- torch
Creative Common License v1.0 Universal
- Hugging Face for providing the model infrastructure
- Datasets are from Datacamp projects
- Special thanks to the teams behind the pre-trained models used in this project such as the Helsinki-NLP, Qwen team, and Facebook teams.