Skip to content

Using LLMs in huggingface for sentiment analysis, translation, summarization and extractive question answering

License

Notifications You must be signed in to change notification settings

Shuyib/HF_model_preview

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Binder

Car-ing is Sharing: LLM Model Review

Car-ing is Sharing

Project Overview

This project evaluates various Hugging Face large language models (LLMs) for a car dealership chatbot application called "Car-ing is Sharing". The chatbot is designed to assist customers and provide support to human agents through multiple NLP functionalities.

Key Features

  • Sentiment Analysis: Classify car reviews as positive or negative
  • Text Translation: Translate customer reviews between English and Spanish
  • Question Answering: Extract specific information from car reviews
  • Text Summarization: Generate concise summaries of longer car reviews

Project Structure

.
├── car.jpeg                   # Project image
├── Makefile                   # Build automation
├── notebook.ipynb             # Main analysis notebook
├── requirements.txt           # Project dependencies
└── data/
    ├── car_reviews.csv        # Car review dataset
    └── reference_translations.txt # Translation reference data

Installation

  1. Clone the repository:
git clone https://github.com/Shuyib/HF_model_preview.git
cd HF_model_review
  1. Create and activate a virtual environment:
python3 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt

Models Used

This project leverages several pre-trained models:

  • Sentiment Analysis:

    • distilbert-base-uncased-finetuned-sst-2-english
    • Qwen/Qwen2.5-1.5B-Instruct (via API)
  • Translation:

    • Helsinki-NLP/opus-mt-en-es
  • Question Answering:

    • deepset/minilm-uncased-squad2
  • Summarization:

    • cnicu/t5-small-booksum
    • facebook/bart-large-cnn

Usage

  1. Open the Jupyter notebook:
jupyter notebook notebook.ipynb
  1. Run the cells to see model evaluations for:
    • Sentiment analysis with accuracy and F1 metrics
    • Translation quality with BLEU score calculation
    • Question answering capabilities
    • Text summarization quality

API Access

For some models, you'll need to set up API access from huggingface.co. You can do this by creating a token on your Hugging Face account and setting it as an environment variable:

export HF_token="your_huggingface_token"
import os
os.environ["HF_TOKEN"] = "your_huggingface_token"

Development

This project includes a comprehensive Makefile with useful commands:

  • make install: Set up the environment
  • make format: Format code
  • make lint: Lint code
  • make test: Run tests
  • make clean: Clean up environment

Run make help to see all available commands.

Requirements

  • Python 3.10 or higher (Used 3.11.9)
  • See requirements.txt for Python dependencies:
    • transformers
    • evaluate
    • datasets
    • sentencepiece
    • openai
    • tenacity
    • ipykernel
    • torch

License

Creative Common License v1.0 Universal

Acknowledgments

  • Hugging Face for providing the model infrastructure
  • Datasets are from Datacamp projects
  • Special thanks to the teams behind the pre-trained models used in this project such as the Helsinki-NLP, Qwen team, and Facebook teams.