Project

Toxic Comment Classification using Deep Learning

Multilingual toxic comment classifier fine-tuned on XLM-RoBERTa across 7 languages and 360k+ comments. Custom language-aware attention, 6 toxicity categories, AUC 0.92+ on main categories. Deployed to HuggingFace with Streamlit and FastAPI interfaces.

February 2025 - April 2025my part Machine Learning Engineerwith Nauman Pathan

PythonDeep LearningNLPXLM-RoBERTaPyTorch

source live demo

Overview

A multilingual toxic comment classification system that can identify toxic content across 7 languages (English, Russian, Turkish, Spanish, French, Italian, Portuguese). The system uses language-aware transformers with custom attention mechanisms and advanced deep learning techniques to accurately classify comments into different toxicity categories.

Key Features