Published Fast: - If it's accepted, We aim to get your article published online in 48 hours.

Home / Articles

No Article found
PERSONALISED VOICE ASSISTANT USING WHISPER AND GPT
Author Name

ARJUNRAJA V, SRIMAN S, MOHINTH R, ANUSHMA MAHALAKSHMI A, Mrs SUSEELA D

Abstract

In the evolving landscape of artificial intelligence, personalized voice assistants have become essential tools for enhancing user experience through natural interaction and automation. This project, titled "Personalized Voice Assistant with Whisper and GPT," integrates state-of-the-art technologies—OpenAI's Whisper and GPT-3—to establish a highly adaptable, context-aware voice interaction system. The Whisper model, known for its high-accuracy speech recognition capabilities, is optimized to handle diverse accents and noisy environments, supported by an active learning mechanism that enables continuous performance improvement based on user feedback. Complementing Whisper,

The assistant is designed with multi-modal capabilities, supporting Speech-to-Text (STT), Text-to-Speech (TTS), Speech-to-Speech (STS), and Text-to-Text (TTT) interactions, thereby offering users multiple interaction options. To enhance natural language processing (NLP), the project leverages open-source libraries such as spaCy and NLTK, facilitating robust text processing and semantic analysis. The primary objective of this project is to develop a responsive, user-centric assistant that adapts to individual preferences while ensuring seamless, human-like interactions. Key components include fine-tuning GPT-3 to enhance domain-specific language modeling, active learning for dynamic improvement in speech recognition accuracy, and real-time TTS synthesis for natural and coherent responses. By integrating Flask for a web-based interface, the system is accessible across platforms, enhancing usability and ensuring broad reach. The system architecture prioritizes privacy and data ethics, employing measures to safeguard user data while ensuring transparency in AI functionalities.

 

 

Keywords—Personalized voice assistant, Whisper, GPT-3, speech recognition, natural language processing, Speech- to-Text, Text-to-Speech, Speech-to-Speech, Text-to-Text, cloud-based deployment, data privacy.



Published On :
2024-12-05

Article Download :
Publish your academic thesis as a book with ISBN Contact – connectirj@gmail.com
Visiters Count :
Click to see detail of visits and stats for this site