LyrAssist - AI-Powered Lyric Transcription & Video Generation
Full-stack web application that automatically transcribes audio/video files and generates synchronized lyric videos using OpenAI Whisper, WhisperX, and Demucs AI models
Full-stack web application that automatically transcribes audio/video files and generates synchronized lyric videos using OpenAI Whisper, WhisperX, and Demucs AI models
Interactive visualization tool analyzing 20 years of Manhattan pedestrian infrastructure data using React, D3.js, and geospatial processing
Advanced multi-agent reinforcement learning implementation with parameter sharing and Independent Q-Learning, achieving robust scalability across 2-5 agents![]()
Published in International Conference on Data Science and Applications (ICDSA 2024), 2024
This study aims to understand and improve the predictive accuracy of emotional state classification through metrics such as valence, arousal, dominance, and likeness by applying a long short-term memory (LSTM) network to analyze EEG signals.
Citation: Sateesh, S. K., Sparsh, B. K., & Uma, D. (2024). "Decoding Human Emotions: Analyzing Multi-channel EEG Data Using LSTM Networks." International Conference on Data Science and Applications. Springer Nature Singapore, 503-515.
Download Paper | View on Springer
Published in International Conference on Multi-disciplinary Trends in Artificial Intelligence (MIWAI 2024), 2024
This survey overviews various meta-learning approaches used in audio and speech processing scenarios. Meta-learning is used where model performance needs to be maximized with minimum annotated samples, making it suitable for low-sample audio processing.
Citation: Raimon, A., Masti, S., Sateesh, S. K., Vengatagiri, S., & Das, B. (2024). "Meta-learning in Audio and Speech Processing: An End to End Comprehensive Review." International Conference on Multi-disciplinary Trends in Artificial Intelligence. Springer Nature Singapore, 140-154.
Download Paper | View on Springer