I'm working on an audio project. My goal is to count the number of people who spokes in an audio file. We can consider that we already removed the noise from that audio.(for example, if there are two people talking in the audio the program can return 2 if there are three people talking in that audio the program will return 3...). I don't need speech recognition; I just want to know how many people talks. What is the best way to solve this problem?
How can I count the number of people speaks in an audio file
1.9k views Asked by Kacem ICHAKDI At
1
There are 1 answers
Related Questions in AUDIO
- Play multiple audio files in a slider
- Unity3d AudioSource not creatable
- JavaFX can't play mp3 files
- iPhone simultaneous sound output
- Phonegap Build App - Play Audio
- HTML5 Audio pause not working
- Java boolean play button issue (play over and over again with each click)
- import a sound externally or from the library? AS3
- Set audio source
- Saving a sound bite as a ringtone
Related Questions in SIGNAL-PROCESSING
- Calculate energy for each frequency band around frequency F of interest in Python
- convert sound to list of phonemes in python
- Why is there a difference in magnitude response between scipy.filtfilt and scipy.lfilter?
- Image 2x downsampling with Lanczos filter
- Simple Python Median Filter for time series
- FFT Fundamental frequency calculation from LomontFFT
- Daubechies orthogonal wavelet in python
- fftw slight peak inaccuracy/drifting
- Zoom in on np.fft2 result
- How can I find process noise and measurement noise in a Kalman filter if I have a set of RSSI readings?
Related Questions in SPEECH-RECOGNITION
- Sphinx4 fails to find resources
- How to config grammar for StreamSpeechRecognizer in CMUSphinx
- Offline Speech Recognition on Android Wear
- Is Speech-to-Text-to-Translation an Impossible Dream?
- Recognition listener android studio, it doesn't work
- Android speech recognizer works fine on 5.0.1 but doesn't work on 5.1
- How do I reconfigure MS' CLI for full dictation via speech recognition?
- Can't get Mac dictation custom commands to work
- How to working with multiple button recognizer at HTML5 web speech API
- Offline voice recognition android taking unwanted voice
Related Questions in LIBROSA
- Applying a window function to a frame in librosa
- How to load another file in libROSA?
- Using Librosa to plot a mel-spectrogram
- load directly an audio file with librosa in dB
- Effect of window shifting in spectrogram?
- Time steps difference in spectrogram
- Audio signal split at word level boundary
- understanding librosa.feature.spectral_contrast
- Librosa Spectogram vs Matplotlib Spectrogram
- How can I process OPUS format with Librosa?
Related Questions in DIARIZATION
- Speaker Diarization using Resemblyzer
- How to split 1 channel audio into 2 channels?
- torch.hub.load('pyannote/pyannote-audio', 'dia') doesn't work in local
- How can I count the number of people speaks in an audio file
- speaker diarization for telephone conversations using Resemblyzer
- Azure speech-to-text speaker identification (or diarization): no text and no guests
- Google Speech-to-Text API Speaker Diarization with Python .long_running_recognize() method
- Python: How to align two lists using start/end timestamps in the item
- Efficient speaker diarization
- Diart (torchaudio) on Windows x64 results in torchaudio error "ImportError: FFmpeg libraries are not found. Please install FFmpeg."
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Popular Tags
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
If I am correct you are looking for
speaker diarization. In this thread someone listed a few options for python. Python Speaker RecognitionOtherwise if you want to take the easier way, you can let google do it for you with their
Cloud Speech-to-textAPI. Not free, but also really cool. More about that right here: https://cloud.google.com/speech-to-text/docs/multiple-voices