I have the Coqui STT yesno model from GitHub, which is an ultra-compact speech recognition model that only recognises two words: yes and no.
I have the yesno.pbmm and yesno.scorer files.
I also have a tarball coqui-yesno-checkpoints.tar.gz containing the following files:
ls -l coqui-yesno-checkpoints/
total 7356
-rw-r--r-- 1 jeremiah jeremiah 12 Jul 27 00:00 alphabet.txt
-rw-r--r-- 1 jeremiah jeremiah 1066076 Jul 26 23:59 best_dev-1909.data-00000-of-00001
-rw-r--r-- 1 jeremiah jeremiah 1377 Jul 26 23:59 best_dev-1909.index
-rw-r--r-- 1 jeremiah jeremiah 4476795 Jul 26 23:59 best_dev-1909.meta
-rw-r--r-- 1 jeremiah jeremiah 83 Jul 27 00:00 best_dev_checkpoint
-rw-r--r-- 1 jeremiah jeremiah 3861 Jul 27 00:02 flags.txt
-rw-r--r-- 1 jeremiah jeremiah 1368905 Jul 27 00:05 yesno-64dims.logs
-rw-r--r-- 1 jeremiah jeremiah 594622 Jul 27 00:05 yesno-64dims.logs.lm-optimizer
How do I convert this model into a .tflite file for use with Coqui on embedded devices?
Looking for a command-line solution that can be easily scripted.
Coqui STT is forked from Mozilla's DeepSpeech. I wrote the PlayBook for DeepSpeech, alongside Josh Meyer, one of the founders of Coqui. The Coqui STT training documentation doesn't cover exporting a model to
tflite, so I would recommend starting from the DeepSpeech documentation - as the syntax is generally interchangeable.That is, you need to run
train.pyfrom Coqui, specifying the checkpoints you have, and pass the--export_tfliteflag. This should create atflitemodel from the checkpoints. By default, DeepSpeech would export aprotobuf(.pb) file, but I don't know what Coqui's default export format is.At a guess, this would look something like:
Hope this helps - interested to hear how you go.