Author Topic: Windows Text To Speech Library  (Read 4922 times)

0 Members and 1 Guest are viewing this topic.

Offline SMcNeill

  • QB64 Developer
  • Forum Resident
  • Posts: 3972
    • View Profile
    • Steve’s QB64 Archive Forum
Windows Text To Speech Library
« on: January 27, 2022, 06:52:50 am »
I turned the powershell stuff into a simple little library for people to make use of in their projects, and here it is: 

Code: QB64: [Select]
  1. _Title "Steve's Powershell Speech Library"
  2.  
  3. Speech_IoR 'initialize or reset speech options
  4. Speech_SaP "Hello World, This is a normal speed demo of David's voice" 'speak and print
  5. Speech_Speaker "Ziva"
  6. Speech_Say "Hello again.  This is a normal speed demo of Ziva's voice." 'just speak this one
  7. Speech_Speaker "David"
  8. Speech_Speed -10
  9. Speech_SaP "And now I'm speaking as David, but I'm speaking veeery slow."
  10. Speech_Speaker "Ziva"
  11. Speech_Speed 5
  12. Speech_SaP "And now I'm a very hyper Ziva!"
  13. Speech_Speed 0
  14. Speech_Volume 30
  15. Speech_SaP "And now I'm whispering to you that I'm done with my demo!"
  16.  
  17.  
  18. '$INCLUDE:'TextToSpeech.BM'

 

As you can see, all the commands are preceeded by "Speech_", to try and help keep the sub names unique, associative, and not interfere with any user variable names and such.

Routines in this little package are:

Speech_IoR -- Init or Reset.  Call this first to initialize the settings (and turn volume up to 100, or else you'll be speaking on a MUTE channel)

Speech_Speaker -- Change the default speaker.  Currently I only support "David" and "Ziva", but feel free to change or add to this if your system has other voices installed via language/voice packs.

Speech_Speed -- Set a value from -10 to 10 to adjust the speed of the speaker.  0 id default, -10 is sloooow, and 10 is faaaast.

Speech_Volume -- Set a value from 0 to 100 to adjust how loud you're going to be speaking with the voices.

Speech_OutTo -- Use this to change where you want the speech to go.  Only options now are your speakers or a file.  Since it's not currently in the demo, as I didn't want to randomly save junk to folks drives, an example looks like:

        Speech_OutTo "MyTextToFile.wav"
        Speech_OutTo "Speaker"
        Speech_OutTo "" 'defaults/resets to speaker

Speech_Say -- Just says the text you specify with the settings you gave it previously.

Speech_SaP -- Says and Prints the text you specify to the screen as a quick print and speak shortcut.  Uses previous settings.

Speech_ToWav -- Converts text to a wav file and saves it to the disk where you specify.  Since it's not in the short demo above, usage would be as:

       Speech_ToWav "Hello World.  This is the text I'm saving to a file!", "MyFile.wav"

speak -- This is the master command with all the options built into it.  You can skip everything else, if you want to use this as a stand alone command to do everything all at once.  Everything else ends up calling this command at the end of the day, so you can bypass some processes if you call this directly.



And that's basically it for now.  Windows Speech Synthesizer is quite a powerful little tool, with a ton of options which we can utilize with it, but I figure this is the basics of what someone would need to be able to do with it for a program.  It seems to handle what I need from it for now.

If you guys need it to do more, feel free to ask and I'll see about adding extra functionality as people need it.  Or, feel free to make the changes necessary yourself and then share them here with us so everybody else can enjoy any extra tweaks you guys add into the code.



To make use of this:

1) Download the library: 
2) Move it to your QB64 folder.
3) '$INCLUDE:'TextToSpeech.BM' at the end of your program.
4) Speech_IoR inside your code to initialize everything
5) Call the other subs as you want to make use of them and alter the settings to your specific needs.

It's that simple!  ;D
https://github.com/SteveMcNeill/Steve64 — A github collection of all things Steve!

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
    • View Profile
Re: Windows Text To Speech Library
« Reply #1 on: January 27, 2022, 07:46:37 am »
Oh I guess saving to file wasn't too hard either, nice thanks.

Offline SMcNeill

  • QB64 Developer
  • Forum Resident
  • Posts: 3972
    • View Profile
    • Steve’s QB64 Archive Forum
Re: Windows Text To Speech Library
« Reply #2 on: January 27, 2022, 07:55:31 am »
Oh I guess saving to file wasn't too hard either, nice thanks.

Why sound so surprised?  I illustrated the process yesterday for you guys here: https://qb64forum.alephc.xyz/index.php?topic=4598.msg140111#msg140111
https://github.com/SteveMcNeill/Steve64 — A github collection of all things Steve!

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
    • View Profile
Re: Windows Text To Speech Library
« Reply #3 on: January 27, 2022, 08:11:46 am »
Why sound so surprised?  I illustrated the process yesterday for you guys here: https://qb64forum.alephc.xyz/index.php?topic=4598.msg140111#msg140111

I missed it in the code, I thought you were reshowing the code and then presenting .wav files so Johnno could hear what the replies sounded like. Plus, I thought storing a .wav file would be like storing an image file which is why I didn't expect to see it in the code, too short!

Offline OldMoses

  • Seasoned Forum Regular
  • Posts: 469
    • View Profile
Re: Windows Text To Speech Library
« Reply #4 on: January 27, 2022, 01:06:27 pm »
This is great! Now I can have my computer spew random filthy language. ;)

Offline SierraKen

  • Forum Resident
  • Posts: 1454
    • View Profile
Re: Windows Text To Speech Library
« Reply #5 on: January 27, 2022, 02:00:00 pm »
Pretty nice Steve, thanks. But I'm curious, why no STOP, PAUSE, or RESUME examples?

Offline SMcNeill

  • QB64 Developer
  • Forum Resident
  • Posts: 3972
    • View Profile
    • Steve’s QB64 Archive Forum
Re: Windows Text To Speech Library
« Reply #6 on: January 27, 2022, 04:41:47 pm »
Pretty nice Steve, thanks. But I'm curious, why no STOP, PAUSE, or RESUME examples?

I figured you'd ask about those.  ;)

Truth is, I'd thought about adding all those features to the library, but then I figured it was just an added layer of complexity that most people won't need.

The problem with Pause, Stop, Resume, Cancel is that you have to run the routines with a SHELL _DONTWAIT command, which comes with a few issues attached.

Let's say I have code like this:

DIM out(10) As String
out(1) = "Hello World"
out(2) = "My name is Steve"
out(3) = "I like cheese"
FOR I = 1 to 3
   Speech_say out(i)
NEXT

As things currently exist, the program will say the lines one after the other, and wait for each line to finish before moving on to the next one.

If I write the routine to SHELL _DONTWAIT by default, then it'd start playing the first line, move on and start playing the second line, and then move on and play the third line all at the same time.

You'd need callback functions so you could add in something like:

FOR I = 1 to 3
   Speech_say out(i)
   While Speech_speaking: Wend 'Wait for speech to finish before moving on to the next line.
NEXT


At that point, you might as well add handles so you can control each speech instance individually... 

FOR I = 1 to 3
   handle = Speech_say (out(i))
   While Speech_speaking (habdle): Wend 'Wait for speech to finish before moving on to the next line.
NEXT

And of course, you'd want to make certain to clean up and free handles and such afterwards:

FOR I = 1 to 3
   handle = Speech_say (out(i))
   While Speech_speaking (handle): Wend 'Wait for speech to finish before moving on to the next line.
   Speech_Close handle
NEXT

And in the end, it just makes things a bit more flexible but a lot more complex. 

Honestly, it's probably something I'll end up adding to the library over time, in future updates, but I've learned from past experience:  It's best to start things off as simple as possible and then expand from there as people adjust and learn the library.  It's the same way I developed the SaveImage library -- originally it was just a SaveBMP library.  Then I added text screen captures.  Then PNG support.   GIF came a little latter and then JPG files...  Image dithering and color reductions came after....

Start out with the very basics so folks can learn the library and make use of it -- and learn its limitations -- and then over time add and extend functionality as more people realize they need it and are willing to make use of it. 



And that's why there's basically no support for Pause, Resume, Stop.... (yet),  ;D
https://github.com/SteveMcNeill/Steve64 — A github collection of all things Steve!

Offline SierraKen

  • Forum Resident
  • Posts: 1454
    • View Profile
Re: Windows Text To Speech Library
« Reply #7 on: January 27, 2022, 07:02:07 pm »
Thanks Steve. :) Yeah I ran into the overlapping with my LCD clock (different thread). But without the _DONTWAIT it stops the overlapping. Of course the bouncing ball also stops but there wasn't much for me to do about that using the same speak SUB twice. I tried to use it all at once, but I found out that the speak sub has a limit of how much you can throw in one command line at once.

Offline SierraKen

  • Forum Resident
  • Posts: 1454
    • View Profile
Re: Windows Text To Speech Library
« Reply #8 on: January 27, 2022, 07:05:22 pm »
On a side note, I also found out that you can't use the ' symbol, like in the word: today's. For some reason the command won't run with it.