Build Your Own Personal Voice Assistant with Python - Code Masala Bytes - Helping Developers Solve Real World Problems

Are you ready to create your own personal voice assistant? With Python this is possible! In this blog post We’ll show you how to create a personal voice assistant that can search Google, YouTube, Wikipedia, perform calculations, open applications, and more, all through simple voice commands. The assistant takes advantage of Python’s versatility and uses various API libraries to process commands and provide output.

Let’s dive in!

Why Python for Building a Voice Assistant?

Python is ideal for creating personal voice assistants due to its simplicity and large library. You can easily integrate Google Speech Recognition, Google Text-to-Speech (gTTS), WolframAlpha API for computation, and Selenium for web operations. This flexibility makes Python an ideal language for developing custom automation scripts. Including voice assistant.

Features of the Python Voice Assistant

Your personal voice assistant with just a few lines of code:

Convert speech to text and vice versa.
Search Google Wikipedia or YouTube using voice commands.
Open the application installed on your system.
Perform mathematical calculations using the WolframAlpha API.

Required Libraries

Here is a list of the external Python packages you will need.

gTTS: Convert text to speech using Google Text-to-Speech.
Speech_recognition: Recognizes speech and converts it to text.
Selenium: Automate browser activities such as web searches.
WolframAlpha: for mathematical calculations.
Playsound: Play audio files from the system.

You can install these libraries with:

Bash

pip install gtts speechrecognition selenium wolframalpha playsound

Step-by-Step Code Breakdown

1. Default settings: Speech to Text and Text to Speech. First, we need to set up the speech recognition and text-to-speech components. It has basic code that accepts voice input, processes it, and speaks a response to the user.

Python

import speech_recognition as sr 
import playsound 
from gtts import gTTS 
import os 
import wolframalpha 
from selenium import webdriver 

def assistant_speaks(output):
    print("Assistant: ", output)
    toSpeak = gTTS(text=output, lang='en', slow=False)
    toSpeak.save("output.mp3")
    playsound.playsound("output.mp3", True)
    os.remove("output.mp3")

def get_audio():
    r = sr.Recognizer()
    with sr.Microphone() as source:
        print("Listening...")
        audio = r.listen(source, phrase_time_limit=5)
    try:
        text = r.recognize_google(audio, language='en-US')
        print("You: ", text)
        return text
    except Exception as e:
        assistant_speaks("Sorry, I couldn't understand. Please try again!")
        return None

2. Main Program: User Interaction In this section, we request the user’s name. and starts looping to listen to the user’s commands. The assistant will listen until the user says “Bye” or “Exit”.

Python

if __name__ == "__main__":
    assistant_speaks("What's your name?")
    name = get_audio()
    assistant_speaks("Hello, " + name + ". How can I assist you today?")
    
    while True:
        assistant_speaks("What can I do for you?")
        command = get_audio().lower()
        
        if "bye" in command or "exit" in command:
            assistant_speaks("Goodbye!")
            break
        
        process_text(command)

3. Processing of user orders: The process_text() function handles commands such as web searches, calculations, and application launches.

Python

def process_text(input):
    if "search" in input or "play" in input:
        search_web(input)
    elif "calculate" in input:
        calculate_query(input)
    elif "open" in input:
        open_application(input)
    else:
        assistant_speaks("I can search the web for you. Do you want me to do that?")
        response = get_audio()
        if 'yes' in response:
            search_web(input)

4. Web Search and Application Handling

The assistant can search the web and open applications like Chrome and Word using voice commands.

Python

def search_web(query):
    driver = webdriver.Firefox()
    if 'youtube' in query:
        assistant_speaks("Opening YouTube")
        driver.get("http://www.youtube.com/results?search_query=" + '+'.join(query.split()[1:]))
    elif 'wikipedia' in query:
        assistant_speaks("Opening Wikipedia")
        driver.get("https://en.wikipedia.org/wiki/" + '_'.join(query.split()[1:]))
    else:
        assistant_speaks("Searching Google")
        driver.get("https://www.google.com/search?q=" + '+'.join(query.split()[1:]))

def open_application(query):
    if "chrome" in query:
        assistant_speaks("Opening Chrome")
        os.startfile("C:/Program Files (x86)/Google/Chrome/Application/chrome.exe")
    elif "word" in query:
        assistant_speaks("Opening Word")
        os.startfile("C:/ProgramData/Microsoft/Windows/Start Menu/Programs/Microsoft Office/Word.lnk")

Wrapping It All Up

When you follow these important steps finished You can combine these functions with a fully functional personal voice assistant. Your assistant:

Understanding voice commands.
Respond verbally.
Search, open the application, and calculate.

You can further improve this project by integrating natural language processing (NLP) to understand more complex questions, or add additional APIs for additional functionality.

Common voice commands you can use:

“Search Google for Python tutorials”
“Play popular YouTube encoded songs”
“Wikipedia Python programming language”
“Open Microsoft Word” “Count 5+

This project is highly scalable. So feel free to add additional features as you improve your Python skills, such as smart home integration or email management.

Conclusion

Building a personal voice assistant in Python is a great way to improve your programming skills and automate everyday tasks. With libraries like gTTS, SpeechRecognition, Selenium, and WolframAlpha, you can implement a wide range of functionalities. Happy coding!

Share This Post:

Why Python for Building a Voice Assistant?

Features of the Python Voice Assistant

Required Libraries

Step-by-Step Code Breakdown

4. Web Search and Application Handling

Wrapping It All Up

Common voice commands you can use:

Conclusion

Leave a Reply Cancel reply