Skip to content

This project enables real-time streaming of audio (and optionally video or screen captures) from your local device to Google Gemini using the Live API. It allows you to interact with Gemini through both text and voice, supporting conversational AI responses.

Notifications You must be signed in to change notification settings

Creature-112/gemini-live

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎥 Gemini Live: Real-Time Streaming to Google Gemini

Gemini Live

Welcome to the Gemini Live repository! This project enables real-time streaming of audio, and optionally video or screen captures, from your local device to Google Gemini using the Live API. With Gemini Live, you can interact with Gemini through both text and voice, supporting conversational AI responses.

🚀 Features

  • Real-Time Audio Streaming: Stream audio directly from your device to Google Gemini.
  • Video and Screen Capture Support: Optionally include video or screen captures in your streams.
  • Conversational AI: Engage with Gemini using both text and voice, making your interactions more dynamic.
  • Easy Setup: Simple installation and setup process for quick access.
  • Cross-Platform: Works on various operating systems, including Windows, macOS, and Linux.

📦 Installation

To get started with Gemini Live, you need to download the latest release. Visit the Releases section to find the necessary files. Download the appropriate package for your operating system and follow the installation instructions.

Prerequisites

  • Python 3.7 or higher
  • Required libraries (see below)

Required Libraries

You will need to install the following libraries:

pip install requests
pip install websocket-client
pip install opencv-python
pip install pyaudio

🔧 Usage

  1. Start the Application: Run the main script to initiate the streaming process.

    python main.py
  2. Configure Settings: Edit the configuration file to set your preferences, including audio and video settings.

  3. Begin Streaming: Once configured, you can start streaming to Google Gemini.

  4. Interact with Gemini: Use voice commands or text inputs to engage with the AI.

📈 Project Structure

gemini-live/
│
├── main.py             # Main application script
├── config.json         # Configuration file for settings
├── requirements.txt     # Required Python libraries
├── README.md           # Project documentation
└── assets/             # Additional assets (images, etc.)

🌐 Topics

This project covers various topics related to AI and real-time processing:

  • gemini
  • gemini-2-0-flash
  • gemini-2-0-flash-live
  • gemini-ai
  • gemini-api
  • gemini-flash
  • google-genai
  • google-generative-ai
  • live
  • live-video-processing
  • python
  • python-generative-ai
  • realtime
  • realtime-video-processor
  • video-analysis

📖 Documentation

For detailed documentation on using Gemini Live, refer to the Wiki section. This will guide you through advanced features and troubleshooting tips.

🤝 Contributing

We welcome contributions! If you want to help improve Gemini Live, please follow these steps:

  1. Fork the repository.
  2. Create a new branch for your feature or bug fix.
  3. Make your changes and commit them.
  4. Push your branch to your forked repository.
  5. Create a pull request.

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

💬 Support

If you encounter any issues or have questions, feel free to open an issue in the repository. You can also check the Releases section for updates and bug fixes.

📅 Roadmap

  • Q1 2024: Implement additional features for enhanced video processing.
  • Q2 2024: Expand support for more audio formats.
  • Q3 2024: Integrate machine learning capabilities for improved AI interactions.

📸 Screenshots

Streaming Example

Configuration Settings

🎉 Acknowledgments

  • Thanks to the Google Gemini team for providing the Live API.
  • Special thanks to all contributors and users who provide feedback.

For more information, visit the Releases section.

🛠️ Tools Used

  • Python
  • OpenCV
  • WebSocket
  • PyAudio

📣 Community

Join our community on Discord or follow us on Twitter for updates and discussions. Your feedback is valuable and helps us improve.


Thank you for your interest in Gemini Live! We look forward to seeing how you use this project to enhance your interactions with Google Gemini. Happy streaming!

About

This project enables real-time streaming of audio (and optionally video or screen captures) from your local device to Google Gemini using the Live API. It allows you to interact with Gemini through both text and voice, supporting conversational AI responses.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages