Welcome to the Gemini Live repository! This project enables real-time streaming of audio, and optionally video or screen captures, from your local device to Google Gemini using the Live API. With Gemini Live, you can interact with Gemini through both text and voice, supporting conversational AI responses.
- Real-Time Audio Streaming: Stream audio directly from your device to Google Gemini.
- Video and Screen Capture Support: Optionally include video or screen captures in your streams.
- Conversational AI: Engage with Gemini using both text and voice, making your interactions more dynamic.
- Easy Setup: Simple installation and setup process for quick access.
- Cross-Platform: Works on various operating systems, including Windows, macOS, and Linux.
To get started with Gemini Live, you need to download the latest release. Visit the Releases section to find the necessary files. Download the appropriate package for your operating system and follow the installation instructions.
- Python 3.7 or higher
- Required libraries (see below)
You will need to install the following libraries:
pip install requests
pip install websocket-client
pip install opencv-python
pip install pyaudio
-
Start the Application: Run the main script to initiate the streaming process.
python main.py
-
Configure Settings: Edit the configuration file to set your preferences, including audio and video settings.
-
Begin Streaming: Once configured, you can start streaming to Google Gemini.
-
Interact with Gemini: Use voice commands or text inputs to engage with the AI.
gemini-live/
│
├── main.py # Main application script
├── config.json # Configuration file for settings
├── requirements.txt # Required Python libraries
├── README.md # Project documentation
└── assets/ # Additional assets (images, etc.)
This project covers various topics related to AI and real-time processing:
- gemini
- gemini-2-0-flash
- gemini-2-0-flash-live
- gemini-ai
- gemini-api
- gemini-flash
- google-genai
- google-generative-ai
- live
- live-video-processing
- python
- python-generative-ai
- realtime
- realtime-video-processor
- video-analysis
For detailed documentation on using Gemini Live, refer to the Wiki section. This will guide you through advanced features and troubleshooting tips.
We welcome contributions! If you want to help improve Gemini Live, please follow these steps:
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Make your changes and commit them.
- Push your branch to your forked repository.
- Create a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.
If you encounter any issues or have questions, feel free to open an issue in the repository. You can also check the Releases section for updates and bug fixes.
- Q1 2024: Implement additional features for enhanced video processing.
- Q2 2024: Expand support for more audio formats.
- Q3 2024: Integrate machine learning capabilities for improved AI interactions.
- Thanks to the Google Gemini team for providing the Live API.
- Special thanks to all contributors and users who provide feedback.
For more information, visit the Releases section.
- Python
- OpenCV
- WebSocket
- PyAudio
Join our community on Discord or follow us on Twitter for updates and discussions. Your feedback is valuable and helps us improve.
Thank you for your interest in Gemini Live! We look forward to seeing how you use this project to enhance your interactions with Google Gemini. Happy streaming!