The Flickr Scraper is a Python tool designed to help you gather images from Flickr for creating custom datasets, particularly useful for Ultralytics YOLO model training. Based on your search criteria, this tool simplifies the process of collecting relevant images for various computer vision tasks, streamlining your dataset preparation workflow. Learn more about datasets in our blog post on the best computer vision datasets.
- Keyword Search: Find images on Flickr using specific keywords relevant to your project.
- Direct Download: Easily download images to assemble your computer vision dataset.
- Streamlined Data Collection: Simplify the process of gathering training data for model training with YOLO models.
Ensure you have Python 3.7 or later installed. The necessary dependencies can be installed using pip:
pip install -U -r requirements.txt
Key packages include:
flickrapi
: A Python wrapper for the Flickr API, essential for interacting with Flickr services. You can find more details on theflickrapi
PyPI page.
To set up the Flickr scraper on your local machine, follow these steps using Git:
# Clone the repository
git clone https://github.com/ultralytics/flickr_scraper
# Navigate to the project directory
cd flickr_scraper
# Install the required packages
pip install -U -r requirements.txt
Before you begin scraping images:
- Get a Flickr API Key: Obtain your unique API key and secret by applying here.
- Configure API Credentials: Insert your API key and secret into the
flickr_scraper.py
script:
# flickr_scraper.py
# Replace with your actual Flickr API key and secret
key = "YOUR_API_KEY"
secret = "YOUR_API_SECRET"
-
Execute the Script: Run the script from your terminal, specifying your search query, the number of images to fetch (
--n
), and the--download
flag to save them locally. Downloaded images are saved by default to theflickr_scraper/images/
directory, organized into subfolders based on the search query.Important: Be mindful of Flickr's API rate limits and terms of service. Excessive requests may lead to temporary or permanent blocking. Refer to the official Flickr API documentation for detailed usage guidelines.
Example command to download 10 images matching 'honeybees on flowers':
python3 flickr_scraper.py --search 'honeybees on flowers' --n 10 --download
You should see output indicating the download progress:
0/10 https://live.staticflickr.com/21/38596887_40df118fd9_o.jpg
...
9/10 https://live.staticflickr.com/1770/43276172331_e779b8c161_o.jpg
Done. (4.1s)
All images saved to /Users/glennjocher/PycharmProjects/flickr_scraper/images/honeybees_on_flowers/
The downloaded images will be available in the specified folder (e.g., images/honeybees_on_flowers/
), ready for annotation, further processing, or direct use in training your models.
If the Flickr Scraper tool helps your research or work, please consider citing it using the following DOI:
Contributions are welcome! We value input from the community to fix bugs, add features, or improve documentation. Please see our Contributing Guide for details on how to get started. Don't forget to share your experiences and feedback by completing our Survey. Thank you π to all our contributors!
Ultralytics provides two licensing options for this project:
- AGPL-3.0 License: An OSI-approved open-source license, ideal for students and enthusiasts who wish to contribute and share improvements publicly. See the LICENSE file for details.
- Enterprise License: Designed for commercial applications, this license allows for the integration of Ultralytics software and AI models into commercial products and services without the open-source requirements of AGPL-3.0. If your use case involves commercial deployment, please contact us through Ultralytics Licensing.
For bug reports, feature suggestions, or contributions, please visit GitHub Issues. For broader questions and discussions about Ultralytics projects, join our active community on Discord! Explore the full range of our resources at Ultralytics Docs.