Skip to content

ultralytics/flickr_scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

83 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Ultralytics logo

πŸš€ Introduction

The Flickr Scraper is a Python tool designed to help you gather images from Flickr for creating custom datasets, particularly useful for Ultralytics YOLO model training. Based on your search criteria, this tool simplifies the process of collecting relevant images for various computer vision tasks, streamlining your dataset preparation workflow. Learn more about datasets in our blog post on the best computer vision datasets.

Ultralytics Actions Ultralytics Discord Ultralytics Forums Ultralytics Reddit

🌟 Key Features

  • Keyword Search: Find images on Flickr using specific keywords relevant to your project.
  • Direct Download: Easily download images to assemble your computer vision dataset.
  • Streamlined Data Collection: Simplify the process of gathering training data for model training with YOLO models.

πŸ”§ Requirements

Ensure you have Python 3.7 or later installed. The necessary dependencies can be installed using pip:

pip install -U -r requirements.txt

Key packages include:

  • flickrapi: A Python wrapper for the Flickr API, essential for interacting with Flickr services. You can find more details on the flickrapi PyPI page.

πŸ› οΈ Installation

To set up the Flickr scraper on your local machine, follow these steps using Git:

# Clone the repository
git clone https://github.com/ultralytics/flickr_scraper

# Navigate to the project directory
cd flickr_scraper

# Install the required packages
pip install -U -r requirements.txt

βš™οΈ Running the Scraper

Before you begin scraping images:

  1. Get a Flickr API Key: Obtain your unique API key and secret by applying here.
  2. Configure API Credentials: Insert your API key and secret into the flickr_scraper.py script:
# flickr_scraper.py
# Replace with your actual Flickr API key and secret
key = "YOUR_API_KEY"
secret = "YOUR_API_SECRET"
  1. Execute the Script: Run the script from your terminal, specifying your search query, the number of images to fetch (--n), and the --download flag to save them locally. Downloaded images are saved by default to the flickr_scraper/images/ directory, organized into subfolders based on the search query.

    Important: Be mindful of Flickr's API rate limits and terms of service. Excessive requests may lead to temporary or permanent blocking. Refer to the official Flickr API documentation for detailed usage guidelines.

Example command to download 10 images matching 'honeybees on flowers':

python3 flickr_scraper.py --search 'honeybees on flowers' --n 10 --download

You should see output indicating the download progress:

0/10 https://live.staticflickr.com/21/38596887_40df118fd9_o.jpg
...
9/10 https://live.staticflickr.com/1770/43276172331_e779b8c161_o.jpg
Done. (4.1s)
All images saved to /Users/glennjocher/PycharmProjects/flickr_scraper/images/honeybees_on_flowers/

The downloaded images will be available in the specified folder (e.g., images/honeybees_on_flowers/), ready for annotation, further processing, or direct use in training your models.

Example scraped image of a honeybee on a flower

πŸ“œ Citation

If the Flickr Scraper tool helps your research or work, please consider citing it using the following DOI:

DOI

🀝 Contributing

Contributions are welcome! We value input from the community to fix bugs, add features, or improve documentation. Please see our Contributing Guide for details on how to get started. Don't forget to share your experiences and feedback by completing our Survey. Thank you πŸ™ to all our contributors!

Ultralytics open-source contributors

©️ License

Ultralytics provides two licensing options for this project:

  • AGPL-3.0 License: An OSI-approved open-source license, ideal for students and enthusiasts who wish to contribute and share improvements publicly. See the LICENSE file for details.
  • Enterprise License: Designed for commercial applications, this license allows for the integration of Ultralytics software and AI models into commercial products and services without the open-source requirements of AGPL-3.0. If your use case involves commercial deployment, please contact us through Ultralytics Licensing.

πŸ“¬ Contact

For bug reports, feature suggestions, or contributions, please visit GitHub Issues. For broader questions and discussions about Ultralytics projects, join our active community on Discord! Explore the full range of our resources at Ultralytics Docs.


Ultralytics GitHub space Ultralytics LinkedIn space Ultralytics Twitter space Ultralytics YouTube space Ultralytics TikTok space Ultralytics BiliBili space Ultralytics Discord