You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update Apify log formatter to contain logger name (#116)
## Description
Since Actors can contain many different loggers it could be valuable to
have a logger name in the log (at the beginning).
## Example (Scrapy Actor)
### Before
```
$ apify run --purge
Info: All default local stores were purged.
Run: /home/vdusek/Apify/actor-templates/templates/python-scrapy/.venv/bin/python3 -m src
INFO Initializing actor...
INFO System info ({"apify_sdk_version": "1.1.4", "apify_client_version": "1.4.1", "python_version": "3.11.5", "os": "linux"})
INFO Actor is being executed...
INFO Scrapy 2.11.0 started (bot: titlebot)
INFO Versions: lxml 4.9.3.0, libxml2 2.10.3, cssselect 1.2.0, parsel 1.8.1, w3lib 2.1.2, Twisted 22.10.0, Python 3.11.5 (main, Aug 28 2023, 00:00:00) [GCC 13.2.1 20230728 (Red Hat 13.2.1-1)], pyOpenSSL 23.2.0 (OpenSSL 3.1.2 1 Aug 2023), cryptography 41.0.3, Platform Linux-6.5.5-200.fc38.x86_64-x86_64-with-glibc2.37
INFO Enabled addons:
[] ({"crawler": "<scrapy.crawler.Crawler object at 0x7fc405aaf110>"})
INFO Telnet Password: 1d21357fcef1a014
INFO Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.memusage.MemoryUsage',
'scrapy.extensions.logstats.LogStats'] ({"crawler": "<scrapy.crawler.Crawler object at 0x7fc405aaf110>"})
INFO Overridden settings:
{'BOT_NAME': 'titlebot',
'DEPTH_LIMIT': 1,
'NEWSPIDER_MODULE': 'src.spiders',
'REQUEST_FINGERPRINTER_IMPLEMENTATION': '2.7',
'ROBOTSTXT_OBEY': True,
'SCHEDULER': 'src.apify.scheduler.ApifyScheduler',
'SPIDER_MODULES': ['src.spiders'],
'TWISTED_REACTOR': 'twisted.internet.asyncioreactor.AsyncioSelectorReactor'}
INFO Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats',
'src.apify.middlewares.ApifyRetryMiddleware'] ({"crawler": "<scrapy.crawler.Crawler object at 0x7fc405aaf110>"})
INFO Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware'] ({"crawler": "<scrapy.crawler.Crawler object at 0x7fc405aaf110>"})
INFO Enabled item pipelines:
['src.apify.pipelines.ActorDatasetPushPipeline',
'src.pipelines.TitleItemPipeline'] ({"crawler": "<scrapy.crawler.Crawler object at 0x7fc405aaf110>"})
INFO Spider opened ({"spider": "<TitleSpider 'title_spider' at 0x7fc403a0c290>"})
INFO Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) ({"spider": "<TitleSpider 'title_spider' at 0x7fc403a0c290>"})
INFO TelnetConsole starting on 6023
INFO Telnet console listening on 127.0.0.1:6023 ({"crawler": "<scrapy.crawler.Crawler object at 0x7fc405aaf110>"})
INFO TitleSpider is parsing <200 https://apify.com>...
INFO TitleSpider is parsing <200 https://apify.com/templates>...
INFO TitleSpider is parsing <200 https://apify.com/>...
INFO TitleSpider is parsing <200 https://apify.com/enterprise>...
INFO TitleSpider is parsing <200 https://crawlee.dev>...
INFO TitleSpider is parsing <200 https://apify.com/store>...
INFO TitleSpider is parsing <200 https://apify.com/actors>...
INFO TitleSpider is parsing <200 https://docs.apify.com/>...
INFO TitleSpider is parsing <200 https://apify.com/storage>...
INFO TitleSpider is parsing <200 https://apify.com/proxy>...
INFO TitleSpider is parsing <200 https://apify.com/integrations>...
INFO TitleSpider is parsing <200 https://apify.com/data-for-generative-ai?ref=top_nav>...
INFO TitleSpider is parsing <200 https://blog.apify.com/>...
INFO TitleSpider is parsing <200 https://apify.com/partners>...
INFO TitleSpider is parsing <200 https://apify.com/about>...
INFO TitleSpider is parsing <200 https://apify.com/ideas>...
INFO TitleSpider is parsing <200 https://apify.com/pricing>...
INFO TitleSpider is parsing <200 https://docs.apify.com>...
INFO TitleSpider is parsing <200 https://docs.apify.com/academy/web-scraping-for-beginners>...
INFO TitleSpider is parsing <200 https://apify.com/>...
INFO TitleSpider is parsing <200 https://apify.com/>...
INFO TitleSpider is parsing <200 https://apify.com/>...
INFO TitleSpider is parsing <200 https://apify.com/>...
INFO TitleSpider is parsing <200 https://docs.apify.com/academy/apify-platform>...
INFO TitleSpider is parsing <200 https://apify.com/>...
INFO TitleSpider is parsing <200 https://apify.com/>...
INFO TitleSpider is parsing <200 https://apify.com/partners/actor-developers>...
INFO TitleSpider is parsing <200 https://apify.com/use-cases>...
INFO TitleSpider is parsing <200 https://apify.com/data-for-generative-ai>...
INFO TitleSpider is parsing <200 https://discord.com/invite/jyEM2PRvMU>...
INFO TitleSpider is parsing <200 https://apify.com/product-matching-ai>...
INFO Ignoring response <403 https://www.g2.com/products/apify/reviews>: HTTP status code is not handled or not allowed ({"spider": "<TitleSpider 'title_spider' at 0x7fc403a0c290>"})
INFO TitleSpider is parsing <200 https://apify.com/success-stories>...
INFO TitleSpider is parsing <200 https://console.apify.com/sign-in>...
INFO TitleSpider is parsing <200 https://apify.com/store/scrapers/universal-web-scrapers>...
INFO TitleSpider is parsing <200 https://apify.com/>...
INFO TitleSpider is parsing <200 https://apify.com/>...
INFO TitleSpider is parsing <200 https://apify.com/enterprise>...
INFO TitleSpider is parsing <200 https://apify.com/>...
INFO TitleSpider is parsing <200 https://apify.com/streamers/youtube-scraper>...
INFO Ignoring response <403 https://www.trustradius.com/products/apify/reviews>: HTTP status code is not handled or not allowed ({"spider": "<TitleSpider 'title_spider' at 0x7fc403a0c290>"})
INFO TitleSpider is parsing <200 https://console.apify.com>...
INFO Ignoring response <403 https://crozdesk.com/it/platform-as-a-service-paas/apify>: HTTP status code is not handled or not allowed ({"spider": "<TitleSpider 'title_spider' at 0x7fc403a0c290>"})
INFO TitleSpider is parsing <200 https://console.apify.com/sign-up>...
INFO TitleSpider is parsing <200 https://apify.com/terms-of-use>...
INFO TitleSpider is parsing <200 https://apify.com/privacy-policy>...
INFO TitleSpider is parsing <200 https://apify.com/quacker/twitter-scraper>...
INFO TitleSpider is parsing <200 https://apify.com/cookie-policy>...
INFO TitleSpider is parsing <200 https://apify.com/apify/cheerio-scraper>...
INFO Ignoring response <403 https://www.capterra.com/reviews/150854/Apify>: HTTP status code is not handled or not allowed ({"spider": "<TitleSpider 'title_spider' at 0x7fc403a0c290>"})
INFO TitleSpider is parsing <200 https://apify.com/apify/web-scraper>...
INFO TitleSpider is parsing <200 https://docs.apify.com/cli/>...
INFO TitleSpider is parsing <200 https://apify.com/compass/crawler-google-places>...
INFO TitleSpider is parsing <200 https://consent.youtube.com/ml?continue=https://www.youtube.com/apify?cbrd%3D1&gl=CZ&hl=en&cm=2&pc=yt&src=1>...
INFO TitleSpider is parsing <200 https://apify.com/apify/puppeteer-scraper>...
INFO TitleSpider is parsing <200 https://github.com/apify>...
INFO TitleSpider is parsing <200 https://apify.com/junglee/amazon-crawler>...
INFO TitleSpider is parsing <200 https://apify.com/voyager/booking-scraper>...
INFO TitleSpider is parsing <200 https://help.apify.com/en/>...
INFO TitleSpider is parsing <200 https://stackoverflow.com/questions/tagged/apify>...
INFO Closing spider (finished) ({"spider": "<TitleSpider 'title_spider' at 0x7fc403a0c290>"})
INFO Dumping Scrapy stats:
{'downloader/exception_count': 2,
'downloader/exception_type_count/scrapy.exceptions.IgnoreRequest': 2,
'downloader/request_bytes': 21499,
'downloader/request_count': 84,
'downloader/request_method_count/GET': 84,
'downloader/response_bytes': 2156228,
'downloader/response_count': 84,
'downloader/response_status_count/200': 70,
'downloader/response_status_count/302': 4,
'downloader/response_status_count/308': 3,
'downloader/response_status_count/403': 6,
'downloader/response_status_count/404': 1,
'elapsed_time_seconds': 3.797489,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2023, 10, 2, 18, 8, 25, 118412, tzinfo=datetime.timezone.utc),
'httpcompression/response_bytes': 10134095,
'httpcompression/response_count': 73,
'httperror/response_ignored_count': 4,
'httperror/response_ignored_status_count/403': 4,
'item_scraped_count': 56,
'log_count/INFO': 71,
'memusage/max': 75239424,
'memusage/startup': 75239424,
'request_depth_max': 1,
'response_received_count': 77,
'robotstxt/forbidden': 2,
'robotstxt/request_count': 17,
'robotstxt/response_count': 17,
'robotstxt/response_status_count/200': 14,
'robotstxt/response_status_count/403': 2,
'robotstxt/response_status_count/404': 1,
'start_time': datetime.datetime(2023, 10, 2, 18, 8, 21, 320923, tzinfo=datetime.timezone.utc)} ({"spider": "<TitleSpider 'title_spider' at 0x7fc403a0c290>"})
INFO Spider closed (finished) ({"spider": "<TitleSpider 'title_spider' at 0x7fc403a0c290>"})
INFO (TCP Port 6023 Closed)
INFO Exiting actor ({"exit_code": 0})
```
### After
```
$ apify run --purge
Info: All default local stores were purged.
Run: /home/vdusek/Apify/actor-templates/templates/python-scrapy/.venv/bin/python3 -m src
[apify] INFO Initializing actor...
[apify] INFO System info ({"apify_sdk_version": "1.1.4", "apify_client_version": "1.4.1", "python_version": "3.11.5", "os": "linux"})
[apify] INFO Actor is being executed...
[scrapy.utils.log] INFO Scrapy 2.11.0 started (bot: titlebot)
[scrapy.utils.log] INFO Versions: lxml 4.9.3.0, libxml2 2.10.3, cssselect 1.2.0, parsel 1.8.1, w3lib 2.1.2, Twisted 22.10.0, Python 3.11.5 (main, Aug 28 2023, 00:00:00) [GCC 13.2.1 20230728 (Red Hat 13.2.1-1)], pyOpenSSL 23.2.0 (OpenSSL 3.1.2 1 Aug 2023), cryptography 41.0.3, Platform Linux-6.5.5-200.fc38.x86_64-x86_64-with-glibc2.37
[scrapy.addons] INFO Enabled addons:
[] ({"crawler": "<scrapy.crawler.Crawler object at 0x7f9c770fac10>"})
[scrapy.extensions.telnet] INFO Telnet Password: 565a012bc27d0fc0
[scrapy.middleware] INFO Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.memusage.MemoryUsage',
'scrapy.extensions.logstats.LogStats'] ({"crawler": "<scrapy.crawler.Crawler object at 0x7f9c770fac10>"})
[scrapy.crawler] INFO Overridden settings:
{'BOT_NAME': 'titlebot',
'DEPTH_LIMIT': 1,
'NEWSPIDER_MODULE': 'src.spiders',
'REQUEST_FINGERPRINTER_IMPLEMENTATION': '2.7',
'ROBOTSTXT_OBEY': True,
'SCHEDULER': 'src.apify.scheduler.ApifyScheduler',
'SPIDER_MODULES': ['src.spiders'],
'TWISTED_REACTOR': 'twisted.internet.asyncioreactor.AsyncioSelectorReactor'}
[scrapy.middleware] INFO Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats',
'src.apify.middlewares.ApifyRetryMiddleware'] ({"crawler": "<scrapy.crawler.Crawler object at 0x7f9c770fac10>"})
[scrapy.middleware] INFO Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware'] ({"crawler": "<scrapy.crawler.Crawler object at 0x7f9c770fac10>"})
[scrapy.middleware] INFO Enabled item pipelines:
['src.apify.pipelines.ActorDatasetPushPipeline',
'src.pipelines.TitleItemPipeline'] ({"crawler": "<scrapy.crawler.Crawler object at 0x7f9c770fac10>"})
[scrapy.core.engine] INFO Spider opened ({"spider": "<TitleSpider 'title_spider' at 0x7f9c76fe43d0>"})
[scrapy.extensions.logstats] INFO Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) ({"spider": "<TitleSpider 'title_spider' at 0x7f9c76fe43d0>"})
[twisted] INFO TelnetConsole starting on 6023
[scrapy.extensions.telnet] INFO Telnet console listening on 127.0.0.1:6023 ({"crawler": "<scrapy.crawler.Crawler object at 0x7f9c770fac10>"})
[apify] INFO TitleSpider is parsing <200 https://apify.com>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/data-for-generative-ai?ref=top_nav>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/templates>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/enterprise>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/integrations>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/storage>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/actors>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/proxy>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/partners>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/partners/actor-developers>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/data-for-generative-ai>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/product-matching-ai>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/use-cases>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/about>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/success-stories>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/ideas>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/pricing>...
[apify] INFO TitleSpider is parsing <200 https://docs.apify.com/academy/web-scraping-for-beginners>...
[apify] INFO TitleSpider is parsing <200 https://docs.apify.com/>...
[apify] INFO TitleSpider is parsing <200 https://docs.apify.com/academy/apify-platform>...
[apify] INFO TitleSpider is parsing <200 https://docs.apify.com>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/>...
[apify] INFO TitleSpider is parsing <200 https://blog.apify.com/>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/streamers/youtube-scraper>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/apify/web-scraper>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/compass/crawler-google-places>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/quacker/twitter-scraper>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/apify/cheerio-scraper>...
[apify] INFO TitleSpider is parsing <200 https://crawlee.dev>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/apify/puppeteer-scraper>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/store>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/enterprise>...
[apify] INFO TitleSpider is parsing <200 https://discord.com/invite/jyEM2PRvMU>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/>...
[scrapy.spidermiddlewares.httperror] INFO Ignoring response <403 https://crozdesk.com/it/platform-as-a-service-paas/apify>: HTTP status code is not handled or not allowed ({"spider": "<TitleSpider 'title_spider' at 0x7f9c76fe43d0>"})
[scrapy.spidermiddlewares.httperror] INFO Ignoring response <403 https://www.trustradius.com/products/apify/reviews>: HTTP status code is not handled or not allowed ({"spider": "<TitleSpider 'title_spider' at 0x7f9c76fe43d0>"})
[scrapy.spidermiddlewares.httperror] INFO Ignoring response <403 https://www.g2.com/products/apify/reviews>: HTTP status code is not handled or not allowed ({"spider": "<TitleSpider 'title_spider' at 0x7f9c76fe43d0>"})
[apify] INFO TitleSpider is parsing <200 https://apify.com/terms-of-use>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/privacy-policy>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/cookie-policy>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/junglee/amazon-crawler>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/voyager/booking-scraper>...
[apify] INFO TitleSpider is parsing <200 https://console.apify.com/sign-in>...
[apify] INFO TitleSpider is parsing <200 https://docs.apify.com/cli/>...
[apify] INFO TitleSpider is parsing <200 https://consent.youtube.com/ml?continue=https://www.youtube.com/apify?cbrd%3D1&gl=CZ&hl=en&cm=2&pc=yt&src=1>...
[scrapy.spidermiddlewares.httperror] INFO Ignoring response <403 https://www.capterra.com/reviews/150854/Apify>: HTTP status code is not handled or not allowed ({"spider": "<TitleSpider 'title_spider' at 0x7f9c76fe43d0>"})
[apify] INFO TitleSpider is parsing <200 https://github.com/apify>...
[apify] INFO TitleSpider is parsing <200 https://console.apify.com>...
[apify] INFO TitleSpider is parsing <200 https://console.apify.com/sign-up>...
[apify] INFO TitleSpider is parsing <200 https://help.apify.com/en/>...
[apify] INFO TitleSpider is parsing <200 https://stackoverflow.com/questions/tagged/apify>...
[apify] INFO TitleSpider is parsing <200 https://apify.com/store/scrapers/universal-web-scrapers>...
[scrapy.core.engine] INFO Closing spider (finished) ({"spider": "<TitleSpider 'title_spider' at 0x7f9c76fe43d0>"})
[scrapy.statscollectors] INFO Dumping Scrapy stats:
{'downloader/exception_count': 2,
'downloader/exception_type_count/scrapy.exceptions.IgnoreRequest': 2,
'downloader/request_bytes': 21563,
'downloader/request_count': 84,
'downloader/request_method_count/GET': 84,
'downloader/response_bytes': 2156572,
'downloader/response_count': 84,
'downloader/response_status_count/200': 70,
'downloader/response_status_count/302': 4,
'downloader/response_status_count/308': 3,
'downloader/response_status_count/403': 6,
'downloader/response_status_count/404': 1,
'elapsed_time_seconds': 2.714584,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2023, 10, 2, 17, 56, 23, 977190, tzinfo=datetime.timezone.utc),
'httpcompression/response_bytes': 10134231,
'httpcompression/response_count': 73,
'httperror/response_ignored_count': 4,
'httperror/response_ignored_status_count/403': 4,
'item_scraped_count': 56,
'log_count/INFO': 71,
'memusage/max': 75722752,
'memusage/startup': 75722752,
'request_depth_max': 1,
'response_received_count': 77,
'robotstxt/forbidden': 2,
'robotstxt/request_count': 17,
'robotstxt/response_count': 17,
'robotstxt/response_status_count/200': 14,
'robotstxt/response_status_count/403': 2,
'robotstxt/response_status_count/404': 1,
'start_time': datetime.datetime(2023, 10, 2, 17, 56, 21, 262606, tzinfo=datetime.timezone.utc)} ({"spider": "<TitleSpider 'title_spider' at 0x7f9c76fe43d0>"})
[scrapy.core.engine] INFO Spider closed (finished) ({"spider": "<TitleSpider 'title_spider' at 0x7f9c76fe43d0>"})
[twisted] INFO (TCP Port 6023 Closed)
[apify] INFO Exiting actor ({"exit_code": 0})
```
### After (screenshot with colored output)

0 commit comments