Skip to content

Conversation

@tkotthakota-adobe
Copy link
Collaborator

@tkotthakota-adobe tkotthakota-adobe commented Dec 23, 2025

Implemented comprehensive bot protection detection and alerting in the Task Processor to identify when sites are blocked by bot protection services (Cloudflare, Akamai, Imperva, etc.) and prevent unnecessary processing.

    1. CloudWatch-Based Detection
      Single Source of Truth: Query CloudWatch logs from Content Scraper for bot protection events
      Filter Pattern: "[BOT-BLOCKED]" "${siteId}" with time-based filtering
      Time Window: onboardStartTime to capture all scraping activity during onboarding
    1. Fail-Fast Bot Protection Check
      Early Detection: Check for bot protection BEFORE fetching scrape results
      Abort Immediately: Return early with bot protection details when detected

Logs:
"[BOT-BLOCKED] Bot protection detected"
https://spacecat.coralogix.com/#/query-new/logs?id=kHDgr5BOZMaJ9LAOhMZCV&page=0&startTime=1768321614979&endTime=1768408014979

[BOT-BLOCKED] Bot protection detected: 94 URLs blocked (from CloudWatch logs) for site https://abbvie.com

Tests:
https://cq-dev.slack.com/archives/C060T2PPF8V/p1768350318773359?thread_ts=1768348500.256659&cid=C060T2PPF8V

image

https://jira.corp.adobe.com/browse/SITES-37727

@tkotthakota-adobe tkotthakota-adobe marked this pull request as draft December 23, 2025 02:33
@github-actions
Copy link

This PR will trigger no release when merged.

@codecov
Copy link

codecov bot commented Dec 23, 2025

Codecov Report

❌ Patch coverage is 99.41406% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/tasks/opportunity-status-processor/handler.js 95.38% 3 Missing ⚠️

📢 Thoughts on this report? Let us know!

@tkotthakota-adobe tkotthakota-adobe marked this pull request as ready for review January 14, 2026 01:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants