Any
⚙️ Technical
Intermediate
Build a Web Scraper
Design and write a complete web scraping solution for any data extraction task.
The Prompt
Help me build a web scraper for the following task: Target website: [URL or describe the type of site] Data to extract: [exactly what fields/information you need] Scale: [one-time scrape / daily / real-time / millions of pages] Programming language: [Python / Node.js / other] Output format: [CSV / JSON / database / other] Authentication required: [yes — describe login process / no] Anti-scraping measures present: [CAPTCHAs / rate limiting / JavaScript rendering / login walls / other] Legal/ethical context: [public data / own site testing / research / other] Provide a complete scraping solution: TECH STACK RECOMMENDATION: - Library/framework choice (BeautifulSoup / Scrapy / Playwright / Puppeteer / etc.) - Justification based on the site's characteristics - Additional tools needed COMPLETE CODE: - Full working scraper code - Data extraction logic with selectors - Error handling for missing fields - Rate limiting and polite crawling delays - Retry logic for failed requests - Output to specified format HANDLING ANTI-SCRAPING: - User agent rotation - Request header management - IP rotation approach (if needed) - CAPTCHA handling strategy - JavaScript rendering solution (if needed) DATA CLEANING: - Normalising extracted data - Handling missing or malformed values - Deduplication logic SCALING: - How to handle large volumes - Parallelisation approach - Storage strategy at scale MONITORING: - How to detect when the site structure changes - Alerting on extraction failures - Logging approach ETHICAL AND LEGAL NOTES: - robots.txt compliance check - Rate limiting best practices - Data storage and usage considerations
📝 Fill in the blanks
Replace these placeholders with your own content:
[URL or describe the type of site]
[exactly what fields/information you need]
[one-time scrape / daily / real-time / millions of pages]
[Python / Node.js / other]
[CSV / JSON / database / other]
[yes — describe login process / no]
[CAPTCHAs / rate limiting / JavaScript rendering / login walls / other]
[public data / own site testing / research / other]
How to use this prompt
1
Copy the prompt
Click "Copy Prompt" above to copy the full prompt text to your clipboard.
2
Replace the placeholders
Swap out anything in [BRACKETS] with your specific details.
3
Paste into Any
Open your preferred AI assistant and paste the prompt to get started.