Prompt Library ⚙️ Technical Data Extraction
GPT-4o ⚙️ Technical Intermediate

Data Extraction

Write a structured prompt and approach for extracting specific data fields from unstructured text, documents, or web content.
👁 5 views ⎘ 0 copies ♥ 0 likes

The Prompt

# Data Extraction

You are a data engineering specialist and NLP workflow designer. Your task is to create a structured approach for extracting specific data from the described source material.

## Input Details

- **Source type:** [EMAILS / PDF_DOCUMENTS / WEB_PAGES / NEWS_ARTICLES / CONTRACTS / INVOICES / FORMS]
- **Data fields to extract:** [LIST_TARGET_FIELDS]
- **Volume:** [SINGLE_DOCUMENT / BATCH_OF_DOCUMENTS / ONGOING_STREAM]
- **Output format needed:** [JSON / CSV / TABLE / STRUCTURED_SUMMARY]
- **Tool or approach:** [PYTHON / AI_PROMPT / NO_CODE_TOOL / MANUAL_STRUCTURED]

## Instructions

Create a data extraction plan including:
1. **Extraction Schema** — define each target field with: field name, data type, example value, and extraction rule (where to find it in the source)
2. **Extraction Prompt** (for AI-based extraction) — a reusable, structured prompt that instructs the AI to extract the defined fields in the specified output format
3. **Edge Case Handling** — what to do when a field is missing, ambiguous, or has multiple values
4. **Validation Rules** — rules for checking extracted data quality (format, range, required vs. optional)
5. **Output Format Template** — the exact JSON, CSV header row, or table structure for the output
6. **Error Handling** — how to flag and handle extraction failures or low-confidence extractions
7. **Scaling Notes** — how to adapt this approach for batch or automated extraction

## Output Format

Extraction schema table, reusable AI extraction prompt (copy-paste ready), output format template, and validation rules list.

📝 Fill in the blanks

Replace these placeholders with your own content:

[EMAILS / PDF_DOCUMENTS / WEB_PAGES / NEWS_ARTICLES / CONTRACTS / INVOICES / FORMS]
[LIST_TARGET_FIELDS]
[SINGLE_DOCUMENT / BATCH_OF_DOCUMENTS / ONGOING_STREAM]
[JSON / CSV / TABLE / STRUCTURED_SUMMARY]
[PYTHON / AI_PROMPT / NO_CODE_TOOL / MANUAL_STRUCTURED]

How to use this prompt

1
Copy the prompt

Click "Copy Prompt" above to copy the full prompt text to your clipboard.

2
Replace the placeholders

Swap out anything in [BRACKETS] with your specific details.

3
Paste into GPT-4o

Open your preferred AI assistant and paste the prompt to get started.