Data Extraction
Transform research PDFs into structured, editable spreadsheets with AI-powered extraction, Google Sheets integration, and human validation.
Data Extraction
PDFs in. Structured spreadsheets out. No more manual copy-paste marathons.
Data extraction is the bridge between your included studies and your analysis. It's also one of the most error-prone steps in a systematic review — manually transcribing values from PDFs into spreadsheets introduces mistakes that propagate through your entire analysis. mapped's Multimodal Extraction Engine reads research papers the way a human would, and outputs structured, validated data.
How AI Extraction Works
mapped's Multimodal Extraction Engine processes PDFs visually and textually:
- Upload your included studies — PDF or full-text format
- Define your extraction template — specify which data points you need (study characteristics, interventions, outcomes, effect sizes, confidence intervals, etc.)
- AI extracts data — the engine reads the full document, including tables, figures, and supplementary text
- Review and validate — every extracted value is presented for human confirmation before entering your dataset
- Export to spreadsheet — validated data flows into your extraction table
The engine doesn't just scrape text — it understands document structure. It knows that a number in a table cell under "95% CI" is a confidence interval, not a p-value.
Complex Table Handling
Research papers are notorious for complex tables: multi-level headers, merged cells, footnotes with asterisks, and data split across multiple pages. mapped handles:
- Multi-level column headers — correctly mapping values to the right variable
- Merged cells — understanding that a spanning cell applies to all rows beneath it
- Footnotes and annotations — capturing the symbols and their explanations
- Split tables — tables that continue across multiple pages are unified
- Landscape tables — orientation doesn't affect extraction
Google Sheets Integration
Extracted data flows directly into Google Sheets, enabling:
- Real-time collaboration — your extraction team can review, edit, and validate data simultaneously
- Version history — Google's automatic versioning means you never lose an edit
- Familiar interface — no learning curve; it's the spreadsheet environment researchers already know
- Formula support — add calculations, conditional formatting, or data validation rules
- Export flexibility — download as Excel, CSV, or PDF at any time
Human Validation Loop
AI extracts. You verify. This is non-negotiable.
Every AI-extracted value is flagged with a confidence indicator:
| Confidence | What it means | Action needed |
|---|---|---|
| High | Clear, unambiguous value in the source | Quick confirmation |
| Medium | Value extracted but context is ambiguous | Careful review against source |
| Low | Uncertain extraction, possibly from complex formatting | Manual verification required |
No value enters your final dataset without explicit human confirmation. This loop ensures that your meta-analysis input is clean, accurate, and defensible.
Extraction Template Builder
mapped provides a structured template builder so you can standardize what gets extracted across all studies:
- Study characteristics — author, year, country, setting, study design
- Population — sample size, demographics, inclusion criteria
- Intervention details — type, dose, duration, delivery method
- Comparator details — control group specifics
- Outcomes — primary and secondary, measurement methods, timepoints
- Effect measures — means, standard deviations, odds ratios, hazard ratios, confidence intervals
- Quality indicators — funding source, conflict of interest declarations
Custom fields can be added for domain-specific data points. The template ensures consistency across your entire team.
Handling Supplementary Materials
Research increasingly publishes detailed data in supplementary files, appendices, and online-only materials. mapped's extraction engine processes supplementary PDFs alongside the main manuscript, ensuring no data is overlooked.
Why This Step Matters
"Garbage in, garbage out" applies directly to meta-analysis. A single mistyped standard deviation or incorrectly transcribed sample size can distort your pooled estimate. mapped's combination of AI extraction, human validation, and structured templates ensures that the data feeding your analysis is accurate, complete, and traceable back to its source.
Next step: With your data extracted, move to Quality Assessment →
Screening
Dual-reviewer screening with AI batch processing, conflict resolution, and real-time progress tracking — PRISMA-compliant by design.
Quality Assessment
Risk of bias assessment with four validated tools (RoB 2.0, ROBINS-I, NOS, QUADAS-2) and GRADE evidence certainty rating — AI-assisted, human-confirmed.