Extract data, not headaches
No more squinting at PDFs with a spreadsheet open in a split screen. mapped's Multimodal Extraction Engine reads research papers — tables, figures, footnotes, scanned pages — and outputs structured, validated data straight into Google Sheets.
Updated April 2026
Data Extraction
Systematic Data Collection
“a Cardiology Department, Faculty of Medicine, Benha University, Benha, Egypt and b National Heart Institute, Cairo, Egypt”
Key Capabilities
Multimodal PDF Extraction
The Multimodal Extraction Engine reads PDFs visually and textually — not just OCR'd text. It understands that a number under a '95% CI' header is a confidence interval, not a p-value. Tables, figures, and supplementary text are processed as document structure, not flattened pages.
Complex Table Handling
Multi-level column headers map values to the right variable. Merged spanning cells apply correctly to rows beneath. Footnotes and asterisk annotations are captured with their explanations. Tables continuing across pages are unified. Landscape orientation works without configuration.
Scanned-Document Support
Older trials and grey literature often arrive as scanned PDFs. mapped's extraction pipeline includes OCR plus the multimodal engine, so a scanned 1995 RCT report extracts cleanly into the same structured template as a born-digital 2026 paper. Added January 2026.
Live Google Sheets Integration
Extracted data flows directly into Google Sheets. Your extraction team edits collaboratively in real time with Google's native version history, comments, and conditional formatting. Export to Excel, CSV, or PDF whenever you need a snapshot. No separate tooling, no learning curve.
Confidence-Tiered Human Review
Every extracted value carries a confidence tier. High = unambiguous, just confirm. Medium = extracted with some context ambiguity, careful review. Low = uncertain (often complex formatting), manual verification required. No value enters the final dataset without explicit human confirmation. The audit trail records who confirmed what, and when.
Custom Extraction Templates
Standardize what gets extracted across all studies: characteristics, population, intervention, comparator, outcomes, effect measures, quality indicators. Templates are project-level so the entire team extracts consistently — and adding a new field mid-review backfills it across already-extracted studies, not just new ones.
Every value, back to its source.
Source quote. Page reference. Confidence rating. Cited conversion method — yes, including Wan et al. 2014, Scenario 3. Every cell. Every time. Reviewer queries close themselves.
“Haptoglobin (median, g/L) for CBA group at Day 1 was approximately 0.8 (range bars approximately 0.3 to 1.2). Figure 1 text: 'Haptoglobin (median, g/L) P < 0.001 … CBA group at Day 1.'”
Median (IQR) → Mean ± SD
Method: Wan et al. 2014, Scenario 3 (Q1/M/Q3)
Frequently asked questions
- What is mapped's Data Extraction?
- It's an AI-powered pipeline that turns research-paper PDFs into structured spreadsheets. The Multimodal Extraction Engine reads tables, figures, footnotes, and scanned pages, then routes every value through human confirmation with confidence-tiered review and live Google Sheets sync.
- Does mapped handle scanned PDFs?
- Yes. The pipeline runs OCR on scanned documents and feeds the result through the same multimodal engine. Older trials and grey literature extract into the same structured template as born-digital papers. This was added in the January 2026 release alongside a 15% accuracy improvement.
- How does mapped compare to EPPI-Reviewer for extraction?
- EPPI-Reviewer is a respected full-pipeline tool with strong qualitative-research features. mapped focuses on AI-multimodal PDF extraction with live Google Sheets collaboration and per-value confidence tiers — a smoother fit for medical and quantitative reviews. EPPI-Reviewer is stronger on qualitative coding; mapped is stronger on PDF-to-table accuracy and team collaboration.
- Who is Data Extraction for?
- Anyone extracting structured data from research papers — systematic reviews, meta-analyses, scoping reviews, evidence syntheses. Particularly valuable for teams of 2+ where collaborative editing matters, and for reviews with complex tables that take an hour each to transcribe manually.
- How much does Data Extraction cost?
- The free tier includes basic data extraction for one active project. The Mapped Project tier (list $119/project, currently $79 launch pricing) unlocks the full Multimodal Extraction Engine, scanned-PDF support, custom templates, and live Google Sheets sync. Custom Enterprise plans add unlimited projects. See mappedresearch.com/pricing.
- Can I export extracted data to Excel or statistical software?
- Yes. Extracted data lives in Google Sheets natively, and exports cleanly to Excel, CSV, or PDF. From there it flows into R, Stata, SPSS, or directly into mapped's own meta-analysis module without re-entry.
Comparing tools? See how mapped stacks up against EPPI-Reviewer on the workflow you actually run.
Mapped vs EPPI-Reviewer