Methodology

Technical notes on data sources, classification rules, metrics, and known limitations for the ATMP Research Platform.


What are ATMPs?

Advanced Therapy Medicinal Products (ATMPs) are a class of medicines regulated under EU Regulation 1394/2007. They fall into four sub-classes:

Abbreviation Full name Description
GTMP Gene Therapy Medicinal Product Recombinant nucleic acid used to regulate, repair, replace, add, or delete a genetic sequence
sCTMP Somatic Cell Therapy Medicinal Product Cells that have been manipulated to change biological characteristics, or used to treat or diagnose disease
TEP Tissue Engineered Product Contains or consists of engineered cells or tissues; intended to repair, regenerate, or replace human tissue
cATMP Combined ATMP An ATMP that integrates a medical device in addition to cells or tissues

This platform covers research publications related to all four ATMP classes, identified via MeSH term classification rather than regulatory approval status. The corpus therefore includes basic research, translational studies, and clinical investigations, not only EMA-approved products.


MeSH ATMP classification

Source vocabulary

The identification strategy uses the Medical Subject Headings (MeSH) 2026 descriptor vocabulary (31,110 terms), parsed from the NLM XML release (desc2026.xml). Publications are classified as ATMP-related if indexed in Dimensions under one or more of 112 classified MeSH descriptors.

Classification methodology

MeSH terms were selected in two passes:

  1. Keyword matching — terms were screened against a curated list of ATMP-relevant keywords spanning gene therapy, cell therapy, tissue engineering, vectors, editing tools, and regulatory biology
  2. Tree-number matching — MeSH tree codes were used to capture entire sub-trees of related concepts (e.g., all descendants of D020871 - Gene Transfer Techniques)

This produced a three-tier classification:

Expert validation of the classification is ongoing. The "Edge Cases" tier may be refined in future versions.

Note: Only canonical DescriptorName values are queried in Dimensions, never entry terms (synonyms). Dimensions indexes publications by canonical descriptor; querying synonyms would be redundant and introduce false positives.

Scheme D technology domains

For Scheme D analysis, ATMP publications are further grouped into 8 technology domains aligned with the VR/VINNOVA Excellence Clusters for Groundbreaking Technologies grant framework:

Code Domain Description
F1 DNA tailoring Gene editing, CRISPR, DNA repair and modification
F2 Identity & Fate reprogramming iPSCs, cell differentiation, epigenetic reprogramming
F3 Delivery Vectors, nanoparticles, delivery vehicles
F4 Sensing & Control systems Synthetic biology, gene switches, biosensors
E1 Phenotyping Omics, single-cell analysis, biomarkers
E2 Bioprocessing Cell culture, bioreactors, growth factors
E3 Preclinical modelling Animal models, organoids, disease models
E4 Manufacturing GMP processes, quality control, scale-up

Each of the 112 MeSH terms is assigned a primary Scheme D domain (and optionally a secondary domain for multi-domain terms). Long-format assignments allow a paper to contribute to multiple domains if it is indexed under terms from different domains.


Publication data

Source

Publications are retrieved from the Dimensions API (Digital Science) using institution-level queries filtered by MeSH descriptor. Dimensions aggregates publications from PubMed, Crossref, and other bibliographic databases.

Query strategy

Each of the 112 ATMP MeSH descriptors is queried independently. Results are deduplicated by Dimensions publication ID (pub_id). A publication qualifies as ATMP-related if it carries at least one of the 112 classified MeSH terms in its Dimensions metadata.

Country and institution attribution

Country and institutional affiliation is taken directly from Dimensions' parsed affiliation data. Each publication can have multiple country attributions (one per affiliated institution). The unit of analysis in cross-country comparisons is therefore paper-country pair, not unique paper:

Known limitations


Citation metrics

Citation counts

Raw citation counts are from Dimensions and represent forward citations to each focal publication as of the download date (2026). Self-citations are not excluded.

Relative Citation Ratio (RCR)

The Relative Citation Ratio (Hutchins et al., 2016) is a field- and time-normalised citation metric from the NIH iCite database. An RCR of 1.0 means a paper has been cited at the same rate as the average for its field and year; RCR > 1 = above average.


Commercial potential score (compot)

The commercial potential score (compot) is a proprietary metric provided by Dimensions (Digital Science). It estimates the likelihood that a publication will be cited in a patent, based on the citation patterns of similar papers in the Dimensions network.

Papers without a compot score are excluded from all commercial potential analysis. Results represent the subset of papers for which Dimensions has computed a score, which may not be a random sample.


Altmetric data

Source

Altmetric mentions are retrieved from the Altmetric Details API (Altmetric.com / Digital Science) for all publications with a DOI. Coverage depends on Altmetric having indexed the publication.

Coverage types

Type Description
News Mentions in news outlets tracked by Altmetric
Blogs Mentions in research or science blogs
Patents Patent applications or grants that cite the publication (via USPTO, EPO, WIPO, and national patent offices)
Policy documents Government policy documents and reports citing the publication
Clinical guidelines Clinical practice guidelines citing the publication
Clinical trials Registered clinical trials (ClinicalTrials.gov and WHO ICTRP) that cite the publication

Patent jurisdiction classification

Patent citations are classified into jurisdiction groups based on the filing office:

Group Offices included
US USPTO (United States Patent and Trademark Office)
EP EPO (European Patent Office)
WIPO PCT international applications
CN CNIPA (China National Intellectual Property Administration)
JP JPO (Japan Patent Office)
KR KIPO (Korean Intellectual Property Office)
RestEurope All other European national patent offices (including SE — additive)
RestWorld All remaining offices

Sweden is additive: Swedish patents (SE jurisdiction) are counted in both RestEurope and the separate SE column. This is intentional — it allows comparison of Sweden against its regional peer group without double-subtraction.

Policy and guideline source classification

242 unique policy sources and 997 unique guideline sources were manually classified by scope:

Classification Description
international WHO, UN, OECD, ICH, and other supranational bodies
eu_regional EMA, European Commission, ECDC, and EU bodies
national National health ministries, agencies, and regulatory bodies

Each source was assigned a certainty level (high / medium). 235 of 242 policy sources and 933 of 997 guideline sources were classified at high certainty. The SE column in policy/guideline tables uses location == "SE" (Altmetric's reported document location), not the source name.


Clinical trials

Source

Clinical trial metadata is retrieved from Dimensions via the clinical_trials endpoint. Dimensions links publications to registered trials via citation and metadata matching.

Trial phase classification

Trials are grouped into three phases:

Group Included phases
Early Phase 1, Phase 1/2
Mid Phase 2, Phase 2/3
Late Phase 3, Phase 3/4, Phase 4
Not Reported Phase not specified or "N/A"

Trial geography

Trial geography is based on the country of the registering organisation (trials_orgs.csv). A trial can be attributed to multiple countries. "Translation rate" = number of publications linked to ≥1 trial / total publications in that group.


Funder classification

Extraction

Funder names are extracted from Dimensions publication metadata. Each paper can have multiple funders. 6,092 unique funder name strings were extracted from the ATMP corpus.

Classification taxonomy

Category Description
public_se Swedish public research councils and grant agencies (VR, Vetenskapsrådet, FORMAS, FORTE, MISTRA, Vinnova, KAW)
public_eu EU funding bodies (Horizon 2020, Horizon Europe, ERC, Marie Curie)
public_foreign Public research councils and government agencies outside Sweden and EU
corporate Private companies and industry funders
foundation Private philanthropic foundations (Wellcome Trust, Gates Foundation, etc.)
unknown Unclassified or unrecognised funder name

Coverage note

252 of 6,092 unique funders have been classified (48 public_se, 108 foundation, 83 public_foreign, 7 public_eu, 6 corporate). The long tail of funders (≤15 papers each) is intentionally left as unknown. Classification coverage is concentrated in the high-volume funders that drive the majority of funded papers.

Dimensions funder name strings often differ from assumed short forms (e.g., "Wellcome Trust Ltd" not "Wellcome Trust"). Name matching is therefore imperfect, and some known funders may be missed due to string variation.


Sample restrictions


Key decisions log

Date Decision
2026-05-07 Query Dimensions by DescriptorName only, not entry terms
2026-05-17 Policy/guideline sources classified manually (242 + 997 sources)
2026-05-17 SE additive in patent and trial tables (SE ⊂ RestEurope, SE ⊂ Europe)
2026-05-19 Scheme D taxonomy finalised (F1–F4 Fundamental + E1–E4 Enabling)
2026-05-22 All 18 formerly-Misfit Scheme D terms reclassified by domain expert
2026-05-25 RCR used as primary quality metric; scipot unavailable (all NULL)

Citation

If citing this platform or its outputs, please use:

ATMP Research Platform (2026). Descriptive analysis of global ATMP research output and Sweden's position. Developed in support of the VR/VINNOVA Excellence Clusters for Groundbreaking Technologies proposal. Stockholm: Stockholm School of Economics.

Data: Dimensions (publications, citations, clinical trials) · Altmetric (patent, policy, guideline, news, blog mentions) · NIH iCite (RCR) · MeSH 2026 ATMP classification: 112 terms, expert-validated. Platform built with Observable Framework 1.13.4 and Apache DuckDB.