Most pivotal GLP-1 obesity trials run 68 to 72 weeks — long enough to register FDA approval, not long enough to answer the questions patients actually ask about ten-year safety. The papers below extend the safety window in two directions. Five are randomized trials with median follow-up between 2 and 5.4 years (STEP-5, SELECT, LEADER, SUSTAIN-6, REWIND), the longest-running prospective GLP-1 data in print. Two are obesity-trial extensions that capture what happens during continued dosing (SURMOUNT-4) or after withdrawal (STEP-1 extension). Three are large pharmacovigilance and observational studies — a JAMA cohort on gastrointestinal events, a FAERS disproportionality analysis of biliary signals, a WHO VigiBase replication on suicidality, and a multisite cohort on thyroid cancer — that probe rare adverse events the trials were never powered to detect. Older liraglutide post-marketing data extend further (10+ years), but the modern second-generation GLP-1 long-term record is still being written.
Ranked papers
#1STEP-5
Garvey WT, Batterham RL, Bhatta M, et al. · Nat Med · 2022
Primary endpoint: Body weight change and adverse-event profile at 104 weeks
STEP-5 extended the STEP-1 design to 104 weeks, the longest randomized obesity-trial follow-up for semaglutide 2.4 mg. Among 304 randomized adults, the safety profile at year 2 matched year 1: gastrointestinal adverse events remained the dominant category (82.2% semaglutide vs 53.9% placebo), most mild-to-moderate and concentrated during dose escalation. Serious adverse events occurred in 11.2% on semaglutide vs 6.6% on placebo. No new safety signals emerged in the second year of treatment, supporting the chronic-dosing paradigm but leaving longer follow-up to SELECT.
PMID 36216945 ↗NCT03693430 ↗
#2SELECT
Lincoff AM, Brown-Frandsen K, Colhoun HM, et al. · N Engl J Med · 2023
Primary endpoint: First MACE and on-treatment serious adverse events over mean 39.8 months
SELECT randomized 17,604 adults with established cardiovascular disease and obesity (no diabetes) to semaglutide 2.4 mg or placebo with mean follow-up 39.8 months — the longest prospective safety dataset for a GLP-1 weight-loss dose. Serious adverse events were less common on semaglutide (33.4% vs 36.4%), but discontinuation for adverse events was higher (16.6% vs 8.2%), driven by gastrointestinal symptoms. Cholelithiasis was more frequent on semaglutide (2.8% vs 2.3%). No excess of pancreatitis, thyroid cancer, or psychiatric events was observed, providing the strongest medium-term obesity-dose safety signal to date.
PMID 37952131 ↗NCT03574597 ↗DOI 10.1056/NEJMoa2307563 ↗
#3LEADER
Marso SP, Daniels GH, Brown-Frandsen K, et al. · N Engl J Med · 2016
Primary endpoint: First MACE and adjudicated adverse events over median 3.8 years
LEADER followed 9,340 type 2 diabetes patients on liraglutide 1.8 mg vs placebo for a median 3.8 years — the longest randomized safety record for any GLP-1 in modern use. Beyond the 13% relative MACE reduction, the safety dataset documented similar overall serious adverse-event rates (49.7% vs 50.4%), more acute gallstone disease (3.1% vs 1.9%), and no excess pancreatic cancer (0.3% vs 0.1%, p=0.06) or medullary thyroid carcinoma. Pancreatitis events were numerically lower on liraglutide. LEADER remains the canonical long-term GLP-1 safety dataset and continues to underwrite class-level FDA labeling.
PMID 27295427 ↗NCT01179048 ↗DOI 10.1056/NEJMoa1603827 ↗
#4SUSTAIN-6
Marso SP, Bain SC, Consoli A, et al. · N Engl J Med · 2016
Primary endpoint: First MACE and adjudicated adverse events at 104 weeks
SUSTAIN-6 was the pre-approval cardiovascular safety trial for injectable semaglutide — 3,297 high-risk type 2 diabetes patients followed for 104 weeks on semaglutide 0.5 or 1.0 mg or placebo. Beyond a 26% MACE reduction, the safety dataset documented more retinopathy complications on semaglutide (3.0% vs 1.8%, HR 1.76, p=0.02), driven by rapid glycemic improvement in patients with pre-existing retinopathy. Discontinuation for gastrointestinal adverse events ran 11.5% (high dose) vs 5.7%. Neoplasm and pancreatitis rates were balanced. The retinopathy signal remains the most consequential long-term safety finding for semaglutide and shaped subsequent screening guidance.
PMID 27633186 ↗NCT01720446 ↗DOI 10.1056/NEJMoa1607141 ↗
#5REWIND
Gerstein HC, Colhoun HM, Dagenais GR, et al. · Lancet · 2019
Primary endpoint: First MACE and adjudicated adverse events over median 5.4 years
REWIND is the longest GLP-1 cardiovascular outcomes trial: 9,901 type 2 diabetes patients on dulaglutide 1.5 mg vs placebo for a median 5.4 years. With 69% enrolled for primary prevention, REWIND provides the deepest randomized safety dataset in a lower-risk population. Serious adverse-event rates were similar (45% vs 45%); pancreatitis, pancreatic cancer, and medullary thyroid carcinoma were all numerically balanced with no statistical excess. Discontinuation for gastrointestinal symptoms was 6%. REWIND extends the GLP-1 safety record past five years and remains the longest-duration randomized obesity-relevant safety dataset in print.
PMID 31189511 ↗NCT01394952 ↗DOI 10.1016/S0140-6736(19)31149-3 ↗
#6SURMOUNT-4
Aronne LJ, Sattar N, Horn DB, et al. · JAMA · 2024
Primary endpoint: Body-weight change and adverse-event profile from week 36 to week 88
SURMOUNT-4 captured 88 weeks of tirzepatide safety in 670 adults with obesity, including a 36-week open-label run-in plus 52-week randomized withdrawal. Discontinuing tirzepatide produced rapid weight regain (+14% over 52 weeks), while continued dosing remained well tolerated. Gastrointestinal adverse events during the maintenance phase were less frequent than in the SURMOUNT-1 escalation phase — most patients had completed dose escalation before randomization. Serious adverse events ran 4.7% on tirzepatide vs 5.0% placebo. SURMOUNT-4 is the longest randomized safety dataset published for tirzepatide as of 2026 and supports continuous rather than intermittent dosing.
PMID 38078870 ↗NCT04660643 ↗DOI 10.1001/jama.2023.24945 ↗
#7STEP-1 extension
Wilding JPH, Batterham RL, Davies M, et al. · Diabetes Obes Metab · 2022
Primary endpoint: Body weight and cardiometabolic risk factors one year after treatment withdrawal
This extension followed 327 STEP-1 completers for 52 weeks after stopping semaglutide 2.4 mg. Participants regained two-thirds of the lost weight (mean +6.9% vs the −17.3% lost during the trial) and most cardiometabolic improvements — blood pressure, lipids, A1C, hs-CRP — reverted toward baseline. The data are central to the long-term safety conversation because they quantify the consequence of discontinuation rather than continued dosing, anchoring the chronic-disease framing. No new adverse events emerged during the off-treatment phase. STEP-1 extension is the foundational withdrawal-physiology paper underpinning insurance coverage for indefinite GLP-1 use.
PMID 35441470 ↗NCT03548935 ↗
#8
Sodhi M, Rezaeianzadeh R, Kezouh A, et al. · JAMA · 2023
Primary endpoint: Incidence of biliary disease, pancreatitis, bowel obstruction, gastroparesis
Sodhi and colleagues analyzed a PharMetrics Plus claims database of roughly 16 million U.S. patients to estimate post-marketing rates of serious gastrointestinal events on semaglutide or liraglutide vs bupropion-naltrexone for weight loss. GLP-1 use was associated with higher incidence of pancreatitis (HR 9.09), bowel obstruction (HR 4.22), and gastroparesis (HR 3.67), but not biliary disease. Absolute event rates remained low. The paper is the most-cited observational signal in the modern GLP-1 GI-safety conversation, and the gastroparesis finding drove the September 2023 FDA label update for ileus on Ozempic and Wegovy.
PMID 37796527 ↗DOI 10.1001/jama.2023.19574 ↗
#9WHO VigiBase replication
McIntyre RS, Mansur RB, Rosenblat JD, et al. · J Affect Disord · 2025
Primary endpoint: Disproportionality of suicidal ideation and behavior reports for GLP-1 vs comparator drugs
After early FAERS signals raised concern about GLP-1-associated suicidal ideation in 2023, McIntyre and colleagues replicated the analysis in the WHO VigiBase global pharmacovigilance database. The replication found no signal of disproportionate reporting for suicidal ideation or self-injury with semaglutide or liraglutide versus comparator anti-obesity and antidiabetic agents. Reporting odds ratios were below 1 for most pairwise comparisons. The paper provides important counter-evidence to the original 2023 FAERS report and supports the EMA's 2024 conclusion that no causal link could be established.
PMID 39433133 ↗
#10
Baxter SM, Lund LC, Andersen JH, et al. · Thyroid · 2025
Primary endpoint: Incidence of thyroid cancer in GLP-1 vs DPP-4 inhibitor users across multiple national registries
Baxter and colleagues pooled Danish, Norwegian, and Swedish national health registries plus the U.S. Marketscan database to compare thyroid cancer incidence in GLP-1 receptor agonist users versus DPP-4 inhibitor users — a design that controls for diabetes severity. Across more than 200,000 GLP-1 users with mean 3.9-year follow-up, no excess thyroid cancer risk was observed (HR 0.93, 95% CI 0.66-1.31). Subgroup analyses by drug and dose were consistent. The study addresses the FDA boxed warning for medullary thyroid carcinoma carried over from rodent toxicology and provides the strongest population-level evidence that the warning may not translate to humans.
PMID 39772758 ↗
About this list
We curate ranked, citation-anchored PubMed paper lists for the most-searched questions in obesity medicine. Every citation on this page was checked against PubMed on 2026-05-28. Each paper card links directly to PubMed and to ClinicalTrials.gov where applicable.
Browse our full index of research lists or our long-form research articles.