Chemical Structure, CAS Number, and Synthesis of Semaglutide Research Peptide Explained
RESEARCH DISCLAIMER: Semaglutide, as supplied by Palmetto Peptides, is a research peptide for in vitro laboratory and qualified preclinical research use only. It is not intended for human or veterinary use. This article is for qualified researchers and peptide chemistry professionals.
Chemical Structure, CAS Number, and Synthesis of Semaglutide Research Peptide Explained
Last Updated: March 19, 2026 | Reading Time: ~11 minutes | Author: Palmetto Peptides Research Team
Quick Answer: Semaglutide (CAS 910463-68-2) has the molecular formula C187H291N45O59 and a molecular weight of approximately 4,113.58 g/mol. It is a 31-amino acid GLP-1 analog with two key structural modifications over native GLP-1: an Aib substitution at position 8 for DPP-4 resistance, and a C18 fatty diacid chain conjugated at lysine-26 via an OEG/gamma-Glu hydrophilic linker for albumin binding. Synthesis requires Fmoc SPPS followed by a specialized fatty acid conjugation step and preparative HPLC purification.
Semaglutide at a Glance: Key Identifiers
Before diving into structural details, here is the core reference data that any researcher or peptide chemist working with semaglutide needs:
Researchers sourcing this compound can find semaglutide research peptide at Palmetto Peptides, available as a ≥98% purity, COA-verified peptide for preclinical laboratory use.
| Identifier | Value |
|---|---|
| Common Name | Semaglutide |
| CAS Number | 910463-68-2 |
| Molecular Formula | C187H291N45O59 |
| Molecular Weight | ~4,113.58 g/mol |
| Peptide Length | 31 amino acids |
| Sequence Basis | Human GLP-1(7-37) |
| Sequence Homology to GLP-1 | ~94% |
| Key Modifications | Aib at position 8; C18 diacid-OEG-OEG-gamma-Glu at Lys26 |
| Purity (Research Grade) | >98% by HPLC |
| Physical Form | White to off-white lyophilized powder |
| Solubility | Aqueous, best with mild acetic acid |
The Amino Acid Sequence
Semaglutide's 31-amino acid sequence is as follows, with non-standard residues noted:
`
Position: 1 2 3 4 5 6 7 8 9 10
Residue: His Aib Glu Gly Thr Phe Thr Ser Asp Val
Position: 11 12 13 14 15 16 17 18 19 20
Residue: Ser Ser Tyr Leu Glu Gly Gln Ala Ala Lys*
Position: 21 22 23 24 25 26 27 28 29 30 31
Residue: Glu Phe Ile Ala Trp Leu Val Arg Gly Arg (C-terminus)
`
*Position 26 (Lys) carries the C18 fatty diacid conjugate via the OEG/gamma-Glu linker.
Note on numbering: Semaglutide numbering follows the GLP-1(7-37) convention, where histidine at the GLP-1(7) position is position 1 of semaglutide's sequence.
Comparing with native GLP-1(7-37):
- Position 2: Ala (in GLP-1) --> Aib (in semaglutide) [DPP-4 resistance modification]
- Position 26: Lys (in GLP-1) --> Lys with C18 fatty diacid conjugate (in semaglutide) [albumin binding modification]
- All other 29 positions are identical to human GLP-1(7-37)
Structural Modification 1: Aib at Position 8
What Aib Is
Alpha-aminoisobutyric acid (Aib) is a non-proteinogenic amino acid with two methyl groups on the alpha-carbon (making it a dialkyl amino acid). Unlike the standard 20 amino acids, Aib has no free hydrogen on the alpha-carbon, which gives it unique conformational properties:
- Strong helix-inducing tendency (often drives alpha-helix formation in peptide sequences)
- Complete resistance to most proteolytic enzymes due to steric occlusion of the alpha-carbon
Why Position 8?
DPP-4 cleaves peptides specifically at the bond between positions 2 and 3 (the penultimate N-terminal position and the third residue) when position 2 is proline or alanine. In native GLP-1(7-37), position 2 is alanine, making it an ideal DPP-4 substrate.
In semaglutide, replacing alanine at position 2 with Aib (which, in the semaglutide numbering system, corresponds to position 8 of the full GLP-1 sequence) creates sufficient steric bulk to prevent DPP-4 from recognizing and cleaving the site. The result is a peptide that is effectively invisible to DPP-4's catalytic site.
This single amino acid substitution is responsible for extending semaglutide's half-life from the ~2-minute window of native GLP-1 to a compound that requires only the albumin-binding mechanism for further half-life extension.
Structural Modification 2: The Fatty Diacid Conjugate at Lys-26
This is the more structurally complex of semaglutide's two modifications, and it is the feature that distinguishes semaglutide's half-life profile from liraglutide and most other GLP-1 analogs.
Components of the Conjugate
The full fatty acid modification at Lys-26 consists of four chemical units assembled as a linker:
`
Lys-26 (epsilon-amine)
|
gamma-Glu (glutamic acid via gamma-carboxyl)
|
OEG (8-amino-3,6-dioxaoctanoic acid, "mini-PEG" spacer 1)
|
OEG (8-amino-3,6-dioxaoctanoic acid, "mini-PEG" spacer 2)
|
C18 fatty diacid (octadecanedioic acid)
`
Gamma-glutamic acid is linked via its gamma-carboxyl group (not the standard alpha-carboxyl used in peptide bonds), creating a branch off the Lys-26 sidechain.
OEG (mini-PEG spacers) are short polyethylene glycol-like units that provide hydrophilicity and flexibility to the linker, preventing the fatty diacid chain from folding back and potentially occluding the receptor binding regions of the peptide.
C18 fatty diacid (octadecanedioic acid, an 18-carbon chain with carboxyl groups at both ends) is the albumin-binding moiety. One carboxyl end attaches to the OEG linker; the other end remains free and is the primary albumin-binding site.
Why Fatty Diacid Rather Than Fatty Acid?
Liraglutide, the predecessor GLP-1 analog, uses a C16 fatty monoacid (palmitic acid) attached via a glutamic acid spacer. This provides albumin binding but yields a shorter half-life (~13 hours) compared to semaglutide's 165 to 184 hours.
Semaglutide's C18 fatty diacid provides stronger albumin binding affinity compared to a monoacid of similar chain length, primarily because the diacid geometry allows for a more favorable interaction surface with albumin's fatty acid binding pockets. The addition of the two OEG spacers provides additional separation from the peptide backbone, allowing more flexible albumin engagement and reducing steric interference with GLP-1R binding when the peptide is released from albumin.
Synthesis Chemistry
Step 1: Solid-Phase Peptide Synthesis (SPPS)
Semaglutide's 31-amino acid backbone is assembled by Fmoc (9-fluorenylmethyloxycarbonyl) solid-phase peptide synthesis. In this approach:
- The C-terminal amino acid (Arg-31) is anchored to a solid resin support
- Amino acids are added sequentially from C-terminus to N-terminus
- Each coupling cycle involves: Fmoc deprotection (piperidine), washing, coupling (with activating reagents such as HATU/DIPEA), and capping
- Special coupling conditions are required for Aib at position 2 due to its lower reactivity compared to standard amino acids
- Lys-26 is incorporated with a protecting group on its epsilon-amine (typically Dde or ivDde, which are selectively removable later)
- After full assembly, global deprotection removes side-chain protecting groups, and the peptide is cleaved from the resin using TFA/scavenger cocktail
Step 2: Selective Deprotection and Fatty Acid Conjugation
This is the step that separates semaglutide synthesis from standard linear peptide production:
- The Dde/ivDde protecting group on Lys-26 is selectively removed using hydrazine solution while other protecting groups remain in place
- The gamma-Glu-OEG-OEG-C18 diacid linker unit is coupled to the free epsilon-amine of Lys-26
- This conjugation uses standard amide coupling chemistry but requires careful optimization to achieve complete coupling without side reactions
- Final global deprotection and cleavage follow
Step 3: Purification and Lyophilization
The crude conjugated peptide is purified by preparative reversed-phase HPLC, typically using a C4 or C8 column (the fatty diacid chain makes C18 columns impractical due to excessive retention). Gradient elution with acetonitrile/water/TFA mobile phases separates semaglutide from truncated sequences, deletion peptides, and incompletely conjugated material.
After purification, the peptide is analyzed by analytical HPLC and LC-MS to confirm purity and identity, then lyophilized to yield the final white to off-white powder shipped with each Palmetto Peptides order.
What This Means for Analytical Verification
Understanding semaglutide's synthesis explains why proper analytical verification is important:
- HPLC purity confirms separation from synthesis byproducts
- MS identity at m/z corresponding to MW 4,113.58 confirms the fatty acid conjugation was completed (a truncated or unconjugated peptide would have a distinctly different mass)
- Net peptide content accounts for TFA or acetate counter-ions from purification and residual water in the lyophilized cake
All three data points together provide genuine confidence that what you are working with is semaglutide. View our lot-specific CoA on the Semaglutide Research Peptide Product Page to see these data for current lots.
Comparison with Related Peptide Structures
| Structural Feature | Semaglutide | Liraglutide | Exendin-4 | Tirzepatide |
|---|---|---|---|---|
| Length (aa) | 31 | 31 | 39 | 39 |
| C-terminal | Arg-NH2 | Arg-NH2 | Pro-Ser-Ser-Gly-Ala-Pro-Pro-Pro-Ser-NH2 | Novel |
| Position 2 (vs. GLP-1) | Aib (DPP-4 resistant) | Aib (DPP-4 resistant) | Gly (DPP-4 resistant) | Aib (DPP-4 resistant) |
| Fatty acid type | C18 diacid | C16 monoacid | None | C18 diacid |
| Linker | OEG-OEG-gamma-Glu | gamma-Glu | N/A | OEG-gamma-Glu |
| Albumin binding | Strong | Moderate | None | Strong |
| MW (g/mol) | ~4,114 | ~3,751 | ~4,187 | ~4,813 |
Summary
Semaglutide's chemical architecture is the result of deliberate engineering decisions that optimize GLP-1R engagement, DPP-4 resistance, and albumin-mediated half-life extension. Its CAS number 910463-68-2, molecular formula C187H291N45O59, and MW of ~4,113.58 g/mol are the key identifiers researchers and chemists use to verify identity. The synthesis requires both Fmoc SPPS and a specialized fatty acid conjugation step, which is why analytical verification via both HPLC purity and MS identity is essential for research applications.
For related reading, see our articles on Mechanism of Action of Semaglutide Research Peptide in Preclinical Laboratory Models and Purity Testing and Quality Standards for Semaglutide Research Peptides.
Frequently Asked Questions
What is the CAS number for semaglutide?
910463-68-2.
What is the molecular formula and weight of semaglutide?
C187H291N45O59, approximately 4,113.58 g/mol.
What are the key structural modifications in semaglutide?
Aib at position 8 (DPP-4 resistance) and a C18 fatty diacid chain at Lys-26 via an OEG/gamma-Glu linker (albumin binding).
How is semaglutide synthesized?
Fmoc SPPS assembles the peptide backbone, followed by selective Lys-26 deprotection and fatty acid conjugation, then preparative HPLC purification and lyophilization.
What is the amino acid sequence of semaglutide?
His-Aib-Glu-Gly-Thr-Phe-Thr-Ser-Asp-Val-Ser-Ser-Tyr-Leu-Glu-Gly-Gln-Ala-Ala-Lys(C18 conjugate)-Glu-Phe-Ile-Ala-Trp-Leu-Val-Arg-Gly-Arg.
For qualified researchers, semaglutide research peptide is available from Palmetto Peptides with full Certificate of Analysis documentation.
References
- Lau J, Bloch P, Schaffer L, et al. Discovery of the once-weekly glucagon-like peptide-1 (GLP-1) analogue semaglutide. Journal of Medicinal Chemistry. 2015;58(18):7370-7380. https://doi.org/10.1021/acs.jmedchem.5b00726
- Knudsen LB, Lau J. The discovery and development of liraglutide and semaglutide. Frontiers in Endocrinology. 2019;10:155. https://doi.org/10.3389/fendo.2019.00155
- Muller TD, Finan B, Bloom SR, et al. Glucagon-like peptide 1 (GLP-1). Molecular Metabolism. 2019;30:72-130. https://doi.org/10.1016/j.molmet.2019.09.010
- Marbury TC, Flint A, Jacobsen JB, et al. Pharmacokinetics and tolerability of a single dose of semaglutide. Clinical Pharmacokinetics. 2017;56(11):1381-1390. https://doi.org/10.1007/s40262-017-0528-2
- Fields GB, Noble RL. Solid phase peptide synthesis utilizing 9-fluorenylmethoxycarbonyl amino acids. International Journal of Peptide and Protein Research. 1990;35(3):161-214. https://doi.org/10.1111/j.1399-3011.1990.tb00939.x
Last Updated: March 19, 2026
Author: Palmetto Peptides Research Team
Palmetto Peptides | Research Peptides for Qualified Researchers | palmettopeptides.com
Research Use Only. Not for human or veterinary use.