Abstract
It is widely acknowledged that sialyl Lewis X (sLeX), the composition and linkage of which are N-acetylneuraminic acid (Neu5Ac) α2-3 galactose (Gal) β1-4 [fucose (Fuc) α1-3] N-acetylglucosamine, is usually attached to the cell surface. It presents as a terminal structure on either glycoproteins or glycolipids and has been demonstrated to be related to various biological processes, such as fertilization and selectin binding. Due to the vital role of sLeX, its synthesis as well as its determination approaches have attracted considerable attention from many researchers. In this review, the focus is sLeX on glycoproteins. The biological importance of sLeX in fertilization and development, immunity, cancers, and other aspects will be first introduced. Then the chemical and enzymatic synthesis of sLeX including the contributions from more than 15 international research groups will be described, followed by a brief view of the sLeX detection focusing on monosaccharides and linkages. This review is valuable for those readers who are interested in the chemistry and biology of sLeX.
Keywords
Sialyl Lewis X, biological function, synthesis, mass spectrometry, glycoproteinsIntroduction
Proteins and DNA can be easily fabricated on a laboratory scale, but it is difficult to do so on glycans. The difficulties include the diversity of glycan components and the complexity of monosaccharide linkages within every glycan together with the technical limitations of the current detection tools [1]. In addition, unlike proteins that utilize DNA as a template, there is no template for the glycans biosynthesis [2–4]. More importantly, glycan isomeric molecules with the identical chemical formula but different structures can be employed to build varied polysaccharides, but are hard to tell apart based on the molecular weight alone [5]. All these results in studies of glycans lagged behind research into other macromolecules [5].
Glycans on mammalian glycoconjugates are crucial for biological processes [5, 6]. Abnormal glycan modifications are usually related to a variety of diseases. For instance, reverse lectin-based enzyme-linked immunosorbent assay (ELISA) showed increased fucosylation on haptoglobin in sera of ovarian cancer patients, while the protein level of haptoglobin remained the same between the patients and controls [7]. In addition, glycans could be tissue specific. For instance, mouse brain N-glycans are less complex in sequence and variety compared to other tissues, which are predominantly composed of high-mannose (Man) and fucosylated/bisected structures [8]. Glycans on glycoproteins usually possess conserved core structures, for instance, the core structure in N-glycan contains two N-acetylglucosamine (GlcNAc) residues and three Man residues [9]. Based on the core structures, glycans can be elaborated to form various terminal structures. Sialyl Lewis X (sLeX) is one of these terminal structures, the structure of sLeX is shown in Figure 1 [10]. It is usually observed in glycoproteins and glycolipids, such as N-glycan, neolacto-series glycosphingolipid, and other glycoconjugates [11–13]. In this review, the focus is sLeX on glycoproteins.
Structure and location of the tetrasaccharide sLeX. The fucose (Fuc) is α1-3 linked to the GlcNAc which is β1-4 linked to the galactose (Gal) with an α2-3 linked Neu5Ac, the resulting form is Neu5Ac α2-3Gal β1-4 (Fucα1-3) GlcNAc. : GlcNAc, : Gal, : Fuc, : Neu5Ac
sLeX is a tetrasaccharide that is usually attached to the surface of cells. It is also known as a cluster of differentiation 15s (CD15s) [14, 15]. The term Lewis in sLeX is originally from a family name of the people who suffered from incompatibility in their red blood cells [16, 17]. The researches on red blood cells resulted in the introduction of sLeX [18]. The biosynthesis of sLeX involves N-acetylglucosaminyltransferases, β1-4 galactosyltransferases, α1-3 fucosyltransferases (α1-3 fucosyltransferase 3, α1-3 fucosyltransferase 5, α1-3 fucosyltransferase 6 and α1-3 fucosyltransferase 7) and α2-3 sialyltransferases (β galactoside α2-3 sialyltransferase 3, β galactoside α2-3 sialyltransferase 4 and β galactoside α2-3 sialyltransferase 6) which are responsible for the GlcNAc, the β1-4 linked Gal, the α1-3 linked Fuc and the α2-3 linked Neu5Ac respectively [19–21].
sLeX is a very important red blood cell antigen present on the glycoconjugates on the plasma membrane of the cell [22–24]. It plays an essential role in various biological processes. For instance, Pang et al. [25] and Wang et al. [26] reported that sLeX is the most abundant terminal sequence on the glycans of human zona pellucida glycoproteins involved in sperm-egg binding. Recently, Puan et al. [27] found that basophil rolling is dependent on sLeX expression.
sLeX can be further structurally modified. Adding a sulfate group to the C6 of the GlcNAc results in 6-sulfo-sLeX (Figure 2A), adding a sulfate group to the C6 of the Gal results in 6’-sulfo-sLeX (Figure 2B), and adding two sulfate groups to the C6 of the GlcNAc and C6 of the Gal yields 6’, 6-bisulfo-sLeX (Figure 2C) [28, 29]. 6-sulfo-sLeX has been considered as a ligand for L-selectin (contributing to L-selectin recognition) [24, 30, 31], and it shows the strongest binding strength among the three sulphated sLeX to L-selectin [28]. The synthesis of 6-sulfo-sLeX is catalysed by GlcNAc-6-O-sulfotransferases (GlcNAc6STs) [24, 32]. Until now, five GlcNAc6STs and four GlcNAc6STs have been identified in humans and mice respectively [33, 34]. One of the GlcNAc6STs, GlcNAc6ST-2 (also termed L-selectin ligand sulfotransferase), is known to be expressed specifically in high endothelial venules (HEVs), although it was later discovered in mice colon as well [35, 36]. Initially, the 6-sulfo-sLeX was found at the nonreducing termini of core 2 and extended core 1 branch of the O-glycans. However, 6-sulfo-sLeX was subsequently found on the N-glycans of the HEV glycoproteins in the mutant mice that lack core 2 and extended core 1 O-glycans [32, 37]. This structure has been considered as a ligand for L-selectin (contributing to L-selectin recognition) [24, 30, 31].
Chemical structures of 6-sulfo-sLeX (A), 6’-sulfo-sLeX (B), and 6’, 6-bisulfo-sLeX (C). R: residues
Note. Reprinted from “Systematic chemoenzymatic synthesis of O-sulfated sialyl Lewis x antigens,” by Santra A, Yu H, Tasnima N, Muthana MM, Li Y, Zeng J, et al. Chem Sci. 2016;7:2827–31 (https://pubs.rsc.org/en/content/articlelanding/2016/SC/C5SC04104J). © The Royal Society of Chemistry 2016.
With intensive studies focusing on sLeX, the importance of this tetrasaccharide in cell biology has been discovered. Therefore, access to this tetrasaccharide is getting necessary.
The total synthesis of sLeX in a complex-type ganglioside was first reported by Japanese scientists in 1991 [38]. Before that, sLeX was isolated from the human kidney and found to be a tumor-associated antigen [39, 40]. The synthesis of sLeX will be introduced here in two aspects: chemical synthesis and enzymatic synthesis. For the chemical synthesis of sLeX tetrasaccharide, the reported synthetic routes will be classified into four strategies according to the sequence of installing monosaccharide/disaccharide building blocks. Compared with a chemical synthesis which needs great efforts in protecting group installation and removal, enzymatic synthesis uses unprotected sugar compounds as building blocks and gives the product with highly selective glycosylations.
Biological functions of sLeX
With more research focusing on sLeX, the significance of this tetrasaccharide in cell biology has been gradually discovered.
In fertilization and development
Human fertilization starts with spermatozoa binding to the oocyte coating which is termed the zona pellucida, the resulting fertilized egg is known as a zygote [41]. Glycans on the zona pellucida have been implicated in sperm binding, however, the structures of the glycans have remained enigmatic then. Until 2011, Pang et al. [25] reported that sLeX is the most abundant terminal sequence on the glycans of human zona pellucida glycoproteins involving in sperm-egg binding, which implied that sLeX sequence represents the main carbohydrate ligand for sperm-egg binding in humans. Recently, Wang et al. [26] found that the sLeX on zona pellucida could bind to a protein, chromosome 1 open reading frame 56 (C1orf56), on human spermatozoa.
Embryo development relies on the adhesion of trophoblast cells to the maternal uterine and the diversion of maternal blood to the placenta. The process of trophoblast adhesion is integrin dependent [42, 43]. Therefore, it is likely that trophoblast adhesion to the endometrium in implantation and placentation may follow a similar way that selectins and their carbohydrate-based ligands follow. Liu et al. [44] reported that the sLeX/L-selectin adhesion system at the maternal and embryonic interface regulates the adhesion of the embryo to the maternal uterine epithelium. They set up an in vitro implantation model using a human trophoblast cell line (JAR) and human uterine epithelial cell line (RL95-2); they found that sLeX was expressed on JAR, after being transfected with fucosyltransferase VII which is responsible for synthesizing sLeX, the synthesis of sLeX in JAR was increased, and simultaneously the percent adhesion of JAR to RL95-2 monolayer was markedly increased [43, 44]. In 2016 sLeX containing N-glycans was identified in human trophoblasts, which supported previous research and implicated the importance of sLeX in embryo development [12].
However, due to the sample limitation and ethical issues, it is not easy to perform research using a human embryo. Therefore, animal models were employed. The first comprehensive glycomic analysis on zebrafish embryos showed that oligomannose-type glycans and complex N-glycans with galactosyl sLeX antennae were major N-glycans at all developmental stages of the whole zebrafish embryonic samples [45, 46], and this suggested the importance of galactosyl sLeX in the embryo development.
In immunity
sLeX had been identified in around 10% of resting human memory T lymphocytes via anti-sLeX antibodies [47, 48], and this indicates its importance in immunity.
In mammals, lymphocytes circulation occurs in lymphatic and vascular areas, and this allows lymphocytes exposed maximally to invading pathogens. Lymphocytes left the vascular area via lymph nodes, then passed the lymphoid organs, and finally returned to the vascular system [48–50]. This circulation path of lymphocytes is proposed to rely on glycans displayed on specialized endothelial cells, termed HEVs [51].
Lymphocyte homing is regulated via adhesive interactions between lymphocytes and HEVs, it particularly binds to the tetrasaccharide 6-sulfo-sLeX on HEVs. It has been reported found that GlcNAc6STs can control lymphocyte homing via the ligand 6-sulfo-sLeX synthesis on HEVs, and this ligand is on either mucin type branched core 2 O-glycan or extended core 1 O-glycan of endothelial sialomucin CD34 [31, 49]. The function of this carbohydrate structure in lymphocyte homing has been demonstrated mainly through researches using several mouse models with deficiencies in related glycosyltransferases [52], detailedly, studies employing β galactoside α2-3 sialyltransferases 4 and 6 double-deficient mice uncovered the coordinated involvement of these two sialyltransferases in the synthesis of functional oligosaccharides that mediate lymphocyte homing to HEVs [53]; studies using both α1-3 fucosyltransferase 4 and 7 deficiency mice model exposed that the fucosylation of 6-sulfo-sLeX in peripheral lymph nodes HEVs is vital for the interaction with L-selectin [54], compared to wild-type mice, lymphocyte homing to peripheral lymph nodes was significantly influenced in these two fucosyltransferase double-knockout (DKO) mice by more than 80% [52, 54]; studies using GlcNAc6STs 1 and 2 DKO mice illustrated more than 70% reduction in lymphocyte homing, which suggested the importance of GlcNAc-6-O-sulfation in L-selectin ligand synthesis [31, 55].
Asthma is a chronic inflammatory disease that results in severe leukocyte infiltration in the lungs. To achieve the infiltration, the binding of sLeX on the leukocytes to the E- and P-selectins on the endothelial surface at the inflammation area is required [56]. These two selectins probably function similarly in regulating the infiltration; sLeX capped O-glycans on P-selectin glycoprotein ligand 1 (PSGL-1) at the leukocyte surface binds to the two selectins and thus makes leukocytes roll following the blood flow direction [57–60], without the binding of PSGL-1 to the two selectins, leukocytes could not initiate rolling on the endothelial surface [27, 61, 62]. Overcoming the hydrodynamic force of the blood flow is an essential aim of selectin-sLeX binding for leukocyte rolling, Zhang et al. [63] identified the molecular determinants within sLeX that contributes to the binding using single-molecule dynamic force spectroscopy; two determinants in sLeX are required for the selectin binding, which is the Fuc and the terminal Neu5Ac.
In cancers
sLeX is a well known cancer-associated carbohydrate structure [48, 64, 65]. It plays a vital role in tumor cell metastasis. Increased levels of sLeX positively correlated with metastasis [65]. For instance, mouse melanoma B16-F1 cells were transfected by α1-3 fucosyltransferase 3 to express sLeX, researchers found that transfected cells with high sLeX expression level became highly metastatic compared to the wild-type B16-F1 cells or the cells with low sLeX expression level. In addition, sLeX overexpression on B16-F1 cells resulted in apoptosis in lung tissues, indicating that these cells were eliminated by natural killer cells [66].
It also plays an importance role in cancer cell invasion. For instance, Gomes et al. [67] found that the expression of β galactoside α2-3 sialyltransferase 4 in MKN45 gastric cancer cells resulted in sLeX expression and subsequently caused an increased invasive phenotype in vitro and in the in vivo chicken chorioallantoic membrane (CAM) model via c-Met activation, however, it is not clear whether the sLeX is on glycoprotein or glycolipid.
Glycosylation has emerged as a cancer hallmark; some of the biomarkers used in oncology are cancer-associated glycans, and sLeX is one of them [68]. Serum sLeX has been proposed as a marker for the detection of breast cancer [69, 70]. An increase of sLeX level on ceruloplasmin in pancreatic adenocarcinoma patients was identified via N-glycan analysis on ceruloplasmin. After being immunoprecipitated with anti-ceruloplasmin antibody and analyzed by western blot, sLeX/ceruloplasmin ratio in sera from pancreatic adenocarcinoma patients tend to be higher than that from healthy controls and chronic pancreatitis patients [71]. Tang et al. [72] reported that sLeX could be used as a biomarker for pancreatic cancer in combination with sialyl Lewis A (sLeA), and this could differentiate 109 pancreatic cancers from 91 benign pancreatic diseases with 79% accuracy (74% sensitivity and 78% specificity), which was noticeably better than employing sLeA alone.
Others
sLeX plays an important role in virus attaching. For instance, coronaviruses are able to result in human respiratory tract infections and outbreaks of deadly worldwide pneumonia [73]. Middle East respiratory syndrome coronavirus (MERS-CoV) targets the epithelial cells of the respiratory tract in humans. Proteins or glycolipids with sLeX on the surface of human airway epithelial cells can be used as an attaching receptor by MERS-CoV, thereby increasing infection efficiency. Removing cell surface Neu5Ac by neuraminidase inhibited MERS-CoV from entering human airway cells [73, 74].
Additionally, sLeX is essential for angiogenesis. The function of sLeX in angiogenesis is demonstrated by the observation that the emergence of tube-like networks of endothelial cells caused by the co-culture with cancer cells could be prohibited by sLeX antibodies [24, 75]. Once sLeX biosynthesis was blocked, the ability of hepatocarcinoma cells to promote angiogenesis was hindered [24, 76].
The syntheses of sLeX
In nature, sLeX exists in glycoproteins and glycolipids in many different forms. The sLeX tetrasaccharide is either directly linked to the peptide or present as a terminal structure of a more complex oligosaccharide. Within those diverse structures, the tetrasaccharide 1 comprising lactosamine bearing α2-3 sialylation and α1-3 fucosylation became a hot synthetic target in the past three decades, due to its potential in anti-adhesion drug development. Furthermore, although it is not quite difficult to synthesize for current chemists armed with all the available glycosylation methodologies, the challenges posed by the α sialylation and α fucosylation make sLeX a good target to demonstrate the methodologies developed since the 1990s. Herein, this review will focus on the chemical and enzymatic syntheses of sLeX tetrasaccharide, including contributions from more than 15 research groups.
Chemical synthesis
For the chemical synthesis of sLeX tetrasaccharide, the reported synthetic routes can be classified into four strategies according to the sequence of installing monosaccharide/disaccharide building blocks (Figure 3). For a designated strategy, the orthogonal protecting groups and glycosyl donor types (with different leaving groups) can be varied to pursue high reactivity and stereoselectivity.
Four strategies for chemical synthesis of sLeX tetrasaccharide
In 1991, Nicolaou et al. [77] reported the chemical total synthesis of sLeX based on the Strategy 1 (Figure 4). The protected lactosamine 4 was synthesized from GlcNAc acceptor 3 and Gal fluoride donor 2. After releasing the 3-OH group by a two-step protocol of allyl removal (double bond migration and acidolysis), a highly efficient α-fucosylation was achieved using Fuc fluoride donor 5. Next, the sialic acid glycosyl donor 7 bearing an equatorial 3-PhS group was used to facilitate the α-sialylation. The 3-PhS served as an auxiliary and controlled the α-selectivity via neighboring group participation. Finally, the PhS group was removed via radical process, and the sLeX 1 was obtained after global deprotection. In the following years, this BC + D + A strategy (Figure 3) was adopted respectively by Hasegawa et al. [78, 79], Jain et al. [80, 81], Vig et al. [82], Ellervik and Magnusson [83], Herzner and Kunz [84], Filser et al. [85], and Lu et al. [86] to finish the syntheses of sLeX tetrasaccharide and related conjugates, using different sialic acid glycosyl donors and Fuc donors.
An overview of Strategy 1 for chemical synthesis of sLeX tetrasaccharide
In 1992, Danishefsky et al. [87–89] reported the synthesis of 1 based on the Strategy 2 (Figure 5). The disaccharide 11 was first prepared via the α-fucosylation at the more reactive 3-OH of the glycal acceptor 10. Subsequently, without protecting group manipulation, the less reactive 4-OH was involved in the β-galactosylation using donor 12. After removing the three benzoyl groups on the Gal ring of the trisaccharide 13, the sialyl chloride donor 14 reacted selectively with the most reactive 3-OH followed by acetylation to give the tetrasaccharide 15 in good overall yield. Serving as the surrogate of the glucosamine, the glycal in 15 was subjected to the iodinium mediated two-step transformation (iodoamination and aziridine formation/ring-opening) to install the 2-amino group, and product 16 was deprotected globally to give 1. This CD + B + A (Figure 3) Strategy was adopted by Sprengard et al. [90], Kretzschmar and Stahl [91], Misra et al. [92], and Dekany et al. [93] in their synthetic works, while the glucosamine building blocks were used directly instead of the glycal in the first glycosylation step.
An overview of Strategy 2 for chemical synthesis of sLeX tetrasaccharide
In 1998, to further improve the efficiency of the synthetic route, Baba et al. [94] reported the synthesis based on Strategy 3 (Figure 6), in which the tetrasaccharide was assembled in a convergent AB + CD manner (Figure 3). The disaccharide 19, which was prepared from 17 and 18 in high yield, was subjected to the regioselective benzylidene opening and glycosylation with disaccharide donor 20. The tetrasaccharide product 21 was then transformed to 1 after global deprotection. In this route, the two key glycosylation reactions were facilitated by dimethyl(thiomethyl)sulfonium triflate (DMTST) mediated activation of methyl thioglycosides, which was developed by the same group. This strategy was also adopted by Gege et al. [95] and Akçay et al. [96] respectively to achieve sLeX derivatives.
An overview of Strategy 3 for chemical synthesis of sLeX tetrasaccharide
In 2003, Pazynina et al. [97] reported the Strategy 4, in which the tetrasaccharide 27 was assembled in a C + AB + D manner (Figure 3, Figure 7). The linear trisaccharide 24 was prepared from disaccharide donor 22 and acceptor 23, and the chloroacetyl (ClAc) group was selectively removed by ethylenediamine to release 3-OH. Subsequently, α-fucosylation with donor 25 gave product 26, and global deprotection gave 27 in good yield. This work also demonstrated that the hindered trisaccharide can serve as a good acceptor in the α-fucosylation.
An overview of Strategy 4 for chemical synthesis of sLeX tetrasaccharide
In 2012, leveraging Strategy 4, Esposito et al. [98] and Kröck et al. [99] developed the solid phase synthesis of sLeX derivative 35 [98, 99]. As shown in Figure 8, resin 28 with a cleavable linker installed, was used as the support of the whole synthesis. The first GlcNAc building block was first installed by reaction with donor 29, and the resin-bound acceptor 30 was obtained after 9-fluorenylmethyloxycarbonyl (Fmoc) removal. Then, a Lewis acid catalyzed glycosylation using disaccharide donor 31 gave linear trisaccharide, and the new acceptor 32 was obtained after levulinoyl (Lev) deprotection. Finally, N-iodosuccinimide (NIS) mediated α-fucosylation using donor 33 gave resin-bound tetrasaccharide 34, which was transformed into 35 via radical based reduction of trichloroacetamide and global deprotection. This solid phase synthesis, comprised of 6 on-resin steps, 1 linker cleavage step, and 2 in-solution deprotection steps, gave 35 in overall 15% yield with minimized purification efforts.
An overview of solid phase synthesis of sLeX derivative
Enzymatic synthesis
Compared with chemical synthesis that needs great efforts in protecting group installation and removal, enzymatic synthesis (Figure 9) uses unprotected sugar compounds as building blocks and gives the product highly selective glycosylations. The first attempt at the enzymatic synthesis of sLeX was reported by Palcic et al. in 1989 [100]. In this work, the α-sialylation of disaccharide N-acetyllactosamine (LacNAc) was catalyzed by procine submaxillary α2-3 sialyltransferase, while the α-fucosylation was catalyzed by an α3/4 fucosyltransferase isolated from the milk of a human Lea+b- donor. A similar synthesis using enzymes obtained from different sources was reported by de Vries et al. [101]. In 1991, Dumas et al. [102] reported that a recombinant α3/4 fucosyltransferase could catalyze the α-fucosylation of a series of disaccharide acceptors and the 3’-sialyl LacNAc. A chemically synthesized trisaccharide with a reducing end block was also suitable for this transformation [103, 104]. This strategy was also applied to the synthesis of sLeX analogs bearing N-modifications [105].
Enzymatic synthesis of sLeX tetrasaccharide. PEP: phosphoenolpyruvate
In 1992, Ichikawa et al. [106] reported the enzymatic total synthesis of sLeX from monosaccharides. In this work, uridine diphosphogalactose (UDP)-Gal, cytidine 5’-monophospho (CMP)-Neu5Ac, and guanosine diphosphate (GDP)-Fuc were used as donors respectively in the three enzymatic glycosylation steps. Using this process, a sLeX analog containing 13C labelled Gal was synthesized. The same process was also applied to the synthesis of sLeX containing glycopeptides [107] and further extended to the solid phase oligosaccharide synthesis [108]. Considering the limited availability of the glycosyl donors that hampered the scale-up synthesis, Ichikawa et al. [106] developed two multiple enzyme systems to generate the UDP-Gal and CMP-Neu5Ac from glucose-1-phosphate (Glc-1-P) and Neu5Ac respectively. In another work reported by Hayashi et al. [109], the UDP-Gal was generated from UDP-Glc by UDP-Gal epimerase (UDPGE) catalyzed epimerization. The linear trisaccharide was synthesized via enzymatic β-galactosylation and chemical α-sialylation. By this process, a series of sLeX analogues with N-modification on the glucosamine unit was obtained.
In 2011, a new multiple enzyme system for the α-sialylation reaction was developed by Sugiarto et al. [110]. In this system, the enzyme Neisseria meningitidis CMP-sialic acid synthetase (NmCSS) catalyzed the synthesis of CMP-Neu5Ac from Neu5Ac and cytidine triphosphate (CTP), while the maltose binding protein-viral α2-3 sialyltransferase I (N-terminal 30 amino acids truncated)-His6 tag fusion (MBP-Δ30vST3Gal-I-His6) catalyzed the α-sialylation. This system was used by the same group in the synthesis of sulfated sLeX analogues [29]. In 2019, Tasnima et al. [111] reported a new system to facilitate the gram-scale synthesis of sLeX. In this new progress, both UDP-Gal and GDP-Fuc were synthesized enzymatically from Gal and Fuc respectively. The CMP-Neu5Ac was synthesized from N-acetylmannosamine (ManNAc) by two enzymatic reactions catalyzed by Pasteurella multocida sialic acid aldolase (PmNanA) and NmCSS and used in the α-sialylation, as they demonstrated before [112]. More importantly, the mannosamine bearing modification on the 6-OH and 2-NH2 sites were well adopted by these enzymes, and a series of sLeX analogues were synthesized in good yields.
The detection of sLeX
Since the focus is sLeX on glycoproteins, it is necessary to consider removing glycans from glycopeptides/glycoproteins. The glycan isolation method mainly depends on how the glycans are attached to the protein. The N-glycan is attached to the protein through an asparagine (Asn) residue in a conserved motif Asn-X-serine (Ser)/threonine (Thr), in which X can be any amino acid except proline (Pro) [113, 114]. N-glycans can be released from the protein via peptide-N-glycosidases (PNGases) [115, 116]. O-linked glycan is attached to either a Ser or a Thr residue [21]. Unlike N-glycans which possess one core structure, O-glycans (mucin type) usually consist of eight core structures. There is no specific O-glycan enzyme that is similar to PNGases can remove all O-linked glycans. Chemical approaches, e.g., reductive β-elimination, are usually employed to release O-linked glycan. For the released glycans, purification steps are usually required for the removal of salts and reagents for mass spectrometry (MS) analysis [21, 114]. Additionally, chemical derivatization, such as permethylation, is usually employed to improve sensitivity and reproducibility [117, 118]. As previously mentioned, sLeX can locate on either N-glycan or O-glycan, therefore, samples need to be processed as previously described [12, 119, 120] and the procedure for sample preparation will not be addressed here. Because sLeX consists of 4 monosaccharides (a GlcNAc, a Fuc, a Gal, and a Neu5Ac), detecting sLeX means determining the presence of Neu5Ac, Fuc, Gal, and GlcNAc. However, as GlcNAc is innermost, it is therefore relatively fixed and usually not the focus. To determine the presence of Neu5Ac, Fuc, Gal, and GlcNAc, a tandem mass spectrometry (MS2) experiment could be designed to detect the presence of a potential sLeX based on the observation of a fragment ion with mass to charge ratio (m/z) 803.29 (or m/z 1021.4 if it is permethylated and sodiated) during ionization.
Furthermore, linkages between every two monosaccharides need to be determined. Linkage information is vital, without knowing it will probably result in incorrect glycan identification. For instance, it is likely to mistake sLeA for sLeX as both of them have the same monosaccharide component, but the linkages among Gal, Fuc, and GlcNAc are different; it is Gal β1-4(Fuc α1-3)GlcNAc in sLeX but Gal β1-3(Fuc α1-4)GlcNAc in sLeA [24].
Detection approaches for Neu5Ac α2-3Gal
Sialic acids, a series of nine carbon acidic monosaccharides, usually exist as the terminal sugars on glycoproteins at the cell surface [121, 122]. Neu5Ac is a member of the sialic acid family, and it is the most common sialic acid in humans [21, 123].
Usually, glycans containing Neu5Ac require extra considerations. Because sialic acid residues are highly labile compared to other glycosidic bonds, they are easy to lose during ionization in mass spectrometric analysis [124]. The presence of sialic acid on glycoconjugates offers other analytical difficulties. For instance, the negative charge on the monosaccharide results in quantitative difficulties. Additionally, the presence of sialyl linkage isomers increases the difficulty of analysis of sialylated glycans [124, 125].
There are seveal linkages for Neu5Ac; α2-3, α2-6, α2-8 and α2-9 [6, 21, 126]. In nature, Neu5Acs are α2-3 or α2-6 linked to Gal and N-acetylgalactosamine (GalNAc), α2-6 linked to GlcNAc, and α2-8 or α2-9 linked to the second Neu5Ac residue [21, 127]. In most cases, distinguishing α2-3 and α2-6 linked Neu5Ac is adequate.
Enzyme treatment
Sialidases are a large group of enzymes, and the majority of them cleave terminal sialic acids from complex carbohydrates on glycoconjugates [123, 128]. Sialidase S, one of these enzymes, detaches only non-reducing terminal unbranched α2-3 linked Neu5Ac from glycoconjugates [129]. It is usually employed together with sialidase A which can cleave all non-reducing terminal Neu5Ac from glycoconjugates [130].
Derivatization coupled with MS
Normally α2-3 and α2-6 linked Neu5Ac are present at the end of glycans in humans and it is difficult to distinguish. Derivatization on Neu5Ac was designed to detect the different linkages by MS.
MS is a powerful analytical technique, it has been intensively used in glycomics [131, 132]. Nishikaze et al. [133] reported derivatization which was termed sialic acid linkage specific alkylamidation (SALSA) (Figure 10). This derivatization consisted of sequential two-step alkylamidations. As a result of the reactions, α2-6 and α2-3 linked Neu5Ac residues are differentiated by the mass difference of 28.031 dalton (Da) in the matrix-assisted laser desorption/ionization-time of flight (MALDI-TOF) mass spectrum [133]. In addition, α2-6 and α2-3 linked Neu5Ac can be differentiated by electrospray ionization tandem mass spectrometry (ESI MS2) via the mass difference caused by the modification of methylamine and isopropylamine [134].
The combination of SALSA and chemoselective glycan purification using hydrazide beads, together with linkage-specific sialic acid stabilization [133]. α2-6 And α2-3 linked Neu5Ac residue would have +13 Da and +41 Da respectively. : GlcNAc, : Gal, : Man, : Neu5Ac
Note. Preprinted from “Differentiation of sialyl linkage isomers by one-pot sialic acid derivatization for mass spectrometry-based glycan profiling,” by Nishikaze T, Tsumoto H, Sekiya S, Iwamoto S, Miura Y, Tanaka K. Anal Chem. 2017;89:2353–60 (https://pubs.acs.org/doi/10.1021/acs.analchem.6b04150). © 2017, American Chemical Society.
Zhou et al. [135] also reported two-step derivatization which is illustrated in Figure 11, via which α2-3 and α2-6 linked Neu5Ac on N-glycans could be distinguished in MALDI-TOF; after derivatization, α2-3 linked Neu5Ac formed lactone and then hydrolyzed to amidation, but α2-6 linked Neu5Ac formed dimethylamidation, and this would result in –0.984 Da for α2-3 sialylated lactose and +27.047 Da for α2-6 sialylated lactose theoretically in the MALDI-TOF mass spectrum respectively.
Schematic scheme of two-step derivatization method. α2-6 and α2-3 linked Neu5Ac residue would have –0.984 Da and +27.047 Da respectively [135]. : GlcNAc, : Gal, : Man, : Neu5Ac
Note. Reprinted from “Two-step derivatization and mass spectral distinction of α2,3 and α2,6 sialic acid linkages on N-glycans by MALDI-TOF,” by Zhou XX, Yang S, Yang GL, Tan ZQ, Guan F. Chin Chem Lett. 2019;30:676–80 (https://www.sciencedirect.com/science/article/abs/pii/S1001841718304844). © 2018 Published by Elsevier B.V. on behalf of Chinese Chemical Society and Institute of Materia Medica, Chinese Academy of Medical Sciences.
Hydrophilic interaction liquid chromatography coupled with MS
Traditional glycomic profiling cannot readily differentiate between sialylated N-glycan linkage isomers, therefore it is challenging to distinguish the isomers [136]. Tao et al. [137] presented a liquid chromatography-selected reaction monitoring (LC-SRM) approach which made quantitate the individual Neu5Ac linkage isomer achievable. The LC method is able to separate sialylated N-glycan isomers differing in α2-3 and α2-6 linkages via a superficially porous particle penta-hydrophilic interaction liquid chromatography (HILIC) column, selected reaction monitoring (SRM) detection shows the relative quantitation of each Neu5Ac linkage isomer.
Usually, isomeric glycans are being resolved based on the ratio of α2-3 to α2-6 Neu5Ac linkages present in the glycoform, with the α2-3 linked Neu5Ac eluting before the α2-6 linked [138, 139], which has been shown in Figure 12. Additionally, it is clear in the Figure 12 that the presence of peptide backbone resulted in the glycopeptide to be retained longer in the HILIC separation than the glycans alone. However, the presence of the peptide only leads to minimal shifts in retention [138].
Comparison of the separation obtained from released N-glycans and glycopeptides of fetuin by HILIC [138]. A) HILIC separation of procainamide-labeled, released N-glycans; B) HILIC separation of glycopeptides with the same peptide backbone. PEP is short for peptide, PEP sequence is LCPDCPLLAPLNDSR, in which the glycosylation occurs on N. : GlcNAc, : Gal, : Man, : Neu5Ac
Note. Reprinted from “Resolving isomeric glycopeptide glycoforms with hydrophilic interaction chromatography (HILIC),” by Huang Y, Nie Y, Boyes B, Orlando R. J Biomol Tech. 2016;27:98–104 (https://doi.org/10.7171/jbt.16-2703-003). © Association of Biomolecular Resource Facilities.
Additionally, Yang et al. [140] reported a two-step solid-phase matrix-based method for sequential derivatization of glycopeptides containing α2-6 and α2-3 linked Neu5Acs. Mass shift of glycopeptide modification was caused by ethyl esterification and ethylenediamine amidation; ethyl esterification modified α2-6 linked Neu5Acs, while ethylenediamine derivatized α2-3 linked Neu5Acs, and this resulted in 14.0268 Da mass difference between one α2-3 and α2-6 linked Neu5Ac [140].
Others
With the development of technology, more and more detection approaches have emerged. For instance, ion mobility spectrometry (IMS) has been shown to separate and identify α2-3 and α2-6 linked Neu5Ac from released N-glycans [141]. IMS possesses an essential dimension which is able to resolve isobaric species by their gas-phase collision cross-section (CCS), however, the application of IMS is usually limited by the databases of carbohydrate CCS values [142, 143].
In addition, lectins can be used for detection. Lectins are a group of glycan binding proteins; they could bind to specific glycan structures [144]. It is reported that Sambucus nigra (elderberry) agglutinin (SNA) is a lectin that recognizes α2-6 linked Neu5Ac [145, 146], while Maackia amurensis lectin II (MALII) binds to α2-6 sialic acids [147].
Detection approaches for Gal β1-4(Fucα1-3)GlcNAc
Gal β1-4(Fucα1-3)GlcNAc is termed Lewis X.
Fuc is a 6-deoxy hexose in the L-configuration discovered in a great variety of different organisms. In mammals, there are 13 fucosyltransferases responsible for transferring Fuc from GDP-Fuc to glycoconjugates [148]. In humans, there are six α1-3 fucosyltransferases [20]. For a more detailed description of Fuc please read these review papers [148–150]. A research of 3,299 mammalian oligosaccharides showed that Fuc was found in approximately 7.2% of the oligosaccharides studied and thus was the second commonest component [151]. There are seveal linkages for Fuc; α1-2, α1-3, α1-4 and α1-6. In nature, Fucs are α1-2 linked to Gal, α1-3, α1-4, and α1-6 linked to GlcNAc, α1-2 and α1-4 linked to Fuc [148, 152]. In this case, determining α1-3 linked Fuc is required.
Gal has a different configuration of the hydroxyl group at the C4 position from glucose [153]. It exists either as a free sugar or bound to other monosaccharide units in various linkages in glycoproteins and glycolipids [154, 155]. For a more detailed description of Gal please read these review papers [154, 156, 157]. By far 19 distinct galactosyltransferase enzymes have been characterized in mammals, as a result of which there are four linkages for Gal; α1-3, α1-4, β1-3, and β1-4 linked [158]. Gals that are linked to GalNAc via α1-3 linkage can be observed in the O-glycan core 8 structure [21]. Gals that are linked to Gal via α1-4 linkage can be observed in glycosphingolipid Gb4 structure [159]. Gals that are linked to GlcNAc via β1-3 and β1-4 linkages can be observed in type 1 and type 2 LacNAc respectively [160, 161].
MS2 fragmentation
During MS2 fragmentation (Figure 13), the substituent at the C3 position of the glycan ring could be β-eliminated [3, 162]. This will help us to determine 1-3 linked Fuc. For instance, Fuc in Figure 14 could be confirmed to be 3-linked in the sequence as the signal at m/z 2386 corresponded to the loss of a Fuc from the C3 position of GlcNAc via β-elimination. The fragment ion at m/z 660 is consistent with an oxonium ion for a glycan structure consisting of Gal, G1cNAc, and Fuc, its concurrent ion at m/z 1955 is also observed. Because C3 of GlcNAc is occupied by Fuc, the Gal should be 1-4 linked, the sequence of these three monosaccharides thus is Gal β1-4(Fucα1-3)GlcNAc.
The mechanism of β-elimination of the 3-position of the oxonium ion during MS fragmentation. Glycans illustrated here are permethylated
Annotated MALDI-TOF/time of flight (TOF) MS2 spectrum of permethylated N-glycan at m/z 2592 in human cytotrophoblasts. Data were acquired in the form of [M + Na]+ ions. To simplify the annotation, only fragment ions related to Fuc have been annotated. Peaks were annotated with putative fragment ions according to the molecular weight. : GlcNAc, : Gal, : Man, : Fuc
Note. Reprinted from “Mass spectrometric investigation of biomedically important glycosylation,” by Chen Q. London: Imperial College London; 2015 (https://spiral.imperial.ac.uk/handle/10044/1/56202). CC BY NC ND.
Lectins
As shown in Table 1, α1-3 linked Fuc in sLeX or LeX(Y) is the preferred binding site for Lotus tetragonolobus lectin [163], which indicates that this lectin can be used for LeX determination together with previously mentioned methods.
Specifications of the five Fuc-specific lectins [163]
Number | Lectin | Preferred binding Fuc | Substances used for elution |
---|---|---|---|
1 | Lens culinaris agglutinin | Fucα1-6GlcNAc | methyl α-D-mannoside |
2 | Lotus tetragonolobus lectin | Fucα1-3GlcNAc, Fucα1-3 (Lewis X and Y), sLeX | L-Fuc |
3 | Ulex europaeus lectin I | Fucα1-2Gal β1-4Glc(NAc) | L-Fuc |
4 | Aleuria aurantia lectin | Fucα1-2, Fucα1-3/4, Fucα1-6GlcNAc | L-Fuc |
5 | Aspergillus oryzae lectin | Fucα1-2, Fucα1-3/4, Fucα1-6GlcNAc | L-Fuc |
Note. Adapted from “Comparison of fucose-specific lectins to improve quantitative AFP-L3 assay for diagnosing hepatocellular carcinoma using mass spectrometry,” by Lee J, Yeo I, Kim Y, Shin D, Kim J, Kim Y, et al. J Proteome Res. 2022;21:1548–57 (https://pubs.acs.org/doi/10.1021/acs.jproteome.2c00196). © 2022, American Chemical Society.
Indeed, some research groups have used the lectin specifically for α1-3 linked Fuc analysis as it is like LeX determinant [164, 165]. For instance, Yu et al. [165] also used Lotus tetragonolobus lectin to recognize α1-3 linked Fuc within type 2 glycans in functional glycomic analysis of human milk glycans. Similarly, Lis-Kuberka et al. [166] used Ulex europaeus lectin I for α1-2, Lotus tetragonolobus lectin for α1-3, and Lens culinaris lectin for α1-6 linked Fuc in their human milk glycoprotein investigation.
Conclusions
sLeX is a tetrasaccharide that is usually attached to the surface of cells with great importance; it plays important roles in human sperm-egg binding and embryo development. It also shows the vital function in immune and cancerous aspects. However, due to the diversity of glycan components and the complexity of monosaccharide linkages within every glycan together with the technical limitations of the current detection tools, studies of glycans lagged behind researches in proteins and DNA. Chemical and enzymatic syntheses of sLeX certainly provide an important and irreplaceable way to study it. This review has summarized related synthetic approaches including the contributions from more than 15 international research groups. Currently, the fundamental idea of detecting sLeX is to determine the presence of Neu5Ac, Fuc, Gal, and GlcNAc with correct linkage in between, and to achieve this detection, glycoside hydrolase, chemical derivatization, HILIC, MS, and lectin could be employed jointly. This review will be valuable for those researchers who are interested in the importance of sLeX in biological processes and will be helpful for advancing the understanding of sLeX.
Abbreviations
CMP-Neu5Ac: | cytidine 5’-monophospho-N-acetylneuraminic acid |
Da: | dalton |
Fuc: | fucose |
Gal: | galactose |
GDP-Fuc: | guanosine diphosphate-fucose |
GlcNAc: | N-acetylglucosamine |
GlcNAc6STs: | N-acetylglucosamine-6-O-sulfotransferases |
HEVs: | high endothelial venules |
HILIC: | hydrophilic interaction liquid chromatography |
IMS: | ion mobility spectrometry |
LacNAc: | N-acetyllactosamine |
MALDI-TOF: | matrix-assisted laser desorption/ionization-time of flight |
Man: | mannose |
MERS-CoV: | Middle East respiratory syndrome coronavirus |
MS: | mass spectrometry |
MS2: | tandem mass spectrometry |
Neu5Ac: | N-acetylneuraminic acid |
PEP: | phosphoenolpyruvate |
sLeA: | sialyl Lewis A |
sLeX: | sialyl Lewis X |
UDP-Gal: | uridine diphosphogalactose-galactose |
Declarations
Author contributions
QC: Conceptualization, Writing—original draft, Writing—review & editing. HL: Writing—original draft, Writing—review & editing. XL: Conceptualization, Writing—original draft, Writing—review & editing, Supervision.
Conflicts of interest
The authors declare that they have no conflicts of interest.
Ethical approval
Not applicable.
Consent to participate
Not applicable.
Consent to publication
Not applicable.
Availability of data and materials
Not applicable.
Funding
The work was funded by the Laboratory for Synthetic Chemistry and Chemical Biology Limited under the Health@InnoHK Program by the Innovation and Technology Commission. The Funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Copyright
© The Author(s) 2023.