Single-Cell Transcriptome Analysis Defines Expression of Kabuki Syndrome-Associated KMT2D Targets and Interacting Partners

Objectives. Kabuki syndrome (KS) is a rare genetic disorder characterized by developmental delay, retarded growth, and cardiac, gastrointestinal, neurocognitive, renal, craniofacial, dental, and skeletal defects. KS is caused by mutations in the genes encoding histone H3 lysine 4 methyltransferase (KMT2D) and histone H3 lysine 27 demethylase (KDM6A), which are core components of the complex of proteins associated with histone H3 lysine 4 methyltransferase SET1 (SET1/COMPASS). Using single-cell RNA data, we examined the expression profiles of Kmt2d and Kdm6a in the mouse dental pulp. In the incisor pulp, Kmt2d and Kdm6a colocalize with other genes of the SET1/COMPASS complex comprised of the WD-repeat protein 5 gene (Wdr5), the retinoblastoma-binding protein 5 gene (Rbbp5), absent, small, and homeotic 2-like protein-encoding gene (Ash2l), nuclear receptor cofactor 6 gene (Ncoa6), and Pax-interacting protein 1 gene (Ptip1). In addition, we found that Kmt2d and Kdm6a coexpress with the downstream target genes of the Wingless and Integrated (WNT) and sonic hedgehog signaling pathways in mesenchymal stem/stromal cells (MSCs) at different stages of osteogenic differentiation. Taken together, our results suggest an essential role of KMT2D and KDK6A in directing lineage-specific gene expression during differentiation of MSCs.

Mutations in KMT2D are the most common cause of KS and account for 75% of cases, whereas mutations in KDM6A cause up to 5% of cases [6][7][8]. In mice, Kmt2d and Kdm6a are essential during early embryonic development and exhibit a broad and distinct expression pattern in most adult tissues [6,7]. As part of the complex of proteins associated with histone H3 lysine 4 methyltransferase SET1 (SET1/ COMPASS), KMT2D and KDM6A physically associate with a protein module comprised of the WD-repeat protein 5 (WDR5), retinoblastoma-binding protein 5 (RBBP5), absent, small, and homeotic 2-like protein-encoding protein (ASH2L) and Dpy-30 histone methyltransferase complex regulatory subunit (DPY30) and nuclear receptor cofactor 6 (NCOA6), Pax-interacting protein 1 (PTIP), and PTIP-associated protein 1 (PA1) [6]. Recent research has shown that association of KMT2D with the histone acetyltransferases p300 and CBP encoded by EP300 and CREBBP is capable of establishing active enhancer states enriched in histone H3 lysine 4 monomethylation and histone H3 lysine 27 acetylation (H3K4me1/H3K27ac) to facilitate long-distance gene activation [9]. In addition to p300/CBP, the interplay between KMT2D and the SWI/SNF related, matrix associated, and actin-dependent ATP-dependent chromatin remodeling factors (SMARCA4 and SMARCB1) promotes cell type-specific enhancer activation [10]. Research has demonstrated that haploinsufficiency of KMT2D is sufficient to lead to the classical KS phenotype [11]. Mechanistically, haploinsufficiency of KMT2D causes structural changes in chromatin, which affects the mechanical properties of the nucleus [12]. By contrast, haploinsufficiency of KDM6A leads to postnatal growth restriction, microcephaly, cerebral atrophy, seizures, facial dysmorphism, and cleft palate [13]. KDM6A plays an important role in definitive endoderm differentiation through modulating the Wingless and Integrated (WNT) signaling pathway [14]. A recent study also established that KDM6A controls human neural differentiation and dendritic morphology [15]. KDM6A is capable of changing the composition of bivalent promoters by removing histone H3 lysine 27 trimethylation (H3K27me3) marks, which in turn leads to selective upregulation of neural genes. These findings are also supported by work by Dhar et al. showing that KDM6A is required for the activation of bivalent genes during mouse embryonic stem cell differentiation [16].
Previously, using single-cell RNA sequencing (scRNAseq), we characterized the cellular composition of the mouse incisor dental pulp [17]. Our study revealed distinct patterns of cell state heterogeneity in mesenchymal stem/stromal cells (MSCs) undergoing different stages of differentiation, including differentiated cells representing osteoblasts and odontoblasts. In this study, we examined the expression profile of Kmt2d and Kdm6a in different subpopulations of pulp cells. Our investigation revealed that genes encoding members of the WRAD protein complex display partially overlapping expression patterns in MSCs. We also noted that members of the WNT and sonic hedgehog signaling (SHH) pathways, which are known downstream targets of KMT2D [18], exhibit similar expression patterns across distinct subclasses of pulp cells. Collectively, our analysis sheds new light on the role of KMT2D and KDK6A in the lineage commitment of MSCs.

Materials and Methods
2.1. scRNA-seq. Incisor dental pulps from 6-day wild-type mice were obtained as described previously [19]. The cell suspension was then assessed with a Countess II FL Automated Cell Counter (Thermo Fisher Scientific Inc., Waltham, MA). A total of 8,000 cells were loaded into for capture into single channels of the 10× Genomics Chromium Controller (10× Genomics, Pleasanton, CA). After cell lysis, complementary DNA was synthesized and amplified for library preparation and sequencing with the HiSeq 4000 (Illumina, San Diego, CA).

2.2.
Genome-Wide ATAC-seq, hMeDIP-seq, and Bulk RNAseq. We used the Active Motif (Carlsbad, CA) service to perform Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq), hydroxymethylated DNA immunoprecipitation sequencing (hMeDIP-seq), and bulk RNAseq. The extraction of RNA was performed with the RNAeasy Mini/Midi kit (Qiagen, Germantown, MD). Whole-transcriptome analysis was performed with the Illumina NextSeq 500. The Burrows-Wheeler Aligner (BWA) algorithm with default settings was used to map the pairedend sequencing reads to the mouse genome. For the hMeDIP-seq experiment, we applied the Monarch Genomic DNA Isolation kit (New England Biolabs, Ipswich, MA) to isolate genomic DNA. The sonicated DNA was then ligated to the Illumina adaptors. The antibody AM39791 to 5hmC was used to produce DNA demethylation tags. Next, the libraries were generated from immunoprecipitated DNA and sequenced with the NextSeq 500. Input DNA without the immunoprecipitation step was used as a control.

Computational Analysis.
The sequencing reads with more than one mismatch were excluded. The STAR aligner was used, and only reads with MAPQ scores greater than 255 were included. The 10× Genomics barcodes and a unique molecular identifier threshold were used for filtering and generation of a pulp digital counts matrix. The expression pattern of the pulp cells was measured by dispersion and dimensionality reduction with uniform manifold approximation and projection (UMAP) [20][21][22]. The neighborhood clustering graph was performed with the Leiden algorithm [23]. The neighborhood graph was corrected using the batch remover BBKNN [24]. The Illumina base call files were converted to FASTQ format and aligned to the mm10 genome. The MACS2 peak-calling program was used for determining chromatin accessibility across the genome [25]. The BWA algorithm with default settings was used to map hMeDIPseq reads. The MACS peak finding algorithm was used to map methylated segments (18). The 5hmC enrichment was presented as average of values for all target regions. The STAR aligner and the Subread package were used for RNAseq fragments, feature counts (FPKM assignment to genes), and DESeq2 differential analysis [26][27][28].

Expression of Members of the SET1/COMPASS and PRC2
Complexes Associated with KMT2D and KDM6A. KMT2D and KDM6A function as components of the transcriptional activation complex SET1/COMPASS to establish open chromatin domains associated with active enhancers (H3K4me1-rich) and promoters (H3K4me2/3-rich) [6]. Protein-protein interaction networks are critical for a system-level understanding of gene regulatory processes. By 2 Stem Cells International analyzing the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) interaction network (https:// string-db.org), we discovered that both KMT2D and KDM6A have a specific set of interacting partners (Figure 1(a)). Among the members of the SET1/COMPASS complex, we identified KMT2C, ASH2L, WDR5, RBBP5, KDM4B, p300, NCOA6, SMARCA4, PAXIP1, and MEN1. The STRING analysis revealed that enhancer of zeste homolog 2 (EZH2) and suppressor of zeste 12 protein homolog (SUZ12), key components of the polycomb repressive complex 2 (PRC2), physically interact with KMT2D and KDM6A. Previously, we reported the results of an scRNA-seq analysis to define the expression pattern of developmental genes in the incisor dental pulp [17]. Based on the expression of key genes, we grouped the pulp cells into 16 clusters (Figure 1(b)). In the current study, we performed a more in-depth analysis of the Kmt2d and Kdm6a expression. Kmt2d is mainly expressed in clusters 1, 2, 3 8, 9, 13, 14 Figure 2).

Chromatin
Accessibility of the KMT2D-and KDM6-Associated Factors and Downstream Target Genes. According to the scRNA-seq data, Rbbp3, Kdm4b, and Axin2 exhibit relatively weak expression in the dental pulp (Figures 2 and 3). We next investigated the genomic structure of these genes for specific epigenetic marks and chromatin accessibility. Open chromatin regions and the DNA demethylation mark 5hmC are reliable indicators of active  3 Stem Cells International genomic states. Previously, using ATAC-seq and hmeDIPseq assays, we identified accessible chromatin regions and genome-wide enrichment of 5hmC in the mouse dental pulp [30,31]. By analyzing these datasets, we found that the vast majority of genes encoding protein partners and downstream targets of KMT2D and KDM6A exhibit open chromatin enriched in 5hmC (data not shown). Our analysis also revealed that, despite the weak expression of Rbbp3, Kdm4b, and Axin2, the genomic regions across these genes retain open chromatin configurations (Figure 4). Additionally, we detected specific enrichment of 5hmC in Rbbp3, Kdm4b, and Axin2. Collectively, these data suggest that even relatively weakly expressed genes acquire active chromatin states in the mouse dental pulp.

Discussion
Our study revealed that genes encoding KMT2D and KDM6A and other components of the SET1/COMPASS activation complex display overlapping expression patterns within the mouse dental pulp. Sixteen clusters of cells represent MSCs at different stages of osteogenic and odontogenic differentiation as well as pericytes, ameloblasts, smooth muscle cells, islet cells, Schwann cells, vascular endothelial cells, and blood cells [17]. We identified a core set of genes that are common in clusters 1, 2, 3, 8, and 9, which represent the five subpopulations of MSCs ( Figure 5). These genes encode KMT2D, KDM6A, KMT2C, ASH2L, p300, and SMARCA4, which are key components of the SET1/

Stem Cells International
COMPASS complex ( Figure 6). The mouse STRING database revealed that KMT2D and KDM6A also interact with EZH2 and SUZ12, which are subunits of the PRC2. We analyzed the expression patterns of Ezh2 and Suz12 in the incisor dental pulp and found that both genes exhibit vigorous expression in all pulp clusters including MSCs. A recent investigation reported the involvement of KMT2D in regulating WNT/β-catenin and SHH signaling in dental epithelium [18]. Both KMT2D and KDM6A are coexpressed in the dental epithelium of human tooth germs [32]. Hence, we examined the expression profiles of the canonical targets of KMT2D in the dental pulp. We deduced that Wnt5a, Ctnnb1, Tle2, Gli3, and Ptch1 repre-sent a common set of genes for all five clusters of MSCs (Figures 6 and 7).
With the assistance of histone H3K27 acetyltransferases CBP and p300, KMT2D is involved in enhancer activation and cell-type-specific gene expression during differentiation [33]. In chondrocytes, KMT2D regulates the expression of Shox2 [39]. Research has established that a decrease in Shox2 expression in Kmt2d-depleted mouse chondrocytes can release Sox9 inhibition, thereby causing chondrocyte differentiation. Kmt2d is expressed in the developing mouse calvarial osteoblasts, epithelia, and neural tissues [40]. Moreover, the heterozygous loss of Kmt2d impairs the neuromuscular junction, muscle cell differentiation, and myofiber regeneration [2]. In addition, a growing body of research indicates that defects in neural crest development are a major cause of KS [41,42]. Mouse knockout studies have revealed that Kmt2d and Kdm6a are required for proper differentiation of cranial neural crest cells [42,43]. Kmt2d depletion in Xenopus impairs neural crest formation, which is accompanied by reduced levels of H3K4me1 and H3K27ac, a hallmark feature of active chromatin states [41]. Interestingly, in Danio, inactivation of Kmt2d affected all examined tissues, whereas ablation of Kdm6a had a more selective impact on craniofacial and heart development [44]. These findings provide mechanistic insights into the pathogenesis of KS and indicate KMT2D as a key regulator of development and differentiation.
The dental pulp-derived MSCs exhibit multipotent differentiation capacity and are considered a promising stem cell source for tissue engineering and regenerative medicine [45]. Over the past decades, a growing body of scientific evidence shows that epigenetic mechanisms play a major role in lineage specification and gene regulatory networks underlying MSC differentiation [46][47][48]. Therefore, further research toward identifying the molecular pathways through which KMT2D controls gene expression in dental pulp stem  7 Stem Cells International cells will provide a more refined understanding of the mechanisms underlying the pathogenesis of KS.

Data Availability
All datasets from this study will be made available upon request.

Conflicts of Interest
The authors declare that they have no conflicts of interest.