2,632 results on '"Hi-C"'
Search Results
2. A method for chromatin domain partitioning based on hypergraph clustering
- Author
-
Gong, Haiyan, Zhang, Sichen, Zhang, Xiaotong, and Chen, Yang
- Published
- 2024
- Full Text
- View/download PDF
3. Leguminous industrial crop guar (Cyamopsis tetragonoloba): The chromosome-level reference genome de novo assembly
- Author
-
Li, Ji-Han, Li, Meng-Jiao, Li, Wen-Lin, Li, Xin-Yu, Ma, Yu-Bo, Tan, Xin, Wang, Yan, Li, Cai-Xia, and Ma, Xin-Rong
- Published
- 2024
- Full Text
- View/download PDF
4. A CTCF-binding site in the Mdm1-Il22-Ifng locus shapes cytokine expression profiles and plays a critical role in early Th1 cell fate specification
- Author
-
Liu, Chunhong, Nagashima, Hiroyuki, Fernando, Nilisha, Bass, Victor, Gopalakrishnan, Jaanam, Signorella, Sadie, Montgomery, Will, Lim, Ai Ing, Harrison, Oliver, Reich, Lauren, Yao, Chen, Sun, Hong-Wei, Brooks, Stephen R., Jiang, Kan, Nagarajan, Vijayaraj, Zhao, Yongbing, Jung, Seolkyoung, Philips, Rachael, Mikami, Yohei, Lareau, Caleb A., Kanno, Yuka, Jankovic, Dragana, Aryee, Martin J., Pękowska, Aleksandra, Belkaid, Yasmine, O’Shea, John, and Shih, Han-Yu
- Published
- 2024
- Full Text
- View/download PDF
5. 3D genome topology distinguishes molecular subgroups of medulloblastoma.
- Author
-
Lee, John, Johnston, Michael, Farooq, Hamza, Chen, Huey-Miin, Younes, Subhi, Suarez, Raul, Zwaig, Melissa, Juretic, Nikoleta, Weiss, William, Ragoussis, Jiannis, Jabado, Nada, Taylor, Michael, and Gallo, Marco
- Subjects
3D genome ,CNS tumor ,Hi-C ,cancer ,medulloblastoma ,transcriptome ,Medulloblastoma ,Humans ,Cerebellar Neoplasms ,DNA Methylation ,Genome ,Human ,Gene Expression Regulation ,Neoplastic ,Transcriptome ,Animals ,Male ,Female ,Mice - Abstract
Four main medulloblastoma (MB) molecular subtypes have been identified based on transcriptional, DNA methylation, and genetic profiles. However, it is currently not known whether 3D genome architecture differs between MB subtypes. To address this question, we performed in situ Hi-C to reconstruct the 3D genome architecture of MB subtypes. In total, we generated Hi-C and matching transcriptome data for 28 surgical specimens and Hi-C data for one patient-derived xenograft. The average resolution of the Hi-C maps was 6,833 bp. Using these data, we found that insulation scores of topologically associating domains (TADs) were effective at distinguishing MB molecular subgroups. TAD insulation score differences between subtypes were globally not associated with differential gene expression, although we identified few exceptions near genes expressed in the lineages of origin of specific MB subtypes. Our study therefore supports the notion that TAD insulation scores can distinguish MB subtypes independently of their transcriptional differences.
- Published
- 2024
6. Characterization of chromosome organization in the differentiation of acute myeloid leukemia cells by all-trans retinoic acid
- Author
-
Hu, Yanping, Zhao, Hongchao, Zhao, Yixun, Zheng, Jiawen, Guo, Yongjun, and Ma, Jie
- Published
- 2020
- Full Text
- View/download PDF
7. Whole-genome sequence and annotation of Penstemon davidsonii.
- Author
-
Alabady, Magdy, Zhang, Mengrui, Rausher, Mark, and Ostevik, Kate
- Subjects
Penstemon davidsonii ,Davidsons beardtongue ,Hi-C ,PacBio Hifi ,genome annotation ,genome assembly ,Penstemon ,Molecular Sequence Annotation ,Genome ,High-Throughput Nucleotide Sequencing ,Transcriptome ,Chromosomes - Abstract
Penstemon is the most speciose flowering plant genus endemic to North America. Penstemon species diverse morphology and adaptation to various environments have made them a valuable model system for studying evolution. Here, we report the first full reference genome assembly and annotation for Penstemon davidsonii. Using PacBio long-read sequencing and Hi-C scaffolding technology, we constructed a de novo reference genome of 437,568,744 bases, with a contig N50 of 40 Mb and L50 of 5. The annotation includes 18,199 gene models, and both the genome and transcriptome assembly contain over 95% complete eudicot BUSCOs. This genome assembly will serve as a valuable reference for studying the evolutionary history and genetic diversity of the Penstemon genus.
- Published
- 2024
8. A Joint Analysis of RNA-DNA and DNA-DNA Interactomes Reveals Their Strong Association.
- Author
-
Zvezdin, Dmitry S., Tyukaev, Artyom A., Zharikova, Anastasia A., and Mironov, Andrey A.
- Abstract
At the moment, many non-coding RNAs that perform a variety of functions in the regulation of chromatin processes are known. An increasing number of protocols allow researchers to study RNA-DNA interactions and shed light on new aspects of the RNA–chromatin interactome. The Hi-C protocol, which enables the study of chromatin's three-dimensional organization, has already led to numerous discoveries in the field of genome 3D organization. We conducted a comprehensive joint analysis of the RNA-DNA interactome and chromatin structure across different human and mouse cell lines. We show that these two phenomena are closely related in many respects, with the nature of this relationship being both tissue specific and conserved across humans and mice. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
9. A near-complete telomere-to-telomere genome assembly for Batrachochytrium dendrobatidis GPL JEL423 reveals a larger CBM18 gene family and a smaller M36 metalloprotease gene family than previously recognized.
- Author
-
Helmstetter, Nicolas, Harrison, Keith, Gregory, Jack, Harrison, Jamie, Ballou, Elizabeth, and Farrer, Rhys A
- Abstract
Batrachochytrium dendrobatidis is responsible for mass extinctions and extirpations of amphibians, mainly driven by the Global Panzootic Lineage (Bd GPL). Bd GPL isolate JEL423 is a commonly used reference strain in studies exploring the evolution, epidemiology, and pathogenicity of chytrid pathogens. These studies have been hampered by the fragmented, erroneous, and incomplete B. dendrobatidis JEL423 genome assembly, which includes long stretches of ambiguous positions and poorly resolved telomeric regions. Here, we present and describe a substantially improved, near telomere-to-telomere genome assembly and gene annotation for B. dendrobatidis JEL423. Our new assembly is 24.5 Mb in length, ∼800 kb longer than the previously published assembly for this organism, comprising 18 nuclear scaffolds and 2 mitochondrial scaffolds and including an extra 839 kb of repetitive sequence. We discovered that the patterns of aneuploidy in B. dendrobatidis JEL423 have remained stable over approximately 5 years. We found that our updated assembly encodes fewer than half the number of M36 metalloprotease genes predicted in the previous assembly. In contrast, members of the crinkling and necrosis gene family were found in similar numbers to the previous assembly. We also identified a more extensive carbohydrate binding module 18 gene family than previously observed. We anticipate our findings, and the updated genome assembly will be a useful tool for further investigation of the genome evolution of the pathogenic chytrids. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
10. High-quality genome assembly and annotation of the crested gecko (Correlophus ciliatus).
- Author
-
Huang, Ruyi, Zhang, Jinghang, Lu, Liang, Huang, Song, and Li, Chenhong
- Abstract
Correlophus ciliatus , or the crested gecko, is widely kept as a pet in many countries around the world due to its ease to care and bred and its high survival rate. However, there is limited number of genomic studies on the crested gecko. In this study, we generated a high-quality chromosome-level genome assembly of the crested gecko by combining Nanopore, Illumina, and Hi-C data. The genome assemble has a size of 1.66 Gb, with scaffold N50 of 109.97 Mb, and 99.52% of the scaffold anchored on 19 chromosomes. The BUSCO analysis indicated a gene completeness of 90.3% (n = 7,480), including 6,673 (89.2%) single-copy genes and 84 (1.1%) duplicated genes. Additionally, we identified 21,065 protein-coding genes using the MAKER3 annotation toolkit, with 41.98% (697.51 Mb) consisting of repetitive elements. Among these, 21,037 genes were validated through InterProScan5. Our study is the first to report a chromosome-level genome for the crested gecko. It provides valuable genomic resources for understanding molecular mechanisms under many interesting traits of the species. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
11. Aberrant c-AMP signalling in richter syndrome revealed by single-cell transcriptome and 3D chromatin analysis.
- Author
-
Li, Heng, Xing, Cheng, Li, Ji, Zhan, Yihao, Luo, Ming, Wang, Peilong, Sheng, Yue, and Peng, Hongling
- Subjects
RICHTER syndrome ,CELL transformation ,CHRONIC leukemia ,LYMPHOCYTIC leukemia ,CYTOLOGY - Abstract
Richter syndrome (RS), characterized by aggressive lymphoma arising from chronic lymphocytic leukaemia (CLL), presents a poor response to treatment and grim prognosis. To elucidate RS mechanisms, paired samples from a patient with DLBCL-RS were subjected to single-cell RNA sequencing (scRNA-seq) and high-throughput chromosome conformation capture (Hi-C) sequencing. Over 10,000 cells were profiled via scRNA-seq, revealing the comprehensive B cell transformation in RS. Hi-C sequencing exposed a unique chromatin architecture in RS, with increased proximal and decreased distal interactions. At the compartment scale, the interaction between B compartments was strengthened in DLBCL cells, while topologically associating domains (TADs) in DLBCL had elevated intra-TAD and reduced inter-TAD contacts. Differentially expressed genes at TAD borders between CLL and DLBCL cells highlighted an enrichment of cAMP-mediated signalling. To substantiate the functional relevance of ATF1 and CAP1, the genes involve in cAMP-mediated signalling, in the context of cell proliferation, we have performed gain- and loss-of-function experiments in relevant cell lines. Collectively, integrated scRNA-seq and Hi-C data suggest that chromatin reorganization and altered cAMP signalling drive RS transformation. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
12. Chromosome-Scale Genomes of the Flightless Caterpillar Hunter Beetles Calosoma tepidum and Calosoma wilkesii From British Columbia (Coleoptera: Carabidae).
- Author
-
Gauthier, Jérémy, Blanc, Mickael, and Toussaint, Emmanuel F A
- Subjects
- *
GROUND beetles , *INSECT evolution , *COMPARATIVE method , *COLONIZATION (Ecology) , *INSECT wings - Abstract
The giant ground beetle genus Calosoma (Coleoptera, Carabidae) comprises ca. 120 species distributed worldwide. About half of the species in this genus are flightless due to a process of wing reduction likely resulting from the colonization of remote habitats such as oceanic islands, highlands, and deserts. This clade is emerging as a new model to study the genomic basis of wing evolution in insects. In this framework, we present the de novo assemblies and annotations of two Calosoma species genomes from British Columbia, Calosoma tepidum and Calosoma wilkesii. Combining PacBio HiFi and Hi-C sequencing, we produce high-quality reference genomes for these two species. Our annotation using long-read RNAseq and existing Coleoptera protein evidence identified a total of 21,976 genes for C. tepidum and 26,814 genes for C. wilkesii. Using synteny analyses, we provide an in-depth comparison of genomic architectures in these two species. We infer an overall pattern of chromosome-scale conservation between the two species, with only minor rearrangements within chromosomes. These new reference genomes represent a major step forward in the study of this group, providing high-quality references that open the door to different approaches such as comparative genomics or population scale resequencing to study the implications of flight evolution. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
13. Three Novel Spider Genomes Unveil Spidroin Diversification and Hox Cluster Architecture: Ryuthela nishihirai (Liphistiidae), Uloborus plumipes (Uloboridae) and Cheiracanthium punctorium (Cheiracanthiidae).
- Author
-
Schöneberg, Yannis, Audisio, Tracy Lynn, Ben Hamadou, Alexander, Forman, Martin, Král, Jiří, Kořínková, Tereza, Líznarová, Eva, Mayer, Christoph, Prokopcová, Lenka, Krehenwinkel, Henrik, Prost, Stefan, and Kennedy, Susan
- Subjects
- *
SPIDER silk , *HOMEOBOX genes , *SILK production , *SPIDERS , *KARYOTYPES - Abstract
Spiders are a hyperdiverse taxon and among the most abundant predators in nearly all terrestrial habitats. Their success is often attributed to key developments in their evolution such as silk and venom production and major apomorphies such as a whole‐genome duplication. Resolving deep relationships within the spider tree of life has been historically challenging, making it difficult to measure the relative importance of these novelties for spider evolution. Whole‐genome data offer an essential resource in these efforts, but also for functional genomic studies. Here, we present de novo assemblies for three spider species: Ryuthela nishihirai (Liphistiidae), a representative of the ancient Mesothelae, the suborder that is sister to all other extant spiders; Uloborus plumipes (Uloboridae), a cribellate orbweaver whose phylogenetic placement is especially challenging; and Cheiracanthium punctorium (Cheiracanthiidae), which represents only the second family to be sequenced in the hyperdiverse Dionycha clade. These genomes fill critical gaps in the spider tree of life. Using these novel genomes along with 25 previously published ones, we examine the evolutionary history of spidroin gene and structural hox cluster diversity. Our assemblies provide critical genomic resources to facilitate deeper investigations into spider evolution. The near chromosome‐level genome of the 'living fossil' R. nishihirai represents an especially important step forward, offering new insights into the origins of spider traits. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
14. Somy evolution in the honey bee infecting trypanosomatid parasite Lotmaria passim.
- Author
-
Markowitz, Lindsey M, Nearman, Anthony, Zhao, Zexuan, Boncristiani, Dawn, Butenko, Anzhelika, Pablos, Luis Miguel de, Marin, Arturo, Xu, Guang, Machado, Carlos A, Schwarz, Ryan S, Palmer-Young, Evan C, and Evans, Jay D
- Subjects
- *
POLYPLOIDY , *CHROMOSOMES , *ANEUPLOIDY , *TRYPANOSOMATIDAE , *POLLINATORS , *HONEYBEES - Abstract
Lotmaria passim is a ubiquitous trypanosomatid parasite of honey bees nestled within the medically important subfamily Leishmaniinae. Although this parasite is associated with honey bee colony losses, the original draft genome—which was completed before its differentiation from the closely related Crithidia mellificae —has remained the reference for this species despite lacking improvements from newer methodologies. Here, we report the updated sequencing, assembly, and annotation of the BRL-type (Bee Research Laboratory) strain (ATCC PRA-422) of Lotmaria passim. The nuclear genome assembly has been resolved into 31 complete chromosomes and is paired with an assembled kinetoplast genome consisting of a maxicircle and 30 minicircle sequences. The assembly spans 33.7 Mb and contains very little repetitive content, from which our annotation of both the nuclear assembly and kinetoplast predicted 10,288 protein-coding genes. Analyses of the assembly revealed evidence of a recent chromosomal duplication event within chromosomes 5 and 6 and provided evidence for a high level of aneuploidy in this species, mirroring the genomic flexibility employed by other trypanosomatids as a means of adaptation to different environments. This high-quality reference can therefore provide insights into adaptations of trypanosomatids to the thermally regulated, acidic, and phytochemically rich honey bee hindgut niche, which offers parallels to the challenges faced by other Leishmaniinae during the challenges they undergo within insect vectors, during infection of mammals, and exposure to antiparasitic drugs throughout their multi-host life cycles. This reference will also facilitate investigations of strain-specific genomic polymorphisms, their role in pathogenicity, and the development of treatments for pollinator infection. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
15. Association Between Activated Loci of HML-2 Primate-Specific Endogenous Retrovirus and Newly Formed Chromatin Contacts in Human Primordial Germ Cell-like Cells.
- Author
-
Cordazzo Vargas, Bianca and Shioda, Toshihiro
- Subjects
- *
PLURIPOTENT stem cells , *HUMAN chromatin , *HUMAN genome , *CHROMATIN , *INVERSE relationships (Mathematics) - Abstract
The pluripotent stem cell (PSC)-derived human primordial germ cell-like cells (PGCLCs) are a cell culture-derived surrogate model of embryonic primordial germ cells. Upon differentiation of PSCs to PGCLCs, multiple loci of HML-2, the hominoid-specific human endogenous retrovirus (HERV), are strongly activated, which is necessary for PSC differentiation to PGCLCs. In PSCs, strongly activated loci of HERV-H family HERVs create chromatin contacts, which are required for the pluripotency. Chromatin contacts in the genome of human PSCs and PGCLCs were determined by Hi-C sequencing, and their locations were compared with those of HML-2 loci strongly activated in PGCLCs but silenced in the precursor naïve iPSCs. In both iPSCs and PGCLCs, the size of chromatin contacts were found to be around one megabase, which corresponds to the Topologically Associated Domains in the human genome but is slightly larger in PGCLCs than iPSCs. The number of small-sized chromatin contacts diminished while numbers of larger-sized contacts increased. The distances between chromatin contacts newly formed in PGCLCs and the degrees of activation of the closest HML-2 loci showed significant inverse correlation. Our study provides evidence that strong activation of HML-2 provirus loci may be associated with newly formed chromatin contacts in their vicinity, potentially contributing to PSC differentiation to the germ cell lineage. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Chromosome-Level Genome Assembly of Discogobio brachyphysallidos (Teleostei, Cyprinidae) and Population Genomics of the D. brachyphysallidos Complex: Impacts of Geological and Climate Changes on Species Evolution in Southwest China.
- Author
-
Zheng, Lan-Ping, Wu, Li-Li, and Sun, Hua-Ying
- Subjects
- *
GLACIAL Epoch , *GENETIC variation , *ROHU , *LINKAGE disequilibrium , *CLIMATE change , *PHYLOGEOGRAPHY - Abstract
The genus Discogobio is distributed in the eastern three rivers on the Yunnan–Guizhou Plateau and its adjacent regions, located to the southeast of the Qinghai–Tibet Plateau. Its origin and evolution are likely influenced by the uplift of the Qinghai-Tibet Plateau. However, the historical impact of geological events on the divergence and distribution of this fish group has not been fully elucidated. In this study, we successfully assembled a chromosome-level genome for Discogobio brachyphysallidos, which is approximately 1.21 Gb in length with a contig N50 of 8.63 Mb. The completeness of the genome assembly was assessed with a BUSCO score of 94.78%. A total of 30,597 protein-coding genes were predicted, with 93.92% functionally annotated. Phylogenetic analysis indicated that D. brachyphysallidos was closely related to Labeo rohita, and the divergence of the subfamily Labeoninae coincided with the significant uplift events of the Qinghai–Tibet Plateau. Additionally, we analyzed 75 samples of D. brachyphysallidos and D. yunnanensis from five populations, yielding 1.82 Tb of clean data and identifying 891,303,336 high-quality SNP sites. Population structure analyses indicated that the populations were clustered into five distinct groups, demonstrating significant genetic differentiation among them and the presence of cryptic species within this genus. Analyses of linkage disequilibrium decay and selective sweep indicated that the Pearl River population exhibited relatively higher genetic diversity compared with the populations from other drainages, and none of the populations showed evidence of expansion. Notably, the two population declines coincided with the early Pleistocene and Quaternary glaciation. It can be assumed that the geological movements of the Qinghai–Tibet Plateau and the Quaternary glaciation contributed to the decline in Discogobio populations and shaped their current size. The population genomics results showed that the present distribution pattern of Discogobio was the outcome of a series of geological events following the uplift of the Qinghai–Tibet Plateau. This study reconstructed the geological evolutionary history of the region from the perspective of species evolution. Furthermore, our study presents the first genome-wide analysis of the genetic divergence of Discogobio. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Chromosome-scale genome of the polyphagous pest Anastrepha ludens (Diptera: Tephritidae) provides insights on sex chromosome evolution in Anastrepha.
- Author
-
Congrains, Carlos, Sim, Sheina B, Paulo, Daniel F, Corpuz, Renee L, Kauwe, Angela N, Simmonds, Tyler J, Simpson, Sheron A, Scheffler, Brian E, and Geib, Scott M
- Subjects
- *
Y chromosome , *X chromosome , *DIPTERA , *INSECT pests , *FRUIT flies , *SEX chromosomes - Abstract
The Mexican fruit fly, Anastrepha ludens , is a polyphagous true fruit fly (Diptera: Tephritidae) considered 1 of the most serious insect pests in Central and North America to various economically relevant fruits. Despite its agricultural relevance, a high-quality genome assembly has not been reported. Here, we described the generation of a chromosome-level genome for the A. ludens using a combination of PacBio high fidelity long-reads and chromatin conformation capture sequencing data. The final assembly consisted of 140 scaffolds (821 Mb, N50 = 131 Mb), containing 99.27% complete conserved orthologs (BUSCO) for Diptera. We identified the sex chromosomes using 3 strategies: (1) visual inspection of Hi-C contact map and coverage analysis using the HiFi reads, (2) synteny with Drosophila melanogaster , and (3) the difference in the average read depth of autosomal vs sex chromosomal scaffolds. The X chromosome was found in 1 major scaffold (100 Mb) and 8 smaller contigs (1.8 Mb), and the Y chromosome was recovered in 1 large scaffold (6.1 Mb) and 35 smaller contigs (4.3 Mb). Sex chromosomes and autosomes showed considerable differences of transposable elements and gene content. Moreover, evolutionary rates of orthologs of A. ludens and Anastrepha obliqua revealed a faster evolution of X-linked, compared with autosome-linked, genes, consistent with the faster-X effect, leading us to new insights on the evolution of sex chromosomes in this diverse group of flies. This genome assembly provides a valuable resource for future evolutionary, genetic, and genomic translational research supporting the management of this important agricultural pest. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. A chromosome-scale genome assembly of mungbean (Vigna radiata).
- Author
-
Khanbo, Supaporn, Phadphon, Poompat, Naktang, Chaiwat, Sangsrakru, Duangjai, Waiyamitra, Pitchaporn, Narong, Nattapol, Yundaeng, Chutintorn, Tangphatsornruang, Sithichoke, Laosatit, Kularb, Somta, Prakit, and Pootakham, Wirulda
- Abstract
Background: Mungbean (Vigna radiata) is one of the most socio-economically important leguminous food crops of Asia and a rich source of dietary protein and micronutrients. Understanding its genetic makeup is crucial for genetic improvement and cultivar development. Methods: In this study, we combined single-tube long-fragment reads (stLFR) sequencing technology with high-throughput chromosome conformation capture (Hi-C) technique to obtain a chromosome-level assembly of V. radiata cultivar 'KUML4'. Results: The final assembly of the V. radiata genome was 468.08 Mb in size, with a scaffold N50 of 40.75 Mb. This assembly comprised 11 pseudomolecules, covering 96.94% of the estimated genome size. The genome contained 253.85 Mb (54.76%) of repetitive sequences and 27,667 protein-coding genes. Our gene prediction recovered 98.3% of the highly conserved orthologs based on Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis. Comparative analyses using sequence data from single-copy orthologous genes indicated that V. radiata diverged from V. mungo approximately 4.17 million years ago. Moreover, gene family analysis revealed that major gene families associated with defense responses were significantly expanded in V. radiata. Conclusion: Our chromosome-scale genome assembly of V. radiata cultivar KUML4 will provide a valuable genomic resource, supporting genetic improvement and molecular breeding. This data will also be valuable for future comparative genomics studies among legume species. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Chromosome-level genome assembly of the medicinal insect Blaps rhynchopetera using Nanopore and Hi-C technologies.
- Author
-
Zhang, Wei, Li, Yue, Wang, Qi, Yu, Qun, Ma, Yuchen, Huang, Lei, Zhang, Chenggui, Yang, Zizhong, Wang, Jiapeng, and Xiao, Huai
- Abstract
Blaps rhynchopetera Fairmaire is a significant medicinal resource in southwestern China. We utilized Nanopore and Hi-C technologies in combination to generate a high-quality, chromosome-level assembly of the B. rhynchopetera genome and described its genetic features. Genome surveys revealed that B. rhynchopetera is a highly heterozygous species. The assembled genome was 379.24 Mb in size, of which 96.03% was assigned to 20 pseudochromosomes. A total of 212.93 Mb of repeat sequences were annotated, and 26,824 protein-coding genes and 837 noncoding RNAs were identified. Phylogenetic analysis indicated the divergence of the ancestors of B. rhynchopetera and its closely related species Tenebrio molitor at about 85.6 million years ago. The colinearity analysis showed that some chromosomes of B. rhynchopetera may have had fission events, and it has a good synteny relationship with Tribolium castaneum. Furthermore, in the enrichment analyses, the gene families related to detoxification and immunity of B. rhynchopetera facilitated the understanding of its environmental adaptations, which will serve as a valuable research resource for pest control strategies and conservation efforts of beneficial insects. This high-quality reference genome will also contribute to the conservation of insect species diversity and genetic resources. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Aberrant c-AMP signalling in richter syndrome revealed by single-cell transcriptome and 3D chromatin analysis
- Author
-
Heng Li, Cheng Xing, Ji Li, Yihao Zhan, Ming Luo, Peilong Wang, Yue Sheng, and Hongling Peng
- Subjects
Richter syndrome ,Chronic lymphocytic leukaemia ,scRNA-seq ,Chromosome conformation capture sequencing ,Hi-C ,cAMP-mediated signalling ,Therapeutics. Pharmacology ,RM1-950 - Abstract
Abstract Richter syndrome (RS), characterized by aggressive lymphoma arising from chronic lymphocytic leukaemia (CLL), presents a poor response to treatment and grim prognosis. To elucidate RS mechanisms, paired samples from a patient with DLBCL-RS were subjected to single-cell RNA sequencing (scRNA-seq) and high-throughput chromosome conformation capture (Hi-C) sequencing. Over 10,000 cells were profiled via scRNA-seq, revealing the comprehensive B cell transformation in RS. Hi-C sequencing exposed a unique chromatin architecture in RS, with increased proximal and decreased distal interactions. At the compartment scale, the interaction between B compartments was strengthened in DLBCL cells, while topologically associating domains (TADs) in DLBCL had elevated intra-TAD and reduced inter-TAD contacts. Differentially expressed genes at TAD borders between CLL and DLBCL cells highlighted an enrichment of cAMP-mediated signalling. To substantiate the functional relevance of ATF1 and CAP1, the genes involve in cAMP-mediated signalling, in the context of cell proliferation, we have performed gain- and loss-of-function experiments in relevant cell lines. Collectively, integrated scRNA-seq and Hi-C data suggest that chromatin reorganization and altered cAMP signalling drive RS transformation.
- Published
- 2025
- Full Text
- View/download PDF
21. ppHiC: Interactive exploration of Hi-C results on the ProteinPaint web portal
- Author
-
Akanksha Rajput, Colleen Reilly, Airen Zaldivar Peraza, Jian Wang, Edgar Sioson, Gavriel Matt, Robin Paul, Congyu Lu, Aleksandar Acic, Karishma Gangwani, and Xin Zhou
- Subjects
Hi-C ,ProteinPaint ,Contact matrix ,Visualization ,Web server ,Genomic rearrangement ,Biotechnology ,TP248.13-248.65 - Abstract
The ProteinPaint Hi-C tool (ppHiC) facilitates web-based visualization and collaborative exploration of Hi-C data, a vital resource for understanding three-dimensional genomic structures. ppHiC allows researchers to easily analyze large Hi-C datasets on a web browser without requiring the computational expertise that has heretofore limited access to this complex genomic data. The platform is compatible with multiple Hi-C data versions and boasts a highly customizable interface, including a configuration panel for the precise adjustment of key visualization parameters. The tool’s interactive features offer a broad range of views, from whole-genome landscapes to detailed interactions between pairs of loci, that are accessible within a single, integrated environment. Here, we demonstrate how using ppHiC to visualize an altered chromatin conformational landscape in neuroblastoma can inform understanding of the genomic rearrangements in this cancer.
- Published
- 2024
- Full Text
- View/download PDF
22. DeCGR: an interactive toolkit for deciphering complex genomic rearrangements from Hi-C data
- Author
-
Junping Li, Minghui Sun, Yusen Ye, and Lin Gao
- Subjects
Complex genomic rearrangements ,3D genome ,Hi-C ,Biotechnology ,TP248.13-248.65 ,Genetics ,QH426-470 - Abstract
Abstract Background Complex genomic rearrangements (CGRs) drive the restructuring of chromatin architecture, resulting in significant interactions among rearranged fragments, visible as anomalous interaction blocks in chromatin contact maps generated by chromosome conformation capture technologies such as Hi-C. These blocks not only offer the orientation and genome coordinates of rearranged fragments but also filter out false positive CGRs, thereby facilitating CGR assembly. Despite this, there is a lack of interactive graphical software tailored for this purpose. Results We present DeCGR, a user-friendly Python toolbox specifically designed for deciphering CGRs in Hi-C data. DeCGR consists of four independent execution components. The Breakpoint Filtering module identifies and filters simple rearrangements, providing the coordinates of rearrangement breakpoints. The Fragment Assembly module automatically assembles CGRs and visualizes the assembly process, facilitating the direct association between anomalous interaction blocks and CGR events. The Validation CGRs module verifies the completeness and accuracy of CGRs by generating the Hi-C map with CGRs through a simulation process and examines the difference from the original Hi-C maps. This module displays both the original and the simulated Hi-C map with highlighted rearranged fragment boundaries for rapid review to assess the CGRs. Finally, the Reconstruct Hi-C Map module provides the reconstructed Hi-C map based on the determined CGRs, allowing users to directly observe the impact of rearrangements on chromatin structure. Conclusions DeCGR is designed specifically for biologists who aim to explore CGRs from Hi-C data. It provides a validation module to ensure the completeness and correctness of CGRs. Additionally, it allows users to generate CGR assembly results and reconstruct the Hi-C map with just one click. DeCGR provides intuitive visualization results for each module, allowing users to easily associate CGRs with Hi-C maps. DeCGR is operable through a user-friendly graphical interface. Source codes are freely available at https://github.com/GaoLabXDU/DeCGR .
- Published
- 2024
- Full Text
- View/download PDF
23. Graphasing: phasing diploid genome assembly graphs with single-cell strand sequencing
- Author
-
Mir Henglin, Maryam Ghareghani, William T. Harvey, David Porubsky, Sergey Koren, Evan E. Eichler, Peter Ebert, and Tobias Marschall
- Subjects
De novo assembly ,Phasing ,Assembly graph ,Haplotype ,Strand-seq ,Hi-C ,Biology (General) ,QH301-705.5 ,Genetics ,QH426-470 - Abstract
Abstract Haplotype information is crucial for biomedical and population genetics research. However, current strategies to produce de novo haplotype-resolved assemblies often require either difficult-to-acquire parental data or an intermediate haplotype-collapsed assembly. Here, we present Graphasing, a workflow which synthesizes the global phase signal of Strand-seq with assembly graph topology to produce chromosome-scale de novo haplotypes for diploid genomes. Graphasing readily integrates with any assembly workflow that both outputs an assembly graph and has a haplotype assembly mode. Graphasing performs comparably to trio phasing in contiguity, phasing accuracy, and assembly quality, outperforms Hi-C in phasing accuracy, and generates human assemblies with over 18 chromosome-spanning haplotypes.
- Published
- 2024
- Full Text
- View/download PDF
24. HiCDiffusion - diffusion-enhanced, transformer-based prediction of chromatin interactions from DNA sequences
- Author
-
Mateusz Chiliński and Dariusz Plewczynski
- Subjects
3D genomics ,Hi-C ,Machine learning ,Artificial intelligence ,Biotechnology ,TP248.13-248.65 ,Genetics ,QH426-470 - Abstract
Abstract Prediction of chromatin interactions from DNA sequence has been a significant research challenge in the last couple of years. Several solutions have been proposed, most of which are based on encoder-decoder architecture, where 1D sequence is convoluted, encoded into the latent representation, and then decoded using 2D convolutions into the Hi-C pairwise chromatin spatial proximity matrix. Those methods, while obtaining high correlation scores and improved metrics, produce Hi-C matrices that are artificial - they are blurred due to the deep learning model architecture. In our study, we propose the HiCDiffusion, sequence-only model that addresses this problem. We first train the encoder-decoder neural network and then use it as a component of the diffusion model - where we guide the diffusion using a latent representation of the sequence, as well as the final output from the encoder-decoder. That way, we obtain the high-resolution Hi-C matrices that not only better resemble the experimental results - improving the Fréchet inception distance by an average of 11 times, with the highest improvement of 56 times - but also obtain similar classic metrics to current state-of-the-art encoder-decoder architectures used for the task.
- Published
- 2024
- Full Text
- View/download PDF
25. Improved simultaneous mapping of epigenetic features and 3D chromatin structure via ViCAR
- Author
-
Sean M. Flynn, Somdutta Dhir, Krzysztof Herka, Colm Doyle, Larry Melidis, Angela Simeone, Winnie W. I. Hui, Rafael de Cesaris Araujo Tavares, Stefan Schoenfelder, David Tannahill, and Shankar Balasubramanian
- Subjects
3D genome structure ,Hi-C ,Histone marks ,G-quadruplex DNA ,Biology (General) ,QH301-705.5 ,Genetics ,QH426-470 - Abstract
Abstract Methods to measure chromatin contacts at genomic regions bound by histone modifications or proteins are important tools to investigate chromatin organization. However, such methods do not capture the possible involvement of other epigenomic features such as G-quadruplex DNA secondary structures (G4s). To bridge this gap, we introduce ViCAR (viewpoint HiCAR), for the direct antibody-based capture of chromatin interactions at folded G4s. Through ViCAR, we showcase the first G4-3D interaction landscape. Using histone marks, we also demonstrate how ViCAR improves on earlier approaches yielding increased signal-to-noise. ViCAR is a practical and powerful tool to explore epigenetic marks and 3D genome interactomes.
- Published
- 2024
- Full Text
- View/download PDF
26. An expanded odorant-binding protein mediates host cue detection in the parasitic wasp Baryscapus dioryctriae basis of the chromosome-level genome assembly analysis
- Author
-
Xiaoyan Zhu, Yi Yang, Qiuyao Li, Jing Li, Lin Du, Yanhan Zhou, Hongbo Jin, Liwen Song, Qi Chen, and Bingzhong Ren
- Subjects
Hi‐C ,Parasitic wasp ,Baryscapus dioryctriae ,Odorant-binding protein ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Baryscapus dioryctriae (Chalcidodea: Eulophidae) is a parasitic wasp that parasitizes the pupae of many Pyralidae members and has been used as a biological control agent against Dioryctria pests of pinecones. Results This B. dioryctriae assembly has a genome size of 485.5 Mb with a contig N50 of 2.17 Mb, and scaffolds were assembled onto six chromosomes using Hi-C analysis, significantly increasing the scaffold N50 to 91.17 Mb, with more than 96.13% of the assembled bases located on chromosomes, and an analysis revealed that 94.73% of the BUSCO gene set. A total of 54.82% (279.27 Mb) of the assembly was composed of repetitive sequences and 24,778 protein-coding genes were identified. Comparative genomic analysis demonstrated that the chemosensory perception, genetic material synthesis, and immune response pathways were primarily enriched in the expanded genes. Moreover, the functional characteristics of an odorant-binding protein (BdioOBP45) with ovipositor-biased expression identified from the expanded olfactory gene families were investigated by the fluorescence competitive binding and RNAi assays, revealing that BdioOBP45 primarily binds to the D. abietella-induced volatile compounds, suggesting that this expanded OBP is likely involved in locating female wasp hosts and highlighting a direction for future research. Conclusions Taken together, this work not only provides new genomic sequences for the Hymenoptera systematics, but also the high-quality chromosome-level genome of B. dioryctriae offers a valuable foundation for studying the molecular, evolutionary, and parasitic processes of parasitic wasps.
- Published
- 2024
- Full Text
- View/download PDF
27. HiCMC: High-Efficiency Contact Matrix Compressor
- Author
-
Yeremia Gunawan Adhisantoso, Tim Körner, Fabian Müntefering, Jörn Ostermann, and Jan Voges
- Subjects
Contact matrix ,Hi-C ,3C ,Compression ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Chromosome organization plays an important role in biological processes such as replication, regulation, and transcription. One way to study the relationship between chromosome structure and its biological functions is through Hi-C studies, a genome-wide method for capturing chromosome conformation. Such studies generate vast amounts of data. The problem is exacerbated by the fact that chromosome organization is dynamic, requiring snapshots at different points in time, further increasing the amount of data to be stored. We present a novel approach called the High-Efficiency Contact Matrix Compressor (HiCMC) for efficient compression of Hi-C data. Results By modeling the underlying structures found in the contact matrix, such as compartments and domains, HiCMC outperforms the state-of-the-art method CMC by approximately 8% and the other state-of-the-art methods cooler, LZMA, and bzip2 by over 50% across multiple cell lines and contact matrix resolutions. In addition, HiCMC integrates domain-specific information into the compressed bitstreams that it generates, and this information can be used to speed up downstream analyses. Conclusion HiCMC is a novel compression approach that utilizes intrinsic properties of contact matrix, such as compartments and domains. It allows for a better compression in comparison to the state-of-the-art methods. HiCMC is available at https://github.com/sXperfect/hicmc .
- Published
- 2024
- Full Text
- View/download PDF
28. DeCGR: an interactive toolkit for deciphering complex genomic rearrangements from Hi-C data.
- Author
-
Li, Junping, Sun, Minghui, Ye, Yusen, and Gao, Lin
- Subjects
SOURCE code ,GENE mapping ,CHROMATIN ,BIOLOGISTS ,GENOMES - Abstract
Background: Complex genomic rearrangements (CGRs) drive the restructuring of chromatin architecture, resulting in significant interactions among rearranged fragments, visible as anomalous interaction blocks in chromatin contact maps generated by chromosome conformation capture technologies such as Hi-C. These blocks not only offer the orientation and genome coordinates of rearranged fragments but also filter out false positive CGRs, thereby facilitating CGR assembly. Despite this, there is a lack of interactive graphical software tailored for this purpose. Results: We present DeCGR, a user-friendly Python toolbox specifically designed for deciphering CGRs in Hi-C data. DeCGR consists of four independent execution components. The Breakpoint Filtering module identifies and filters simple rearrangements, providing the coordinates of rearrangement breakpoints. The Fragment Assembly module automatically assembles CGRs and visualizes the assembly process, facilitating the direct association between anomalous interaction blocks and CGR events. The Validation CGRs module verifies the completeness and accuracy of CGRs by generating the Hi-C map with CGRs through a simulation process and examines the difference from the original Hi-C maps. This module displays both the original and the simulated Hi-C map with highlighted rearranged fragment boundaries for rapid review to assess the CGRs. Finally, the Reconstruct Hi-C Map module provides the reconstructed Hi-C map based on the determined CGRs, allowing users to directly observe the impact of rearrangements on chromatin structure. Conclusions: DeCGR is designed specifically for biologists who aim to explore CGRs from Hi-C data. It provides a validation module to ensure the completeness and correctness of CGRs. Additionally, it allows users to generate CGR assembly results and reconstruct the Hi-C map with just one click. DeCGR provides intuitive visualization results for each module, allowing users to easily associate CGRs with Hi-C maps. DeCGR is operable through a user-friendly graphical interface. Source codes are freely available at https://github.com/GaoLabXDU/DeCGR. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. The High‐Quality Genome Sequencing and Analysis of Red Raspberry (Rubus idaeus L.).
- Author
-
Zhang, Haopeng, Li, Weihua, Li, Guodong, Liu, Jiaren, Chen, Hongsheng, Zhang, Chunpeng, Zhao, Jinlu, Zhang, Zhicheng, Lv, Qiang, Zhang, Yan, Yang, Guohui, Liu, Ming, and Pinto, Paulo M.
- Subjects
- *
GENE families , *RUBUS , *CHROMOSOMES , *NUCLEOTIDE sequencing , *ROSACEAE - Abstract
Red raspberry (Rubus idaeus L.), which is an important nutritional source for human health, belongs to fruit crops of the Rosaceae family. Here, we used Pacific Biosciences single‐molecule real‐time (SMRT) sequencing and high‐throughput chromosome conformation capture (Hi‐C) sequencing technologies to assemble genomes and reported a high‐quality Rubus idaeus L. (DNS‐1) genome with 321.29 Mb assembled into seven chromosomes. The LAI score of the DNS‐1 genome assembly was 21.32, belonging to gold quality. Approximately 52.3% of the assembly sequences were annotated as repetitive sequences, and 24.15% were composed of long terminal repeat elements. A total of 29,814 protein‐coding genes and 2474 pseudogenes were predicted in DNS‐1. We characterized the complete genomes of DNS‐1 and compared them to those of seven other species. We found that 652 gene families were unique to DNS‐1 and they were shaped from an ancestor. There were 1000 and 5193 gene families that expanded and contracted in the DNS‐1 genome. The Rubus idaeus L. genome can be used to understand the structure and evolution of Rosaceae genomes and can be developed to identify genes controlling important traits and improve breeding work. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Effects of Differentially Methylated CpG Sites in Enhancer and Promoter Regions on the Chromatin Structures of Target LncRNAs in Breast Cancer.
- Author
-
Fan, Zhiyu, Chen, Yingli, Yan, Dongsheng, and Li, Qianzhong
- Subjects
- *
MACHINE learning , *RNA regulation , *DNA methylation , *GENE expression , *PROMOTERS (Genetics) - Abstract
Aberrant DNA methylation plays a crucial role in breast cancer progression by regulating gene expression. However, the regulatory pattern of DNA methylation in long noncoding RNAs (lncRNAs) for breast cancer remains unclear. In this study, we integrated gene expression, DNA methylation, and clinical data from breast cancer patients included in The Cancer Genome Atlas (TCGA) database. We examined DNA methylation distribution across various lncRNA categories, revealing distinct methylation characteristics. Through genome-wide correlation analysis, we identified the CpG sites located in lncRNAs and the distally associated CpG sites of lncRNAs. Functional genome enrichment analysis, conducted through the integration of ENCODE ChIP-seq data, revealed that differentially methylated CpG sites (DMCs) in lncRNAs were mostly located in promoter regions, while distally associated DMCs primarily acted on enhancer regions. By integrating Hi-C data, we found that DMCs in enhancer and promoter regions were closely associated with the changes in three-dimensional chromatin structures by affecting the formation of enhancer–promoter loops. Furthermore, through Cox regression analysis and three machine learning models, we identified 11 key methylation-driven lncRNAs (DIO3OS, ELOVL2-AS1, MIAT, LINC00536, C9orf163, AC105398.1, LINC02178, MILIP, HID1-AS1, KCNH1-IT1, and TMEM220-AS1) that were associated with the survival of breast cancer patients and constructed a prognostic risk scoring model, which demonstrated strong prognostic performance. These findings enhance our understanding of DNA methylation's role in lncRNA regulation in breast cancer and provide potential biomarkers for diagnosis. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. HiCDiffusion - diffusion-enhanced, transformer-based prediction of chromatin interactions from DNA sequences.
- Author
-
Chiliński, Mateusz and Plewczynski, Dariusz
- Subjects
NUCLEOTIDE sequence ,ARTIFICIAL intelligence ,MACHINE learning ,DNA sequencing ,CHROMATIN ,DEEP learning - Abstract
Prediction of chromatin interactions from DNA sequence has been a significant research challenge in the last couple of years. Several solutions have been proposed, most of which are based on encoder-decoder architecture, where 1D sequence is convoluted, encoded into the latent representation, and then decoded using 2D convolutions into the Hi-C pairwise chromatin spatial proximity matrix. Those methods, while obtaining high correlation scores and improved metrics, produce Hi-C matrices that are artificial - they are blurred due to the deep learning model architecture. In our study, we propose the HiCDiffusion, sequence-only model that addresses this problem. We first train the encoder-decoder neural network and then use it as a component of the diffusion model - where we guide the diffusion using a latent representation of the sequence, as well as the final output from the encoder-decoder. That way, we obtain the high-resolution Hi-C matrices that not only better resemble the experimental results - improving the Fréchet inception distance by an average of 11 times, with the highest improvement of 56 times - but also obtain similar classic metrics to current state-of-the-art encoder-decoder architectures used for the task. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. HiCMC: High-Efficiency Contact Matrix Compressor.
- Author
-
Adhisantoso, Yeremia Gunawan, Körner, Tim, Müntefering, Fabian, Ostermann, Jörn, and Voges, Jan
- Subjects
CHROMOSOME structure ,MORPHOLOGY ,CELL lines ,COMPRESSORS ,TRANSCRIPTION (Linguistics) - Abstract
Background: Chromosome organization plays an important role in biological processes such as replication, regulation, and transcription. One way to study the relationship between chromosome structure and its biological functions is through Hi-C studies, a genome-wide method for capturing chromosome conformation. Such studies generate vast amounts of data. The problem is exacerbated by the fact that chromosome organization is dynamic, requiring snapshots at different points in time, further increasing the amount of data to be stored. We present a novel approach called the High-Efficiency Contact Matrix Compressor (HiCMC) for efficient compression of Hi-C data. Results: By modeling the underlying structures found in the contact matrix, such as compartments and domains, HiCMC outperforms the state-of-the-art method CMC by approximately 8% and the other state-of-the-art methods cooler, LZMA, and bzip2 by over 50% across multiple cell lines and contact matrix resolutions. In addition, HiCMC integrates domain-specific information into the compressed bitstreams that it generates, and this information can be used to speed up downstream analyses. Conclusion: HiCMC is a novel compression approach that utilizes intrinsic properties of contact matrix, such as compartments and domains. It allows for a better compression in comparison to the state-of-the-art methods. HiCMC is available at https://github.com/sXperfect/hicmc. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Peculiar k-mer Spectra Are Correlated with 3D Contact Frequencies and Breakpoint Regions in the Human Genome.
- Author
-
Hikmat, Wisam Mohammed, Sievers, Aaron, Hausmann, Michael, and Hildenbrand, Georg
- Abstract
Background: It is widely accepted that the 3D chromatin organization in human cell nuclei is not random and recent investigations point towards an interactive relation of epigenetic functioning and chromatin (re-)organization. Although chromatin organization seems to be the result of selforganization of the entirety of all molecules available in the cell nucleus, a general question remains open as to what extent chromatin organization might additionally be predetermined by the DNA sequence and, if so, if there are characteristic differences that distinguish typical regions involved in dysfunction-related aberrations from normal ones, since typical DNA breakpoint regions involved in disease-related chromosome aberrations are not randomly distributed along the DNA sequence. Methods: Highly conserved k-mer patterns in intronic and intergenic regions have been reported in eukaryotic genomes. In this article, we search and analyze regions deviating from average spectra (ReDFAS) of k-mer word frequencies in the human genome. This includes all assembled regions, e.g., telomeric, centromeric, genic as well as intergenic regions. Results: A positive correlation between k-mer spectra and 3D contact frequencies, obtained exemplarily from given Hi-C datasets, has been found indicating a relation of ReDFAS to chromatin organization and interactions. We also searched and found correlations of known functional annotations, e.g., genes correlating with ReDFAS. Selected regions known to contain typical breakpoints on chromosomes 9 and 5 that are involved in cancerrelated chromosomal aberrations appear to be enriched in ReDFAS. Since transposable elements like ALUs are often assigned as major players in 3D genome organization, we also studied their impact on our examples but could not find a correlation between ALU regions and breakpoints comparable to ReDFAS. Conclusions: Our findings might show that ReDFAS are associated with instable regions of the genome and regions with many chromatin contacts which is in line with current research indicating that chromatin loop anchor points lead to genomic instability. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Chromosome‐Level Genome Assembly for the Chinese Serow (Capricornis milneedwardsii) Provides Insights Into Its Taxonomic Status and Evolution.
- Author
-
Li, Anning, Yang, Qimeng, Li, Rongrong, Cai, Keli, Zhu, Li, Wang, Xiaoyu, Cheng, Gong, Wang, Xihong, Lei, Yinghu, Jiang, Yu, and Zan, Linsen
- Subjects
- *
MYOCARDIUM , *SEROWS , *MUSCLE contraction , *CHROMOSOMES , *CENTROMERE , *KARYOTYPES - Abstract
Chinese serow (Capricornis milneedwardsii) is mainly distributed in the south of Yellow River in China, which has been listed as vulnerable by the International Union for Conservation of Nature (IUCN). However, the reference genome of serow has not been reported and its taxonomic status is still unclear. Here, we first constructed a high‐quality chromosome‐level reference genome of C. milneedwardsii using PacBio long HiFi reads combined with Hi‐C technology. The assembled genome was ~2.83 Gb in size, with a contig N50 of 100.96 Mb and scaffold N50 of 112.75 Mb, which were anchored onto 24 chromosomes. Furthermore, we found that the Chinese serow was more closely related to muskox, which diverged from ~4.85 million years ago (Mya). Compared to the karyotype of goat (2n = 60), we found the Chinese serow (2n = 48) experienced six chromosome fusions, which resulted in the formation of six central centromere chromosomes. We also identified two positively selected genes (MYH6 and DCSTAMP) specific to Chinese serow, which were involved in 'viral myocarditis' and 'Cardiac muscle contraction'. Interestingly, compared to other Caprinae animals, the MYH6 protein of Chinese serow occurred two mutations (E1520S and G1521S), which might be related to hypoxia tolerance. The high‐quality reference genome of C. milneedwardsii provides a valuable information for protection of serows and insights into its evolution. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Snow alga Sanguina aurantia as revealed through de novo genome assembly and annotation.
- Author
-
Raymond, Breanna B, Guenzi-Tiberi, Pierre, Maréchal, Eric, and Quarmby, Lynne M
- Subjects
- *
ERYTHROCYTES , *RNA sequencing , *ASTAXANTHIN , *GREEN algae , *GENOMES - Abstract
To thrive on melting alpine and polar snow, some Chlorophytes produce an abundance of astaxanthin, causing red blooms, often dominated by genus Sanguina. The red cells have not been cultured, but we recently grew a green biciliate conspecific with Sanguina aurantia from a sample of watermelon snow. This culture provided source material for Oxford Nanopore Technology and Illumina sequencing. Our assembly pipeline exemplifies the value of a hybrid long- and short-read approach for the complexities of working with a culture grown from a field sample. Using bioinformatic tools, we separated assembled contigs into 2 genomic pools based on a difference in GC content (57.5 and 55.1%). We present the data as 2 assemblies of S. aurantia variants but explore other possibilities. High-throughput chromatin conformation capture analysis (Hi-C sequencing) was used to scaffold the assemblies into a 96-Mb genome designated as "A" and a 102-Mb genome designated as "B." Both assemblies are highly contiguous: genome A consists of 38 scaffolds with an N50 of 5.4 Mb, while genome B has 50 scaffolds with an N50 of 6.4 Mb. RNA sequencing was used to improve gene annotation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. An expanded odorant-binding protein mediates host cue detection in the parasitic wasp Baryscapus dioryctriae basis of the chromosome-level genome assembly analysis.
- Author
-
Zhu, Xiaoyan, Yang, Yi, Li, Qiuyao, Li, Jing, Du, Lin, Zhou, Yanhan, Jin, Hongbo, Song, Liwen, Chen, Qi, and Ren, Bingzhong
- Subjects
ODORANT-binding proteins ,PARASITIC wasps ,BIOLOGICAL pest control agents ,GENE expression ,GENE families ,OLFACTORY receptors - Abstract
Background : Baryscapus dioryctriae (Chalcidodea: Eulophidae) is a parasitic wasp that parasitizes the pupae of many Pyralidae members and has been used as a biological control agent against Dioryctria pests of pinecones. Results: This B. dioryctriae assembly has a genome size of 485.5 Mb with a contig N50 of 2.17 Mb, and scaffolds were assembled onto six chromosomes using Hi-C analysis, significantly increasing the scaffold N50 to 91.17 Mb, with more than 96.13% of the assembled bases located on chromosomes, and an analysis revealed that 94.73% of the BUSCO gene set. A total of 54.82% (279.27 Mb) of the assembly was composed of repetitive sequences and 24,778 protein-coding genes were identified. Comparative genomic analysis demonstrated that the chemosensory perception, genetic material synthesis, and immune response pathways were primarily enriched in the expanded genes. Moreover, the functional characteristics of an odorant-binding protein (BdioOBP45) with ovipositor-biased expression identified from the expanded olfactory gene families were investigated by the fluorescence competitive binding and RNAi assays, revealing that BdioOBP45 primarily binds to the D. abietella-induced volatile compounds, suggesting that this expanded OBP is likely involved in locating female wasp hosts and highlighting a direction for future research. Conclusions: Taken together, this work not only provides new genomic sequences for the Hymenoptera systematics, but also the high-quality chromosome-level genome of B. dioryctriae offers a valuable foundation for studying the molecular, evolutionary, and parasitic processes of parasitic wasps. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Light control of three‐dimensional chromatin organization in soybean.
- Author
-
Li, Zhu, Sun, Linhua, Xu, Xiao, Liu, Yutong, He, Hang, and Deng, Xing Wang
- Subjects
- *
RNA polymerase II , *GENETIC regulation , *GENE expression , *CHROMATIN , *GENETIC transcription - Abstract
Summary: Higher‐order chromatin structure is critical for regulation of gene expression. In plants, light profoundly affects the morphogenesis of emerging seedlings as well as global gene expression to ensure optimal adaptation to environmental conditions. However, the changes and functional significance of chromatin organization in response to light during seedling development are not well documented. We constructed Hi‐C contact maps for the cotyledon, apical hook and hypocotyl of soybean subjected to dark and light conditions. The resulting high‐resolution Hi‐C contact maps identified chromosome territories, A/B compartments, A/B sub‐compartments, TADs (Topologically Associated Domains) and chromatin loops in each organ. We observed increased chromatin compaction under light and we found that domains that switched from B sub‐compartments in darkness to A sub‐compartments under light contained genes that were activated during photomorphogenesis. At the local scale, we identified a group of TADs constructed by gene clusters consisting of different numbers of Small Auxin‐Upregulated RNAs (SAURs), which exhibited strict co‐expression in the hook and hypocotyl in response to light stimulation. In the hypocotyl, RNA polymerase II (RNAPII) regulated the transcription of a SAURs cluster under light via TAD condensation. Our results suggest that the 3D genome is involved in the regulation of light‐related gene expression in a tissue‐specific manner. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. SuperTAD-Fast: Accelerating Topologically Associating Domains Detection Through Discretization.
- Author
-
Ling, Zhao, Zhang, Yu Wei, and Li, Shuai Cheng
- Subjects
- *
TIME complexity , *INFORMATION theory , *DYNAMIC programming , *CYTOSKELETAL proteins , *SOFTWARE development tools - Abstract
High-throughput chromosome conformation capture (Hi-C) technology captures spatial interactions of DNA sequences into matrices, and software tools are developed to identify topologically associating domains (TADs) from the Hi-C matrices. With structural information theory, SuperTAD adopted a dynamic programming approach to find the TAD hierarchy with minimal structural entropy. However, the algorithm suffers from high time complexity. To accelerate this algorithm, we design and implement an approximation algorithm with a theoretical performance guarantee. We implemented a package, SuperTAD-Fast. Using Hi-C matrices and simulated data, we demonstrated that SuperTAD-Fast achieved great runtime improvement compared with SuperTAD. SuperTAD-Fast shows high consistency and significant enrichment of structural proteins from Hi-C data of human cell lines in comparison with the existing six hierarchical TADs detecting methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Genome structural dynamics: insights from Gaussian network analysis of Hi-C data.
- Author
-
Banerjee, Anupam, Zhang, She, and Bahar, Ivet
- Subjects
- *
GENETIC regulation , *CHROMOSOME structure , *GENE expression , *CHROMOSOMES , *CHROMATIN - Abstract
Characterization of the spatiotemporal properties of the chromatin is essential to gaining insights into the physical bases of gene co-expression, transcriptional regulation and epigenetic modifications. The Gaussian network model (GNM) has proven in recent work to serve as a useful tool for modeling chromatin structural dynamics, using as input high-throughput chromosome conformation capture data. We focus here on the exploration of the collective dynamics of chromosomal structures at hierarchical levels of resolution, from single gene loci to topologically associating domains or entire chromosomes. The GNM permits us to identify long-range interactions between gene loci, shedding light on the role of cross-correlations between distal regions of the chromosomes in regulating gene expression. Notably, GNM analysis performed across diverse cell lines highlights the conservation of the global/cooperative movements of the chromatin across different types of cells. Variations driven by localized couplings between genomic loci, on the other hand, underlie cell differentiation, underscoring the significance of the four-dimensional properties of the genome in defining cellular identity. Finally, we demonstrate the close relation between the cell type–dependent mobility profiles of gene loci and their gene expression patterns, providing a clear demonstration of the role of chromosomal 4D features in defining cell-specific differential expression of genes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. SnapHiC-G: identifying long-range enhancer–promoter interactions from single-cell Hi-C data via a global background model.
- Author
-
Liu, Weifang, Zhong, Wujuan, Giusti-Rodríguez, Paola, Jiang, Zhiyun, Wang, Geoffery W, Sun, Huaigu, Hu, Ming, and Li, Yun
- Subjects
- *
HUMAN embryonic stem cells , *NEURAL stem cells , *GENOME-wide association studies , *GENETIC regulation , *SINGLE nucleotide polymorphisms - Abstract
Harnessing the power of single-cell genomics technologies, single-cell Hi-C (scHi-C) and its derived technologies provide powerful tools to measure spatial proximity between regulatory elements and their target genes in individual cells. Using a global background model, we propose SnapHiC-G, a computational method, to identify long-range enhancer–promoter interactions from scHi-C data. We applied SnapHiC-G to scHi-C datasets generated from mouse embryonic stem cells and human brain cortical cells. SnapHiC-G achieved high sensitivity in identifying long-range enhancer–promoter interactions. Moreover, SnapHiC-G can identify putative target genes for noncoding genome-wide association study (GWAS) variants, and the genetic heritability of neuropsychiatric diseases is enriched for single-nucleotide polymorphisms (SNPs) within SnapHiC-G-identified interactions in a cell-type-specific manner. In sum, SnapHiC-G is a powerful tool for characterizing cell-type-specific enhancer–promoter interactions from complex tissues and can facilitate the discovery of chromatin interactions important for gene regulation in biologically relevant cell types. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Effectiveness of machine learning at modeling the relationship between Hi‐C data and copy number variation.
- Author
-
Wang, Yuyang, Sun, Yu, Liu, Zeyu, Chen, Bijia, Chen, Hebing, Ren, Chao, Lin, Xuanwei, Hu, Pengzhen, Jia, Peiheng, Xu, Xiang, Xu, Kang, Liu, Ximeng, Li, Hao, and Bo, Xiaochen
- Subjects
- *
CHROMATIN , *CHROMOSOMES , *BORED piles , *DNA copy number variations , *DEEP learning - Abstract
Copy number variation (CNV) refers to the number of copies of a specific sequence in a genome and is a type of chromatin structural variation. The development of the Hi‐C technique has empowered research on the spatial structure of chromatins by capturing interactions between DNA fragments. We utilized machine‐learning methods including the linear transformation model and graph convolutional network (GCN) to detect CNV events from Hi‐C data and reveal how CNV is related to three‐dimensional interactions between genomic fragments in terms of the one‐dimensional read count signal and features of the chromatin structure. The experimental results demonstrated a specific linear relation between the Hi‐C read count and CNV for each chromosome that can be well qualified by the linear transformation model. In addition, the GCN‐based model could accurately extract features of the spatial structure from Hi‐C data and infer the corresponding CNV across different chromosomes in a cancer cell line. We performed a series of experiments including dimension reduction, transfer learning, and Hi‐C data perturbation to comprehensively evaluate the utility and robustness of the GCN‐based model. This work can provide a benchmark for using machine learning to infer CNV from Hi‐C data and serves as a necessary foundation for deeper understanding of the relationship between Hi‐C data and CNV. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. A chromosome-level reference genome assembly and a full-length transcriptome assembly of the giant freshwater prawn (Macrobrachium rosenbergii).
- Author
-
Pootakham, Wirulda, Sittikankaew, Kanchana, Sonthirod, Chutima, Naktang, Chaiwat, Uengwetwanit, Tanaporn, Kongkachana, Wasitthee, Ampolsak, Kongphop, and Karoonuthaisiri, Nitsara
- Subjects
- *
ALTERNATIVE RNA splicing , *MACROBRACHIUM rosenbergii , *MICROSATELLITE repeats , *AQUACULTURE industry , *GENOMES , *FISH breeding - Abstract
The giant freshwater prawn (Macrobrachium rosenbergii) is a key species in the aquaculture industry in several Asian, African, and South American countries. Despite a considerable growth in its production worldwide, the genetic complexities of M. rosenbergii various morphotypes pose challenges in cultivation. This study reports the first chromosome-scale reference genome and a high-quality full-length transcriptome assembly for M. rosenbergii. We employed the PacBio High Fidelity (HiFi) sequencing to obtain an initial draft assembly and further scaffolded it with the chromatin contact mapping (Hi-C) technique to achieve a final assembly of 3.73-Gb with an N50 scaffold length of 33.6 Mb. Repetitive elements constituted nearly 60% of the genome assembly, with simple sequence repeats and retrotransposons being the most abundant. The availability of both the chromosome-scale assembly and the full-length transcriptome assembly enabled us to thoroughly probe alternative splicing events in M. rosenbergii. Among the 2,041 events investigated, exon skipping represented the most prevalent class, followed by intron retention. Interestingly, specific isoforms were observed across multiple tissues. Additionally, within a single tissue type, transcripts could undergo alternative splicing, yielding multiple isoforms. We believe that the availability of a chromosome-level reference genome for M. rosenbergii , along with its full-length transcriptome, will be instrumental in advancing our understanding of the giant freshwater prawn biology and enhancing its molecular breeding programs, paving the way for the development of M. rosenbergii with valuable traits in commercial aquaculture. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. Recovery of 52 bacterial genomes from the fecal microbiome of the domestic cat (Felis catus) using Hi-C proximity ligation and shotgun metagenomics.
- Author
-
Rojas, Connie, Gardy, Jennifer, Ganz, Holly, and Eisen, Jonathan
- Subjects
Hi-C ,antimicrobial-resistance ,assembly ,domestic cats ,fecal microbiome ,gut bacteria ,gut microbiome ,metagenome-assembled genomes (MAGs) ,shotgun metagenomics - Abstract
We used Hi-C proximity ligation with shotgun sequencing to retrieve metagenome-assembled genomes (MAGs) from the fecal microbiomes of two domestic cats (Felis catus). The genomes were assessed for completeness and contamination, classified taxonomically, and annotated for putative antimicrobial resistance (AMR) genes.
- Published
- 2023
44. Memory CD4+ T cells sequentially restructure their 3D genome during stepwise activation
- Author
-
Alexander I. Ward, Jose I. de las Heras, Eric C. Schirmer, and Ariberto Fassati
- Subjects
3D-genome organization ,memory CD4+ T cells ,sequential immune activation ,gene expression regulation ,Hi-C ,IL-2 ,Biology (General) ,QH301-705.5 - Abstract
BackgroundCD4+ T cells are a highly differentiated cell type that maintain enough transcriptomic plasticity to cycle between activated and memory statuses. How the 1D chromatin state and 3D chromatin architecture support this plasticity is under intensive investigation.MethodsHere, we wished to test a commercially available in situ Hi-C kit (Arima Genomics Inc.) to establish whether published performance on limiting cell numbers from clonal cell lines copies across to a primary immune cell type. We achieved comparable contact matrices from 50,000, 250,000, and 1,000,000 memory CD4+ T-cell inputs. We generated multiple Hi-C and RNA-seq libraries from the same biological blood donors under three separate conditions: unstimulated fresh ex vivo, IL-2-only stimulated, and T cell receptor (TCR)+CD28+IL-2-stimulated, conferring increasingly stronger activation signals. We wished to capture the magnitude and progression of 3D chromatin shifts and correlate these to expression changes under the two stimulations.ResultsAlthough some genome organization changes occurred concomitantly with changes in gene expression, at least as many changes occurred without corresponding changes in expression. Counter to the hypothesis that topologically associated domains (TADs) are largely invariant structures providing a scaffold for dynamic looping contacts between enhancers and promotors, we found that there were at least as many dynamic TAD changes. Stimulation with IL-2 alone triggered many changes in genome organization, and many of these changes were strengthened by additional TCR and CD28 co-receptor stimulation.ConclusionsThis suggests a stepwise process whereby mCD4+ T cells undergo sequential buildup of 3D architecture induced by distinct or combined stimuli likely to “prime” or “deprime” them for expression responses to subsequent TCR-antigen ligation or additional cytokine stimulation.
- Published
- 2025
- Full Text
- View/download PDF
45. Using genomics to explore the epidemiology of vancomycin resistance in a sewage system
- Author
-
Emilie Egholm Bruun Jensen, Saria Otani, Ivan Liachko, Benjamin Auch, and Frank M. Aarestrup
- Subjects
Hi-C ,metagenomics ,metagenome assemblies ,antimicrobial resistance ,sewage ,glycopeptide resistance ,Microbiology ,QR1-502 - Abstract
ABSTRACT VanHAX-mediated glycopeptide resistance has been consistently high in one of the three main sewer systems in Copenhagen, Lynetten, for +20 years. To explore this for other glycopeptide resistance genes, and whether the colonization has resulted in establishment of multiple bacterial taxa, we mapped 505 shotgun metagenomic data sets from the inlet of three sewage treatment plants to 831 different glycopeptide resistance genes. Only vanHAX and vanHBX genes were differentially abundant in Lynetten. Analyses of eight contigs suggested limited variations in the flanking regions. Proximity ligation metagenomic analysis of 12 samples from Lynetten identified 441 and 5 paired reads mapping to vanHAX and vanHBX, respectively. The other end of these reads was mapped to generated metagenomic-assembled genomes and NCBI using BLAST. vanHBX could only be linked to the phylum level (Bacillota). Plasmid analysis of vanHBX Hi-C contigs showed that these were mainly located on plasmids reported found in enterococci species. Most vanHAX-linked reads could only be linked to phylum and class level, but some reads were assigned to Enterococcus faecium (7 reads), Enterococcus faecalis (4 reads), Paenibacillus apiarius (2 reads), and Paenibacillus thiaminolyticus (27 reads). Ten of the 20 Hi-C contigs-containing vanHAX were annotated as plasmid, all reported found in Enterococcus species. This study shows that while Hi-C technology is valuable for linking antimicrobial resistance genes to bacterial taxa, it suffers from challenges in reliably mapping the linked read to a genomic region with sufficient taxonomic information. Our results also suggest that over the +20 years of colonizing a sewer system, vanHAX has not become widespread across multiple taxa, remaining primarily in E. faecalis and E. faecium, with the exception of Paenibacillus.IMPORTANCELong-term colonization of microbial communities with antimicrobial-resistant bacteria is expected to result in sharing of the resistance genes between several different bacterial taxa of the communities. We investigated microbiomes from a sewer, which have been colonized with glycopeptide-resistant bacteria harboring the mobile vanHAX gene cluster for a minimum of 20 years, using metagenomics sequencing and Hi-C. We found that despite the long-term presence in the sewer, the vanHAX genes have seemingly not disseminated widely.
- Published
- 2025
- Full Text
- View/download PDF
46. A chromosome-scale genome assembly of mungbean (Vigna radiata)
- Author
-
Supaporn Khanbo, Poompat Phadphon, Chaiwat Naktang, Duangjai Sangsrakru, Pitchaporn Waiyamitra, Nattapol Narong, Chutintorn Yundaeng, Sithichoke Tangphatsornruang, Kularb Laosatit, Prakit Somta, and Wirulda Pootakham
- Subjects
Mungbean ,Vigna radiata ,Chromosome-scale ,Genome assembly ,Hi-C ,Annotation ,Medicine ,Biology (General) ,QH301-705.5 - Abstract
Background Mungbean (Vigna radiata) is one of the most socio-economically important leguminous food crops of Asia and a rich source of dietary protein and micronutrients. Understanding its genetic makeup is crucial for genetic improvement and cultivar development. Methods In this study, we combined single-tube long-fragment reads (stLFR) sequencing technology with high-throughput chromosome conformation capture (Hi-C) technique to obtain a chromosome-level assembly of V. radiata cultivar ‘KUML4’. Results The final assembly of the V. radiata genome was 468.08 Mb in size, with a scaffold N50 of 40.75 Mb. This assembly comprised 11 pseudomolecules, covering 96.94% of the estimated genome size. The genome contained 253.85 Mb (54.76%) of repetitive sequences and 27,667 protein-coding genes. Our gene prediction recovered 98.3% of the highly conserved orthologs based on Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis. Comparative analyses using sequence data from single-copy orthologous genes indicated that V. radiata diverged from V. mungo approximately 4.17 million years ago. Moreover, gene family analysis revealed that major gene families associated with defense responses were significantly expanded in V. radiata. Conclusion Our chromosome-scale genome assembly of V. radiata cultivar KUML4 will provide a valuable genomic resource, supporting genetic improvement and molecular breeding. This data will also be valuable for future comparative genomics studies among legume species.
- Published
- 2024
- Full Text
- View/download PDF
47. A chromosome-scale genome assembly and evaluation of mtDNA variation in the willow leaf beetle Chrysomela aeneicollis.
- Author
-
Bracewell, Ryan, Stillman, Jonathon, Dahlhoff, Elizabeth, Smeds, Elliott, Chatla, Kamalakar, Bachtrog, Doris, Williams, Caroline, and Rank, Nathan
- Subjects
Hi-C ,genome assembly ,mitochondria ,Female ,Male ,Animals ,Coleoptera ,DNA ,Mitochondrial ,Salix ,RNA ,Ribosomal ,16S ,Genome ,Mitochondrial ,Chromosomes - Abstract
The leaf beetle Chrysomela aeneicollis has a broad geographic range across Western North America but is restricted to cool habitats at high elevations along the west coast. Central California populations occur only at high altitudes (2,700-3,500 m) where they are limited by reduced oxygen supply and recent drought conditions that are associated with climate change. Here, we report a chromosome-scale genome assembly alongside a complete mitochondrial genome and characterize differences among mitochondrial genomes along a latitudinal gradient over which beetles show substantial population structure and adaptation to fluctuating temperatures. Our scaffolded genome assembly consists of 21 linkage groups; one of which we identified as the X chromosome based on female/male whole genome sequencing coverage and orthology with Tribolium castaneum. We identified repetitive sequences in the genome and found them to be broadly distributed across all linkage groups. Using a reference transcriptome, we annotated a total of 12,586 protein-coding genes. We also describe differences in putative secondary structures of mitochondrial RNA molecules, which may generate functional differences important in adaptation to harsh abiotic conditions. We document substitutions at mitochondrial tRNA molecules and substitutions and insertions in the 16S rRNA region that could affect intermolecular interactions with products from the nuclear genome. This first chromosome-level reference genome will enable genomic research in this important model organism for understanding the biological impacts of climate change on montane insects.
- Published
- 2023
48. Transcriptional enhancers in human neuronal differentiation provide clues to neuronal disorders
- Author
-
Yoshihara, Masahito, Coschiera, Andrea, Bachmann, Jörg A, Pucci, Mariangela, Li, Haonan, Bhagat, Shruti, Murakawa, Yasuhiro, Weltner, Jere, Jouhilahti, Eeva-Mari, Swoboda, Peter, Sahlén, Pelin, and Kere, Juha
- Published
- 2025
- Full Text
- View/download PDF
49. HiSVision: A Method for Detecting Large-Scale Structural Variations Based on Hi-C Data and Detection Transformer
- Author
-
Zhai, Haixia, Dong, Chengyao, Wang, Tao, and Luo, Junwei
- Published
- 2024
- Full Text
- View/download PDF
50. A chromosome-level genome assembly and annotation of the medicinal plant Lepidium apetalum
- Author
-
Hang Yan, Yunhao Zhu, Haoyu Jia, Yuanjun Li, Yongguang Han, Xiaoke Zheng, Xiule Yue, Le Zhao, and Weisheng Feng
- Subjects
Lepidium apetalum ,Genome assembly ,PacBio sequencing ,Hi-C ,Transcriptome ,Genetics ,QH426-470 - Abstract
Abstract Objectives As a traditional Chinese medicine, Lepidium apetalum is commonly used for purging the lung, relieving dyspnea, alleviating edema, and has the significant pharmacological effects on cardiovascular disease, hyperlipidemia, etc. In addition, the seeds of L. apetalum are rich in unsaturated fatty acids, sterols, glucosinolates and have a variety of biological activity compounds. To facilitate genomics, phylogenetic and secondary metabolite biosynthesis studies of L. apetalum, we assembled the high-resolution genome of L. apetalum. Data description We completed chromosome-level genome assembly of the L. apetalum genome (2n = 32), using Illumina HiSeq and PacBio Sequel sequencing platform as well as high-throughput chromosome conformation capture (Hi-C) technique. The assembled genome was 296.80 Mb in size, 34.41% in GC content, and 23.89% in repeated sequence content, including 316 contigs with a contig N50 of 16.31 Mb. Hi-C scaffolding resulted in 16 chromosomes occupying 99.79% of the assembled genome sequences. A total of 46 584 genes and 105 pseudogenes were predicted, 98.37% of which can be annotated to Nr, GO, KEGG, TrEMBL, SwissPort, Pfam and KOG databases. The high-quality reference genome generated by this study will provide accurate genetic information for the molecular biology research of L. apetalum.
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.