105 results on '"Braga, Marília D. V."'
Search Results
2. Investigating the complexity of the double distance problems
- Author
-
Braga, Marilia D. V., Brockmann, Leonie R., Klerx, Katharina, and Stoye, Jens
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
Two genomes over the same set of gene families form a canonical pair when each of them has exactly one gene from each family. Different distances of canonical genomes can be derived from a structure called breakpoint graph, which represents the relation between the two given genomes as a collection of cycles of even length and paths. Then, the breakpoint distance is equal to n - (c_2 + p_0/2), where n is the number of genes, c_2 is the number of cycles of length 2 and p_0 is the number of paths of length 0. Similarly, when the considered rearrangements are those modeled by the double-cut-and-join (DCJ) operation, the rearrangement distance is n - (c + p_e/2), where c is the total number of cycles and p_e is the total number of even paths. The distance formulation is a basic unit for several other combinatorial problems related to genome evolution and ancestral reconstruction, such as median or double distance. Interestingly, both median and double distance problems can be solved in polynomial time for the breakpoint distance, while they are NP-hard for the rearrangement distance. One way of exploring the complexity space between these two extremes is to consider the {\sigma}_k distance, defined to be n - [c_2 + c_4 + ... + c_k + (p_0 + p_2 + ... +p_k)/2], and increasingly investigate the complexities of median and double distance for the {\sigma}_4 distance, then the {\sigma}_6 distance, and so on. While for the median much effort was done in our and in other research groups but no progress was obtained even for the {\sigma}_4 distance, for solving the double distance under {\sigma}_4 and {\sigma}_6 distances we could devise linear time algorithms, which we present here., Comment: 24 pages, 26 figures
- Published
- 2023
3. Efficient gene orthology inference via large-scale rearrangements
- Author
-
Rubert, Diego P. and Braga, Marília D. V.
- Published
- 2023
- Full Text
- View/download PDF
4. On the Class of Double Distance Problems
- Author
-
Braga, Marília D. V., Brockmann, Leonie R., Klerx, Katharina, Stoye, Jens, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Jahn, Katharina, editor, and Vinař, Tomáš, editor
- Published
- 2023
- Full Text
- View/download PDF
5. Natural family-free genomic distance
- Author
-
Rubert, Diego P., Martinez, Fábio V., and Braga, Marília D. V.
- Subjects
Computer Science - Data Structures and Algorithms ,G.1.6 ,J.3 - Abstract
A classical problem in comparative genomics is to compute the rearrangement distance, that is the minimum number of large-scale rearrangements required to transform a given genome into another given genome. While the most traditional approaches in this area are family-based, i.e., require the classification of DNA fragments into families, more recently an alternative family-free approach was proposed, and consists of studying the rearrangement distances without prior family assignment. On the one hand the computation of genomic distances in the family-free setting helps to match occurrences of duplicated genes and find homologies, but on the other hand this computation is NP-hard. In this paper, by letting structural rearrangements be represented by the generic double cut and join (DCJ) operation and also allowing insertions and deletions of DNA segments, we propose a new and more general family-free genomic distance, providing an efficient ILP formulation to solve it. Our experiments show that the ILP produces accurate results and can handle not only bacterial genomes, but also fungi and insects, or subsets of chromosomes of mammals and plants., Comment: Workshop on Algorithms in Bioinformatics (WABI) 2020, 24 pages, 10 figures, 8 tables
- Published
- 2020
6. Computing the rearrangement distance of natural genomes
- Author
-
Bohnenkämper, Leonard, Braga, Marília D. V., Doerr, Daniel, and Stoye, Jens
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
The computation of genomic distances has been a very active field of computational comparative genomics over the last 25 years. Substantial results include the polynomial-time computability of the inversion distance by Hannenhalli and Pevzner in 1995 and the introduction of the double-cut and join (DCJ) distance by Yancopoulos et al. in 2005. Both results, however, rely on the assumption that the genomes under comparison contain the same set of unique markers (syntenic genomic regions, sometimes also referred to as genes). In 2015, Shao, Lin and Moret relax this condition by allowing for duplicate markers in the analysis. This generalized version of the genomic distance problem is NP-hard, and they give an ILP solution that is efficient enough to be applied to real-world datasets. A restriction of their approach is that it can be applied only to balanced genomes, that have equal numbers of duplicates of any marker. Therefore it still needs a delicate preprocessing of the input data in which excessive copies of unbalanced markers have to be removed. In this paper we present an algorithm solving the genomic distance problem for natural genomes, in which any marker may occur an arbitrary number of times. Our method is based on a new graph data structure, the multi-relational diagram, that allows an elegant extension of the ILP by Shao, Lin and Moret to count runs of markers that are under- or over-represented in one genome with respect to the other and need to be inserted or deleted, respectively. With this extension, previous restrictions on the genome configurations are lifted, for the first time enabling an uncompromising rearrangement analysis. Any marker sequence can directly be used for the distance calculation. The evaluation of our approach shows that it can be used to analyze genomes with up to a few ten thousand markers, which we demonstrate on simulated and real data.
- Published
- 2020
- Full Text
- View/download PDF
7. Computing the Inversion-Indel Distance
- Author
-
Willing, Eyla, Stoye, Jens, and Braga, Marília D. V.
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
The inversion distance, that is the distance between two unichromosomal genomes with the same content allowing only inversions of DNA segments, can be exactly computed thanks to a pioneering approach of Hannenhalli and Pevzner from 1995. In 2000, El-Mabrouk extended the inversion model to perform the comparison of unichromosomal genomes with unequal contents, combining inversions with insertions and deletions (indels) of DNA segments, giving rise to the inversion-indel distance. However, only a heuristic was provided for its computation. In 2005, Yancopoulos, Attie and Friedberg started a new branch of research by introducing the generic double cut and join (DCJ) operation, that can represent several genome rearrangements (including inversions). In 2006, Bergeron, Mixtacki and Stoye showed that the DCJ distance can be computed in linear time with a very simple procedure. As a consequence, in 2010 we gave a linear-time algorithm to compute the DCJ-indel distance. This result allowed the inversion-indel model to be revisited from another angle. In 2013, we could show that, when the diagram that represents the relation between the two compared genomes has no bad components, the inversion-indel distance is equal to the DCJ-indel distance. In the present work we complete the study of the inversion-indel distance by giving the first algorithm to compute it exactly even in the presence of bad components.
- Published
- 2019
8. Natural family-free genomic distance
- Author
-
Rubert, Diego P., Martinez, Fábio V., and Braga, Marília D. V.
- Published
- 2021
- Full Text
- View/download PDF
9. Computing the Rearrangement Distance of Natural Genomes
- Author
-
Bohnenkämper, Leonard, primary, Braga, Marília D. V., additional, Doerr, Daniel, additional, and Stoye, Jens, additional
- Published
- 2020
- Full Text
- View/download PDF
10. Algorithms for Computing the Family-Free Genomic Similarity Under DCJ
- Author
-
Rubert, Diego P., Medeiros, Gabriel L., Hoshino, Edna A., Braga, Marília D. V., Stoye, Jens, Martinez, Fábio V., Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Meidanis, Joao, editor, and Nakhleh, Luay, editor
- Published
- 2017
- Full Text
- View/download PDF
11. A Linear Time Approximation Algorithm for the DCJ Distance for Genomes with Bounded Number of Duplicates
- Author
-
Rubert, Diego P., Feijão, Pedro, Braga, Marília D. V., Stoye, Jens, Martinez, Fábio V., Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Frith, Martin, editor, and Storm Pedersen, Christian Nørgaard, editor
- Published
- 2016
- Full Text
- View/download PDF
12. On the Family-Free DCJ Distance
- Author
-
Martinez, Fábio V., Feijão, Pedro, Braga, Marília D. V., Stoye, Jens, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Kobsa, Alfred, Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Istrail, Sorin, Series editor, Pevzner, Pavel, Series editor, Waterman, Michael S., Series editor, Brown, Dan, editor, and Morgenstern, Burkhard, editor
- Published
- 2014
- Full Text
- View/download PDF
13. The Potential of Family-Free Genome Comparison
- Author
-
Braga, Marília D. V., Chauve, Cedric, Doerr, Daniel, Jahn, Katharina, Stoye, Jens, Thévenin, Annelyse, Wittler, Roland, Dress, Andreas, Series editor, Linial, Michal, Series editor, Troyanskaya, Olga, Series editor, Vingron, Martin, Series editor, Chauve, Cedric, editor, El-Mabrouk, Nadia, editor, and Tannier, Eric, editor
- Published
- 2013
- Full Text
- View/download PDF
14. An Overview of Genomic Distances Modeled with Indels
- Author
-
Braga, Marília D. V., Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Bonizzoni, Paola, editor, Brattka, Vasco, editor, and Löwe, Benedikt, editor
- Published
- 2013
- Full Text
- View/download PDF
15. Restricted DCJ-Indel Model Revisited
- Author
-
Braga, Marília D. V., Stoye, Jens, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Istrail, Sorin, editor, Pevzner, Pavel, editor, Waterman, Michael S., editor, Setubal, João C., editor, and Almeida, Nalvo F., editor
- Published
- 2013
- Full Text
- View/download PDF
16. DCJ-indel Distance with Distinct Operation Costs
- Author
-
da Silva, Poly H., Braga, Marília D. V., Machado, Raphael, Dantas, Simone, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Istrail, Sorin, editor, Pevzner, Pavel, editor, Waterman, Michael S., editor, Raphael, Ben, editor, and Tang, Jijun, editor
- Published
- 2012
- Full Text
- View/download PDF
17. Analysis and Implementation of Sorting by Transpositions Using Permutation Trees
- Author
-
Lopes, Marcelo P., Braga, Marilia D. V., de Figueiredo, Celina M. H., de A. Hausen, Rodrigo, Kowada, Luis Antonio B., Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Sudan, Madhu, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Vardi, Moshe Y., Series editor, Weikum, Gerhard, Series editor, Istrail, Sorin, editor, Pevzner, Pavel, editor, Waterman, Michael S., editor, Norberto de Souza, Osmar, editor, Telles, Guilherme P., editor, and Palakal, Mathew, editor
- Published
- 2011
- Full Text
- View/download PDF
18. The Problem of Chromosome Reincorporation in DCJ Sorting and Halving
- Author
-
Kováč, Jakub, Braga, Marília D. V., Stoye, Jens, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Sudan, Madhu, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Vardi, Moshe Y., Series editor, Weikum, Gerhard, Series editor, Istrail, Sorin, editor, Pevzner, Pavel, editor, Waterman, Michael S., editor, and Tannier, Eric, editor
- Published
- 2010
- Full Text
- View/download PDF
19. On Sorting Genomes with DCJ and Indels
- Author
-
Braga, Marília D. V., Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Sudan, Madhu, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Vardi, Moshe Y., Series editor, Weikum, Gerhard, Series editor, Istrail, Sorin, editor, Pevzner, Pavel, editor, Waterman, Michael S., editor, and Tannier, Eric, editor
- Published
- 2010
- Full Text
- View/download PDF
20. Genomic Distance with DCJ and Indels
- Author
-
Braga, Marília D. V., Willing, Eyla, Stoye, Jens, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Sudan, Madhu, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Vardi, Moshe Y., Series editor, Weikum, Gerhard, Series editor, Istrail, Sorin, editor, Pevzner, Pavel, editor, Waterman, Michael S., editor, Moulton, Vincent, editor, and Singh, Mona, editor
- Published
- 2010
- Full Text
- View/download PDF
21. Counting All DCJ Sorting Scenarios
- Author
-
Braga, Marília D. V., Stoye, Jens, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Sudan, Madhu, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Vardi, Moshe Y., Series editor, Weikum, Gerhard, Series editor, Istrail, Sorin, editor, Pevzner, Pavel, editor, Waterman, Michael S., editor, Ciccarelli, Francesca D., editor, and Miklós, István, editor
- Published
- 2009
- Full Text
- View/download PDF
22. Algorithms for Computing the Family-Free Genomic Similarity Under DCJ
- Author
-
Rubert, Diego P., primary, Medeiros, Gabriel L., additional, Hoshino, Edna A., additional, Braga, Marília D. V., additional, Stoye, Jens, additional, and Martinez, Fábio V., additional
- Published
- 2017
- Full Text
- View/download PDF
23. The Solution Space of Sorting by Reversals
- Author
-
Braga, Marília D. V., Sagot, Marie-France, Scornavacca, Celine, Tannier, Eric, Istrail, Sorin, editor, Pevzner, Pavel, editor, Waterman, Michael S., editor, Măndoiu, Ion, editor, and Zelikovsky, Alexander, editor
- Published
- 2007
- Full Text
- View/download PDF
24. Computing the family-free DCJ similarity
- Author
-
Rubert, Diego P., Hoshino, Edna A., Braga, Marília D. V., Stoye, Jens, and Martinez, Fábio V.
- Published
- 2018
- Full Text
- View/download PDF
25. An Algorithm That Builds a Set of Strings Given Its Overlap Graph
- Author
-
Braga, Marília D. V., Meidanis, João, and Rajsbaum, Sergio, editor
- Published
- 2002
- Full Text
- View/download PDF
26. Gene Orthology Inference via Large-Scale Rearrangements for Partially Assembled Genomes
- Author
-
Diego P. Rubert and Marília D. V. Braga, Rubert, Diego P., Braga, Marília D. V., Diego P. Rubert and Marília D. V. Braga, Rubert, Diego P., and Braga, Marília D. V.
- Abstract
Recently we developed a gene orthology inference tool based on genome rearrangements (Journal of Bioinformatics and Computational Biology 19:6, 2021). Given a set of genomes our method first computes all pairwise gene similarities. Then it runs pairwise ILP comparisons to compute optimal gene matchings, which minimize, by taking the similarities into account, the weighted rearrangement distance between the analyzed genomes (a problem that is NP-hard). The gene matchings are then integrated into gene families in the final step. Although the ILP is quite efficient and could conceptually analyze genomes that are not completely assembled but split in several contigs, our tool failed in completing that task. The main reason is that each ILP pairwise comparison includes an optimal capping that connects each end of a linear segment of one genome to an end of a linear segment in the other genome, producing an exponential increase of the search space. In this work, we design and implement a heuristic capping algorithm that replaces the optimal capping by clustering (based on their gene content intersections) the linear segments into m ≥ 1 subsets, whose ends are capped independently. Furthermore, in each subset, instead of allowing all possible connections, we let only the ends of content-related segments be connected. Although there is no guarantee that m is much bigger than one, and with the possible side effect of resulting in sub-optimal instead of optimal gene matchings, the heuristic works very well in practice, from both the speed performance and the quality of computed solutions. Our experiments on real data show that we can now efficiently analyze fruit fly genomes with unfinished assemblies distributed in hundreds or even thousands of contigs, obtaining orthologies that are more similar to FlyBase orthologies when compared to orthologies computed by other inference tools. Moreover, for complete assemblies the version with heuristic capping reports orthologies that
- Published
- 2022
- Full Text
- View/download PDF
27. The potential of family-free rearrangements towards gene orthology inference
- Author
-
Rubert, Diego P., primary, Doerr, Daniel, additional, and Braga, Marília D. V., additional
- Published
- 2021
- Full Text
- View/download PDF
28. Additional file 1 of Natural family-free genomic distance
- Author
-
Rubert, Diego P., Fábio V. Martinez, and Braga, Marília D. V.
- Subjects
ComputingMethodologies_SIMULATIONANDMODELING ,InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,ComputingMilieux_COMPUTERSANDEDUCATION ,ComputerApplications_COMPUTERSINOTHERSYSTEMS - Abstract
Additional file 1. Supplementary material on the model and on the experiments, including supplementary figures and tables.
- Published
- 2021
- Full Text
- View/download PDF
29. Natural Family-Free Genomic Distance
- Author
-
Diego P. Rubert and Fábio V. Martinez and Marília D. V. Braga, Rubert, Diego P., Martinez, Fábio V., Braga, Marília D. V., Diego P. Rubert and Fábio V. Martinez and Marília D. V. Braga, Rubert, Diego P., Martinez, Fábio V., and Braga, Marília D. V.
- Abstract
A classical problem in comparative genomics is to compute the rearrangement distance, that is the minimum number of large-scale rearrangements required to transform a given genome into another given genome. While the most traditional approaches in this area are family-based, i.e., require the classification of DNA fragments of both genomes into families, more recently an alternative model was proposed, which, instead of family classification, simply uses the pairwise similarities between DNA fragments of both genomes to compute their rearrangement distance. This model represents structural rearrangements by the generic double cut and join (DCJ) operation and is then called family-free DCJ distance. It computes the DCJ distance between the two genomes by searching for a matching of their genes based on the given pairwise similarities, therefore helping to find gene homologies. The drawback is that its computation is NP-hard. Another point is that the family-free DCJ distance must correspond to a maximal matching of the genes, due to the fact that unmatched genes are just ignored: maximizing the matching prevents the free lunch artifact of having empty or almost empty matchings giving the smaller distances. In this paper, besides DCJ operations, we allow content-modifying operations of insertions and deletions of DNA segments and propose a new and more general family-free genomic distance. In our model we use the pairwise similarities to assign weights to both matched and unmatched genes, so that an optimal solution does not necessarily maximize the matching. Our model then results in a natural family-free genomic distance, that takes into consideration all given genes and has a search space composed of matchings of any size. We provide an efficient ILP formulation to solve it, by extending the previous formulations for computing family-based genomic distances from Shao et al. (J. Comput. Biol., 2015) and Bohnenkämper et al. (Proc. of RECOMB, 2020). Our experiments sh
- Published
- 2020
- Full Text
- View/download PDF
30. Restricted DCJ-Indel Model Revisited
- Author
-
Braga, Marília D. V., primary and Stoye, Jens, additional
- Published
- 2013
- Full Text
- View/download PDF
31. The Potential of Family-Free Genome Comparison
- Author
-
Braga, Marília D. V., primary, Chauve, Cedric, additional, Doerr, Daniel, additional, Jahn, Katharina, additional, Stoye, Jens, additional, Thévenin, Annelyse, additional, and Wittler, Roland, additional
- Published
- 2013
- Full Text
- View/download PDF
32. On the weight of indels in genomic distances
- Author
-
Ribeiro Leonardo C, Machado Raphael, Braga Marília D V, and Stoye Jens
- Subjects
Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Classical approaches to compute the genomic distance are usually limited to genomes with the same content, without duplicated markers. However, differences in the gene content are frequently observed and can reflect important evolutionary aspects. A few polynomial time algorithms that include genome rearrangements, insertions and deletions (or substitutions) were already proposed. These methods often allow a block of contiguous markers to be inserted, deleted or substituted at once but result in distance functions that do not respect the triangular inequality and hence do not constitute metrics. Results In the present study we discuss the disruption of the triangular inequality in some of the available methods and give a framework to establish an efficient correction for two models recently proposed, one that includes insertions, deletions and double cut and join (DCJ) operations, and one that includes substitutions and DCJ operations. Conclusions We show that the proposed framework establishes the triangular inequality in both distances, by summing a surcharge on indel operations and on substitutions that depends only on the number of markers affected by these operations. This correction can be applied a posteriori, without interfering with the already available formulas to compute these distances. We claim that this correction leads to distances that are biologically more plausible.
- Published
- 2011
- Full Text
- View/download PDF
33. Genomic distance under gene substitutions
- Author
-
Ribeiro Leonardo C, Machado Raphael, Braga Marília D V, and Stoye Jens
- Subjects
Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background The distance between two genomes is often computed by comparing only the common markers between them. Some approaches are also able to deal with non-common markers, allowing the insertion or the deletion of such markers. In these models, a deletion and a subsequent insertion that occur at the same position of the genome count for two sorting steps. Results Here we propose a new model that sorts non-common markers with substitutions, which are more powerful operations that comprehend insertions and deletions. A deletion and an insertion that occur at the same position of the genome can be modeled as a substitution, counting for a single sorting step. Conclusions Comparing genomes with unequal content, but without duplicated markers, we give a linear time algorithm to compute the genomic distance considering substitutions and double-cut-and-join (DCJ) operations. This model provides a parsimonious genomic distance to handle genomes free of duplicated markers, that is in practice a lower bound to the real genomic distances. The method could also be used to refine orthology assignments, since in some cases a substitution could actually correspond to an unannotated orthology.
- Published
- 2011
- Full Text
- View/download PDF
34. The Problem of Chromosome Reincorporation in DCJ Sorting and Halving
- Author
-
Kováč, Jakub, primary, Braga, Marília D. V., additional, and Stoye, Jens, additional
- Published
- 2010
- Full Text
- View/download PDF
35. On Sorting Genomes with DCJ and Indels
- Author
-
Braga, Marília D. V., primary
- Published
- 2010
- Full Text
- View/download PDF
36. Counting All DCJ Sorting Scenarios
- Author
-
Braga, Marília D. V., primary and Stoye, Jens, additional
- Published
- 2009
- Full Text
- View/download PDF
37. baobabLUNA: the solution space of sorting by reversals
- Author
-
Braga, Marília D. V.
- Published
- 2009
38. An Algorithm That Builds a Set of Strings Given Its Overlap Graph
- Author
-
Braga, Marília D. V., primary and Meidanis, João, additional
- Published
- 2002
- Full Text
- View/download PDF
39. The Solution Space of Sorting by Reversals
- Author
-
Braga, Marília D. V., primary, Sagot, Marie-France, additional, Scornavacca, Celine, additional, and Tannier, Eric, additional
- Full Text
- View/download PDF
40. On the weight of indels in genomic distances
- Author
-
Braga, Marília D V, primary, Machado, Raphael, additional, Ribeiro, Leonardo C, additional, and Stoye, Jens, additional
- Published
- 2011
- Full Text
- View/download PDF
41. Genomic distance under gene substitutions
- Author
-
Braga, Marília D V, primary, Machado, Raphael, additional, Ribeiro, Leonardo C, additional, and Stoye, Jens, additional
- Published
- 2011
- Full Text
- View/download PDF
42. On the family-free DCJ distance and similarity.
- Author
-
Martinez, Fábio V., Feijão, Pedro, Braga, Marília D. V., and Stoye, Jens
- Subjects
GENETIC transformation ,GENE delivery techniques ,GENOMICS ,GENOMES ,GENETIC regulation - Abstract
Structural variation in genomes can be revealed by many (dis)similarity measures. Rearrangement operations, such as the so called double-cut-and-join (DCJ), are large-scale mutations that can create complex changes and produce such variations in genomes. A basic task in comparative genomics is to find the rearrangement distance between two given genomes, i.e., the minimum number of rearragement operations that transform one given genome into another one. In a family-based setting, genes are grouped into gene families and efficient algorithms have already been presented to compute the DCJ distance between two given genomes. In this work we propose the problem of computing the DCJ distance of two given genomes without prior gene family assignment, directly using the pairwise similarities between genes. We prove that this new family-free DCJ distance problem is APX-hard and provide an integer linear program to its solution. We also study a family-free DCJ similarity and prove that its computation is NP-hard. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
43. On Sorting Genomes with DCJ and Indels.
- Author
-
Braga, Marília D. V.
- Abstract
A previous work of Braga, Willing and Stoye compared two genomes with unequal content, but without duplications, and presented a new linear time algorithm to compute the genomic distance, considering double cut and join (DCJ) operations, insertions and deletions. Here we derive from this approach an algorithm to sort one genome into another one also using DCJ, insertions and deletions. The optimal sorting scenarios can have different compositions and we compare two types of sorting scenarios: one that maximizes and one that minimizes the number of DCJ operations with respect to the number of insertions and deletions. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
44. The Problem of Chromosome Reincorporation in DCJ Sorting and Halving.
- Author
-
Kováč, Jakub, Braga, Marília D. V., and Stoye, Jens
- Abstract
We study two problems in the double cut and join (DCJ) model: sorting – transforming one multilinear genome into another and halving – transforming a duplicated genome into a perfectly duplicated one. The DCJ model includes rearrangement operations such as reversals, translocations, fusions and fissions. We can also mimic transpositions or block interchanges by two operations: we extract an appropriate segment of a chromosome, creating a temporary circular chromosome, and in the next step we reinsert it in its proper place. Existing linear-time algorithms solving both problems ignore the constraint of reincorporating the temporary circular chromosomes immediately after their creation. For the restricted sorting problem only a quadratic algorithm was known, whereas the restricted halving problem was stated as open by Tannier, Zheng, and Sankoff. In this paper we address this constraint and show how to solve the problem of sorting in O(nlogn) time and halving in O(n
3/2 ) time. [ABSTRACT FROM AUTHOR]- Published
- 2011
- Full Text
- View/download PDF
45. The Solution Space of Sorting by Reversals.
- Author
-
Istrail, Sorin, Pevzner, Pavel, Waterman, Michael S., Măndoiu, Ion, Zelikovsky, Alexander, Braga, Marília D. V., Sagot, Marie-France, Scornavacca, Celine, and Tannier, Eric
- Abstract
In comparative genomics, algorithms that sort permutations by reversals are often used to propose evolutionary scenarios of large scale genomic mutations between species. One of the main problems of such methods is that they give one solution while the number of optimal solutions is huge, with no criteria to discriminate among them. Bergeron et al. [4] started to give some structure to the set of optimal solutions, in order to be able to deliver more presentable results than only one solution or a complete list of all solutions. The structure is a way to group solutions into equivalence classes, and to identify in each class one particular representative. However, no algorithm exists so far to compute this set of representatives except through the enumeration of all solutions, which takes too much time even for small permutations. Bergeron et al. [4] state as an open problem the design of such an algorithm. We propose in this paper an answer to this problem, that is, an algorithm which gives one representative for each class of solutions and counts the number of solutions in each class, with a better theoretical and practical complexity than the complete enumeration method. We give several biological examples where the result is more relevant than a unique optimal solution or the list of all solutions. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
46. On the inversion-indel distance.
- Author
-
Willing, Eyla, Zaccaria, Simone, Braga, Marília D. V., and Stoye, Jens
- Abstract
Background: The inversion distance, that is the distance between two unichromosomal genomes with the same content allowing only inversions of DNA segments, can be computed thanks to a pioneering approach of Hannenhalli and Pevzner in 1995. In 2000, El-Mabrouk extended the inversion model to allow the comparison of unichromosomal genomes with unequal contents, thus insertions and deletions of DNA segments besides inversions. However, an exact algorithm was presented only for the case in which we have insertions alone and no deletion (or vice versa), while a heuristic was provided for the symmetric case, that allows both insertions and deletions and is called the inversion-indel distance. In 2005, Yancopoulos, Attie and Friedberg started a new branch of research by introducing the generic double cut and join (DCJ) operation, that can represent several genome rearrangements (including inversions). Among others, the DCJ model gave rise to two important results. First, it has been shown that the inversion distance can be computed in a simpler way with the help of the DCJ operation. Second, the DCJ operation originated the DCJ-indel distance, that allows the comparison of genomes with unequal contents, considering DCJ, insertions and deletions, and can be computed in linear time. Results: In the present work we put these two results together to solve an open problem, showing that, when the graph that represents the relation between the two compared genomes has no bad components, the inversion-indel distance is equal to the DCJ-indel distance. We also give a lower and an upper bound for the inversion-indel distance in the presence of bad components. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
47. DCJ-indel and DCJ-substitution distances with distinct operation costs.
- Author
-
da Silva, Poly H., Machado, Raphael, Dantas, Simone, and Braga, Marília D. V.
- Subjects
DNA ,GENOMES ,GENOMICS ,CHROMOSOMES ,GENETICS ,GENETIC testing - Abstract
Background: Classical approaches to compute the genomic distance are usually limited to genomes with the same content and take into consideration only rearrangements that change the organization of the genome (i.e. positions and orientation of pieces of DNA, number and type of chromosomes, etc.), such as inversions, translocations, fusions and fissions. These operations are generically represented by the double-cut and join (DCJ) operation. The distance between two genomes, in terms of number of DCJ operations, can be computed in linear time. In order to handle genomes with distinct contents, also insertions and deletions of fragments of DNA – named indels – must be allowed. More powerful than an indel is a substitution of a fragment of DNA by another fragment of DNA. Indels and substitutions are called content-modifying operations. It has been shown that both the DCJ-indel and the DCJ-substitution distances can also be computed in linear time, assuming that the same cost is assigned to any DCJ or content-modifying operation. Results: In the present study we extend the DCJ-indel and the DCJ-substitution models, considering that the content-modifying cost is distinct from and upper bounded by the DCJ cost, and show that the distance in both models can still be computed in linear time. Although the triangular inequality can be disrupted in both models, we also show how to efficiently fix this problem a posteriori. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
48. Computing the Rearrangement Distance of Natural Genomes.
- Author
-
Bohnenkämper L, Braga MDV, Doerr D, and Stoye J
- Subjects
- Algorithms, Models, Genetic, Programming, Linear, Computational Biology, Gene Rearrangement genetics, Genome genetics, Genomics
- Abstract
The computation of genomic distances has been a very active field of computational comparative genomics over the past 25 years. Substantial results include the polynomial-time computability of the inversion distance by Hannenhalli and Pevzner in 1995 and the introduction of the double cut and join distance by Yancopoulos et al. in 2005. Both results, however, rely on the assumption that the genomes under comparison contain the same set of unique markers (syntenic genomic regions, sometimes also referred to as genes). In 2015, Shao et al. relax this condition by allowing for duplicate markers in the analysis. This generalized version of the genomic distance problem is NP-hard, and they give an integer linear programming (ILP) solution that is efficient enough to be applied to real-world datasets. A restriction of their approach is that it can be applied only to balanced genomes that have equal numbers of duplicates of any marker. Therefore, it still needs a delicate preprocessing of the input data in which excessive copies of unbalanced markers have to be removed. In this article, we present an algorithm solving the genomic distance problem for natural genomes, in which any marker may occur an arbitrary number of times. Our method is based on a new graph data structure, the multi-relational diagram, that allows an elegant extension of the ILP by Shao et al. to count runs of markers that are under- or over-represented in one genome with respect to the other and need to be inserted or deleted, respectively. With this extension, previous restrictions on the genome configurations are lifted, for the first time enabling an uncompromising rearrangement analysis. Any marker sequence can directly be used for the distance calculation. The evaluation of our approach shows that it can be used to analyze genomes with up to a few 10,000 markers, which we demonstrate on simulated and real data.
- Published
- 2021
- Full Text
- View/download PDF
49. Sorting Linear Genomes with Rearrangements and Indels.
- Author
-
Braga MD and Stoye J
- Subjects
- Algorithms, Models, Genetic, Gene Rearrangement genetics, Genome genetics, Genomics methods, INDEL Mutation genetics
- Abstract
Rearrangements are mutations that can change the organization of a genome, but not its content. Examples are inversions of DNA segments, translocations of chromosome ends, fusions and fissions of chromosomes. All mentioned rearrangements can be represented by the generic Double Cut and Join (DCJ) operation. However, the DCJ operation also allows circular chromosomes to be created at intermediate steps, even if the compared genomes are linear. In this case it is more plausible to consider a restriction in which the reincorporation of a circular chromosome has to be done immediately after its creation. We call these two consecutive operations an ER composition. It has been shown that an ER composition mimics either an internal block interchange (when two segments in the same chromosome exchange their positions), or an internal transposition (the special case of a block interchange when the two segments are adjacent). The DCJ distance of two genomes is the same, regardless of this restriction, and can be computed in linear time. For comparing two genomes with unequal contents, in addition to rearrangements we have to allow insertions and deletions of DNA segments-named indels. It is already known that the distance in the model combining DCJ and indel operations can be exactly computed. Again, for linear genomes it would be more plausible to adopt a restricted version with ER compositions. This model was studied recently by da Silva et al. (BMC Bioinformatics 13, Suppl. 19, S14, 2012), but only an upper bound for the restricted DCJ-indel distance was provided. Here we first solve an open problem posed in that paper and present a very simple proof showing that the distance, which can be computed in linear time, is the same for both the unrestricted and the restricted DCJ-indel models. We then give a simpler algorithm for computing an optimal restricted DCJ-indel sorting scenario in O(n log n) time. We also relate the DCJ-indel distance to the restricted DCJ-substitution distance, which instead of indels considers a more powerful operation that allows the substitution of a DNA segment by another DNA segment. We show that the DCJ-indel distance is a 2-approximation for the restricted DCJ-substitution distance.
- Published
- 2015
- Full Text
- View/download PDF
50. Restricted DCJ-indel model: sorting linear genomes with DCJ and indels.
- Author
-
da Silva PH, Machado R, Dantas S, and Braga MD
- Subjects
- Algorithms, Chromosomes, Genomics, Mutation, Translocation, Genetic, DNA Breaks, Double-Stranded, Genome genetics, INDEL Mutation, Models, Genetic
- Abstract
Background: The double-cut-and-join (DCJ) is a model that is able to efficiently sort a genome into another, generalizing the typical mutations (inversions, fusions, fissions, translocations) to which genomes are subject, but allowing the existence of circular chromosomes at the intermediate steps. In the general model many circular chromosomes can coexist in some intermediate step. However, when the compared genomes are linear, it is more plausible to use the so-called restricted DCJ model, in which we proceed the reincorporation of a circular chromosome immediately after its creation. These two consecutive DCJ operations, which create and reincorporate a circular chromosome, mimic a transposition or a block-interchange. When the compared genomes have the same content, it is known that the genomic distance for the restricted DCJ model is the same as the distance for the general model. If the genomes have unequal contents, in addition to DCJ it is necessary to consider indels, which are insertions and deletions of DNA segments. Linear time algorithms were proposed to compute the distance and to find a sorting scenario in a general, unrestricted DCJ-indel model that considers DCJ and indels., Results: In the present work we consider the restricted DCJ-indel model for sorting linear genomes with unequal contents. We allow DCJ operations and indels with the following constraint: if a circular chromosome is created by a DCJ, it has to be reincorporated in the next step (no other DCJ or indel can be applied between the creation and the reincorporation of a circular chromosome). We then develop a sorting algorithm and give a tight upper bound for the restricted DCJ-indel distance., Conclusions: We have given a tight upper bound for the restricted DCJ-indel distance. The question whether this bound can be reduced so that both the general and the restricted DCJ-indel distances are equal remains open.
- Published
- 2012
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.