Appunti VERIFICATO

Programmazione per bioinformatica

Università degli studi di Bologna bioinformatics 2020
13 visualizzazioni
18 download
Nessun voto ancora
Condividi: WhatsApp Telegram
Anteprima pagina 1 — Programmazione per bioinformatica Anteprima pagina 2 — Programmazione per bioinformatica Anteprima pagina 3 — Programmazione per bioinformatica

Di cosa parla

  • DNA File Formats and Processing:
    • Explores ASN, FASTA, and Genbank formats; Genbank offers richer metadata.
    • Details DNA string compression (bits to hex) and provides Python functions (DecoderDict, DNAFromASN1) for decoding.
  • Principal Component Analysis (PCA):
    • Utilized for dimensionality reduction in high-dimensional biological data.
    • Protocol involves computing covariance, eigenvectors/eigenvalues, and projecting data to simplify analysis.
    • Effectively used for comparing genomes, classifying bacterial types based on codon frequencies, as illustrated by visual clustering.
  • Genetic Sequence Analysis:
    • Discusses codon frequencies as a basis for classifying genomes, with a Python function (CountCodons) to count them.
    • Highlights the complexity of sequence alignment due to indels (insertions/deletions) and varying sequence lengths, precluding brute-force methods.
    • Mentions the BLOSUM50 matrix, a standard for protein sequence alignment, considering amino acid substitution probabilities.
  • Information Theory and Protein Biology:
    • Introduces Shannon's Entropy of Information, a measure of data unpredictability and complexity, vital for understanding data encoding.
    • Outlines challenges in protein production, such as inclusion body formation, influenced by factors like temperature, concentration, redox environment, and molecular interactions.
  • Advanced Sequencing Technologies:
    • RNA-Seq: An unbiased method for transcriptomic analysis (isoforms, SNPs, fusions), offering superior dynamic range and de novo capabilities compared to microarrays.
    • ChIP-Seq: Leverages Next-Generation Sequencing (NGS) to map protein-DNA interactions across the genome.
    • Whole-Genome Bisulfite Sequencing (WGBS): A key epigenetic tool for comprehensive analysis of DNA methylation patterns, linked to disease.
  • Comprehensive Glossary:
    • Defines numerous essential bioinformatics terms, including Coverage level, CpG site, Deep sequencing, Epigenetics, GWAS, NGS, SNPs, Targeted resequencing, Transcriptome, and WGS.
    • Includes a list of key acronyms relevant to the field.

Altri appunti di PROGRAMMING FOR BIOINFORMATICS [cod. 69442]

Condividi questi appunti

WhatsApp Telegram