Domande d'esame VERIFICATO

Final exam Di Lena. Module 3 version 5

Università degli studi di Bologna bioinformatics 2020
★ 3,0 (1)
Condividi: WhatsApp Telegram
Anteprima pagina 1 — Final exam Di Lena. Module 3 version 5

Di cosa parla

  • This document outlines a series of shell scripting exercises designed for bioinformatics students.
  • The core task involves processing a simulated dataset, `casp13.txt`, which contains protein sequence information.
  • The `casp13.txt` file includes data points such as:
    • Header lines (starting with `>`) with identifiers (e.g., `H0974`, `Q3KP22-3`).
    • Source information (e.g., `048503/048504`, `Human`, `E. coli`).
    • Subunit details and residue counts.
    • Actual protein sequences.
  • The exercises require the application of various Unix command-line utilities in pipelines:
    • `cat`: To display file content.
    • `sort`: For sorting lines alphabetically or numerically.
    • `cut`: To extract specific columns or character ranges.
    • `head`: To output the first part of files.
    • `wc`: To count lines, words, and characters.
    • `grep`: For searching text using patterns (including regular expressions).
    • `awk`: A powerful pattern-scanning and processing language, used here for field extraction and conditional printing.
    • `sed`: A stream editor for filtering and transforming text, used for deleting lines based on patterns.
  • Specific questions challenge students to:
    • Determine the count of unique prefixes from sorted protein data after truncating lines.
    • Count specific header lines containing a particular digit.
    • Extract and count unique identifiers from header lines using `awk`.
    • Calculate the total number of sequence lines after removing header and specific sequence lines using `sed`.
  • These exercises are fundamental for developing skills in data manipulation, pattern matching, and building efficient command-line workflows essential for bioinformatics research.

Altri appunti di PROGRAMMING FOR BIOINFORMATICS [cod. 69442]

Condividi questi appunti

WhatsApp Telegram