Name: BIOMI609 Assignment 1 Solved
SKU: 103775
Price: 30.00 USD
Availability: InStock

Description

Rate this product

1) You will write a program in Python/R/C (just pick a language you like – I’m listing ones here that I prefer) that will take as input a FASTQ file and print the distribution of quality scores across all reads. You can summarize the distribution of Q scores at each base with a statistic of your choice (e.g. mean, mode, median, quantile distribution). If you’d like, you can also plot the distribution of Q scores as a box plot much like what’s generated by FASTQC. You will then run your program on the provided FASTQ file, and obtain the output from it.

Note that a FASTQ file has the following format:

This format is repeated for each read.

Hints: The Illumina PHRED quality score encoding can be found here: https://support.illumina.com/help/BaseSpace_OLH_009008/Content/Source/Informatics/BS/QualityScoreEncoding_swBS.htm

The idea is real simple; for each character in the quality score line, the ASCII value of that character – 33 = Q. Thereon, Q = -10log₁₀P_e, where P_e is the probability of error in calling that nucleotide base.

Here are functions in various languages to convert to the ASCII encoding:

Python: ord()

R: iconv()

C: When you scanf() the character, you scanf() with a %c, which automatically converts it into its ASCII encoding

Assign1-cejxlb.zip

BIOMI609 Assignment 1 Solved

If Helpful Share:

Description

Related products

SOLVED:COP 3223 Introduction to C Assignment 3

SOLVED:CSE 110 – Assignment #1

SOLVED:comp 2406 assignment 1