PosterWall Facebook Blog - Upload any text, file, video to your blog - share on Facebook
Connect with Facebook Share
Upload any text and file to your blog, post any link to your Facebook account with the "f" "Share" button.

Go Back   PosterWall Facebook Blog - Upload any text, file, video to your blog - share on Facebook > Blogs > xeeshan
Connect with Facebook

Rate this Entry


Posted 12-11-2010 at 11:04 PM by xeeshan

Assignment 2: Regulatory motifs Bioinformatics algorithms, BI720A, autumn 2010 Total credit points: 20 Instructions for the report layout and how to submit it can be found on the Scio course website. Problem description In order to regulate the expression of its genes, the cell employs a large number of transcription factors. These factors are proteins that bind you specific regions (usually in promoter sequences) in DNA and either facilitate or repress transcription of some gene(s). Since the behavior and function of cells are determined by what genes that are expressed, it becomes very important to understand gene regulation. Regions in the DNA where transcription factors bind are called transcription factor binding sites (TFBSs). By predicting where these are located in the genome of an organism, it is possible to make predictions about what transcription factors that regulate different genes. TFBSs can be represented by motifs, i.e. as strings of allowed nucleotides that together constitute a region that a transcription factor can recognize. In addition to A, T, C and G, it is also desirable to represent different combinations of nucleotides in a single position. This can be done using the IUPAC code which is given below. W = A or T S = C or G R = A or G Y = C or T K = G or T M = A or C B = C, G, or T (not A) D = A, G, or T (not C) H = A, C, or T (not G) V = A, C, or G (not T) N = A, C, G, or T (any nucleotide)
Task A (12p) Your first task in this assignment is to implement a program that reads a file containing promoter sequences and a file containing motifs. The program should then determine how many times each motif occurs in each promoter sequence, and give this as output to the user.
Task B (3p) Extend the program so the reverse complement of each promoter sequences is determined. The program should then search for motifs in both the promoter sequences in the file, as well as in their reverse complements. In the output it should be indicated how many times each motif occurs in each of these. Task C (5p) Extend the above program so that it also outputs the positions in the promoter sequences where each motif starts.
Posted in Uncategorized
Views 1157 Comments 0 Edit Tags Email Blog Entry

« Prev     Main     Next »
Total Comments 0



All times are GMT +1. The time now is 05:34 PM.

Connect with Facebook