Student Theses and Dissertations


Amrita Basu

Date of Award


Document Type


Degree Name

Doctor of Philosophy (PhD)

RU Laboratory

Allis Laboratory


Eukaryotic DNA is found packaged with proteins and RNA, which forms a substance called chromatin. This packaging is dynamic and regulates access to DNA for essential cellular processes such as transcription, replication, and repair. In recent years, studies have shown that regulated changes in the chemical and physical properties of chromatin often lead to dynamic changes in multiple cellular processes by affecting the accessibility of the DNA. These changes can be brought about in part through posttranslational modifications of histone proteins, which are involved in disrupting chromatin contacts or by recruiting effector proteins to chromatin. Acetylation is one of the well-studied post-translational modifications that has been associated with chromatin-associated processes, notably gene regulation. Many studies have contributed to our knowledge of the enzymology underlying acetylation, including efforts to understand the molecular mechanism of substrate recognition by several acetyltransferases, but traditional experiments to determine intrinsic features of substrate and site specificity have proven challenging. In my thesis work, I hypothesize that the primary amino acid sequence surrounding an acetylated lysine plays a critical role in acetylation site selection, and whether there are sequence preferences that enable a lysine acetyltransferase to recognize target lysines. A computational method was devised to examine this hypothesis, and an experimental approach was taken to test my computationally-derived predictions. In Chapter 2, I describe my basic computational methods, using a clustering analysis of protein sequences to predict lysine acetylation based on the sequence characteristics of acetylated lysines within histones. I define a local amino acid sequence composition that represents potential acetylation sites by implementing a clustering analysis of histone and nonhistone sequences. I demonstrate that this sequence composition has predictive power on two independent experimental datasets of acetylation marks. In Chapter 3, I describe the experimental validation approach used to detect acetylation in histone and nonhistone proteins using mass spectrometry. I also report several novel non-histone acetylated substrates in S. cerevisiae. My approach, combined with more traditional experimental methods, may be useful for identifying additional proteins in the acetylome. Finally, in Chapter 4, I describe two bioinformatics approaches; one to predict additional chromatin associated effector proteins, and another to further understand the evolutionary history and complexity of the Polycomb Group (PcG) proteins in multicellular organisms in order to infer gene expansion, co-evolution, and deletion events.


A Thesis Presented to the Faculty of The Rockefeller University in Partial Fulfillment of the Requirements for the degree of Doctor of Philosophy

Included in

Life Sciences Commons