Archives and Documentation Center
Digital Archives

Identifying passages describing protein-protein interaction detection methods in biomedical full text articles using information retrieval methods

Show simple item record

dc.contributor Graduate Program in Computer Engineering.
dc.contributor.advisor Özgür, Arzucan.
dc.contributor.author Aydın, Ferhat.
dc.date.accessioned 2023-03-16T10:02:31Z
dc.date.available 2023-03-16T10:02:31Z
dc.date.issued 2016.
dc.identifier.other CMPE 2016 A84
dc.identifier.uri http://digitalarchive.boun.edu.tr/handle/123456789/12328
dc.description.abstract Information regarding the physical interactions among proteins is crucial, since protein-protein interactions (PPIs) are central for many biological processes. The experimental techniques used to verify PPIs are also vital for characterizing and assessing the reliability of the identi ed PPIs. A lot of information about PPIs and the experimental methods are only available in the text of the scienti c publications that report them. In this thesis, we approach the problem of identifying passages with experimental methods for physical interactions between proteins as an information retrieval search task. The baseline system is based on query matching, where the queries are generated by utilizing the names (including synonyms) of the experimental methods in the Proteomics Standard Initiative - Molecular Interactions (PSI-MI) ontology. We propose two methods, where the baseline queries are expanded by including additional relevant terms. The rst method is a supervised approach, where the most salient terms for each experimental method are obtained by using the term frequency-relevance frequency (tf.rf) metric over 13 articles from our manually annotated data set of 30 full text articles, which is made publicly available as an additional contribution of this study. The rst method is evaluated on the test set consisting of the remaining 17 articles and achieves better recall score compared to the baseline. On the other hand, the second method is an unsupervised approach, where the queries for each experimental method are expanded by using the word embeddings of the names of the experimental methods in the PSI-MI ontology. The second method achieves better recall and F-measure scores over the test set compared to the baseline.
dc.format.extent 30 cm.
dc.publisher Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2016.
dc.subject.lcsh Protein-protein interactions.
dc.title Identifying passages describing protein-protein interaction detection methods in biomedical full text articles using information retrieval methods
dc.format.pages x, 69 leaves ;


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Digital Archive


Browse

My Account