Skip to contents

All functions

abstracts2text()
Validate a data frame with abstracts to construct vectpr of strings
example_abstract
Vector with 100 examples of abstracts from papers. This abstracts are used as an example to create a object of class abstracts.
example_auc
Example of value auc for test data using a plus model with a set of abstracts.
example_class
Vector with 100 examples of class of abstracts from papers. Is used as an example to build an abstracts class. Could be positive (belonging to the corpus of mammal - parasite papers) or unknown.
example_doi
Vector with 100 examples of DOI from papers. Is used as an example to build an abstracts class.
example_plus
Example of plus model fitted with abstracts.
example_test
Example of test data. Is a sample of the lacs dataset.
example_title
Vector with 100 examples of titles from papers. Is used as an example to build an abstracts class.
example_train
Example of train data. Is a sample of the lacs dataset.
example_vocabulary
Example of vocabulary. Is a vocabulary object used to test the new_plus function.
fit_plus()
Fit a plus model to an abstracts object
get_abstracts()
abstracts model helper function
get_dtm()
Get a document term matrix from abstracts
get_vocabulary()
Create vocabulary from abstracts
get_y()
Get the independent variable for an abstracts object
hp3_abstracts
Dataset with 710 abstracts originally selected by the hp3 project. These papers were used to obtain mammal-virus interactions, aligned and harmonized taxonomically by Rory Gibb of the clover project. Rory Gibb curated the hp3 original dataset and added doi and pmid. We retrived the abstracts through entrez API in PubMed using pmid. Is primarily used to test a lacs model as a benchmark in classification. We use true possitive rate metric (precision) to classify this dataset. We consider all these articles as positive class papers because they contain information on parasite-host interactions. Our lacs model is trained to find papers with this information. The original dataset is stored in 10.5281/zenodo.596810 and the clover curated dataset was retrived from 10.5281/zenodo.4435127
lacs()
Create an object of class lacs
lacsSample
Data frame with 600 abstracts. Each abstract belongs to one of both classes, positive and unknown. Abstracts from parasite class are from ZOVER and GMPD database. Abstracs from unknown class are random abstracts retrived from crossref.
new_abstracts()
Constructor function for the class abstracts A constructor for the abstracts class
new_lacs()
Constructor function for the class lacs. A constructor for the lacs class
predict()
Predicted classification with a lacs model
predictions()
Title
validate_abstracts()
Validate a data frame with abstracts to construct an object of class abstracts via new_abstracts function