Title: Interface to the arXiv API
Version: 0.10
Date: 2024-02-29
Description: An interface to the API for 'arXiv', a repository of electronic preprints for computer science, mathematics, physics, quantitative biology, quantitative finance, and statistics.
URL: https://docs.ropensci.org/aRxiv/, https://github.com/ropensci/aRxiv
BugReports: https://github.com/ropensci/aRxiv/issues
Depends: R (≥ 3.5.0)
License: MIT + file LICENSE
Imports: httr, utils, XML
Suggests: devtools, knitr, rmarkdown, roxygen2, testthat
VignetteBuilder: knitr
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.3.1
NeedsCompilation: no
Packaged: 2024-02-29 16:57:24 UTC; kbroman
Author: Karthik Ram ORCID iD [aut], Karl Broman ORCID iD [aut, cre]
Maintainer: Karl Broman <broman@wisc.edu>
Repository: CRAN
Date/Publication: 2024-02-29 17:12:37 UTC

arXiv subject classifications

Description

arXiv subject classifications: their abbreviations and corresponding descriptions.

Usage

data(arxiv_cats)

Format

A data frame with five columns: the abbreviations of the subject classifications (category), the field of study, subfield of study (within Physics; NA otherwise), a short description, and a longer description.

Source

https://arxiv.org/category_taxonomy

Examples

arxiv_cats

Count number of results for a given search

Description

Count the number of results for a given search. Useful to check before attempting to pull down a very large number of records.

Usage

arxiv_count(query = NULL, id_list = NULL)

Arguments

query

Search pattern as a string; a vector of such strings is also allowed, in which case the elements are combined with AND.

id_list

arXiv doc IDs, as comma-delimited string or a vector of such strings

Value

Number of results (integer). An attribute "search_info" contains information about the search parameters and the time at which it was performed.

See Also

arxiv_search(), query_terms(), arxiv_cats()

Examples



# count papers in category stat.AP (applied statistics)
arxiv_count(query = "cat:stat.AP")

# count papers by Peter Hall in any stat category
arxiv_count(query = 'au:"Peter Hall" AND cat:stat*')

# count papers for a range of dates
#    here, everything in 2013
arxiv_count("submittedDate:[2013 TO 2014]")




Open abstract for results of arXiv search

Description

Open, in web browser, the abstract pages for each of set of arXiv search results.

Usage

arxiv_open(search_results, limit = 20)

Arguments

search_results

Data frame of search results, as returned from arxiv_search().

limit

Maximum number of abstracts to open in one call.

Details

There is a delay between calls to utils::browseURL(), with the amount taken from the R option "aRxiv_delay" (in seconds); if missing, the default is 3 sec.

Value

(Invisibly) Vector of character strings with URLs of abstracts opened.

See Also

arxiv_search()

Examples

z <- arxiv_search('au:"Peter Hall" AND ti:deconvolution')
arxiv_open(z)


Description

Allows for progammatic searching of the arXiv pre-print repository.

Usage

arxiv_search(
  query = NULL,
  id_list = NULL,
  start = 0,
  limit = 10,
  sort_by = c("submitted", "updated", "relevance"),
  ascending = TRUE,
  batchsize = 100,
  force = FALSE,
  output_format = c("data.frame", "list"),
  sep = "|"
)

Arguments

query

Search pattern as a string; a vector of such strings also allowed, in which case the elements are combined with AND.

id_list

arXiv doc IDs, as comma-delimited string or a vector of such strings

start

An offset for the start of search

limit

Maximum number of records to return.

sort_by

How to sort the results (ignored if id_list is provided)

ascending

If TRUE, sort in ascending order; else descending (ignored if id_list is provided)

batchsize

Maximum number of records to request at one time

force

If TRUE, force search request even if it seems extreme

output_format

Indicates whether output should be a data frame or a list.

sep

String to use to separate multiple authors, affiliations, DOI links, and categories, in the case that output_format="data.frame".

Value

If output_format="data.frame", the result is a data frame with each row being a manuscript and columns being the various fields.

If output_format="list", the result is a list parsed from the XML output of the search, closer to the raw output from arXiv.

The data frame format has the following columns.

[,1] id arXiv ID
[,2] submitted date first submitted
[,3] updated date last updated
[,4] title manuscript title
[,5] summary abstract
[,6] authors author names
[,7] affiliations author affiliations
[,8] link_abstract hyperlink to abstract
[,9] link_pdf hyperlink to pdf
[,10] link_doi hyperlink to DOI
[,11] comment authors' comment
[,12] journal_ref journal reference
[,13] doi published DOI
[,14] primary_category primary category
[,15] categories all categories

The contents are all strings; missing values are empty strings ("").

The columns authors, affiliations, link_doi, and categories may have multiple entries separated by sep (by default, "|").

The result includes an attribute "search_info" that includes information about the details of the search parameters, including the time at which it was completed. Another attribute "total_results" is the total number of records that match the query.

See Also

arxiv_count(), arxiv_open(), query_terms(), arxiv_cats()

Examples



# search for author Peter Hall with deconvolution in title
z <- arxiv_search(query = 'au:"Peter Hall" AND ti:deconvolution', limit=2)
attr(z, "total_results") # total no. records matching query
z$title

# search for a set of documents by arxiv identifiers
z <- arxiv_search(id_list = c("0710.3491v1", "0804.0713v1", "1003.0315v1"))
# can also use a comma-separated string
z <- arxiv_search(id_list = "0710.3491v1,0804.0713v1,1003.0315v1")
# Journal references, if available
z$journal_ref

# search for a range of dates (in this case, one day)
z <- arxiv_search("submittedDate:[199701010000 TO 199701012400]", limit=2)




Check for connection to arXiv API

Description

Check for connection to arXiv API

Usage

can_arxiv_connect(max_time = 5)

Arguments

max_time

Maximum wait time in seconds

Value

Returns TRUE if connection is established and FALSE otherwise.

Examples


can_arxiv_connect(2)



arXiv query field terms

Description

Possible terms that correspond to different fields in arXiv searches.

Usage

data(query_terms)

Format

A data frame with two columns: the term and corresponding description.

Author(s)

Karl W Broman

Source

https://arxiv.org/help/api/user-manual.html

Examples

query_terms