This data reports the publication output (number of articles and number of citations received) for a few scientists from the start of their career to 2000. Most of the variables are processed from the Microsoft Academic Graph (MAG) data set. A few variables are randomly generated.
data(base_pub, package = "fixest")base_pub is a data frame with 4,024 observations and 10 variables. There are 200 different scientists and 51 different years (ends in 2000).
author_id: scientist identifier
year: current year
affil_id: affiliation ID of the scientist's current affiliation
affil_name: affiliation name of the scientist's current affiliation (character)
field: field name of the scientist (character), time invariant
nb_pub: number of publications of the scientist for the current year
nb_cites: number of citations received by the publications of the scientist in the current year. Accounts for the citations received from articles published up to 2020.
birth_year: birth year of the scientist (this is randomly generated)
is_woman: 1 if the scientist is a woman, 0 otherwise (this is randomly generated)
age: current age of the scientist (formally year - birth_year)
The source of this data set is the Microsoft Academic Graph data set, extracted in 2020. Now a defunct project, you can find similar data on OpenAlex.
The variables birth_year, is_woman and age were randomly generated. All other variables have created from the raw MAG files.