studied Library and Information Science (LIS) at the Humboldt University of Berlin. In 2012 he graduated with a Master of Arts in LIS with a thesis in the field of bibliometrics. He is working for iFQ since August 2013.
Paul Donner
Research Area Research System and Science Dynamics
Researcher
- +49 30 2064177-21
- +49 30 2064177-99
- Google Scholar
- Orcid
Academic research fields
Bibliometrics, Information visualization
List of projects
List of publications
Remarks on modified fractional counting.Donner, P. (2024).Remarks on modified fractional counting. Journal of Informetrics, 18, 101585. https://doi.org/10.1016/j.joi.2024.101585 |
Drawbacks of Normalization by Percentile Ranks in Citation Impact Studies.Donner, P. (2022).Drawbacks of Normalization by Percentile Ranks in Citation Impact Studies. Journal of Library and Information Studies, 20(2), 75-93. Abstract
This paper discusses drawbacks of the percentile rank method for citation impact normalization which have hitherto been neglected in the bibliometrics literature. The transformation of citation counts to percentile ranks changes ratio scale data into ordinal scale data, for which the notions of the ratio between two values and of the magnitude of a difference between two values are not defined – a substantial loss of information. This distorts citation data particularly severely because the differences between citation counts adjacent in order in publication sets are greater for more highly cited publications and because highly cited publications are more scarce than non-highly cited ones. [...] |
Algorithmic identification of Ph.D. thesis-related publications: a proof-of-concept study.Donner, P. (2022).Algorithmic identification of Ph.D. thesis-related publications: a proof-of-concept study. Scientometrics (online first). Abstract
In this study we propose and evaluate a method to automatically identify the journal publications that are related to a Ph.D. thesis using bibliographical data of both items. We build a manually curated ground truth dataset from German cumulative doctoral theses that explicitly list the included publications, which we match with records in the Scopus database. We then test supervised classification methods on the task of identifying the correct associated publications among high numbers of potential candidates using features of the thesis and publication records. The results indicate that this approach results in good match quality in general and with the best results attained by the “random forest” classification algorithm. |
Citation analysis of Ph.D. theses with data from Scopus and Google Books.Donner, P. (2021).Citation analysis of Ph.D. theses with data from Scopus and Google Books. Scientometrics, 126, 9431-9456. https://doi.org/10.1007/s11192-021-04173-w Abstract
This study investigates the potential of citation analysis of Ph.D. theses to obtain valid and useful early career performance indicators at the level of university departments. For German theses from 1996 to 2018 the suitability of citation data from Scopus and Google Books is studied and found to be sufficient to obtain quantitative estimates of early career researchers’ performance at departmental level in terms of scientific recognition and use of their dissertations as reflected in citations. Scopus and Google Books citations complement each other and have little overlap. Individual theses’ citation counts are much higher for those awarded a dissertation award than others. Departmental level estimates of citation impact agree ... |
Identifying constitutive articles of cumulative dissertation theses by bilingual text similarity. Evaluation of similarity methods on a new short text task.Donner, P. (2021).Identifying constitutive articles of cumulative dissertation theses by bilingual text similarity. Evaluation of similarity methods on a new short text task. Quantitative Science Studies, 2(3). https://doi.org/10.1162/qss_a_00152 Abstract
Cumulative dissertations are doctoral theses comprised of multiple published articles. For studies of publication activity and citation impact of early career researchers it is important to identify these articles and link them to their associated theses. Using a new benchmark data set, this paper reports on experiments of measuring the bilingual textual similarity between, on the one hand, titles and keywords of doctoral theses, and, on the other hand, articles’ titles and abstracts. The tested methods are cosine similarity and L1 distance in the Vector Space Model (VSM) as baselines, the language-indifferent methods Latent Semantic Analysis (LSA) and trigram similarity, and the language-aware methods fastText and Random Indexing (RI)... |
Validation of the Astro dataset clustering solutions with external data.Donner, P. (2020).Validation of the Astro dataset clustering solutions with external data. Scientometrics, 126, 1619–1645. https://doi.org/10.1007/s11192-020-03780-3 Abstract
We conduct an independent cluster validation study on published clustering solutions of a research testbed corpus, the Astro dataset of publication records from astronomy and astrophysics. We extend the dataset by collecting external validation data serving as proxies for the latent structure of the corpus. Specifically, we collect (1) grant funding information related to the publications, (2) data on topical special issues, (3) on specific journals’ internal topic classifications and (4) usage data from the main online bibliographic database of the discipline. The latter three types of data are newly introduced for the purpose of clustering validation and the rationale for using them for this task is set out. |
The implicit preference of bibliometrics for basic research.Donner, P., & Schmoch, U. (2020).The implicit preference of bibliometrics for basic research. Scientometrics, 124, 1411-1419. https://doi.org/10.1007/s11192-020-03516-3 Abstract
By individually associating articles to basic or applied research, it is shown that basic articles are cited more frequently than applied ones. Dividing the subject categories of the Web of Science into a basic and an applied part, the mean field-normalization rate is referred to the applied or basic part depending on the research orientation of the paper analysed. By this approach, a distinct difference of the citations for the applied and basic parts of most subject categories is found. However, differences of the citation scores of applied and basic research organisations are found as well, but are less clear. The explanation is that applied and basic research organisations generally publish a mix of basic and applied articles. [...] |
A validation of coauthorship credit models with empirical data from the contributions of PhD candidates.Donner, P. (2020).A validation of coauthorship credit models with empirical data from the contributions of PhD candidates. Quantitative Science Studies, 1 (2), 551-564. Abstract
A perennial problem in bibliometrics is the appropriate distribution of authorship credit for coauthored publications. Several credit allocation methods and formulas have been introduced, but there has been little empirical validation as to which method best reflects the typical contributions of coauthors. This paper presents a validation of credit allocation methods using a new data set of author-provided percentage contribution figures obtained from the coauthored publications in cumulative PhD theses by authors from three countries that contain contribution statements. [...] |
Comparing institutional-level bibliometric research performance indicator values based on different affiliation disambiguation systems.Donner, P., Rimmert, C., & van Eck, N.J. (2020).Comparing institutional-level bibliometric research performance indicator values based on different affiliation disambiguation systems. Quantitative Science Studies, Volume 1 Issue 1, MIT Press, 150-170. Abstract
The present study is an evaluation of three frequently used institution name disambiguation systems. The Web of Science normalized institution names and Organization Enhanced system and the Scopus Affiliation ID system are tested against a complete, independent institution disambiguation system for a sample of German public sector research organizations. The independent system is used as the gold standard in the evaluations that we perform. We study the coverage of the disambiguation systems and, in particular, the differences in a number of commonly used bibliometric indicators. The key finding is that for the sample institutions, the studied systems provide bibliometric indicator values that have only a limited accuracy. [...] |
Different but similar. A comparison of performance monitoring in the UK, Australia and Germany.Hinze, S., Butler, L., Donner, P., & McAllister, I. (2019).Different but similar. A comparison of performance monitoring in the UK, Australia and Germany. In Glänzel, W., Moed, H., Schmoch, U., & Thelwall, M. (Hrsg.), Springer Handbook of Science and Technology Indicators (S. 465-484). Basel: Springer International Publishing. https://doi.org/10.1007/978-3-030-02511-3 |
Comparing institutional-level bibliometric indicator values based on different affiliation disambiguation systems. Benchmarking Web of Science and Scopus platform tools against a gold-standard data set for Germany.Donner, P., Rimmert, C., & van Eck, N. J. (2019).Comparing institutional-level bibliometric indicator values based on different affiliation disambiguation systems. Benchmarking Web of Science and Scopus platform tools against a gold-standard data set for Germany. In Catalano, G., Daraio, C., Gregori, M., Moed, H. F., & Ruocco, G (Hrsg.) Proceedings of the 17th Conference of the International Society for Scientometrics and Informetrics (ISSI 2019), Vol. 1, (S. 306-315). Edizioni Efesto. ISBN 978-88-3381-118-5. |
Supplementing citations to PhD theses with citations from Google.Donner, P. (2018).Supplementing citations to PhD theses with citations from Google. In R. Costas, T. Franssen & A. Yegros-Yegros, Proceedings of the 23rd International Conference on Science and Technology Indicators (465-471). Centre for Science and Technology Studies (CWTS), Leiden, Netherlands. |