Hallucinated citations highest in social sciences preprints site
Nature News ·

Analyses of research repositories are estimating the rates of hallucinated citations in research papers. Credit: patpitchaya/iStock via Getty The problem of artificial intelligence models …
Analyses of research repositories are estimating the rates of hallucinated citations in research papers. Credit: patpitchaya/iStock via Getty The problem of artificial intelligence models ‘hallucinating’ non-existent citations has recently shot to prominence . Now a team of researchers has sifted through 2.5 million papers and preprints to provide the best assessment of their prevalence yet. Their audit encompassed 111 million references in papers and preprints listed in major repositories including arXiv, bioRxiv, Social Science Research Network (SSRN), and PubMed Central servers, and found that there were 146,932 hallucinated citations in material published in 2025 alone. The analysis also suggests that the prevalence of hallucinated citations depends on the area of research. SSRN, a preprint server for social sciences research, had the highest rate of hallucinated citations at nearly 2%, almost five times higher than any other major repository. “We were really amazed by the overall magnitude and dynamics of the whole body of hallucinated citations,” says Yian Yin, assistant professor of information science at Cornell University in Ithaca, New York state, and a co-author of the study. The analysis was posted on the preprint server arXiv 1 and has not been peer-reviewed. Bibliographic hallucinations Yin and his colleagues were prompted to investigate the scale of the problem after spotting some references to unfamiliar work, supposedly authored by researchers they knew. …
Original source: Nature News