Bots are scraping open data — how should researchers respond?
Nature News ·

90% of open access data repositories part of the Confederation of Open Access Repositories encounter bot scraping, Credit: fdmsd8yea/Getty Should researchers still be posting their data openly …
90% of open access data repositories part of the Confederation of Open Access Repositories encounter bot scraping, Credit: fdmsd8yea/Getty Should researchers still be posting their data openly online? It’s a question being debated by some researchers now that bots are routinely mining open-access databases and scientific publications to train artificial-intelligence tools — and in some cases analysing and combining data sets to churn out new results and papers faster than humans can. Some researchers argue that the potential of automated science to be used for scientific ‘good’ — speeding up the discovery of new drug targets , for example — means that open data should remain open. But others point to evidence that bots scraping complex data sets can contribute to low-quality research and AI slop , while also allowing the extraction of sensitive data, including patient information. They argue that new rules and technical systems are needed to restrict bot access to databases. “It’s a pretty big issue everybody should be thinking about, whether you’re for or against AI,” says Andrea Howard, a psychologist at Carleton University in Ottawa, Canada. Privacy concerns What is clear is that AI scraping is common. A survey published in June last year by the Confederation of Open Access Repositories found that more than 90% of the member organizations that responded encounter bot scraping, with most of them seeing abnormally high bot activity at least once a week 1 . …
Original source: Nature News