Page:Code Swaraj - Carl Malamud - Sam Pitroda.djvu/135

Note on Code Swaraj program they have in mind, but that wasn’t how we work and we have always put mission before money.

The foundation had finally come back and agreed to give us $250,000 in January 2017 and said we could have $250,000 more in July after submitting a report, with the remaining $400,000 to come as installments in 2018 and 2019.

It was far more “chunking” of the grant into tranches than I like to see, but I signed the papers.

Auditing Publishers For Shady Practices

I spent the first six months of 2017 working intensively on research on works of government. Working with two professors and a graduate student at the University of North Carolina, and with help from librarians at the University of California and Stanford, we conducted an intensive search of the scholarly literature looking for author affiliations. It is actually non-trivial to find this information searching journal databases because author affiliations can be written in a number of ways.

What we basically did was throw each government agency, one by one, into three different commercial search engines used by libraries, and looked at the results. For example, if you search for “Centers for Disease Control” you get articles from not only the U.S. agency, but their Chinese counterpart. So, you refine the search to look for the agency name and the word “U.S.” or “United States” or “Atlanta.”

The number of results we found were breathtaking. Our initial audit found 1,264,429 articles that appeared to have been authored by federal employees. From that initial list, we conducted a second-stage analysis asking several questions. It is possible for a federal employee to author an article on their own time without federal funds. Even if that article is within the scope of the employee’s area of expertise, that is not a work of government. It has to be conducted in the course of their official duties to be considered a work and free of copyright. A question we had was whether articles were properly marked as being devoid of copyright, as required by law.

Our analysis allowed us to sort the 1.2 million article citations two ways. First, because they used Digital Object Identifiers, we could determine how many possible works of government were from which publishers. One corporate branch of Reed Elsevier, for example, had 293,769 articles, whereas the American Medical Association had 5,961 articles. In addition, because we had