Page:Code Swaraj - Carl Malamud - Sam Pitroda.djvu/131

Note on Code Swaraj Other states were even harder. With Odisha, I was able to pull up index files with long lists of issues of the Gazette, and in that index file was a URL directly to each PDF file. By first bringing down the indices, then parsing them for metadata and file addresses, it was fairly trivial to bring in all the PDF files. But, most of the states were not that straightforward.

Most of the state gazettes are based on some Microsoft server software which does not expose the URL (network address) of the PDF files. The problem was that each state had a different opaque way of getting issues. There are several dozen official gazettes in India, one for each state and ones for major municipalities like Delhi. Each one is programmed differently.

We had amassed 163,977 total PDF files in the collection, but it was clear that to do this right, we would have to do some serious work on it in 2018. Not only did files have to be brought in for all the gazettes, the collection needed to be kept up to date to be truly useful, and in order to permit the kind of searching across gazettes we really wanted to see, we had to tackle the issue of high-quality optical character recognition on any scanned gazettes, an issue we also faced with the Public Library of India. In addition, as we downloaded gazettes from the Union government and from states and cities, it was clear some of them were improperly labeled or missing, so some serious quality assurance would be necessary.

The purpose of the official journals of government for any country are to allow citizens to know what their government is doing. This was the genesis in the United States of the Federal Register, the official journal of the federal government. There had been a famous court case that reached the Supreme Court in which it turned out that the government sued a group during the Great Depression for noncompliance with regulations, but nobody could actually find those regulations because they had never been published.

At the urging of Justice Brandeis of the Supreme Court, a Harvard Law professor wrote a seminal paper titled “Government in Ignorance of the Law—A Plea for Better Publication of Executive Legislation.” That led to a formal procedure in which all government regulations would be first published in a preliminary fashion, known as a “Notice of Proposed Rulemaking” so that citizens might know what would be happening, then the final rules would also be published. The entire regulation would then be incorporated into a consolidated document, the Code of Federal Regulations, which would be kept up to date with all amendments, deletions, and helpful historical notes and pointers to the enabling statutes.