Searching through RECAP

[Post updated — see note at bottom]

The folks at the CITP have just improved RECAP.

How? They have now added search functionality.

By going to Archive.recapthelaw.org, you can search all the documents gathered by the RECAP Firefox Extension. The simple search allows you quick access to documents from U.S. Federal District and Bankruptcy Court documents, without charge. Further, the search will pull up the full docket sheet for a case, alerting and allowing you to acquire documents that are on PACER (and not yet in RECAP).

There is a simple and an advanced query page. The advanced query allows you to search by case name, number, court and dates of filing.

Two great features: it provides an RSS feed and an e-mail alert option for your query if you want to track the case.

Further, once you are viewing a case, you can add tags and connect related cases. For the case that I was searching, it even showed that it had been viewed 2 other times (by me).

As to privacy concerns, when you view case details, there is a button for reporting privacy violations. Although we hope that these privacy mistakes won’t occur, this button does help remove that data from the searchable RECAP archive.

The site mentions that it is still in the experimental phase and they welcome feedback.  I might suggest a page of search tips (for example, should we use quotation marks for exact searches? etc).  Also, I am not sure if this is searching full-text through all the downloaded documents and the tags that users might supply.  Or, is this searching just through a few fields in the documents (attorney name, nature of suit, etc)?  So, a bit more information on the searching will be helpful.  [I will find out more and add to this post with details.]

But, all that being said, what a great resource.  Now, what should we ask the folks at CITP to do for us next?

[UPDATE: this comes from Dhruv Kapadia, one of the Archive.Recapthelaw.org developers: “Right now the search is limited to the contents of the docket – the descriptions of each document as well as some parts of the metadata
associated with the docket. In the future, we may try to incorporate the OCRed text of the documents themselves, but we aren’t doing that currently.”]

What we don’t know….RECAP/Pacer Survey

At the NOCALL Spring Institute in March, I demonstrated the RECAP plug-in.  After the presentation, one attendee stopped by and suggested that not enough librarians are using the plug-in because they just don’t know about it.  I must admit that the comment surprised me — so, I decided to do a quick survey.

With the blessing of the folks at RECAP (CITP at Princeton), I created a super simple survey trying to see  if we (librarians) know about RECAP and if we do, do we use it and teach about it.  I created a survey on Zoomerang and sent the link to the following e-mail lists: LAW-LIB, NOCALL, and All-SIS, and also spread the word on Twitter.

As of May 15th, here is what the we saw from the survey:

There were 261 completed surveys.  Law firm librarians represented 18% percent of the respondents; academic law librarians represented 70%, and state/county/federal librarians represented 6% of the replies.

Ninety percent of the respondents said that they use PACER.

However, 42.4% of the 257 folks who answered the question “Have you ever heard of RECAP?” said “no”.   The academic law librarians comprised nearly 78 percent of the “no” votes (and 63% of the “yes” votes).   Seventy-three percent of the 45 law firm librarians who responded to this question had heard of RECAP.

Seventy-two percent of the respondents said that they didn’t have RECAP installed on any computers in their library.  And, of that group, 12% don’t use the plug-in because they use IE or Chrome for their browser (plug-in not compatible with those browsers);  15% don’t have the plug-in installed because their employers don’t allow it; and the largest part, 58%, don’t have it installed because they are unfamiliar with the plug-in.

And, the last question asked if respondents provided training on RECAP or taught RECAP in advanced legal research courses.  Only 6% of the respondents said “yes” to this question.  Ninety-four percent are not providing any training or instruction on RECAP.  (Note: We have been showing our students how to use RECAP and we find that our clinic students are often most receptive to this type of training.)

I hope that after reading this survey, more librarians might want to learn more about RECAP and try to use it at work and with their patrons.  Given the new look and feel of the PACER website (launched this weekend), it is good to know that the RECAP plug-in still works just fine.  What a good time to install RECAP.

More than One Document a Minute

The headline from the Internet Archive posting reads: “Millions of documents from over 350k federal court cases now freely available.”

The millions of documents are all from PACER by way of the RECAP plugin.

As the posting states:

RECAP is a Firefox Internet browser extension that allows users of the PACER to get free copies of documents they would normally pay for when the Archive has a copy, and if it is not available to then automatically donate the documents after they purchase them from PACER for future users. Therefore the repository on the Internet Archive grows as people use the PACER system with this plug-in. We are currently getting more than one document a minute and some large holdings are being uploaded. We hope that the government will eventually put all of these documents in an open archive, but until then this repository will grow with use.”

Wow.  Growing faster than one document a minute!  (Right now: stop what you are doing and check to see if you have the RECAP plugin installed on your machine — every little bit helps.)

To visit this collection and search the content, go to www.archive.org/details/usfederalcourts.  There you will be able to browse by date (the other browsing features aren’t operational).  You can also do an Advanced Search on the Internet Archive and keyword search through all the available materials by limiting to the Collection Type = usfederalcourts.   VERY COOL.

And, might I add: FREE!

I checked with the good folks who created RECAP at Princeton University’s Center for Information Technology Policy, and they said that for now the RECAP/Internet Archive collection of PACER dockets (specifically: just the high-level case metadata) are indexed and can be searched by the likes of Google, but the underlying dockets, documents and briefs are still hidden from the search robots because of privacy concerns.

Public Means Online

Today’s Washington Post features an editorial supporting the new Public Online Information Act, H.R. 4858.

[Rep. Steve Israel, D-NY] “has introduced the Public Online Information Act (POIA), a sensible and modest bill that could nevertheless be a catalyst for important changes in how the federal government thinks about and handles public information. It could also lead to greater transparency in the workings of the government.”

As the folks at the Sunlight Foundation have noted: “public means online.”

However, the realities of getting the bill passed means that it does have its limits.  Most notably, “public information generated by Congress, including real-time lobbying registrations, is exempt from the mandatory provision, as are public filings within the judicial branch.”

But with Law.gov and other transparency efforts ongoing, we can be hopeful for even bigger changes down the road.

Yesterday, Carl Malamud gave a rousing talk to the NOCALL Spring Institute about Law.gov.

[By the way, NOCALL throws down an amazing Spring Institute every year — this year was no exception!  Besides the terrific parties, they always pull in a great range of speakers and topics, from Ryan Calo (on Privacy Tools) to Mark Sirkin (on New Roles in the Law Firm of the Future).  Many attendees spoke highly of the forum on the Google Book Settlement, featuring Mary Minow, Gary Reback and Andrew Bridges.  On Saturday, I enjoyed demonstrating the awesome RECAP plug-in — hopefully, more folks will be downloading PACER court documents to the archive. ]

Malamud’s inspirational Law.gov talk got the crowd buzzing.  NOCALL members are already involved in the prototype of a national law inventory for the Law.gov effort.  And, invigorating talks like this one should help spread the word and add more volunteers to the project.    As Malamud mentioned, the inventory will help provide key metrics for the Law.gov report (for example: how many municipalities assert copyright over their regulations).

While the California legal inventory is now underway, more work is needed [READ: please contact me if you would like to volunteer to help!].  And, other AALL chapters/working groups should be starting their legal inventory projects very shortly.

For those who are still curious about Law.gov and for those who are contemplating volunteering for their own state legal inventory project(s), I encourage you to view at least one of Malamud’s Law.gov talks online and/or read his “By the People” pamphlet.

Stay tuned…As “public means online’, Law.gov equals change.

RECAP: cracking open US courtrooms

Any mention of PACER or RECAP will get our attention.   Just this week, RECAP made the news across the pond.  “RECAP: cracking open US courtrooms:  Access to US legal files is being transformed by a Napster-like sharing system called RECAP” by Bobbie Johnson appeared in the Guardian (11/11/2009).
The article starts off: “The legal system is often accused of lagging behind the technological curve – indeed, it is only a couple of years since a high court judge made headlines by saying: “I don’t really understand what a website is.” He later said that the remarks were taken out of context.”  [For more on *that* judge, see here and here.]

Quickly, Johnson moves on to discuss the development of PACER, the Administrative Office of the U.S. Court’s site for Public Access to Court Electronic Records.  (And, blogged about a lot on this site, here, here and here….)

The article continues:

“Their RECAP tool, as the name suggests, aims to turn PACER on its head: by making legal documents more easily available, and dramatically reducing the cost.

“All of the stuff in Pacer is, essentially, part of the law of the land,” says Harlan Yu, a Princeton PhD student and one of the trio behind Recap. “Our nation is governed by laws, and we feel like the law should be accessible to all. And being accessible, in this day and age, means that the law should be online where it’s most accessible to citizens in a way that is free.””

As the article closes, it brings up some of the privacy concerns confronting RECAP and PACER right now.

 

“For advocates, the bigger question is whether PACER objects: opening access to legal documents is an important part of expanding free data and free information. After all, it was Thomas Jefferson – who made his living practicing the law, among other things – who said that “information is the currency of democracy”.”

For now, the A.O. is in the midst of a survey and evaluation of PACER and the pilot program might re-launch sometime soon. . .In the meantime, go ahead and install RECAP on your library machines.



US Courts & RECAP – the Latest News

Paul Alan Levy on the Consumer Law and Policy blog writes:

“I got a call this afternoon from Michel Ishakian, the Deputy Chief for IT Policy and Budget at the Administrative Office of the United States courts.  She assured me that they have no problem with counsel using RECAP (discussed here) and that the language sent out by the Northern District of Georgia (see my update to my previous post) is the only language that she disseminated for publication.  She also indicated that she has been in touch with Ed Felten (under whose auspices RECAP was developed) and that, so far as she can tell, he and she are on the same page.

To the extent that messages from some districts sounded more severe, it was simply a matter of reminding all of our ECF filers to be careful about computer security and was not intended to discourage use of RECAP.”

I wonder if there will be a new series of e-mails from the courts to this effect….

Geeks seek to make the law Googleable; RECAP in WSJ

Buried on page W13 of today’s Wall Street Journal  is a must-read piece by Katherine Mangu-Ward, “Transparency Chic.”

As the author makes clear:

. . . no aspect of government remains more locked down than the secretive, hierarchical judicial branch. Digital records of court filings, briefs and transcripts sit behind paywalls like Lexis and Westlaw. Legal codes and judicial documents aren’t copyrighted, but governments often cut exclusive distribution deals, rendering other access methods a bit legally questionable. . . .

Which leads her to discuss RECAP:

. . . [Stephen Schultz, Tim Lee and Harlan Wu] whipped up a sleek little add-on to the popular Firefox Internet browser called RECAP (PACER spelled backward). Legit users of the federal court system download it. Then each time they drop eight pennies, it deposits a copy of the page in the free Internet archive. This data joins other poached information, all of which is formatted, relabeled and made searchable—the kind of customer service government tends to skimp on. . . .

This might be the first mainstream press mention of RECAP, which is something we are all abuzz about here.

The author of the Wall Street Journal piece, Katherine Mangu-Ward, a senior editor at Reason magazine, is apparently a bit of a geek herself, giving a Twitter shoutout to those who helped her write the piece:

@kmanguward Thanks @binarybits @carlmalamud @cshirky @evwayne for info, perspective, and snappy quotes in “Transparency Chic” http://tinyurl.com/navyvj

@evwayne is, of course, our very own Erika Wayne who was interviewed for the piece.

A Note on RECAP’s Commitment to Privacy

Posted on the RECAP site today:

A Note on RECAP’s Commitment to Privacy

We’ve gotten our first official reaction from the judiciary, in the form of a statement on the New Mexico Bankruptcy court’s website. It contains two important points about the PACER terms of use, and a misleading statement about privacy that we want to correct.

First, the good news: the court acknowledges the point we’ve made before: use of RECAP is consistent with the law and the PACER terms of use. The only potential exception is if you’ve received a fee waiver for PACER. In that case, use of RECAP could violate the terms of the fee waiver, which reads: “Any transfer of data obtained as the result of a fee exemption is prohibited unless expressly authorized by the court.” We’re not lawyers, so we don’t know if the court’s interpretation is correct, but we encourage our users to honor the terms of the fee waiver.

Now, an important correction. The statement raises the concern that RECAP could compromise sealed or private documents that attorneys access via the CM/ECF, the system attorneys use for electronic filing and retrieval of documents in pending cases. Protecting privacy is our top priority, and we specifically designed RECAP to safeguard the privacy of CM/ECF documents. As we describe
in our privacy FAQ
, RECAP is carefully designed not to upload documents from the CM/ECF system. When a user logs into the CM/ECF system, a cookie is set on the user’s browser that’s different from the cookie that’s set when a user is logged into the public PACER system. RECAP monitors for this cookie and automatically deactivates itself whenever the user is logged into CM/ECF. We tested this thoroughly, with some CM/ECF users, before we released the public beta.

We’re confident that RECAP maintains the security model set up by the courts, and that it will never upload documents while a user is logged into CM/ECF. The code is open source, so anyone with concerns is welcome to inspect it for themselves. We’d like to work with the judiciary in the coming weeks to ensure they understand how RECAP protects privacy and security, and to incorporate any further enhancements they might suggest. In the meantime, users can continue using RECAP with the knowledge that it’s designed with privacy as our top priority.

RECAP: Turning PACER Around

Meet  RECAP (http://recapthelaw.org) – Be impressed.  Very impressed.

PACER (Public Access to Court Electronic Records) documents sit behind a pay-wall; however, these are public record documents, so once a document has been retrieved from PACER, it may be freely shared.

RECAP  enables us to easily share federal court documents. The goal of this project is to publish an extensive archive of these documents to the public for free (our favorite word).

RECAP is an extension to the Firefox browser. There is a a video on the site that demonstrates the extension in action.

RECAP works with Firefox to upload case dockets and documents that you have paid for to the public archive, and notifying you when free versions of documents are already available.

RECAP is a project of the Center for Information Technology Policy at Princeton University.  It is one of several projects that harness the power of the web to increase government transparency.