Skip to content

Gutenbit

Fast local search across public-domain literary works. Find, browse, and search books from your terminal or Python script.

Install

Gutenbit is not published on PyPI yet, so the quickest way to try it is to run it directly from the GitHub repo:

uvx --from git+https://github.com/keinan1/gutenbit gutenbit --help

If you want to keep it installed for repeated use:

uv tool install git+https://github.com/keinan1/gutenbit

Then run gutenbit --help. Remove it later with uv tool uninstall gutenbit. Gutenbit stores its database and catalog cache in a .gutenbit/ folder. To use gutenbit as a project dependency instead of a standalone CLI tool:

uv add git+https://github.com/keinan1/gutenbit

CLI

gutenbit catalog --author "Austen, Jane"                              # find Pride and Prejudice
gutenbit add 1342                                                     # download and store it
gutenbit toc 1342                                                     # inspect numbered sections
gutenbit view 1342                                                    # read the opening
gutenbit view 1342 --section 1 --forward 5                            # jump into chapter 1
gutenbit search "truth universally acknowledged" --book 1342 --phrase
gutenbit search "bennet" --book 1342 --limit 3 --radius 1             # read hits in context

All commands support --json for machine-readable output. CLI-managed state is stored under .gutenbit/ by default, including the database at .gutenbit/gutenbit.db and the catalog cache under .gutenbit/cache/.

Python

from gutenbit import Catalog, Database

catalog = Catalog.fetch()
book = catalog.get(1342)

if book is not None:
    with Database(".gutenbit/gutenbit.db") as db:
        db.ingest([book])
        for hit in db.search("truth universally acknowledged", book_id=1342):
            print(hit.title, hit.div1, hit.content[:80])

Next steps

  • Getting Started walks through a complete workflow.
  • Python API covers the library in full.
  • CLI documents every subcommand and flag.
  • Concepts explains how chunking, divisions, and search work.
  • API Reference has auto-generated module documentation.

Project Gutenberg Access

Gutenbit is for individual downloads, not bulk downloading. It prefers official mirrors and uses the main site only as a zip fallback, with a default 2.0 second delay between downloads. Review Project Gutenberg's Robot Access Policy and Terms of Use.