Search tool for EPUB e-books
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
Go to file
tastytea 449e315397
continuous-integration/drone/push Build is passing Details
add performance section to readme
4 months ago
cmake CI: Add package generation for openSUSE Leap 15. 1 year ago
dist/termcolor Merge commit '6c33fb4dcebd5464d89ca3fb98bdf23847d81fbf' as 'dist/termcolor' 2 years ago
man Add sub-headings for option categories in man page. 2 years ago
src pass c strings to fmt (…) 6 months ago
tests fix tests (copy paste error) 6 months ago
translations Update german translation. 1 year ago
.clang-format Add .clang-tify and .clang-format. 2 years ago
.clang-tidy clang-tidy: change MinimumVariableNameLength to 2 6 months ago
.cmake-format.json disable cmake-format for now 5 months ago
.drone.yml fix release upload 6 months ago
.editorconfig Clean up .editorconfig. 2 years ago
.gitignore Ignore dap-mode file. 2 years ago
AUTHORS Initial commit. 2 years ago
CMakeLists.txt version bump 0.6.2 6 months ago
CMakePresets.json Make sanitizers optional. 1 year ago
CODE_OF_CONDUCT.adoc Initial commit. 2 years ago
CONTRIBUTING.adoc Update rebuild-commands in translator guide. 2 years ago
CREDITS Update credits. 2 years ago
LICENSE Initial commit. 2 years ago
README.adoc add performance section to readme 4 months ago
screenshot.png Update screenshot. 2 years ago

README.adoc

epubgrep

epubgrep is a search tool for EPUB e-books. It does not operate on lines, but on whole files. All newlines will be replaced by spaces and HTML will be stripped. This means you can search for text spanning multiple lines and dont have to worry about HTML tags in the text.

epubgrep is licensed under the AGPL-3.0-only. The bundled Termcolor is licensed under the BSD-3-Clause license.

Usage

Screenshot of epubgrep, showing the output of 2 book searches.

See man page for more information.

Install

Packaging status

Gentoo

sudo eselect repository enable guru
echo 'app-text/epubgrep' | sudo tee -a /etc/portage/package.accept_keywords/epubgrep
sudo emaint sync -r guru
sudo emerge -a app-text/epubgrep

Debian and Ubuntu

wget -O - https://tastytea.de/tastytea.asc | sudo apt-key add -
sudo add-apt-repository 'deb https://apt.schlomp.space/[code name] [code name] main'
sudo apt install epubgrep

Replace [code name] with the code name of your installation. Packages are available for bullseye (Debian 11), buster (Debian 10), focal (Ubuntu 20.04) and bionic (Ubuntu 18.04).

Tip
If you get the error message that add-apt-repository was not found, install software-properties-common.

From source

Dependencies

  • Tested OS: Linux

  • C++ compiler with C++17 support (tested: GCC 8/9/10, clang 6/11)

  • CMake (at least: 3.12)

  • Boost (tested: 1.75.0 / 1.65.0)

  • gettext (tested: 0.21 / 0.19)

  • libarchive (tested: 3.5 / 3.2)

  • fmt (tested: 7.0 / 4.0)

  • AsciiDoc (tested: 9.0 / 8.6)

  • Termcolor (tested: 2.0) (If not found, the bundled version is used.)

  • pugixml (tested: 1.11 / 1.8)

  • nlohmann_json (tested: 3.9 / 2.1)

  • Optional

    • Tests: Catch (tested: 2.13 / 1.10)

Install dependencies in Debian or Ubuntu

Or distributions that are derived from Debian or Ubuntu. You will need at least Debian buster (10) or Ubuntu focal (20.04).

sudo apt install build-essential cmake libboost-program-options-dev \
                 libboost-locale-dev libboost-regex-dev libboost-log-dev \
                 gettext libarchive-dev libfmt-dev asciidoc libpugixml-dev \
                 nlohmann-json-dev
Tip
If nlohmann-json-dev can not be found, try nlohmann-json3-dev.
Install dependencies in openSUSE

Tested on openSUSE Leap 15.3.

sudo zypper install cmake gcc10-c++ rpm-build \
                    libboost_program_options1_75_0-devel \
                    libboost_locale1_75_0-devel libboost_log1_75_0-devel \
                    fmt-devel libarchive-devel pugixml-devel \
                    nlohmann_json-devel asciidoc

Get sourcecode

Release

Download the current release at schlomp.space.

Development version
git clone https://schlomp.space/tastytea/epubgrep.git

Compile

In a terminal, go to the directory where you unpacked / cloned the source code and then:

cmake -S . -B build
cmake --build build --parallel $(nproc --ignore=1)

To install, run sudo cmake --install build. To run the tests, run ctest --test-dir build.

Tip
If you are using Debian or Ubuntu, or a distribution that is derived from these, you can run cpack -G DEB in the build directory to generate a .deb-file. You can then install it with apt install ./epubgrep-*.deb. If you are using a distribution that uses RPM packages, like openSUSE or Fedora, you can generate a package with cpack -G RPM and install it with zypper install ./epubgrep-*.rpm or dnf install ./epubgrep-*.rpm.
CMake options:
  • -DCMAKE_BUILD_TYPE=Debug for a debug build.

  • -DWITH_TESTS=YES if you want to compile the tests.

  • -DXGETTEXT_CMD=String The program to use instead of xgettext.

  • -DFALLBACK_BUNDLED=NO if you dont want to fall back on bundled libraries.

  • -DWITH_SANITIZER=YES to use sanitizers in debug builds.

Similar projects

  • ripgrep-all can search EPUB files and strips HTML, but does not display page numbers or headings.

  • zipgrep from unzip can search EPUB files but does not strip HTML and does not display page numbers or headings.

Performance

A test with a directory containing 3333 EPUBs and 6269 files in total showed this difference between epubgrep-0.6.2 and ripgrep-all-0.9.6:

% hyperfine "epubgrep 'floor' ~/Books" "rga 'floor' ~/Books"
Benchmark #1: epubgrep 'floor' ~/Books
  Time (mean ± σ):     167.246 s ±  3.848 s    [User: 176.251 s, System: 79.107 s]
  Range (min … max):   161.533 s … 173.647 s    10 runs

Benchmark #2: rga 'floor' ~/Books
  Time (mean ± σ):      9.219 s ±  0.506 s    [User: 17.540 s, System: 12.773 s]
  Range (min … max):    8.571 s …  9.923 s    10 runs

Summary
  'rga 'floor' ~/Books' ran
   18.14 ± 1.08 times faster than 'epubgrep 'floor' ~/Books'