epubgrep/README.adoc

194 lines
7.3 KiB
Plaintext
Raw Permalink Normal View History

= epubgrep
:showtitle:
:toc: preamble
:project: epubgrep
:uri-base: https://schlomp.space/tastytea/{project}
:uri-branch-main: {uri-base}/src/branch/main
:uri-gcc: https://gcc.gnu.org/
:uri-clang: https://clang.llvm.org/
:uri-cmake: https://cmake.org/
:uri-catch: https://github.com/catchorg/Catch2
2021-05-20 09:59:15 +02:00
:uri-boost: https://www.boost.org/
:uri-gettext: https://www.gnu.org/software/gettext/
:uri-libarchive: https://www.libarchive.org/
:uri-fmt: https://github.com/fmtlib/fmt
2021-05-23 06:35:04 +02:00
:uri-asciidoc: http://asciidoc.org/
:uri-termcolor: https://termcolor.readthedocs.io/
:uri-pugixml: https://pugixml.org/
2021-06-01 18:47:12 +02:00
:uri-json: https://nlohmann.github.io/json/
:license: https://schlomp.space/tastytea/{project}/src/branch/main/LICENSE
:license-termcolor: https://schlomp.space/tastytea/{project}/src/branch/main/dist/termcolor/LICENSE
*{project}* is a search tool for EPUB e-books. It does not operate on lines, but
on whole files. All newlines will be replaced by spaces and HTML will be
stripped. This means you can search for text spanning multiple lines and don't
have to worry about HTML tags in the text.
{project} is licensed under the link:{license}[AGPL-3.0-only]. The bundled
link:{uri-termcolor}[Termcolor] is licensed under the
link:{license-termcolor}[BSD-3-Clause] license.
2021-05-25 12:02:18 +02:00
== Usage
2021-05-25 12:49:28 +02:00
[alt="Screenshot of epubgrep, showing the output of 2 book searches."]
2021-05-25 11:43:20 +02:00
image::{uri-base}/raw/branch/main/screenshot.png[]
2021-05-25 12:02:18 +02:00
See
https://schlomp.space/tastytea/{project}/src/branch/main/man/{project}.1.adoc[man
page] for more information.
2021-05-23 06:35:04 +02:00
== Install
[alt="Packaging status" link=https://repology.org/project/epubgrep/versions]
image::https://repology.org/badge/vertical-allrepos/epubgrep.svg[]
=== Gentoo
[source,shell]
--------------------------------------------------------------------------------
2021-05-25 18:04:44 +02:00
sudo eselect repository enable guru
echo 'app-text/epubgrep' | sudo tee -a /etc/portage/package.accept_keywords/epubgrep
sudo emaint sync -r guru
sudo emerge -a app-text/epubgrep
--------------------------------------------------------------------------------
2021-05-25 19:07:00 +02:00
=== Debian and Ubuntu
[source,shell]
--------------------------------------------------------------------------------
wget -O - https://tastytea.de/tastytea.asc | sudo apt-key add -
2021-05-28 09:50:18 +02:00
sudo add-apt-repository 'deb https://apt.schlomp.space/[code name] [code name] main'
sudo apt install epubgrep
--------------------------------------------------------------------------------
2021-05-28 09:50:18 +02:00
Replace _[code name]_ with the code name of your installation. Packages are
2021-06-07 15:07:58 +02:00
available for *bullseye* (Debian 11), *buster* (Debian 10), *focal* (Ubuntu
20.04) and *bionic* (Ubuntu 18.04).
2021-05-28 09:50:18 +02:00
[TIP]
If you get the error message that `add-apt-repository` was not found, install
`software-properties-common`.
=== From source
==== Dependencies
* Tested OS: Linux
2021-08-20 20:30:17 +02:00
* C\++ compiler with C++17 support (tested: link:{uri-gcc}[GCC] 8/9/10,
link:{uri-clang}[clang] 6/11)
* link:{uri-cmake}[CMake] (at least: 3.12)
2021-05-20 09:59:15 +02:00
* link:{uri-boost}[Boost] (tested: 1.75.0 / 1.65.0)
* link:{uri-gettext}[gettext] (tested: 0.21 / 0.19)
* link:{uri-libarchive}[libarchive] (tested: 3.5 / 3.2)
* link:{uri-fmt}[fmt] (tested: 7.0 / 4.0)
* link:{uri-asciidoc}[AsciiDoc] (tested: 9.0 / 8.6)
* link:{uri-termcolor}[Termcolor] (tested: 2.0) (If not found, the bundled
version is used.)
* link:{uri-pugixml}[pugixml] (tested: 1.11 / 1.8)
* link:{uri-json}[nlohmann_json] (tested: 3.9 / 2.1)
* Optional
2021-05-20 10:43:51 +02:00
** Tests: link:{uri-catch}[Catch] (tested: 2.13 / 1.10)
===== Install dependencies in Debian or Ubuntu
2021-05-25 14:20:45 +02:00
Or distributions that are derived from Debian or Ubuntu. You will need at least
Debian buster (10) or Ubuntu focal (20.04).
2021-05-25 14:20:45 +02:00
[source,shell]
--------------------------------------------------------------------------------
sudo apt install build-essential cmake libboost-program-options-dev \
libboost-locale-dev libboost-regex-dev libboost-log-dev \
2021-06-01 18:47:12 +02:00
gettext libarchive-dev libfmt-dev asciidoc libpugixml-dev \
nlohmann-json-dev
2021-05-25 14:20:45 +02:00
--------------------------------------------------------------------------------
2021-06-02 14:47:27 +02:00
[TIP]
If `nlohmann-json-dev` can not be found, try `nlohmann-json3-dev`.
2021-08-20 20:30:17 +02:00
===== Install dependencies in openSUSE
Tested on openSUSE Leap 15.3.
2021-08-20 20:30:17 +02:00
[source,shell]
--------------------------------------------------------------------------------
sudo zypper install cmake gcc10-c++ rpm-build \
libboost_program_options1_75_0-devel \
libboost_locale1_75_0-devel libboost_log1_75_0-devel \
fmt-devel libarchive-devel pugixml-devel \
nlohmann_json-devel asciidoc
--------------------------------------------------------------------------------
==== Get sourcecode
===== Release
Download the current release at link:{uri-base}/releases[schlomp.space].
===== Development version
[source,shell]
--------------------------------------------------------------------------------
git clone https://schlomp.space/tastytea/epubgrep.git
--------------------------------------------------------------------------------
==== Compile
2021-05-25 14:20:45 +02:00
In a terminal, go to the directory where you unpacked / cloned the source code
and then:
[source,shell]
--------------------------------------------------------------------------------
2021-05-31 10:54:27 +02:00
cmake -S . -B build
cmake --build build --parallel $(nproc --ignore=1)
--------------------------------------------------------------------------------
2021-06-03 15:53:43 +02:00
To install, run `sudo cmake --install build`. To run the tests, run `ctest
--test-dir build`.
[TIP]
If you are using Debian or Ubuntu, or a distribution that is derived from these,
you can run `cpack -G DEB` in the build directory to generate a .deb-file. You
can then install it with `+++apt install ./epubgrep-*.deb+++`.
If you are using a distribution that uses RPM packages, like openSUSE or Fedora,
you can generate a package with `cpack -G RPM` and install it with `+++zypper
install ./epubgrep-*.rpm+++` or `+++dnf install ./epubgrep-*.rpm+++`.
2021-05-25 14:20:45 +02:00
.CMake options:
* `-DCMAKE_BUILD_TYPE=Debug` for a debug build.
* `-DWITH_TESTS=YES` if you want to compile the tests.
2021-05-20 08:06:38 +02:00
* `-DXGETTEXT_CMD=String` The program to use instead of `xgettext`.
2021-05-25 12:04:22 +02:00
* `-DFALLBACK_BUNDLED=NO` if you don't want to fall back on bundled libraries.
2021-08-20 18:23:40 +02:00
* `-DWITH_SANITIZER=YES` to use sanitizers in debug builds.
2021-07-10 12:12:30 +02:00
== Similar projects
* link:https://github.com/phiresky/ripgrep-all[ripgrep-all] can search EPUB
files and strips HTML, but does not display page numbers or headings.
* zipgrep from link:http://infozip.sourceforge.net/[unzip] can search EPUB files
but does not strip HTML and does not display page numbers or headings.
2022-10-01 20:40:55 +02:00
== Performance
A test with a directory containing 3333 EPUBs and 6269 files in total showed
this difference between epubgrep-0.6.2 and ripgrep-all-0.9.6:
[source,shellsession]
--------------------------------------------------------------------------------
% hyperfine "epubgrep 'floor' ~/Books" "rga 'floor' ~/Books"
Benchmark #1: epubgrep 'floor' ~/Books
Time (mean ± σ): 167.246 s ± 3.848 s [User: 176.251 s, System: 79.107 s]
Range (min … max): 161.533 s … 173.647 s 10 runs
Benchmark #2: rga 'floor' ~/Books
Time (mean ± σ): 9.219 s ± 0.506 s [User: 17.540 s, System: 12.773 s]
Range (min … max): 8.571 s … 9.923 s 10 runs
Summary
'rga 'floor' ~/Books' ran
18.14 ± 1.08 times faster than 'epubgrep 'floor' ~/Books'
--------------------------------------------------------------------------------
include::{uri-base}/raw/branch/main/CONTRIBUTING.adoc[]