130 lines
3.8 KiB
Plaintext
130 lines
3.8 KiB
Plaintext
= epubgrep(1)
|
|
:doctype: manpage
|
|
:Author: tastytea
|
|
:Email: tastytea@tastytea.de
|
|
:Date: 2021-05-25
|
|
:Revision: 0.0.0
|
|
:man source: epubgrep
|
|
:man manual: General Commands Manual
|
|
|
|
== NAME
|
|
|
|
epubgrep - Search tool for EPUB ebooks.
|
|
|
|
== SYNOPSIS
|
|
|
|
*epubgrep* [_OPTION_]… _PATTERN_ [_FILE_]…
|
|
|
|
== DESCRIPTION
|
|
|
|
*epubgrep* searches EPUB files in a similar way as grep. It uses the same names
|
|
for command line switches where possible. However, not all grep switches are
|
|
implemented and some additional switches are added.
|
|
|
|
== OPTIONS
|
|
|
|
*-h*, *--help*::
|
|
Display a short help message and exit.
|
|
|
|
*V*, *--version*::
|
|
Show version, copyright and license.
|
|
|
|
*-G*, *--basic-regexp*::
|
|
_PATTERN_ is a POSIX basic regular expression. This is the default.
|
|
|
|
*-E*, *--extended-regexp*::
|
|
_PATTERN_ is a POSIX extended regular expression.
|
|
|
|
*--grep*::
|
|
In combination with *--basic-regexp* or *--extended-regexp*, _PATTERN_ is
|
|
treated as a newline separated list of expressions, a match is found if any of
|
|
the expressions in the list match.
|
|
|
|
*-P*, *--perl-regexp*::
|
|
_PATTERN_ is a Perl regular expression.
|
|
|
|
*-i*, *--ignore-case*::
|
|
Ignore case distinctions in pattern and data.
|
|
|
|
*-e* _PATTERN_, *--regexp* _PATTERN_::
|
|
Use additional _PATTERN_ for matching. Can be used more than once.
|
|
|
|
*-a*, *--raw*::
|
|
Do not clean up text before searching. No HTML stripping, no newline removal.
|
|
|
|
*-C* _NUMBER_, *context* _NUMBER_::
|
|
Print _NUMBER_ words of context around matches.
|
|
|
|
== USAGE
|
|
|
|
[source,shellsession]
|
|
--------------------------------------------------------------------------------
|
|
$ epubgrep -i makhno -C 4 The_Bolshevik_Myth.epub
|
|
OPS/piece000038.xhtml, Chapter 33. Dark People, page 141: in the campaign against Makhno, and they were exchanging
|
|
--------------------------------------------------------------------------------
|
|
|
|
The output is <file path in epub>, <last headline>, <page number>: <context
|
|
before><match><context after>. <last headline> and <page number> may not be available.
|
|
|
|
=== Differences to grep
|
|
|
|
epubgrep does not operate on lines, but on whole files. All newlines will be
|
|
replaced by spaces (multiple newlines will be condensed into one space) and HTML
|
|
will be stripped. This means you can search for text spanning multiple lines and
|
|
don't have to worry about HTML tags in the text. Use *--raw* if you want to
|
|
search in the raw files instead.
|
|
|
|
=== Configuration
|
|
|
|
Every command line switch can be used as an option in the configuration file. If
|
|
the switch has no value (it is a simple on switch), it has to be written as
|
|
`option = 1`. Do not use quotation marks around the values, they will be taken
|
|
literally.
|
|
|
|
Command line options overwrite configuration file options. Options that can
|
|
occur more than once are merged.
|
|
|
|
==== Example configuration file
|
|
|
|
This example makes epubgrep always use Perl regular expressions and search for
|
|
mentions of the words thyme and oregano in every book.
|
|
|
|
[source,cfg]
|
|
--------------------------------------------------------------------------------
|
|
perl-regexp = 1
|
|
regexp = \b[Tt]hyme\b
|
|
regexp = \b[Oo]regano\b
|
|
--------------------------------------------------------------------------------
|
|
|
|
// == EXAMPLES
|
|
|
|
|
|
== FILES
|
|
|
|
*Configuration file*::
|
|
* If `XDG_CONFIG_HOME` is defined: `${XDG_CONFIG_HOME}/epubgrep.conf`
|
|
* If `HOME` is defined: `${HOME}/.config/epubgrep.conf`
|
|
* Otherwise: `epubgrep.conf`
|
|
|
|
|
|
== KNOWN BUGS
|
|
|
|
EPUB files with non-ASCII file names only work reliably when the system locale
|
|
uses an encoding which has the necessary characters. Technically EPUBs must use
|
|
UTF-8 for file names but it is usually recommended to only use ASCII (ASCII is
|
|
valid UTF-8). If your system locale is not UTF-8, files may be silently skipped.
|
|
You can work around this by calling epubgrep like this:
|
|
`LC_ALL="C.UTF-8" epubgrep`
|
|
|
|
== REPORTING BUGS
|
|
|
|
Bugtracker: https://schlomp.space/tastytea/epubgrep/issues
|
|
|
|
E-mail: tastytea@tastytea.de
|
|
|
|
== SEE ALSO
|
|
|
|
*perlre*(1)
|
|
|
|
// LocalWords: epubgrep
|