epubgrep/man/epubgrep.1.adoc
2021-05-24 18:57:04 +02:00

129 lines
3.7 KiB
Plaintext

= epubgrep(1)
:doctype: manpage
:Author: tastytea
:Email: tastytea@tastytea.de
:Date: 2021-05-24
:Revision: 0.0.0
:man source: epubgrep
:man manual: General Commands Manual
== NAME
epubgrep - Search tool for EPUB ebooks.
== SYNOPSIS
*epubgrep* [_OPTION_]… _PATTERN_ [_FILE_]…
== DESCRIPTION
*epubgrep* searches EPUB files in a similar way as grep. It uses the same names
for command line switches where possible. However, not all grep switches are
implemented and some additional switches are added.
== OPTIONS
*-h*, *--help*::
Display a short help message and exit.
*V*, *--version*::
Show version, copyright and license.
*-G*, *--basic-regexp*::
_PATTERN_ is a POSIX basic regular expression. This is the default.
*-E*, *--extended-regexp*::
_PATTERN_ is a POSIX extended regular expression.
*--grep*::
In combination with *--basic-regexp* or *--extended-regexp*, _PATTERN_ is
treated as a newline separated list of expressions, a match is found if any of
the expressions in the list match.
*-P*, *--perl-regexp*::
_PATTERN_ is a Perl regular expression.
*-i*, *--ignore-case*::
Ignore case distinctions in pattern and data.
*-e* _PATTERN_, *--regexp* _PATTERN_::
Use additional _PATTERN_ for matching. Can be used more than once.
*-a*, *--raw*::
Do not clean up text before searching. No HTML stripping, no newline removal.
*-C* _NUMBER_, **context* _NUMBER_::
Print _NUMBER_ words of context around matches.
== USAGE
[source,shellsession]
--------------------------------------------------------------------------------
$ epubgrep -i makhno -C 4 The_Bolshevik_Myth.epub
OPS/piece000038.xhtml, Chapter 33. Dark People, page 141: in the campaign against Makhno, and they were exchanging
--------------------------------------------------------------------------------
The output is <file path in epub>, <last headline>, <page number>: <context
before><match><context after>. <last headline> and <page number> may not be available.
=== Differences to grep
epubgrep does not operate on lines, but on whole files. This means you can
search for text spanning multiple lines. All newlines will be replaced by spaces
and HTML will be stripped. Use *--raw* if you want to search in the raw files
instead.
=== Configuration
Every command line switch can be used as an option in the configuration file. If
the switch has no value (it is a simple on switch), it has to be written as
`option = 1`. Do not use quotation marks around the values, they will be taken
literally.
Command line options overwrite configuration file options. Options that can
occur more than once are merged.
==== Example configuration file
This example makes epubgrep always use Perl regular expressions and search for
mentions of the words thyme and oregano in every book.
[source,cfg]
--------------------------------------------------------------------------------
perl-regexp = 1
regexp = \b[Tt]hyme\b
regexp = \b[Oo]regano\b
--------------------------------------------------------------------------------
// == EXAMPLES
== FILES
*Configuration file*::
* If `XDG_CONFIG_HOME` is defined: `${XDG_CONFIG_HOME}/epubgrep.conf`
* If `HOME` is defined: `${HOME}/.config/epubgrep.conf`
* Otherwise: `epubgrep.conf`
== KNOWN BUGS
EPUB files with non-ASCII file names only work reliably when the system locale
uses an encoding which has the necessary characters. Technically EPUBs must use
UTF-8 for file names but it is usually recommended to only use ASCII (ASCII is
valid UTF-8). If your system locale is not UTF-8, files may be silently skipped.
You can work around this by calling epubgrep like this:
`LC_ALL="C.UTF-8" epubgrep`
== REPORTING BUGS
Bugtracker: https://schlomp.space/tastytea/epubgrep/issues
E-mail: tastytea@tastytea.de
== SEE ALSO
*perlre*(1)
// LocalWords: epubgrep