epubgrep/man/epubgrep.1.adoc

= epubgrep(1)
:doctype:       manpage
:Author:        tastytea
:Email:         tastytea@tastytea.de
:Date:          2021-05-27
:Revision:      0.0.0
:man source:    epubgrep
:man manual:    General Commands Manual

== NAME

epubgrep - Search tool for EPUB ebooks.

== SYNOPSIS

*epubgrep* [_OPTION_]… _PATTERN_ [_FILE_]…

== DESCRIPTION

*epubgrep* searches EPUB files in a similar way as grep. It uses the same names
for command line switches where possible. However, not all grep switches are
implemented and some additional switches are added.

== OPTIONS

*-h*, *--help*::
Display a short help message and exit.

*V*, *--version*::
Show version, copyright and license.

*-G*, *--basic-regexp*::
_PATTERN_ is a POSIX basic regular expression. This is the default.

*-E*, *--extended-regexp*::
_PATTERN_ is a POSIX extended regular expression.

*--grep*::
In combination with *--basic-regexp* or *--extended-regexp*, _PATTERN_ is
treated as a newline separated list of expressions, a match is found if any of
the expressions in the list match.

*-P*, *--perl-regexp*::
_PATTERN_ is a Perl regular expression.

*-i*, *--ignore-case*::
Ignore case distinctions in pattern and data.

*-e* _PATTERN_, *--regexp* _PATTERN_::
Use additional _PATTERN_ for matching. Can be used more than once.

*-a*, *--raw*::
Do not clean up text before searching. No HTML stripping, no newline removal.

*-C* _NUMBER_, *context* _NUMBER_::
Print _NUMBER_ words of context around matches.

*--nocolor*::
Do not color matches.

*--no-filename* _WHICH_::

Suppress the mentioning of file names on output. _WHICH_ is ‘filesystem’ for the
file names on your file systems, ‘in-epub’ for the file names inside the EPUB or
‘all’. Chapters and page numbers will still be output.

*-r*, *--recursive*:
Read all files under each directory, recursively, following symbolic links only
if they are on the command line. Silently skips directories that are not
readable by the user.

*-R*, *--dereference-recursive*
Read all files under each directory, recursively. Follow all symbolic
links. Silently skips directories that are not readable by the user.

== USAGE

[source,shellsession]
--------------------------------------------------------------------------------
$ epubgrep -i makhno -C 4 The_Bolshevik_Myth.epub
OPS/piece000038.xhtml, Chapter 33. Dark People, page 141: in the campaign against Makhno, and they were exchanging
--------------------------------------------------------------------------------

The output is <file path in epub>, <last headline>, <page number>: <context
before><match><context after>. <last headline> and <page number> may not be available.

=== Differences to grep

epubgrep does not operate on lines, but on whole files. All newlines will be
replaced by spaces (multiple newlines will be condensed into one space) and HTML
will be stripped. This means you can search for text spanning multiple lines and
don't have to worry about HTML tags in the text. Use *--raw* if you want to
search in the raw files instead.

=== Configuration

Every command line switch can be used as an option in the configuration file. If
the switch has no value (it is a simple on switch), it has to be written as
`option = 1`. Do not use quotation marks around the values, they will be taken
literally.

Command line options overwrite configuration file options. Options that can
occur more than once are merged.

==== Example configuration file

This example makes epubgrep always suppress the file names on output, print 2
words of context around matches (unless overridden on the command line) and
search for mentions of the words thyme and oregano in every book.

[source,cfg]
--------------------------------------------------------------------------------
no-filename
context = 2
regexp = [Tt]hyme
regexp = [Oo]regano
--------------------------------------------------------------------------------

// == EXAMPLES


== FILES

*Configuration file*::
* If `XDG_CONFIG_HOME` is defined: `${XDG_CONFIG_HOME}/epubgrep.conf`
* If `HOME` is defined: `${HOME}/.config/epubgrep.conf`
* Otherwise: `epubgrep.conf`


== KNOWN BUGS

EPUB files with non-ASCII file names only work reliably when the system locale
uses an encoding which has the necessary characters. Technically EPUBs must use
UTF-8 for file names but it is usually recommended to only use ASCII (ASCII is
valid UTF-8). If your system locale is not UTF-8, files may be silently skipped.
You can work around this by calling epubgrep like this:
`LC_ALL="C.UTF-8" epubgrep`

== REPORTING BUGS

Bugtracker: https://schlomp.space/tastytea/epubgrep/issues

E-mail: tastytea@tastytea.de

== SEE ALSO

*perlre*(1)

//  LocalWords:  epubgrep