tastytea
e64591f204
Some checks failed
continuous-integration/drone/push Build is failing
Options are now better accessible, --no-filename accepts the values filesystem, in-epub or all.
150 lines
4.6 KiB
Plaintext
150 lines
4.6 KiB
Plaintext
= epubgrep(1)
|
||
:doctype: manpage
|
||
:Author: tastytea
|
||
:Email: tastytea@tastytea.de
|
||
:Date: 2021-05-27
|
||
:Revision: 0.0.0
|
||
:man source: epubgrep
|
||
:man manual: General Commands Manual
|
||
|
||
== NAME
|
||
|
||
epubgrep - Search tool for EPUB ebooks.
|
||
|
||
== SYNOPSIS
|
||
|
||
*epubgrep* [_OPTION_]… _PATTERN_ [_FILE_]…
|
||
|
||
== DESCRIPTION
|
||
|
||
*epubgrep* searches EPUB files in a similar way as grep. It uses the same names
|
||
for command line switches where possible. However, not all grep switches are
|
||
implemented and some additional switches are added.
|
||
|
||
== OPTIONS
|
||
|
||
*-h*, *--help*::
|
||
Display a short help message and exit.
|
||
|
||
*V*, *--version*::
|
||
Show version, copyright and license.
|
||
|
||
*-G*, *--basic-regexp*::
|
||
_PATTERN_ is a POSIX basic regular expression. This is the default.
|
||
|
||
*-E*, *--extended-regexp*::
|
||
_PATTERN_ is a POSIX extended regular expression.
|
||
|
||
*--grep*::
|
||
In combination with *--basic-regexp* or *--extended-regexp*, _PATTERN_ is
|
||
treated as a newline separated list of expressions, a match is found if any of
|
||
the expressions in the list match.
|
||
|
||
*-P*, *--perl-regexp*::
|
||
_PATTERN_ is a Perl regular expression.
|
||
|
||
*-i*, *--ignore-case*::
|
||
Ignore case distinctions in pattern and data.
|
||
|
||
*-e* _PATTERN_, *--regexp* _PATTERN_::
|
||
Use additional _PATTERN_ for matching. Can be used more than once.
|
||
|
||
*-a*, *--raw*::
|
||
Do not clean up text before searching. No HTML stripping, no newline removal.
|
||
|
||
*-C* _NUMBER_, *context* _NUMBER_::
|
||
Print _NUMBER_ words of context around matches.
|
||
|
||
*--nocolor*::
|
||
Do not color matches.
|
||
|
||
*--no-filename* _WHICH_::
|
||
|
||
Suppress the mentioning of file names on output. _WHICH_ is ‘filesystem’ for the
|
||
file names on your file systems, ‘in-epub’ for the file names inside the EPUB or
|
||
‘all’. Chapters and page numbers will still be output.
|
||
|
||
*-r*, *--recursive*:
|
||
Read all files under each directory, recursively, following symbolic links only
|
||
if they are on the command line. Silently skips directories that are not
|
||
readable by the user.
|
||
|
||
*-R*, *--dereference-recursive*
|
||
Read all files under each directory, recursively. Follow all symbolic
|
||
links. Silently skips directories that are not readable by the user.
|
||
|
||
== USAGE
|
||
|
||
[source,shellsession]
|
||
--------------------------------------------------------------------------------
|
||
$ epubgrep -i makhno -C 4 The_Bolshevik_Myth.epub
|
||
OPS/piece000038.xhtml, Chapter 33. Dark People, page 141: in the campaign against Makhno, and they were exchanging
|
||
--------------------------------------------------------------------------------
|
||
|
||
The output is <file path in epub>, <last headline>, <page number>: <context
|
||
before><match><context after>. <last headline> and <page number> may not be available.
|
||
|
||
=== Differences to grep
|
||
|
||
epubgrep does not operate on lines, but on whole files. All newlines will be
|
||
replaced by spaces (multiple newlines will be condensed into one space) and HTML
|
||
will be stripped. This means you can search for text spanning multiple lines and
|
||
don't have to worry about HTML tags in the text. Use *--raw* if you want to
|
||
search in the raw files instead.
|
||
|
||
=== Configuration
|
||
|
||
Every command line switch can be used as an option in the configuration file. If
|
||
the switch has no value (it is a simple on switch), it has to be written as
|
||
`option = 1`. Do not use quotation marks around the values, they will be taken
|
||
literally.
|
||
|
||
Command line options overwrite configuration file options. Options that can
|
||
occur more than once are merged.
|
||
|
||
==== Example configuration file
|
||
|
||
This example makes epubgrep always suppress the file names on output, print 2
|
||
words of context around matches (unless overridden on the command line) and
|
||
search for mentions of the words thyme and oregano in every book.
|
||
|
||
[source,cfg]
|
||
--------------------------------------------------------------------------------
|
||
no-filename
|
||
context = 2
|
||
regexp = [Tt]hyme
|
||
regexp = [Oo]regano
|
||
--------------------------------------------------------------------------------
|
||
|
||
// == EXAMPLES
|
||
|
||
|
||
== FILES
|
||
|
||
*Configuration file*::
|
||
* If `XDG_CONFIG_HOME` is defined: `${XDG_CONFIG_HOME}/epubgrep.conf`
|
||
* If `HOME` is defined: `${HOME}/.config/epubgrep.conf`
|
||
* Otherwise: `epubgrep.conf`
|
||
|
||
|
||
== KNOWN BUGS
|
||
|
||
EPUB files with non-ASCII file names only work reliably when the system locale
|
||
uses an encoding which has the necessary characters. Technically EPUBs must use
|
||
UTF-8 for file names but it is usually recommended to only use ASCII (ASCII is
|
||
valid UTF-8). If your system locale is not UTF-8, files may be silently skipped.
|
||
You can work around this by calling epubgrep like this:
|
||
`LC_ALL="C.UTF-8" epubgrep`
|
||
|
||
== REPORTING BUGS
|
||
|
||
Bugtracker: https://schlomp.space/tastytea/epubgrep/issues
|
||
|
||
E-mail: tastytea@tastytea.de
|
||
|
||
== SEE ALSO
|
||
|
||
*perlre*(1)
|
||
|
||
// LocalWords: epubgrep
|