Commit Graph

130 Commits

Author SHA1 Message Date
tastytea 03b367ee98
Don't print same file path twice in error message.
zip::exception always has  the filename in the message.
2021-05-29 17:37:41 +02:00
tastytea 00e3edb9f2
Only search files in spine, in the right order.
The spine lists all content documents in their linear reading order. So we're
finally getting our results in the right order! 🎉

Since we skip the images and fonts, which usually make up the most bytes in an
EPUB file, the performance increase is immense. I measured 60-70% in a very
short test.

Closes: #1
2021-05-29 17:34:43 +02:00
tastytea c94d9de0db
Reformat error messages.
One line per error message.
2021-05-29 12:53:14 +02:00
tastytea 4ff796a590
Make regular expressions static variables.
continuous-integration/drone/push Build is passing Details
Fewer allocations → faster program. About 17% speed increase with 89 books on up
to 3 cores. Measured using the average of 4 runs.
Before: ~15,5 seconds
 After: ~12,8 seconds

Calls to allocation functions went down from 16.652.583 to 5.059.301.
2021-05-28 19:11:32 +02:00
tastytea 4df7b36dfc
Print matches while still searching.
continuous-integration/drone/push Build is passing Details
Previously we printed the matches at the end.
2021-05-28 17:18:34 +02:00
tastytea 59759b5934
Put output stuff into own function in different file.
It got a little crowded in main(). 😊
2021-05-28 17:07:11 +02:00
tastytea 308e2d271f
Skip rest of file if encoding of files in EPUB is broken.
Standard says UTF-8. I don't want to deal with weird Windows-encodings or
whatever this is.

Closes: #7
2021-05-28 13:57:51 +02:00
tastytea 65b46ca846
Do not allow more threads than max_threads. 2021-05-28 11:48:38 +02:00
tastytea c3131e01f0
Add setting to suppress this-is-not-an-EPUB errors. 2021-05-27 21:48:35 +02:00
tastytea 84f600196c
Add error code to zip::exception. 2021-05-27 21:39:01 +02:00
tastytea b96315f8bb
Don't add extra newlines before errors. 2021-05-27 21:03:42 +02:00
tastytea 2b91a839cc
Add --raw and --context again.
Forgot to re-implement them when I overhauled the option parsing…
2021-05-27 21:01:07 +02:00
tastytea 8d5565a72c
Don't write to matches_all simultaneously from different threads.
What did I do yesterday?!? 😬

Closes: #6
2021-05-27 20:42:20 +02:00
tastytea 38bf9be948
Fix some more memory leaks. 2021-05-27 20:11:59 +02:00
tastytea b24ea9b71e
Fix memory leak. 🤦
continuous-integration/drone/push Build is passing Details
That's why I don't write C. 😄

This seems to fix issue #6 in single-threaded mode but sometimes throws “double
free or corruption (out)” in multi-threaded mode.

Bug: #6
2021-05-27 20:05:02 +02:00
tastytea fbb87cac81
Remove a few unnecessary .data(), remove unnecessary include. 2021-05-27 19:08:53 +02:00
tastytea c50659a339
Chunk error string to make it better translatable. 2021-05-27 17:24:19 +02:00
tastytea e64591f204
Rework option parsing, change --no-filename.
continuous-integration/drone/push Build is failing Details
Options are now better accessible, --no-filename accepts the values filesystem,
in-epub or all.
2021-05-27 17:20:00 +02:00
tastytea c376ce8466
Print the EPUB file name if more than 1 input file.
Change --no-filename to mean: Don't print the EPUB file name.
2021-05-27 14:46:23 +02:00
tastytea 0c45e7ac98
Add --recursive and --dereference-recursive.
Closes: #5
2021-05-27 14:45:52 +02:00
tastytea b764f5423c
Put input files into a std::vector<filesystem::path>.
We need that for supporting recursive directory search later.

#
# Previous commits:
#   29ae22c Make regex const.
#   8ed72af Update german translation.
#   a3b0964 Remove old comment.
#   d107ce5 Modify config file example.
2021-05-27 13:46:47 +02:00
tastytea 29ae22cc4a
Make regex const. 2021-05-27 09:46:59 +02:00
tastytea a3b0964873
Remove old comment. 2021-05-26 20:20:21 +02:00
tastytea 7dcf6d599c
Remove debug statements. 2021-05-26 18:25:53 +02:00
tastytea fe02b155f5
Import std::string into epubgrep::search namespace.
continuous-integration/drone/push Build is passing Details
2021-05-26 18:02:27 +02:00
tastytea fc0aa02bc9
Use threads if more than one input file is searched.
Use 75% of the available threads (rounded up).

Closes: #4
2021-05-26 17:50:52 +02:00
tastytea 694cb3bc44
Add --no-filename switch.
continuous-integration/drone/push Build is passing Details
Suppresses the mentioning of file names on output.
2021-05-26 09:04:16 +02:00
tastytea fd8db544bd
Add --nocolor switch.
Closes: #2
2021-05-25 11:52:13 +02:00
tastytea b72d3f3420
Color matches bright magenta. 2021-05-25 11:00:05 +02:00
tastytea d3c3062cc0
Add Termcolor dependency and bundle it in dist/. 2021-05-25 10:55:44 +02:00
tastytea ce015954ea
Only initialize search::options once. 2021-05-25 10:02:34 +02:00
tastytea 4644c2afd4
Support CMake 3.12.
continuous-integration/drone/push Build is passing Details
Ubuntu 20.04 has 3.16, so requiring 3.17 is a bit mean.
2021-05-25 07:38:07 +02:00
tastytea be229d25d6
Don't demand required options if --help or --version is requested.
continuous-integration/drone/push Build is passing Details
Bump version to 0.1.2.
2021-05-25 07:15:04 +02:00
tastytea e1d29c5893
Don't replace stuff in search::cleanup_text() if nothing matched. 2021-05-24 20:02:27 +02:00
tastytea 09090a1c13
Fix bugs in search::context().
- Don't add context if words == 0
- Handle beginning / end of text correctly.
2021-05-24 19:57:15 +02:00
tastytea 1f25daed26
Add basic error handling to search. 2021-05-24 19:10:00 +02:00
tastytea c790c4952c
Extract page numbers. 2021-05-24 18:56:43 +02:00
tastytea bb4a4c719f
Wrap headlines in <H> and </H> during cleanup. 2021-05-24 18:08:40 +02:00
tastytea 8ab7d0f655
Extract headlines. 2021-05-24 17:27:30 +02:00
tastytea 8b21f4a8b9
Don't output empty fields. 2021-05-24 16:37:43 +02:00
tastytea 972ce1d0fe
Don't strip headlines. 2021-05-24 16:37:30 +02:00
tastytea bb1a43ca92
Move cleanup_text(), document functions. 2021-05-24 16:23:07 +02:00
tastytea 84e2b387e5
Clean up text before searching. 2021-05-24 16:01:41 +02:00
tastytea 1979956f03
Add basic search functionality and context output. 2021-05-24 15:35:49 +02:00
tastytea 4e01032c6f
Put regex type and --grep in search::options. 2021-05-24 13:13:15 +02:00
tastytea b2e70a6faa
Mark everything [[nodiscard]], fix some comments. 2021-05-24 13:00:03 +02:00
tastytea 9c769f664d
Clarify documentation: NUM → NUMBER. 2021-05-24 08:23:52 +02:00
tastytea 44ffeb07ec
Add calls to search::search() to main().
All that's missing now is the actual search functionality. 😊
2021-05-24 08:15:04 +02:00
tastytea 3222019c69
Add default value (0) to --context. 2021-05-24 08:14:29 +02:00
tastytea 1f82d9927a
Add skeleton for search::search().
- Type for matches
- Type for options.
2021-05-24 07:52:36 +02:00
tastytea f388dd0511
Add --raw and --context switches. 2021-05-24 07:50:50 +02:00
tastytea 5ac7f92f1d
Add hint to man page. 2021-05-24 06:03:32 +02:00
tastytea 3ad4e49e3d
Don't dump zip files to stdout. 2021-05-24 05:46:12 +02:00
tastytea 2ab4705475
Make --regexp required, show help if it is not given. 2021-05-24 05:45:42 +02:00
tastytea 4b2fbecf93
Print all options at start.
continuous-integration/drone/push Build is passing Details
2021-05-23 16:52:32 +02:00
tastytea 03e07dfc3e
Rework option parsing a bit.
- Add --basic-regexp
- Add --grep
- Remove --input-file
- Make it possible to have multiple --regexp
2021-05-23 16:23:07 +02:00
tastytea e773d4b78a
Implement zip::read_file() – Read file in archive; add test.
Also added zip::open_file() and zip::close_file() to deduplicate code.
2021-05-23 08:56:58 +02:00
tastytea 6334b7051f
Newline before printing error messages. 2021-05-23 08:55:15 +02:00
tastytea 28c6c80def
Set C locale, treat EPUB file names as UTF-8.
EPUB file names MUST be UTF-8. ASCII is a subset of UTF-8.
2021-05-23 06:32:56 +02:00
tastytea 6c040fa951
Add first test.
continuous-integration/drone/push Build is failing Details
- Compile everything in src/ except main.cpp into a static library
- Link the static library into tests
- Add test for zip::list()
2021-05-21 09:27:31 +02:00
tastytea 8c8a19b86b
Cosmetic changes. 2021-05-21 07:10:46 +02:00
tastytea a941bcced3
Initialize the variables where they're needed. 2021-05-21 07:05:44 +02:00
tastytea 354583bcbd
Add short descriptions. 2021-05-21 06:54:01 +02:00
tastytea 7ecb634473
Return failure on zip error. 2021-05-21 04:12:58 +02:00
tastytea 1229e295ef
Set locale for std::cerr. 2021-05-21 04:10:11 +02:00
tastytea 1a80f770ff
Fix error messages. 2021-05-21 04:04:17 +02:00
tastytea 4e8c6e7489
Add exception for zip processing.
continuous-integration/drone/push Build is failing Details
- New dependency: libfmt.
- Translate error messages.
2021-05-21 03:25:42 +02:00
tastytea 222f802015
Basic zip file support.
continuous-integration/drone/push Build is passing Details
Dumping the TOC works.
2021-05-21 01:56:37 +02:00
tastytea 031e2f0db6
Use sub-namespaces for functionality-groups. 2021-05-21 01:50:13 +02:00
tastytea 10e84a7707
Translate Error messages in main(). 2021-05-21 01:48:55 +02:00
tastytea 7007d5e89a
Add single letter options. 2021-05-20 11:51:08 +02:00
tastytea 231ec20cd5
Rename input → input-file. 2021-05-20 11:43:35 +02:00
tastytea 9a136b87a9
Add --regexp and --input, add both as positional options. 2021-05-20 11:33:33 +02:00
tastytea bfb59da98d
std::filesystem compatibility for older GCC.
continuous-integration/drone/push Build is passing Details
The header was in experimental a while ago and the implementation was a
separate library.
2021-05-20 10:37:45 +02:00
tastytea cf8fb95777
Switch to fs::path where appropriate.
continuous-integration/drone/push Build is failing Details
2021-05-20 10:20:42 +02:00
tastytea f57555fb3a
Fix documentation of get_config_path(). 2021-05-20 10:03:09 +02:00
tastytea 370382d44d
Add basic error handling. 2021-05-20 09:05:52 +02:00
tastytea f95389d76c
Add support for config file. 2021-05-20 09:05:39 +02:00
tastytea 94c25552a1
Add translation support, german translation. 2021-05-20 07:09:21 +02:00
tastytea 678b506a8c
Initial commit.
- Add skeleton
- Add command-line parsing
2021-05-20 04:34:06 +02:00