f59c86e20d
Don't search for whitespace beyond the start/end of the text.
2021-06-06 23:48:06 +02:00
0470acb00e
Make --raw work again.
continuous-integration/drone/push Build is passing
2021-06-06 22:37:09 +02:00
1e29608c7e
Fix positioning of matches in search::search().
2021-06-06 22:34:52 +02:00
9708bb69c8
Don't attempt to access a pointer to nowhere.
2021-06-06 21:34:48 +02:00
b8431019b7
Don't inject page numbers and headline-markers into the text.
...
continuous-integration/drone/push Build is failing
The metadata is recorded in position → data pairs.
Closes: #13
2021-06-06 21:26:09 +02:00
a49c500d0f
Fix <style> and <script> erasure.
...
I didn't take into account that <script […]/> is possible.
2021-06-06 16:06:14 +02:00
262aab6671
Add debug log for replacements.
2021-06-06 15:52:09 +02:00
9067b387ef
Fix pagebreak-iterators.
...
Oopsie! 😄
2021-06-06 15:50:13 +02:00
99e1cd8e98
Re-enabled address sanitizer.
...
continuous-integration/drone/push Build is passing
Found out what was wrong: I fed boost::regex_search() the pointer to a substring
that was created in-place. match[2] was a pointer to a substring inside that.
The problem was, that match was declared outside of the if-block. So after the
if-block match[2] would point to a now freed memory address. It didn't have any
effects because I didn't use match afterwards.
I rewrote the whole thing with iterators. Slightly less readable, slightly
better performance (probably).
2021-06-05 17:45:07 +02:00
bdf9a86651
Fix pagebreak-regex and range in which pagebreaks are searched.
2021-06-05 17:18:35 +02:00
f1a0015f28
Disable address sanitizer.
...
It complains about boost/regex/v5/sub_match.hpp:57:30 and I can't figure out
what's wrong or how to ignore it.
2021-06-05 14:24:53 +02:00
12e1c64fc0
Make text formatting more readable.
2021-06-05 13:34:48 +02:00
4026937f08
Don't return pointer to freed memory address.
2021-06-04 23:14:36 +02:00
de2001a442
Fix nlohmann_json error with old versions.
...
continuous-integration/drone/push Build is passing
Old versions of nlohmann_json do not have support for std::string_view,
std::filesystem::::path(?) and std::pair(?).
2021-06-01 22:35:52 +02:00
6a4511099f
Do not add empty matches to matches_all.
2021-06-01 20:15:05 +02:00
21989aabfe
Fix JSON output.
...
Bug: #3
2021-06-01 20:14:36 +02:00
f1cb16f6d0
Add JSON output.
...
Closes: #3
2021-06-01 19:17:44 +02:00
7b4b9edfe5
Rename file names in search::matches to make it more clear.
2021-06-01 19:15:00 +02:00
88e4e78db8
Add nlohmann_json dependency.
2021-06-01 18:47:12 +02:00
b0b6c00a90
Log application exit.
continuous-integration/drone/push Build is passing
2021-06-01 18:22:15 +02:00
a7fae314b3
Log some progress info to log file.
continuous-integration/drone/push Build is passing
2021-06-01 17:17:00 +02:00
6278779029
Don't mask previous failures.
2021-06-01 17:06:25 +02:00
40e39dc0e7
Umm… nothing to see here… 😄
continuous-integration/drone/push Build is passing
2021-06-01 16:47:47 +02:00
07915bdf87
Add lots of debug output.
2021-06-01 15:32:10 +02:00
017059cb5b
Make options::options printable (for use in debug output).
2021-06-01 15:25:39 +02:00
a8db304bf1
Add DEBUGLOG macro.
...
Adds the severity and prepends the function name.
2021-06-01 15:24:19 +02:00
580f08b823
Output info messages to stderr with --debug.
2021-06-01 13:52:41 +02:00
28c0a5a797
Add --debug switch and enable debugging if it is on.
2021-06-01 13:41:54 +02:00
b12f88003b
Fix text logging and debug logging.
2021-06-01 13:41:20 +02:00
17b6017fe0
Rename init_debug() → enable_debug(), add documentation.
2021-06-01 13:36:34 +02:00
12a1c47259
Make log_path a variable again.
...
We don't need log_dir() twice afterall.
2021-06-01 11:09:40 +02:00
a8f2b7dfb6
Add equipment for debug logs.
2021-06-01 11:02:06 +02:00
c35434e745
Simplify LOG macro.
...
continuous-integration/drone/push Build is passing
We only have one logger, no need to specify it every time.
2021-05-31 22:43:30 +02:00
4c1bae86ba
Add fatal errors.
...
Errors are fatal if the program has to stop immediately.
2021-05-31 22:29:35 +02:00
1fee4f5afd
Fix file error reporting.
...
Not sure why exceptions don't have that info… 🤷
2021-05-31 22:22:04 +02:00
80e2e9d05d
Re-add -DBOOST_LOG_DYN_LINK
...
continuous-integration/drone/push Build is passing
It seems we still need it. The combination of Boost 1.65.1 and CMake 3.12
does not work otherwise. Not sure whose fault it is.
2021-05-31 21:20:41 +02:00
c30a8b40be
Older Boost version need log_setup in addition to log.
...
continuous-integration/drone/push Build is failing
1.75.0 works without it, 1.74.0 not.
Removed -DBOOST_LOG_DYN_LINK again.
2021-05-31 20:57:36 +02:00
1d02c3bd6d
Add workaround for old CMake←→Boost combinations.
continuous-integration/drone/push Build is failing
2021-05-31 20:15:54 +02:00
77d013c12a
Change config file path.
...
continuous-integration/drone/push Build was killed
Existing old config file is copied over.
2021-05-31 19:37:11 +02:00
b966be3021
Log suppressed errors to log file.
2021-05-31 19:27:36 +02:00
18f8600174
Change log file directory.
2021-05-31 19:26:19 +02:00
11572d5b29
Use logger for warnings end errors.
2021-05-31 19:12:07 +02:00
ac5b31f2d5
Add logger.
2021-05-31 18:50:41 +02:00
cf583c6d7f
Don't compile sources twice.
...
Not sure why it would compile them twice if they are set to PUBLIC, but okay. 🤷
2021-05-31 17:24:08 +02:00
78ada56226
Make input files required.
...
It will be difficult to parse EPUB files from stdin and the usefulness is
questionable.
2021-05-31 15:31:41 +02:00
11a8989370
CMake: Make GLOB work with new files (most of the time).
...
continuous-integration/drone/push Build is passing
Caveats: <https://cmake.org/cmake/help/latest/command/file.html#filesystem >
2021-05-31 11:06:29 +02:00
76ed0c9dbf
Un-escape named and numbered entities in documents before searching.
continuous-integration/drone/push Build is passing
2021-05-30 23:32:35 +02:00
8a9be5d45b
Add helpers::unescape_html() & tests.
2021-05-30 23:32:30 +02:00
7ddfe32e30
Move is_whitespace() and urldecode() to helpers.
2021-05-30 21:52:52 +02:00
94564fa914
Strip whitespace from headlines.
2021-05-30 21:16:24 +02:00