PDF→EPUB: Use types for menus and keyboard keys; small fixes.
This commit is contained in:
parent
de98ade9c0
commit
92446b6ba1
|
@ -1,7 +1,7 @@
|
|||
---
|
||||
title: "How I convert PDFs to EPUB semi-automatically"
|
||||
slug: how-i-convert-pdfs-to-epub
|
||||
description: "A guide to clean EPUBs from PDFs using Calibre, Emacs and time."
|
||||
description: "A step by step guide to clean EPUBs from PDFs using Calibre, Emacs and time."
|
||||
date: 2021-03-15T04:12:00+01:00
|
||||
type: posts
|
||||
draft: false
|
||||
|
@ -12,6 +12,7 @@ tags:
|
|||
---
|
||||
|
||||
:source-highlighter: pygments
|
||||
:experimental: true
|
||||
|
||||
:url-calibre: https://calibre-ebook.com/
|
||||
:url-calibre-convert: https://manual.calibre-ebook.com/conversion.html#pdfconversion
|
||||
|
@ -31,25 +32,28 @@ have a lot of footnotes.
|
|||
One option is to use Calibre to convert and then fix the result, but I have
|
||||
found that I get better results in less time when I create a new
|
||||
link:{wp-epub}[EPUB], copy the PDF's content into link:{url-emacs}[Emacs], clean
|
||||
it up there and then copy it over to Calibre. This process is what I'd like to
|
||||
it up there and then copy it over to Calibre. This process is what I want to
|
||||
share with you here. You will need Calibre, Emacs or another editor with
|
||||
keyboard macros and some knowledge of link:{wp-xhtml}[XHTML] and
|
||||
link:{wp-css}[CSS] to follow this recipe. It will take long and is boring, but
|
||||
link:{wp-css}[CSS] to follow this guide. It will take long and is boring, but
|
||||
the result is a clean and enjoyable book.
|
||||
|
||||
== Create a new book in Calibre
|
||||
|
||||
Click on “Add books” → “Add empty book”. Then fill in the metadata and select
|
||||
Click on menu:Add books[Add empty book]. Then fill in the metadata and select
|
||||
“EPUB” as format. You can add more metadata and a cover image by right-clicking
|
||||
the book and then selecting “Edit metadata”. Open Calibre's editor by right
|
||||
clicking on the book and selecting “Edit book”. You start with a single XHTML
|
||||
file, `start.xhtml`. I always use that for the title page, the copyright notice
|
||||
and so on. You can force a page break to separate the title and the copyright
|
||||
notice with CSS: Add `style="page-break-after: always;"` to the last element of
|
||||
the virtual “page” or use a CSS class. To add a CSS file click “File” → “New
|
||||
file” and enter a filename ending with `.css`. Add the CSS file by right
|
||||
clicking on `start.xhtml` in the file browser and selecting “Link
|
||||
stylesheets…”. Note that the in-built preview does not show page breaks.
|
||||
the virtual “page” or use a CSS class. To add a CSS file click menu:File[New
|
||||
file] and enter a filename ending with `.css`. Add the CSS file to the document
|
||||
by right clicking on `start.xhtml` in the file browser and selecting “Link
|
||||
stylesheets…”.
|
||||
|
||||
[NOTE]
|
||||
The built-in preview does not show page breaks.
|
||||
|
||||
Your files should look similar to this:
|
||||
|
||||
|
@ -139,15 +143,17 @@ end of the lines.
|
|||
--------------------------------------------------------------------------------
|
||||
|
||||
Make sure that `auto-fill-mode` is disabled. Position the cursor at the start of
|
||||
the buffer and press `<f3>` to start recording a macro. Press `<end>`
|
||||
`<deletechar>` `SPC` (space bar) and then `<f4>` to stop recording. If there is
|
||||
a hyphen at the end of the current line, press `<backspace>` 2 times. Press
|
||||
`<f4>` to call the macro and repeat until you are at the end of the
|
||||
paragraph. Move the cursor to the first line of the next paragraph and repeat.
|
||||
the buffer and press kbd:[<f3>] to start recording a macro. Press kbd:[<end>]
|
||||
kbd:[<deletechar>] kbd:[SPC] (space bar) and then kbd:[<f4>] to stop
|
||||
recording. If there is a hyphen at the end of the current line, press
|
||||
kbd:[<backspace>] 2 times. Press kbd:[<f4>] to call the macro and repeat until
|
||||
you are at the end of the paragraph. Move the cursor to the first line of the
|
||||
next paragraph and repeat…
|
||||
|
||||
Now you should have a text file with 1 paragraph per line. We need to wrap all
|
||||
lines in `<p>` tags, except block quotes and sub-headlines. Either use another
|
||||
macro (“<p>” `<end>` “</p>” `<down>` `<down>` `<home>`) or this elisp function:
|
||||
macro (`<p> kbd:[<end>] </p> kbd:[<down>] kbd:[<down>] kbd:[<home>]`) or this
|
||||
elisp function:
|
||||
|
||||
[source,elisp]
|
||||
--------------------------------------------------------------------------------
|
||||
|
@ -170,14 +176,16 @@ hyperlink-able, so we can't just wrap them in plain `<p>` tags, they need IDs. I
|
|||
like to use `<span>1</span><p id="fn1">[…]</p>` if there is only one
|
||||
footnote-section or `<span>1</span><p id="fn1_1">[…]</p>` for
|
||||
chapter-footnotes. We are going to use a macro with a counter to generate
|
||||
consecutively numbered IDs. First, set the counter to 1 with `C-x C-k
|
||||
C-c` “1”. Then, record this macro:
|
||||
consecutively numbered IDs. First, set the counter to 1 with `kbd:[C-x]
|
||||
kbd:[C-k] kbd:[C-c] 1`. Then, record this macro:
|
||||
|
||||
“<span>” `C-x C-k` `<tab>` `C-u` “-1” `C-x C-k C-a` “</span><p id="fn” `C-x C-k`
|
||||
`<tab>` “">” `<end>` “</p>” `<down>` `<down>` `<home>`
|
||||
`<span> kbd:[C-x] kbd:[C-k] kbd:[<tab>] kbd:[C-u] -1 kbd:[C-x] kbd:[C-k]
|
||||
kbd:[C-a] </span><p id="fn kbd:[C-x] kbd:[C-k] kbd:[<tab>] "> kbd:[<end>] </p>
|
||||
kbd:[<down>] kbd:[<down>] kbd:[<home>]`
|
||||
|
||||
`C-u` “-1” `C-x C-k C-a` “adds” -1 to the counter, so that we can use the same
|
||||
number again.
|
||||
[NOTE]
|
||||
`kbd:[C-u] -1 kbd:[C-x] kbd:[C-k] kbd:[C-a]` “adds” -1 to the counter, so that
|
||||
we can use the same number again.
|
||||
|
||||
Call the macro until every footnote is wrapped and copy them to Calibre.
|
||||
|
||||
|
@ -195,14 +203,14 @@ Press `<f3>` to search through the text and `C-r` to replace.
|
|||
|
||||
== Finishing touches
|
||||
|
||||
Click “Tools” → “Table of Contents” → “Edit table of Contents”, remove the
|
||||
Click menu:Tools[Table of Contents > Edit table of Contents], remove the
|
||||
existing entry and click “Generate ToC from major headings” or “Generate ToC
|
||||
from all headings”.
|
||||
|
||||
Click “Tools” → “Set semantics” and set the location of the title page,
|
||||
Click menu:Tools[Set semantics] and set the location of the title page,
|
||||
copyright page, beginning of text and so on.
|
||||
|
||||
Select “Tools” → “Check book” and fix the errors.
|
||||
Select menu:Tools[Check book] and fix the errors.
|
||||
|
||||
You're done! Enjoy your cleanly formatted book. 😊
|
||||
|
||||
|
|
Loading…
Reference in New Issue