A personal blog
There is a seemingly endless stream of articles we come across daily, and little time to read them. A quick glance at Hacker News in between tasks can yield half a dozen open tabs with articles to read at some point later. There are many services that help with this problem by maintaining a queue of these articles for you. You know the type: things like Instapaper or Pocket.
In this article, we will build a service like that. The hard way.
I’d like to spend time less time at the computer, and read in a more unhurried, and offline manner. For me, this means reading these articles on my Kindle. I would also like to introduce a bigger gap between collecting the article, and reading it to correct for any recency bias. If it sits in the queue for two weeks, maybe I will realize I don’t care to read it?
Let’s call this project: sa2k — send articles to Kindle. We will not be using the email a file to Amazon feature. This will be a Golang project, so let’s start with:
$ mkdir sa2k
$ cd sa2k
$ go mod init github.com/honza/sa2k
Let’s create a simple CLI project using Cobra, in main.go
:
package main
import (
"github.com/spf13/cobra"
)
var rootCmd = &cobra.Command{
Use: "sa2k",
Short: "sa2k --- send articles to kindle",
}
func main() {
if err := rootCmd.Execute(); err != nil {
fmt.Println(err)
os.Exit(1)
}
}
And the simplest action will be adding a new article:
var addCmd = &cobra.Command{
Use: "add",
Short: "Add an article to the queue",
RunE: func(cmd *cobra.Command, args []string) error {
return Add(config, Title, args)
},
}
Next, hook it up to the Cobra mechanism, in main()
:
rootCmd.AddCommand(addCmd)
Now for the implementation of Add()
. We will accept a config struct which will tell us where to store articles, etc. The Title
is a CLI flag that we can use to override whatever the website’s <title>
is, and args
is list of articles to enqueue.
The Config
struct looks like this:
type Config struct {
// Where to store epubs of articles we want to read
EpubDir string `json:"epub_dir"`
// Once we send an article to the Kindle, where should we archive it?
ArchiveDir string `json:"archive_dir"`
// Where is the Kindle going to be mounted?
KindleDocDir string `json:"kindle_doc_dir"`
}
We probably want a way to initialize this config, so let’s add an init
command.
var initCmd = &cobra.Command{
Use: "init",
Short: "Init sa2k dir",
RunE: func(cmd *cobra.Command, args []string) error {
configDir := path.Join(xdg.ConfigHome, "sa2k")
if _, err := os.Stat(configDir); !os.IsNotExist(err) {
fmt.Println("Already configured")
return nil
}
os.Mkdir(configDir, 0744)
config := Config{
EpubDir: "epubs",
ArchiveDir: "archive",
KindleDocDir: "",
}
b, err := json.MarshalIndent(config, "", " ")
if err != nil {
return err
}
configFile := path.Join(configDir, "config.json")
ioutil.WriteFile(configFile, b, 0744)
return nil
},
}
And import the required libraries:
import (
"path"
"os"
"encoding/json"
"io/ioutil"
"github.com/spf13/cobra"
"github.com/adrg/xdg"
)
Next, hook it up to the Cobra mechanism, in main()
:
rootCmd.AddCommand(addCmd)
rootCmd.AddCommand(initCmd)
Add the Title flag at top level:
var Title string
And hook it up to Cobra:
rootCmd.PersistentFlags().StringVar(&Title, "title", "", "")
And a global config:
// Global config
var config Config
which we can populate at the top of main()
:
func main() {
var err error
config, err = GetConfig()
if err != nil {
fmt.Println(err)
os.Exit(1)
}
}
The GetConfig
function is nothing special:
func GetConfig() (Config, error) {
var config Config
configPath := path.Join(xdg.ConfigHome, "sa2k", "config.json")
if _, err := os.Stat(configPath); os.IsNotExist(err) {
return config, fmt.Errorf("No config file present. Run init first.")
}
contents, err := ioutil.ReadFile(configPath)
if err != nil {
return config, err
}
err = json.Unmarshal(contents, &config)
if err != nil {
return config, err
}
if !path.IsAbs(config.EpubDir) {
config.EpubDir = path.Join(xdg.ConfigHome, "sa2k", config.EpubDir)
}
if !path.IsAbs(config.ArchiveDir) {
config.ArchiveDir = path.Join(xdg.ConfigHome, "sa2k", config.ArchiveDir)
}
if !path.IsAbs(config.KindleDocDir) {
return config, fmt.Errorf("Kindle Doc Dir should be absolute")
}
return config, nil
}
Alright, with that housekeeping out of the way, let’s get back to Add
. Here is the signature:
func Add(config Config, title string, urls []string) error
We will use readability
to download each of the urls
, and simplify the HTML structure. Then, we will create a epub file based on that HTML content.
func Add(config Config, title string, urls []string) error {
for _, url := range urls {
article, err := PrepareArticle(config, url)
if err != nil {
return err
}
title := title
if title == "" {
title = article.Title
}
err = createKfx(config, url, title, article.Byline, article.Content)
if err != nil {
return fmt.Errorf("epub error: %w", err)
}
}
}
The PrepareArticle
function does the network parts:
func PrepareArticle(config Config, url string) (*readability.Article, error) {
var article readability.Article
var err error
if strings.HasPrefix(url, "http") {
client := &http.Client{Timeout: 30 * time.Second}
resp, err := client.Get(url)
if err != nil {
return &article, err
}
if resp.StatusCode > 399 {
return &article, fmt.Errorf("Status code: %d for %s", resp.StatusCode, url)
}
article, err = readability.FromReader(resp.Body, nil)
} else {
f, err := os.Open(url)
if err != nil {
return nil, err
}
article, err = readability.FromReader(f, nil)
}
if err != nil {
return nil, err
}
return &article, nil
}
And import readability
:
readability "github.com/go-shiori/go-readability"
What on earth is KFX? It’s the latest, and greatest ebook format for Kindles. It features excellent typography which is why I like it. Once we have the epub file ready, we will use the amazing Calibre KFX Output plugin. It can be used via CLI so no need to bother with the Calibre GUI. The plugin is available from the link above, and it’s trivial to install. You will need to download Amazon Kindle Previewer, and if you are on Linux, you will need to install it via wine. Once installed, everything is seamless.
In createKfx
, we will create the epub file, and use Pandoc to create the epub file. Then will will convert it to KFX. This might look like a mouthful but it’s pretty simple:
func createKfx(config Config, url string, title string, author string, content string) error {
slug := slugify.Slugify(title)
now := time.Now()
ts := now.Format("20060102-150405")
metaFilename := fmt.Sprintf("%s-%s.txt", ts, slug)
epubFilename := fmt.Sprintf("%s-%s.epub", ts, slug)
kfxFilename := fmt.Sprintf("%s-%s.kfx", ts, slug)
htmlFilename := fmt.Sprintf("%s-%s.html", ts, slug)
coverFilename := fmt.Sprintf("%s-%s.png", ts, slug)
if config.EpubDir != "" {
os.MkdirAll(config.EpubDir, 0744)
metaFilename = filepath.Join(config.EpubDir, metaFilename)
epubFilename = filepath.Join(config.EpubDir, epubFilename)
kfxFilename = filepath.Join(config.EpubDir, kfxFilename)
htmlFilename = filepath.Join(config.EpubDir, htmlFilename)
coverFilename = filepath.Join(config.EpubDir, coverFilename)
}
err := ioutil.WriteFile(htmlFilename, []byte(content), 0644)
if err != nil {
return err
}
title = strings.Trim(title, "\"“”")
meta := fmt.Sprintf(`---
title: >
%s
author: >
%s
date: %s
...`, title, author, now.Format("2006-01-02"))
err = ioutil.WriteFile(metaFilename, []byte(meta), 0644)
if err != nil {
return err
}
err = GenerateCover(title, coverFilename)
if err != nil {
return fmt.Errorf("failed to generate cover: %w", err)
}
pandocCmd := fmt.Sprintf("pandoc -o %s --epub-cover-image %s %s %s",
epubFilename, coverFilename, metaFilename, htmlFilename)
pandocOutput, err := runShellCommand(pandocCmd)
if err != nil {
return fmt.Errorf("pandoc error: %s - %w", pandocOutput, err)
}
convertCmd := fmt.Sprintf("ebook-convert %s %s", epubFilename, kfxFilename)
convertOutput, err := runShellCommand(convertCmd)
if err != nil {
return fmt.Errorf("kfx error: %s - %w", convertOutput, err)
}
err = os.Remove(metaFilename)
if err != nil {
return err
}
err = os.Remove(htmlFilename)
if err != nil {
return err
}
err = os.Remove(coverFilename)
if err != nil {
return err
}
return nil
}
We want our articles to look nice on the Kindle, so we generate a cover:
import (
"embed"
"github.com/fogleman/gg"
"github.com/golang/freetype/truetype"
"golang.org/x/image/font"
)
//go:embed EBGaramondSC12-Regular.ttf
var fontFS embed.FS
func LoadFontFace(path string, points float64) (font.Face, error) {
fontBytes, err := fontFS.ReadFile(path)
if err != nil {
return nil, err
}
f, err := truetype.Parse(fontBytes)
if err != nil {
return nil, err
}
face := truetype.NewFace(f, &truetype.Options{
Size: points,
})
return face, nil
}
func GenerateCover(text string, outputFilename string) error {
const W = 1600
const H = 2560
dc := gg.NewContext(W, H)
dc.SetRGB(1, 1, 1)
dc.Clear()
dc.SetRGB(0, 0, 0)
fontFace, err := LoadFontFace("EBGaramondSC12-Regular.ttf", 140)
if err != nil {
return err
}
dc.SetFontFace(fontFace)
const h = 180
y := H/2 - h/2
dc.DrawStringWrapped(text, 800, float64(y), 0.5, 0.5, 1400.0, 1.3, gg.AlignCenter)
dc.SavePNG(outputFilename)
return nil
}
And runShellCommand
is a cheeky helper:
func runShellCommand(cmd string) (string, error) {
output, err := exec.Command(
"bash",
"-c",
cmd,
).CombinedOutput()
if err != nil {
return string(output), err
}
return string(output), nil
}
We can finally collect articles with:
$ go run main.go add "https://..."
Next, we will want to sync any queued articles to the Kindle. Let’s add a sync command:
var syncCmd = &cobra.Command{
Use: "sync",
Short: "sync articles to kindle",
RunE: func(cmd *cobra.Command, args []string) error {
return Sync(config)
},
}
// and in main()
rootCmd.AddCommand(addCmd)
rootCmd.AddCommand(initCmd)
rootCmd.AddCommand(syncCmd)
And the implementation copies any KFX files to the Kindle, and moves epubs to the archive in case we ever need them in the future:
func Sync(config Config) error {
if _, err := os.Stat(config.KindleDocDir); os.IsNotExist(err) {
return fmt.Errorf("Kindle not connected")
}
epubs, err := ioutil.ReadDir(config.EpubDir)
if err != nil {
return err
}
if _, err := os.Stat(config.ArchiveDir); os.IsNotExist(err) {
os.Mkdir(config.ArchiveDir, 0744)
}
for _, epub := range epubs {
src := path.Join(config.EpubDir, epub.Name())
kindle := path.Join(config.KindleDocDir, epub.Name())
archive := path.Join(config.ArchiveDir, epub.Name())
if strings.HasSuffix(src, "epub") {
os.Rename(src, archive)
continue
}
err := copy(src, kindle)
if err != nil {
return fmt.Errorf("Failed to copy file to Kindle: %w", err)
}
os.Remove(src)
}
return nil
}
OK! We are getting somewhere.
I also want to make it easy to collect articles. So, let’s make a Chrome extension.
Here are the requirements:
Create a chrome
directory, and create some files:
$ mkdir chrome
$ touch chrome/manifest.json
$ touch chrome/background.json
In manifest.json
, let’s add:
{
"manifest_version": 2,
"name": "sa2k",
"version": "0.1.0",
"description": "desc",
"browser_action": {},
"background": {
"scripts": [
"background.js"
]
},
"permissions": ["activeTab", "nativeMessaging", "notifications"]
}
In background.js
, we will create a port, and set up some listeners on that port. The port will start our CLI Go program, and keep it running. Chrome will communicate with the process over stdin.
var port = chrome.runtime.connectNative('sa2k.host');
chrome.browserAction.onClicked.addListener(function(tab) {
port.postMessage({"url": tab.url});
});
When we click the Chrome UI button, send a message to the sa2k process with the URL of the current tab.
Before Chrome is willing to run some random program on your computer, it needs to know that this is safe.
Create this file:
~/.config/google-chrome/NativeMessagingHosts/sa2k.host.json
… with the following contents:
{
"name": "sa2k.host",
"description": "sa2k",
"path": "<absolute path to your GOPATH>/bin/sa2k",
"type": "stdio",
"allowed_origins": ["chrome-extension://<your ext id/"]
}
You can find the ID of the extension on the extension configuration page in Chrome.
We will also need start installing our Go program with:
$ go install .
OK. Chrome will start our program with a single argument, and it’s the string "chrome-extension://<your ID>"
. So, let’s modify our main()
:
func main() {
var err error
config, err = GetConfig()
if err != nil {
fmt.Println(err)
os.Exit(1)
}
logFilename := path.Join(xdg.ConfigHome, "sa2k", "log")
file, err := openLogFile(logFilename)
if err != nil {
fmt.Println(err)
os.Exit(1)
}
log.SetOutput(file)
log.SetFlags(log.LstdFlags | log.Lshortfile | log.Lmicroseconds)
if len(os.Args) > 1 {
if strings.HasPrefix(os.Args[1], "chrome-extension://") {
err = Receive(config)
if err != nil {
log.Println("Receive failed:", err)
os.Exit(1)
}
return
}
}
rootCmd.PersistentFlags().StringVar(&Title, "title", "", "")
rootCmd.AddCommand(addCmd)
rootCmd.AddCommand(initCmd)
rootCmd.AddCommand(syncCmd)
if err := rootCmd.Execute(); err != nil {
fmt.Println(err)
os.Exit(1)
}
}
And Receive
in where the magic happens. The native messaging protocol uses JSON for encoding the messages, and each message is prefixed with the length of the message. Each message is sent in binary.
We will need a shovel for reading the header:
func readHeader(reader io.Reader) (uint32, error) {
// Read message length.
var length uint32
if err := binary.Read(reader, binary.LittleEndian, &length); err != nil {
if err == io.EOF {
return 0, fmt.Errorf("EOF")
}
return length, err
}
return length, nil
}
You may recall that os.Stdin
implements the reader interface. readHeader
returns the size of the message.
Next, let’s create a message type:
type IncomingMessage struct {
URL string `json:"url"`
}
Now we can start implementing Receive
:
func Receive(config Config) error {
log.Println("Receiving now...")
for {
length, err := readHeader(os.Stdin)
if err != nil {
log.Println("failed to read header", err)
return err
}
if length == 0 {
log.Println("length is zero")
return nil
}
var message IncomingMessage
// Read message body.
if err := json.NewDecoder(io.LimitReader(os.Stdin, int64(length))).Decode(&message); err != nil {
log.Println("failed to parse body")
return err
}
go func() {
err := Add(config, "", []string{message.URL})
if err != nil {
log.Println("Add error", err)
return
}
log.Println("Add done, sending message")
SendMessage(os.Stdout, &OutgoingMessage{Type: Ready})
}()
log.Println("sending success accept message")
SendMessage(os.Stdout, &OutgoingMessage{Type: Accepted})
}
return nil
}
We are looping forever, decoding each message, using our Add
function to enqueue new URLs, and then sending mesages back to Chrome when we have status updates. Of course, the long Add
process happens asynchronously in a goroutine.
Let’s fill out the message sending parts. First the types:
type MessageType int
const (
Accepted MessageType = 1
Ready MessageType = 2
)
type OutgoingMessage struct {
Type MessageType `json:"type"`
}
And then the actual sending:
func writeHeader(writer io.Writer, length int) error {
header := make([]byte, 4)
binary.LittleEndian.PutUint32(header, (uint32)(length))
if n, err := writer.Write(header); err != nil || n != len(header) {
return err
}
return nil
}
func SendMessage(writer io.Writer, v interface{}) error {
message, err := json.Marshal(v)
if err != nil {
return err
}
length := len(message)
if err := writeHeader(writer, length); err != nil {
return err
}
// Write message body.
if n, err := writer.Write(message); err != nil || n != length {
return err
}
return nil
}
Now we can go back, and update the background.js
file to accept these messages:
port.onMessage.addListener(function (response) {
if (response.type === 1) {
var opt = {
type: "basic",
title: "Added!",
message: "URL added to the list, processing now...",
iconUrl: "icon.png"
};
chrome.notifications.create(null, opt);
}
if (response.type === 2) {
var opt = {
type: "basic",
title: "Ready!",
message: "Processed, and ready to sync",
iconUrl: "icon.png"
};
chrome.notifications.create(null, opt);
}
});
That’s it! Now we can click a Chrome extension button, and the article at the current URL will be turned into an ebook in the backgroung. When you connect your Kindle to your computer, you can run sa2k sync
to grab new articles. Magic.
This article was first published on April 21, 2023. As you can see, there are no comments. I invite you to email me with your comments, criticisms, and other suggestions. Even better, write your own article as a response. Blogging is awesome.