I wrote a while ago on reading articles offline by printing them on paper. The benefit was reduced eye strain and a superior reading experience, without being tethered to any device.

I continue to consume articles this way, with one exception: Long-form articles that are 20+ pages long. I often can’t read them in one session. This added a burden of keeping track of where I’d left off. Reading should be fun, not a chore.

I solved this problem a different way: Convert it to an audio file and add to a podcast feed so I can listen to it while driving. Let the podcast software keep track of where I left off.

My prior experiments with text-to-speech systems failed to impress. The first time was probably in ‘96 or ‘97, where I had Windows read out to me various articles from a local encyclopedia software. While I thought it pretty cool, the proof was in the usage: I stopped using it within days.

How much has text-to-speech improved since and how much will it cost me?

Google Cloud was the first service I tried. They support many different voices and accents. I wrote a script that would extract the text of an article, send it to Google Cloud, and save the result as a wav file locally. Since I didn’t want to try all the different voices, I randomly pick one each time.

The quality is better than I expected. Listening to small snippets, most will not realize that these are computer generated voices. Only when you listen to a whole article, you start noticing that the tone variation is relatively periodic.

How much does this cost me?

Nothing.

Google gives enough credits per month that so far I have not had to pay a dime to them. [1]

Here’s an example of a particular New Yorker article about Dan Ariely.

You can find my script here with some documentation here. It works by using Mozilla’s Readability feature that is embedded into Firefox, coupled with Simon Willison’s shot-scraper tool. The hardest part, surprisingly, was generating the XML file that podcast readers need. I could not find a simple Python library to generate it, nor could I find any specs on what should be in the XML file. I hacked an existing podcast’s XML, modifying it for my needs.

Footnotes

[1]There was one month where I saw a one cent charge from Google, and I’m not sure if it was due to this or one of those credit card verification charges to see if the card is valid.