Using LLMs to generate small semantic perturbations for language learning writing practice

Still images of this GIF are at the bottom. Learning to read a language is mostly a game of getting massive quantities of comprehensible input. Learning to write that same language is a whole ’nother ballgame. But, using the 4-quadrant Anki card setup from my earlier post, I think I’m finding more and more ways to make this as amenable to spaced repetition as possible. One thing I’ve been experimenting with with surprising success is the idea of using LLMs to generate “semantic perturbations” on sentences I already “know” how to write, where “know” = “have in active review in Anki”, for our purposes. ...

July 4, 2025

LLM tutored writing practice for secondary language acquisition

Language learning for the contemporary adult learner can be broken down roughly into four highly correlated, but distinct, skillsets. Passive understanding Active production The written word Reading Writing The spoken word Listening Speaking You may know from my FOSS software that I have been learning Finnish for the past 4 years or so. For the first few years I pretty much focused exclusively on reading comprehension, as I consider that to be the easiest quadrant to skill up in first. This focus put me in the interesting position for some time of being able to read most YA fiction and tax documents while being unable to order a pizza for myself on the phone. ...

June 1, 2025

The 10 sentences heuristic for foreign vocabulary acquisition

In order to learn a word, we need to come across it several times. It seems that the minimum amount of times we need to meet a word is somehwere around 7 or 8 meetings, but it's very hard to put a figure on it. -- Paul Nation, [2020 Victorial University of Wellington](https://www.youtube.com/watch?v=FlJj8vpJxfE) He’s right, but that never stopped me. I say 10 sentences in a specific practice: When you come across a word you don’t know enough times for it to bother you, ...

April 1, 2024

Cloud translation is more expensive than I thought

Example from yesterday’s news. Count ’em yourself – there’s 76 of them there. Mass i18n efforts like this are I think an underappreciated benefit of what static site generators like Hugo can give you. Actually, especialy Hugo – it’s multi-language support is very good, like darn near everything about the platform once you get past the initial learning curve. Another underappreciated benefit: When building HTML pages is fast, you can afford to build a lot of them. A quick hugo && cd public/ && fd html | wc -l tells us that there are about 2700 HTML files on the site, which Hugo builds in under 3000 ms on my machine. The Github Action run which built the site as of today took a glacial 35 seconds by comparison. ...

November 28, 2023