I just jail broke my kindle and have a few epubs and thought maybe this would be a good time to change my approach to vocabulary.

What I’d like to do is learn the vocabulary for my reading before I read it, instead of after, or as I’m reading it.

My dream piece of software would do the following:

  1. resolve all words down to their most basic form (ie, singular for nouns, infinitive for verbs, etc.) (My Language is French)

  2. count occurences of each word

  3. Filter out words I already know

  4. Define the words with a bilingual dictionary to english, including original context sentence.

  5. Make anki cards for me to study.

(6) God-tier programming: also include idiomatic expressions as vocabulary)

Does this exist?

Edit: Or help me assemble a pipe to get all these tasks done separately.

  • emb@lemmy.worldM
    link
    fedilink
    arrow-up
    3
    ·
    9 days ago

    JPDB.io does something like this for Japanese. Not sure you can really import books, but it basically combines some kind of parser in with a dictionary API, example sentence corpus, and its own spaced repetition system.

    Gotta be something along the line out there for most languages, but I can’t say I know of the tools. Honestly, the breaking-down-into-a base-word part of it is probably in the dictionary’s domain. If you give it a conjugated verb it should usually be able to tell. But then some ambiguities need context, not sure how to account for that.

    AnkiConnect lets you tap into the Anki APIs, Wiktionary or (from a quick search) Collins should have a dictionary API available for French-English. If the dictionary APIs are good then you could probably get pretty far with basic sentence parsing.

    But yeah, feels like there’s gotta be something ready made for it, wish I knew and could point you in a direction.

    • schipelblorp@sh.itjust.worksOP
      link
      fedilink
      arrow-up
      2
      ·
      9 days ago

      I’ve only done enough programming to know this is very possible. A word count is probably all I’d need to do this manualy. Just wondering if this is one of those things I do instead of learning, so the less time I spend on it, the better I’ll feel.