It’s Monday, so time for an update on Sourcery.info. I didn’t get quite as much done as I was hoping since the last update, having gotten quite involved in the plumbing. Working out the little details, like what happens when you upload a two files with the same name, or how to ensure that we process everything in an orderly fashion with queues - it’s these problems that you don’t know to think about before you start building, and it’s what makes software development always so wildly inaccurate with its budgets and time measurements. You simply don’t know what you don’t know. (Until you smash your head against it in the dark.)

One major development to report is Nvidia, the company that makes the graphics cards that have become so valuable to bitcoin miners and AI, has released “Chat with RTX”, which does similar things to what I’m building. I’m not too worried, since the use-case is somewhat different, but I’m definitely going to be stealing ideas from them. Also, I don’t see why a developer working part-time for free on a project can’t take on a company that’s just passed Amazon and Google (Alphabet) in market cap.

I asked you last week about what computers you have access to, what budgets you could potentially muster for an airgapped machine, and how much you work with very sensitive info. Thanks to all those who replied! The good news is that we can definitely work with what you have.

What I’m asking of you this week is for some sample documents. Please don’t send me any sensitive stuff. I’m looking for documents you wish you’d had a tool to quickly parse and get meaningful information from. Some examples could be annual reports, government gazettes, court transcripts or dockets, or anything else that you’ve had to slog through for information on a story. There’s no limit to how short or long, the format, or how many individual docs. I’ll need to test for many different cases, so the more test documents I have, the better.

Until next week,