Show HN: Semantic Grep – A Word2Vec-powered search tool
13 by arunsupe | 2 comments on Hacker News. Much improved new version. Search for words similar to the query. For example, "death" will find "death", "dying", "dead", "killing"... Incredibly useful for exploring large text datasets where exact matches are too restrictive.
Show HN: Zerox – document OCR with GPT-mini
14 by themanmaran | 5 comments on Hacker News. This started out as a weekend hack with gpt-4-mini, using the very basic strategy of "just ask the ai to ocr the document". But this turned out to be better performing than our current implementation of Unstructured/Textract. At pretty much the same cost. I've tested almost every variant of document OCR over the past year, especially trying things like table / chart extraction. I've found the rules based extraction has always been lacking. Documents are meant to be a visual representation after all. With weird layouts, tables, charts, etc. Using a vision model just make sense! In general, I'd categorize this solution as slow, expensive, and non deterministic. But 6 months ago it was impossible. And 6 months from now it'll be fast, cheap, and probably more reliable!
Show HN: I generated 70k audiobooks with OpenAI Text-to-Speech
12 by evan_ry | 8 comments on Hacker News. Hey HN. I’m Ivan, hacker from Ukraine. For about a year, I was working on Listenly — an app to listen to text content with OpenAI's natural-sounding text-to-speech model. At some moment, I realized that it would be cool to take all the public domain e-books and create audio versions for them. So I did it... kind-of. It would cost an immense amount of money to generate all the audio right away (OpenAI TTS costs approximately $0.84/hour of audio; 11labs, for comparison, is 10 times more expensive). So, I took a more gradual approach. I took all the metadata from the Project Gutenberg catalog (it's about 70GB of dirty XML), cleaned it, put it into my database, and created a browsable catalog. When the first user visits a book page on Listenly, I download the full text of the book, save it in my cloud storage, and calculate the price for audio generation based on the book's length. Then, if the user decides to purchase it, we generate the audio. I know it’s not perfect. I've burned out a couple of times already while doing it. But still, I need to show it to the world. And I’ll be glad to hear your feedback. Peace.
Show HN: I coded my own JSON translation tool to easily localize my side project
12 by jboschpons | 9 comments on Hacker News. Hi HN, I’m Joan, the developer of Quicklang. I made this app to easily translate and keep in sync all my localization JSON files for my side projects. While searching online for a similar tool, I only found enterprise solutions that do not allow direct editing of JSON files. I used to use ChatGPT to translate the JSON translation file changes before coding Quicklang. However, I realized that ChatGPT only allows you to input short content for translation into another language (even if you provide a .json file), and each time I had to request translations for one language at a time. So, I decided to build an app that only sends the changes I’ve made to the OpenAI API and easily translates them into all the target languages for my side projects. Technical details: I used Next.js to build the front end and backend, and I use a custom VPS (EC2 instance) on AWS to handle the translation process. This is because the translation can take several minutes, and Vercel Functions time out after 10 seconds by default (up to 60 seconds on the Hobby plan). Finally, I save the translation files in an S3 bucket. What’s next? I want to add cool features like change history, the capability to add context to the OpenAI API to make translations as accurate as possible, and maybe allow developers to interact with the API in order to use the tool. Let me know your thoughts and feedback. It’s been a blast working on this so far, and I think it’s just neat :)