Show HN: Hacker News in Slow Italian - AI-generated podcast (with code)
13 by lakySK | 7 comments on Hacker News. There are plenty of podcasts to listen to some slow basic Italian, but often they just talk about random things I'm not that interested in. Nothing a few hours of tinkering with Python cannot solve these days! Introducing Hacker News in Slow Italian. Each episode is generated automatically, using GPT4 API to summarise the top articles on Hacker News and then fed to Play.ht for text-to-speech. The (very short) code is available on Github: https://ift.tt/8og6Ipl
Ask HN: How does archive.is bypass paywalls?
21 by flerovium | 21 comments on Hacker News. If it simply visits sites, it will face a paywall too. If it identifies itself as archive.is, then other people could identify themselves the same way.
Ask HN: I have 176 logins/accounts. How many do you have?
29 by bojangleslover | 41 comments on Hacker News. Here is a screenshot of my Bitwarden: https://ift.tt/JyMuAEx They include some really important things such as: Health insurance G-Suite for work Bill.com (which I use to get paid) IRS.gov (which I use to get un-paid) UK Companies House Register Interactive Brokers My bank Obviously, anything with OAuth is "bundled" into my Google account. So if anything this is a huge underestimate. I'm asking because of how insane auth has become. I know companies like OnePassword and Bitwarden are working on this and overall they do a great job. But I still have a near-stroke every time I have to do the "forgot my password" loop, or use Duo Mobile/other 2FA. The only really good auth feature I've ever encountered has been Apple's "fill from Messages" feature as well as their Touch.
Has HN Changed? I assume it's just me
19 by travisgriggs | 29 comments on Hacker News. I've been reading HN for a while. I made my first comment May 31, 2018. And have gone through cycles of engagement during that time. But for the last few weeks, even months, I still scan the top articles daily, but something has changed for me. Historically, there's almost always been at least one thing in the top 30 I would be interested in. Sometimes many. But of late, none of it interests me near as much anymore. My guess is that this is just pretty much burnout/age me changing. But I was curious if maybe it was a wider spread effect that others are experiencing. Perhaps the downturn in the tech industry has just led to less of a "this is the place to be and things to know!" experience in general.
Committing changes to a 130GB Git repository without full checkouts [video]
5 by eliomattia | 0 comments on Hacker News. Hey HN, would appreciate feedback on a version control for data toolset I am building, creatively called the Data Manager. When working with large repositories with data, full checkouts are problematic. Many git-for-data solutions will create a new copy of the entire datasets for each commit and none of them allow contributing to a data repo without full checkouts, to my knowledge. In the video, a workflow that does not require full checkouts of the datasets and still allows to commit changes in Git is presented. Specifically, it becomes possible to check out kilobytes to commit changes to a 130 gigabyte repository, including versions. Note that only diffs are committed, at row, column, and cell level, so the diffing that appears in the GUI will seem weird, since it will interpret the old diff as the file to be compared with the new one, when in fact they are both just diffs. The goal of the Data Manager is to version datasets and structured data in general, in a storage-efficient way, and easily identify and deploy to S3 datasets snapshots, identified by repository and commit sha (and optionally a tag) that need to be pulled for processing. S3 is also used to upload heavy files that are then pointed by reference, not URL, in Git commits. The no-full-checkout workflow shown applies naturally to adding data and can be extended to edits or deletions provided the old data is known. That is to ensure the creation of bidirectional diffs that enable navigating Git history both forward and backward, useful when caching snapshots. The burden of checking out and building snapshots from diff history is now borne by localhost, but that may change, as mentioned in the video. Smart navigation of git history from the nearest available snapshots, building snapshots with Spark, and other ways to save on data transfer and compute are being evaluated. This paradigm enables hibernating or cleaning up history on S3 for datasets no longer necessary to create snapshots, like those that are deleted, if snapshots of earlier commits are not needed. Individual data entries could also be removed for GDPR compliance using versioning on S3 objects, orthogonal to git. The prototype already cures the pain point I built it for: it was impossible to (1) uniquely identify and (2) make available behind an API multiple versions of a collection of datasets and config parameters, (3) without overburdening HDDs due to small, but frequent changes to any of the datasets in the repo and (4) while being able to see the diffs in git for each commit in order to enable collaborative discussions and reverting or further editing if necessary. Some background: I am building natural language AI algorithms (a) easily retrainable on editable training datasets, meaning changes or deletions in the training data are reflected fast, without traces of past training and without retraining the entire language model (sounds impossible), and (b) that explain decisions back to individual training data. LLMs have fixed training datasets, whereas editable datasets call for a system to manage data efficiently, plus I wanted to have something that integrates naturally with common, tried and tested tools such as Git, S3, and MySQL, hence the Data Manager. I am considering open-source: is that the best way to go? Which license to choose?
Show HN: On the security of the Linux disk encryption LUKS
18 by proxystore | 1 comments on Hacker News. In the past few days, there have been uncertainties and concerns about the LUKS (“Linux Unified Key Setup”) disk encryption, which is widely used on Linux. We publish our assessment of this here.
Ask HN: What do you use for ML Hosting
25 by blululu | 15 comments on Hacker News. I'm trying to setup server to run ML inferences. I need to provision a somewhat beefy gpu with a decent amount of RAM (8-16 GB). Does anyone here have personal experience and recommendations about the various companies operating in this space?
Ask HN: Who wants to be hired? (May 2023)
33 by whoishiring | 80 comments on Hacker News. Share your information if you are looking for work. Please use this format: Location: Remote: Willing to relocate: Technologies: Résumé/CV: Email: Readers: please only email these addresses to discuss work opportunities.