Show HN: Ploomber Cloud (YC W22) – run notebooks at scale without infrastructure
25 by idomi | 3 comments on Hacker News. Hi, we’re Ido & Eduardo, the founders of Ploomber. We’re launching Ploomber Cloud today, a service that allows data scientists to scale their work from their laptops to the cloud. Our open-source users ( https://ift.tt/m7U5Fc8 ) usually start their work on their laptops; however, often, their local environment falls short, and they need more resources. Typical use cases run out of memory or optimize models to squeeze out the best performance. Ploomber Cloud eases this transition by allowing users to quickly move their existing projects into the cloud without extra configurations. Furthermore, users can request custom resources for specific tasks (vCPUs, GPUs, RAM). Both of us experienced this challenge firsthand. Analysis usually starts in a local notebook or script, and whenever we wanted to run our code on a larger infrastructure we had to refactor the code (i.e. rewrite our notebooks using Kubeflow’s SDK) and add a bunch of cloud configurations. Ploomber Cloud is a lot simpler, if your notebook or script runs locally, you can run it in the cloud with no code changes and no extra configuration. Furthermore, you can go back and forth between your local/interactive environment and the cloud. We built Ploomber Cloud on top of AWS. Users only need to declare their dependencies via a requirements.txt file, and Ploomber Cloud will take care of making the Docker image and storing it on ECR. Part of this implementation is open-source and available at: https://ift.tt/zveBPkY Once the Docker image is ready, we spin up EC2 instances to run the user’s pipeline distributively (for example, to run hundreds of ML experiments in parallel) and store the results in S3. Users can monitor execution through the logs and download artifacts. If source code hasn’t changed for a given pipeline task, we use cached artifacts and skip redundant computations, severely cutting each run's cost, especially for pipelines that require GPUs. Users can sign up to Ploomber Cloud for free and get started quickly. We made a significant effort to simplify the experience ( https://ift.tt/IjarnuC ). There are three plans ( https://ift.tt/Kf6jNTY ): the first is the Community plan, which is free with limited computing. The Teams plan has a flat $50 monthly and usage-based billing, and the Enterprise plan includes SLAs and custom pricing. We’re thrilled to share Ploomber Cloud with you! So if you’re a data scientist who has experienced these endless cycles of getting a machine and going through an ops team, an ML engineer who helps data scientists scale their work, or you have any feedback, please share your thoughts! We love discussing these problems since exchanging ideas sparks exciting discussions and brings our attention to issues we haven’t considered before! You may also reach out to me at ido@ploomber.io.
Britain's Andy Murray maintains his record of never losing in the first round at Wimbledon with an encouraging win against Australia's James Duckworth.
from BBC News - Home https://ift.tt/hZ2wJar
via IFTTT
Ask HN: How on earth are you using your Apple computer with external displays?
15 by n42 | 13 comments on Hacker News. I own four different Apple computers -- a 2017 MacBook Pro, an M1 MacBook Air, an M1 MacBook Pro, and most recently a maxed out Mac Studio. I also have had in that timespan three different Windows desktops that I have built and a ThinkPad running Windows or Linux depending on the mood. I have spent countless dollars on cables and adapters in an attempt to find the magic combination. I have read DisplayPort specs, I know every brand of certified cable. I now know way more than I would ever care to know about DisplayPort and HDMI protocols. I have tried 4 different brands and models of monitor. For one of those models, I had three of the exact same model. All combinations work flawlessly with anything that is not one of the Apple devices. I have all but eliminated any of these components being the problem. Depending on the device and the day I will get: - Visual artifacts like snow, lines, flickering - Failure to support native resolution on any high resolution monitors - Failure to support high refresh rates - Forced scaling, detecting monitor as a TV and using interlacing - Most reliably of all, failure to wake from sleep without plugging/unplugging; doing a dance with power cycling my monitor or device until it finally works, or just giving up and logging into my Windows PC because today I can't use my Apple computer It's never all at once, but it's always at least one thing. In the time of owning any of these devices, I have without exaggeration, not once had the expected experience of sitting down at my desk and starting my day without fighting my computer to work properly with my monitor. Searching the internet, I can't be alone. All of the problems I have, as far as I can tell, other people experience. And as far as I can tell, no one has an answer. I'm at a breaking point after ordering this $4k desktop Mac Studio and waiting 3 months for it to arrive. I hoped that, being a device that requires an external display, they at least worked it out with this one. They did not. So how does the entire professional industry working with Apple computers manage to start their day, every day, like this? Am I insane? Is no one else dealing with this? Are you all just using the built in display? This has been going on for YEARS for me, across multiple generations of devices.
Show HN: Data Diff – compare tables of any size across databases
35 by hichkaker | 0 comments on Hacker News. Gleb, Alex, Erez and Simon here – we are building an open-source tool for comparing data within and across databases at any scale. The repo is at https://ift.tt/rjsvCWb , and our home page is https://datafold.com/ . As a company, Datafold builds tools for data engineers to automate the most tedious and error-prone tasks falling through the cracks of the modern data stack, such as data testing and lineage. We launched two years ago with a tool for regression-testing changes to ETL code https://ift.tt/sJIzVjG . It compares the produced data before and after the code change and shows the impact on values, aggregate metrics, and downstream data applications. While working with many customers on improving their data engineering experience, we kept hearing that they needed to diff their data across databases to validate data replication between systems. There were 3 main use cases for such replication: (1) To perform analytics on transactional data in an OLAP engine (e.g. PostgreSQL > Snowflake) (2) To migrate between transactional stores (e.g. MySQL > PostgreSQL) (3) To leverage data in a specialized engine (e.g. PostgreSQL > ElasticSearch). Despite multiple vendors (e.g., Fivetran, Stitch) and open-source products (Airbyte, Debezium) solving data replication, there was no tooling for validating the correctness of such replication. When we researched how teams were going about this, we found that most have been either: Running manual checks: e.g., starting with COUNT(*) and then digging into the discrepancies, which often took hours to pinpoint the inconsistencies. Using distributed MPP engines such as Spark or Trino to download the complete datasets from both databases and then comparing them in memory – an expensive process requiring complex infrastructure. Our users wanted a tool that could: (1) Compare datasets quickly (seconds/minutes) at a large (millions/billions of rows) scale across different databases (2) Have minimal network IO and database workload overhead. (3) Provide straightforward output: basic stats and what rows are different. (4) Be embedded into a data orchestrator such as Airflow to run right after the replication process. So we built Data Diff as an open-source package available through pip. Data Diff can be run in a CLI or wrapped into any data orchestrator such as Airflow, Dagster, etc. To solve for speed at scale with minimal overhead, Data Diff relies on checksumming the data in both databases and uses binary search to identify diverging records. That way, it can compare arbitrarily large datasets in logarithmic time and IO – only transferring a tiny fraction of the data over the network. For example, it can diff tables with 25M rows in ~10s and 1B+ rows in ~5m across two physically separate PostgreSQL databases while running on a typical laptop. We've launched this tool under the MIT license so that any developer can use it, and to encourage contributions of other database connectors. We didn't want to charge engineers for such a fundamental use case. We make money by charging a license fee for advanced solutions such as column-level data lineage, CI workflow automation, and ML-powered alerts.
Ask HN: Having trouble getting senior applicants, wondering what to do about it
25 by throw1138 | 85 comments on Hacker News. We're a fairly typical run-of-the-mill mid-size enterprise software vendor trying to hire for fully-remote SWEs in the "DevOps" software space (Linux, containers, k8s, yadda yadda). We post in the usual places including Who's Hiring but we haven't even managed to backfill a retirement from six months ago, and we're junior-heavy already. Benefits and salary are good (though salary isn't posted in the ad), and the people are great, though the work requires a reasonably deep understanding of the underlying platforms which a lot of people seem to dislike. I'm wondering if the work being a higher percentage non-code is what's causing us trouble, if we're just rubbish at hiring in general, or if it's something else. What's everyone else's experience attracting applications from senior talent in this market, and what is everyone doing to increase their attractiveness? Current hiring process: - Resume screened by in-house recruiter - 30m call with them - Resume passed up to engineering - Hour-long call with hiring manager (typically the engineering manager of the team the candidate would join) - Take-home technical assignment (~4h) or similar at candidate's choosing - Presentation of technical assignment to the team - Offer
Ask HN: Best dev tool pitches of all time?
53 by swyx | 26 comments on Hacker News. Hey folks! I'm trying to actively get better at pitching developer tools. So I had the idea of collecting an inspiration list of the "best of all time". Would like to crowdsource this! The vibe I'm going for is pitches that left you with a clear "before" and "after" division in your life where you not only "got it" but also keep referring to it from that point onward. Obvious candidate for example is DHH's 15 minute Rails demo (and i've been told the Elixir Liveview demo is similar) and Solomon Hykes' Docker demo. What other pitch is like that? (or successfully pitches a developer tool in a different way, up to your interpretation)
Rory McIlroy makes a strong start to the US Open to join unheralded Englishman Callum Tarren, Sweden's David Lingmerth and American Joel Dahmen in the lead.
from BBC News - Home https://ift.tt/A2uxSJ6
via IFTTT
Are V8 isolates the future of computing?
12 by pranay01 | 4 comments on Hacker News. I was reading this article on Cloudflare workers https://ift.tt/1zxDvup and seemed like isolates have significant advantage over serverless technology like lambda etc. What are the downsides of v8? Is it poor security isolation?
Ask HN: Is there a TV on the market without “Smart TV” features?
89 by nborwankar | 84 comments on Hacker News. Or is there at least one where Smart mode can be turned off verifiably AND it doesn’t keep enticing you to turn it on by withholding ease of use or some convenience feature until you just give up?
Jamaica's Shericka Jackson beats Elaine Thompson-Herah & Great Britain's Dina Asher-Smith to win the 200m in 21.91 seconds at the Diamond League meeting in Rome.
from BBC News - Home https://ift.tt/PsahAuy
via IFTTT
Wales reach a World Cup for the first time since 1958 as Gareth Bale's deflected free-kick sees them beat Ukraine 1-0 in Sunday's play-off final in Cardiff.
from BBC News - Home https://ift.tt/e8MoA25
via IFTTT
I'm Afraid We're Shutting Down
36 by RBBronson123 | 4 comments on Hacker News. So it’s with deep professional and personal sadness that I must announce my plans to shut down 70 Million Resources, Inc., the parent company of 70 Million Jobs (the 1st national, for-profit employment platform for people with criminal records) and Commissary Club (the first mobile social network for this population). When I launched 70MR in 2016, I was motivated to build a company that could short circuit the pernicious cycles of recidivism in this country--cycles that destroy lives, tear apart families and decimate communities. I sought to disrupt the sleepy reentry industry by applying technology, focusing on data, employing an aggressive, accountable team, and moving with some urgency. And for the first time, approaching the challenge as a national, for-profit venture. This approach, which I named “RaaS,” (Reentry as a Service), turned out to be wildly effective, and by the beginning of 2020, we were delivering on our mission of driving “double bottom line returns”: build a big, successful business and do massive social good. With the help of Y Combinator and nearly 1,500 investors, I assembled a team and got to work. We succeeded in facilitating employment for thousands of deserving men and women and became operationally profitable. However, the pandemic had other plans for us. When it hit in force in March 2020, companies made wholesale terminations of nearly all our people, and continued their halt in hiring for two years. Our revenue dropped like a rock to almost nothing. I immediately responded by paring our expenses to the bone and began letting team members go. There was no opportunity to raise additional funding, so I began injecting my own money into the company—money I barely have—just to keep the lights on. When the economy and job market began storming back, we were inundated with inbound requests for our services. Our perseverance seemed to be paying off. Except now we were hit with a new gut punch: “The Great Resignation.” Now our workers were reticent to come back to work. And if they did accept a job, they’d often leave after only a few days. It became obvious that we lacked the resources to weather this new storm while hoping and praying the world would normalize soon. (It still hasn’t.) Our coffers are empty. We’ve incurred a relatively small amount of debt (that I personally guaranteed) that I hope to negotiate down. All employees have been paid what they were owed (except for me). I will explore sale of assets we hold. On a personal note, I can’t tell you how grateful and humbled I’ve been that many would entrust their investment or business with me. For a person who’s done time in prison (me), it’s almost impossible to ask for someone’s trust. I have not yet forgiven myself for things I did which ultimately got me into trouble. But I will be eternally grateful to those that assisted me in my efforts to settle the score and win back my karma. From the beginning I was blessed by an unbelievable team of smart, funny, passionate young people who shared my ambition to cause change. They stuck with me/us until the very end. I’m most saddened by the millions of formerly incarcerated men and women who we won’t be able to help. These are some of the most sincere, honest, and heroic people I’ve ever met. It was my life’s honor to work with them. I’m pretty sure I’ll continue my reentry work. Several prominent organizations have indicated their interests in me assuming a leadership role. I need to work, and I need to continue my work. I’m so sorry for this outcome, despite the good we’ve done. I’m not sure we could have done anything differently or better, but ultimately, I take full responsibility. Needless to say, if you have any thoughts or suggestions, please don’t hesitate to reach out, here or at Richard@70MillionJobs.com. This has been the greatest experience of my life; it couldn’t have happened without my getting a second chance. Richard