ntfy.sh v2.18.0 was written by AI
(github.com)
from ueiqkkwhuwjw@lemmy.world to selfhosted@lemmy.world on 08 Mar 09:45
https://lemmy.world/post/43988094
from ueiqkkwhuwjw@lemmy.world to selfhosted@lemmy.world on 08 Mar 09:45
https://lemmy.world/post/43988094
According to the release:
Adds experimental PostgreSQL support
The code was written by Cursor and Claude
14,997 added lines of code, and 10,202 lines removed
reviewed and heavily tested over 2-3 weeks
This makes me uneasy, especially as ntfy is an internet facing service. I am now looking for alternatives.
Am I overreacting or do you all share the same concern?
#selfhosted
threaded - newest
Definitely share your initial concern. Without strong review processes to ensure that every line of code follows the intent of the human developer, there’s no way of knowing what exactly is in there and the implications for the human users. And I’m not just talking about bugs.
They say it’s reviewed, but the temptation to blindly trust is there. In this case, developer appears to have taken some care.
Let us hope so. Handle with care to ensure responsibility is not offloaded to a machine instead of a person.
The size of that changeset means that it’s inherently unreviewable.
The commit history is something I’ve seen only in the PRs that even the most dysfunctional companies would demand a rewrite for.
Also, 2-3 weeks review? PostgreSQL support could be added in that time without the need for a damn „vibe check”. Hell, it would probably take less time than that.
To be fair they would have needed to spend time testing the manual implementation as well.
The problem I see mainly is that even if this rolls out perfectly, the erratic and changing nature if llms still make it pointless as a proof of concept. Next time Claude might fuck up in a fringe way that’s not covered by unit tests and is missed by manual tests.
On the other hand I guess I’ve been guilty myself on numerous occasions to implement fringe bugs into production code, but at least I learn from it.
I made my statement as a BDD/TDD practitioner.
The code goal of software engineering is not to deliver said code, but to deliver it in a framework that lets others—and consequently me in a week’s time—to contribute easily. This makes both future improvements and bug fixes easier.
Dumping a ~25000 lines changeset with a git history that’s almost designed to confuse is antithetical to both engineering and open source.
Yeah, it could easily have added a couple of lines of code that sends everything to Northern Korean hackers because it found that in a bunch of repositories or just logging passwords to public logs or other things an experienced developer would never do. “AI” only replicates what it sees most often and as more spam and junk repos are added to its training data because “AI” companies are too concerned with profit to teach it properly, it could do tons of random stuff. It’s like training a developer by giving them random examples from the internet rather than specific ones. Of course they pick up bad habits. Even if it “works” it is almost never efficient or secure.
If you use ntfy mainly as a Unified Push distributor on Android, then I highly recommend switching to a XMPP client that can do the same.
I was also using it for notifications but I’ll probably switch to E-Mail for that and find an alternative UP distributor.
Conversations is working very well on my phone as UP distributor.
Do you recommend an app?
The first three on this list can do it: joinjabber.org/docs/apps/android/
Explanation here: joinjabber.org/tutorials/service/unifiedpush/
Uh. I’d really prefer if people experimented with new technology a bit more cautiously and not directly jump to “the biggest release […] ever done”.
Upvote and comment on: github.com/binwiederhier/ntfy/issues/1645
They just replied:
This makes me think that they didn’t review or test it at all, lmao
Thanks for the link! As a short aside for the other people here: Try not to spam developers. That usually achieves the opposite and makes them miserable, when we want them to not burn out, and write good software for us. A thumbs-up emoji is the correct reaction for the average person. Or for the pros - a code-review highlighting specific issues within the code.
Yeah, this is now inherently untrustworthy. Better to switch to an alternative.
Do you know any? I’ve never really looked beyond ntfy.sh until now
I only know NextPush (Nextcloud App), but there is also something called Autopush I think?
Gotify is supposedly a good alternative. Looking into it myself now.
Gotify is not UP compatible still AFAIK. That’s why I went to ntfy.
There’s SunUp on F-droid, but I don’t know anything about them.
That’s from Mozilla, another AI company…
Ugh, seriously? Great…
(Edit) I don’t think this is true? They use Mozilla’s push services, but nothing about their Codeberg repo (yes, it’s on Codeberg, not Github) indicates they’re part of Mozilla.
Read the README
How about you tell me what you see that I missed?
The app itself might be fine, but you are either using the Mozilla services or the backend written by Mozilla. Sadly Mozilla has lost all the good will it had and is just another silicon valley AI company these days, and seems to prefer it that way.
Sure. All I said was that it doesn’t actually seem to be run by Mozilla, like you implied it was.
If you use ntfy for UnifiedPush: unifiedpush.org/users/distributors/
I recently switched to gotify. Push notifications to iOS aren’t as good but I’m happy with it.
I’m sorry, how many lines of code for that?
if you want to send one notification from your desktop to your phone, it’s easy. but from any device to (m)any other, with guaranteed delivery and no doubles? shit gets complicated.
So it’s a little more than just sending notifications, then.
no, it’s literally all in service of sending notifications. but there’s a lot involved. android doesn’t have a way to receive them natively for example, you need to go through google’s services. so ntfy has to emulate the firebase api. then there’s the “exactly once” requirement, which is basically the two generals problem turned up to eleven because every platform syncs differently and you need some way to store messages that are in the process of transmitting. then there’s the matter of punching through NAT, so you need a STUN/TURN setup on the server.
and that’s on top of the fact that every platform requires different build options, manifests, certificates, etc.
They are not even trusting it themselves. This is from the release notes
Fuck that.
Classic “test in production” strategy, very solid!
Test in production is the best. We spent months warning from data bugs and nobody bat an eye (upstream bug, not our responsibility but we noticed) When it was d launched in prod we just pointed out the bug that nobody fixed was still there and immediately a war room was formed and the bug fixed within an hour.
It honestly seems more efficient to let shit hit the fan than to fight everybody to do their job.
You’re implying a shitty capitalist company that nobody cares for if it burns down. A tool like this though that is self-hosted by a lot of people (29.1k stars on GH!) and that is internet-facing is very different.
Then, let’s just call it “massive decentralized surprise testing”
For sure, the song of the hero who fixed the production bug is oft sang at meetings but the loser who prevented the bug to begin with gets no credit.
Testing in production is the most idiotic last 10 years or so concept, which is mainly driven by incompetence of project managers.
Imagine if you get sold a car by a company, for 100k, then it starts having major issues and the car company tells you: “we’ll fix it”.
While that does not necessarily apply to software or services or webapps, the logic still stands. You are selling bugs to people. Bugs that could have been cought, with some risk management and planning.
Edit: F-ing ios keyboard.
I completely agree. I work on an internal solution, which is a part of a very large product. It’s not a live product, only part of a pipeline that runs on a predetermined schedule. Our bit is the only one with actual business/performance KPIs, most of the other teams measure only “user story/CR points”. If the other teams screw up, it will impact our performance unless we prove it’s their fault. And of it’s their fault, they open a US/bug which improves their metrics (one more US closed). Our team has to think ahead and try to do things well in one go, because our bugfixing doesn’t count as work. But our speed is measured against people who benefits from half doing stuff. When we did massive effort, we got complaints we were slow. Now we do less effort and once every blue moon we have to do a hotfix. Most often than not when we have an production issue is due to the other teams that run before us on the pipeline, so we even had to develop checks to our input because they won’t add checks to their outputs. And they won’t because that’s a CR that requires extra funding that’s not approved, but we had to create them for our own sanity.
Yes, I’m looking to move out haha
Consider a donation to help people providing you the open source software you seem to depend upon.
Usage of a helper tool to perform tasks on code whether it is AI or the IDE internal features can reduce the work load of benevolent developers who has not asked you to use their softwares.
Maybe the language was not appropriate but get real. With the little revenue generated by the usage of people complaining, the use of AI agentic coding might be the only way to being features without pushing benevolent devs to burnout.
Hmm, no, I think I’ll just uninstall.
Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I’ve seen in this thread:
[Thread #146 for this comm, first seen 8th Mar 2026, 10:40] [FAQ] [Full list] [Contact] [Source code]
I just set up a ntfy server for Unified Push earlier this week to use with Matrix. Now I have to turn around and immediately replace it…
Same here. Literally just set it up and now this.
I hope the author will roll this back or someone else makes a fork. I don’t want to immediately switch technology to XMPP/Matrix/… and have to do it all over again.
You could, in the meantime, simply not upgrade to the version that uses AI.
Since, from what I’m seeing around, people are having issues looking for an alternative.
Definitely time to find an alternative. What the actual fuck is this
NOOOOOOOOO
Oh ffs…
Thanks for the heads-up
I’ll embrace the inevitable fork.
Time for a fork?
Time for a knife!^[I kid, I kid] Violence is the answer!
I’ve been meaning to put something like this in my setup for a while, but definitely not this now! List of alternatives in the Custom Communication section at awesome selfhosted
I have the same concern..
fuck
I meant to ask already: what is the actual technical difference between mqtt and ntfy? For me it feels pretty similar technique, just one is used for push service and the other not. So it feels like reinventing the wheel. Maybe somebody here can enlighten me?
I think the main difference is that services adapt to mqtt while nfty adapts to services to send the msgs. Also, nfty offers push notifications on your Android device.
@ueiqkkwhuwjw just this quote at the start of the release notes
> 14,997 added lines of code, and 10,202 lines removed, all from one pull request
This is already a major red flag even without the ai stuff right? Can't believe anyone would flaunt that like this.
The “single pull request” is a merge release from 79 separate commits. It’s the sum of all work, it doesn’t mean all of it was changed in one go.
Im quite hesistant with idea of AI writing my code. At one point your AI wont help you with fixing certain bug and you will have to go through all of this AI slop. Not to mention you deploy debt code.
Ai can be powerful and destructive at the same time. (note: I didn't use Ai to write this).
Ai coding can help a lot in accelerating software development. In the right hands that is. Meaning the software engineer still reviews the code. Test it. And takes responsibility. In those cases there is nothing wrong with using Ai for software development.
The problem is that some programmers are using AI without even looking at the end results. Just approves everything, commits, push and release. That approach is wrong and especially inexperience engineers might fail into this trap. So in this case the code has most likely a lot of duplicated code, full with bugs and other issues. Some issues you encounter it for the first time, since it wasn't tested etc.
In the latter story, you feel the impact. And the downsides of Ai. And only see the negatives of Ai. You might say it's Ai slop even. Or vibe coded. Which is correct.
Tldr: Ai can be very powerful in the right hands. It still requires a lot of human time and effort to get it correct. And if the engineer is too lazy then you feel the consequences. If you got an experienced software engineer that takes the responsibility of the code. Reviews it thoroughly. Test all corner cases, etc. Then AI can be powerful and helpful.
Was this written with genAI? Even the TLDR is padded fluff of common talking points
I’m halfway with you, and halfway just considering that people think it’s relevant to include a tl;dr in a barely three paragraph comment. The feeling with tl;dr for me is a summary similar to a closing paragraph, and if anyone thinks that one sentence (“Ai coding can help a lot in accelerating software development.”) is somehow worthy of being summarized as if the point was proven (“Ai can be very powerful in the right hands”)… well, it sounds like shit because it is shit. Maybe it’s ai, maybe it’s just a really rushed dude making a throwaway comment in the fediverse, and maybe it’s just a person who is confident enough in their mind that they forget they haven’t made an actually decent argument outside of their past, and concluding as if they brought that past argument forth here is eye-raising.
Considering he’s on his own instance… I’m going to bet the context is somewhere between throwaway comment and invoking past assertions without citing them.
You can run my text through Ai checkers if you wish. But it's not Ai generated.
I'm not just on my own instance. I'm the creator of the software: Mbin. Previously known as kbin.
People need tldr today, due to TikTok. 😅
Haha. I'm not a native English speaker. But it's not Ai generated.
I try to keep it common for general people to understand it. If you have follow up questions shoot. I have 25+ years of software engineering experience.
But my point is that developers can use Ai, Ai tools become much better for coding, as long as the developer still understands the code. Since some developers don't even bother looking at the code anymore...
Also I can't really answer the question if it's bad or not what happens to ntfy.sh since it really depends on how the maintainer is using Ai here. Whether he did test the code, and read all the generated code.
Ai in itself isn't the problem here.
Agreed. I have a sense that, eventually, development communities will figure out etiquette and policies to govern LLM usage. But how do you enforce that kind of policy? Right now, it’s essentially a judgement call by the maintainers. It’s hard to catch sneaky LLM usage.
On the other hand, I think there are objectively good ways to use LLMs for software:
Indeed also read the paper called Programming as Theory building. From 1985. Which is very relevant today again. Since people lose the connection with the code due to Ai.
One of my favorite papers! On a similar note, I recently started reading A Philosophy of Software Design by John Ousterhout. Although it’s a lot more recent (2018), I’d argue it’s required reading in light of the LLM hype craze.
there is this repo that lists some slopware : codeberg.org/small-hack/open-slopware maybe someone can add it
Awesome page, thanks. Have bookmarked.
Harfbuzz though? That’s going to take some replacing. Hopefully someone will fork an earlier version. The thing that it does (accurate multi-script font shaping) is difficult to do; requires a lot of rule-of-thumb knowledge that’s unlikely to be possessed by a single person, needs a lot of collaboration.
I think there’s room for a little bit of nuance that page doesn’t do a great job of describing. In my opinion there’s a huge difference between volunteer maintainers using AI PR checks as a screening measure to ease their review burden and focusing their actual reviews on PRs that pass the AI checks, and AI-deranged lone developers flooding the code with “AI features” and slopping out 10kloc PRs for no obvious reason.
Just because a project is using AI code reviews or has an AGENTS.md is not necessarily a red flag. A yellow flag, maybe, but the evidence that the Linux Kernel itself is on that list should serve as an example of why you can’t just kneejerk anti-AI here. If you know anything about Linus Torvalds you know he has zero tolerance for bad code, and the use of AI is not going to change that despite everyone’s fears. If it doesn’t work out, Linus will be the first one to throw it under the bus.
Upvote this guy
Lol my project has an AGENTS.md and its contents are basically, “Don’t use AI agents on this codebase.”
did not know that the serde developer tolnay is a military apologist. I’m disgusted. serde is a very good tool… I’ll think about what to do about this. such a shame…
the linux kernel is on that list, bro it’s time to switch!
Also Chrome, Firefox ans Ladybird!
Time to switch to Plan9!
oh no. not ladybird! You were supposed to save us!
Heck. Guess I won't be hosting that then
ntfyfor some home projects but now I will not.That’s concerning. If it was “I generated a function with an LLM and reviewed it myself” I’d be much less concerned, but 14k added lines and 10k removed lines is crazy. We already know that LLMs don’t generate up to scratch code quality…
I won’t use PostgreSQL with ntfy, and keep an eye on it to see if they continue down this path for other parts of ntfy. If so I’ll have to switch to another UP provider.
I’m assuming this is some sort of canary message to indicate that the code base has been compromised, the author can’t talk about it, and everyone should immediately stop using the service. Surely no-one would be unwise enough to commit this otherwise?
Even ignoring the huge red LLM flag, a 25kLOC delta in a single PR should be cause for instant rejection as there’s no way to fully understand or test it, let alone in 2-3 weeks.
Not to pick at nits, but it would be VERY different if it was 1k lines added and 24k lines removed. There’s something extremely satisfying about removing 10k+ lines of unnecessary code.
Sure, that would be a little different, but unless you could make a convincing argument, backed up with a solid set of unit tests, at the least, as to why and how you were able to remove that much code whilst only adding a comparatively small amount, I’d still be inclined to reject it and ask for it to be broken down into smaller units.
Now, that explaination might be something along the lines of it being dead code that is not called from anywhere, or even that it was a patched version of an upstream library, and the patch is now included in that upstream, in which case, fair enough, good work, and thanks very much. As a rewrite or refactor though, it’s too big to sensibly review and needs breaking down into separate features.
Absolutely, the author needs to be able to reason about their changes, no matter what. However, the reason why I think the two situations are fundamentally different, though, is that it’s a lot easier to validate the existence of features than it is the non-existence of bugs or malicious behavior. The biggest risk to removing code is breaking preexisting features, whereas the biggest risk to adding code is introducing malicious behavior.
Well now I certainly am glad I didn’t migrate from Gotify as I’ve been slowly planning.
Damn, I guess I’ll stick to the older release for now. Hopefully a viable alternative/fork comes around.
Look, if he wanted to introduce AI code, whatever, but doing it all at once in a 14k line change is crazy.
Surely it would be better to introduce AI by letting it handle misc changes here and there instead of starting with the “biggest release ever done” (his words), no?
Fuck, I love ntfy, it’s one of the best self hosted push notification systems I’ve used. It has been flawless so far.
Don’t like this.
No thumb down reaction emoji 🤔
Uovote and comment on: github.com/binwiederhier/ntfy/issues/1645
Please add this to the post.
I switched to Gotify when I ran into an issue where ntfy would delete old api tokens when creating more than 20. Only thing missing in Gotify is UniversalPush, other than that it feels actually more solid than ntfy to me.
Oh goddamn it, I’m using this and don’t have an alternative lined up
What is your concern? If it’s a generic “AI”, then I can assure you tha pretty much every software has AI code in it already. Heck, Linus is accepting PRs where AI has been used.
AI is useful. It produces useful code.
Like creative writing, it won’t produce something novel. But man, 75% of code is just boiler plate. AI can do a lot for boilerplate.
That does not absolve anyone of committing crap code. Put your name to it. Own it. Take the consequence of delivering shit code or great code, no matter how it was written. Don’t let AI be a crutch. But you’d be god damn fool not to use it, where it’s right (boilerplate, test writing, tedious changes etc.)
There’s a big difference between “AI was used in some capacity” and “Entirely vibe coded”
Of course. And when I hear “vibe coded”, I hear someone starting with “make me a cool app” and going from there, with zero understanding of the technical architecture.
If you have a thorough, deeply thought through technical spec, then AI can write a great amount of tests up against that spec, say, and you’ve got a fantastic base for TDD.
I honestly feel like a lot of the downvotes are people thinking AI means “clueless programmer having an AI do its work for you”. Many highly productive, deeply technical developers use it every day.
Idk man by the sounds of it, the AI implemented the entire back end change, adding 14k lines of generated code. The dev doesn’t even seem confident with his own testing. Sounds like it’s closer to the vibe-coded end of the scale to me.
I’ve been meaning to give Ntfy a shot but now I likely won’t. If I wanted a vibe coded project I’d just do it myself.
Massive changes made by robit in what has been a pretty stable utility for years is (obviously?) my main concern. It’s absolutely a crutch, and seeing a dev lean on it like this gives me the same feeling Coach must’ve got seeing his star player limping into the big game on a real one. If dude wants to check out and let the machine run his project fine, but I’ll be looking for something someone still cares about and works on.
I think you’d be a fool to use it. At this point it’s subsidized by their need for training data/desire to manufacture dependency, but that won’t be the case for long. It’s expensive, detrimental to your skills, and damaging to both our planet and society. It centralizes and gatekeeps access to information, the most powerful resource of all. “Treat it like an inexperienced dev” managers say, while it replaces their opportunities to gain experience. How are they supposed to even tell great code from shit when everything they’re exposed to has been run through the averaging machine?
I saved your comment for the added arguments against AI.
If using ntfy for UnifiedPush: unifiedpush.org/users/distributors/
I’m a developer
I sometimes sometimes use AI for an answer to a complicated problem because normally I’d open up 20 pages , have to go through them all to find the right answer
AI gets me the answer right away, though it likely is completely wrong or at least partially wrong. Either way, it gives me a general direction and with that I only have to search through one or two pages to confirm, so the same process is just a little faster.
I laso have used AI on a couple of occasions to ask it to write code for a complicated problem. Again, you don’t copy the code, god no, it’s always the worst, and it is in 80% of the cases still at least riddled with bugs, or just complete bullshit. However, it might give me an alternative idea or a direction to take to implement or fix this complicated feature problem.
That’s the extent to which I’ve used AI and for the foreseeable future that won’t change because AI still can’t code. It’s still wildly flailing around and it might produce something that implements a certain functionality, but it’s a guarantee that that functionality will have more bugs and security holes than features
I am also a developer and agree entirely.
Asking for advice, examples or the occasional boilerplate is at most how I use AI and certainly not integrated directly into my IDE.
I understand this comment. AI sometimes saves a ton of mental power and time when I’m stuck on an issue. It can give some really good suggestions. Also, AI is a godsend for frontend shit. I don’t care what y’all say, I’m never touching CSS and HTML ever again. lmao.
It looks like that tool is more or less built by a single developer (you already trust their judgment anyways!), and even though the code came through in a single PR it was a merge from a branch that had 79 separate commits: github.com/binwiederhier/ntfy/pull/1619
Also glancing through it a bit, huge portions of that are straightforward refactors or even just formatting changes caused by adding a new backend option.
I’m not going to say it’s fine, but they didn’t just throw Claude at a problem and let it rewrite 25k lines of code unnecessarily.
Wow a differentiated opinion on AI use :)
Something like graphite.com to create stacked PRs that are reviewable probably would have helped. Can be replicated with local LLMs or remote AI providers with locally configured agentic workflows. Never used graphite personally, but I’ve seen some open source maintainers use it to split up large PRs.
Any AI usage immediately discredits the software for me, because it calls into question all of their past and future work.
Oh boy, do I have bad news about 90% of the internet for you…
Linus sent an email recently to the Kernel Mailing List trashing AI slop and rejecting AI generated patches. The fact that he used it to play around with a script doesn’t invalidate the fact that he distrusts code written by LLMs when it actually matters.
you mean this statement? theregister.com/…/linus_versus_llms_ai_slop_docs/…
If yes, your statement does not really match what Linus said.
we’re all so fucked
Well, Telegram does the something for free.
Telegram does the thing for your sweet juicy data
In reality how big of a risk it currently is? I just started to use it just for fun and personal projects. If previous version didn’t have security vulnerabilties then then there is no rush to update or am i missing something?
What’s the difference between ntfy (android app) and ntfy.sh?
Ntfy.sh is the hosted version. Hosted by the author. Ntfy (android, ios) is the app that you use as a client.
I’ve never used ntfy.sh
I’ve only used Ntfy app for Universal Push that some apps need, and they recommend ntfy. Does this affect the app then? Ah, if so, what alternative can I use for just that purpose?
Gotify is probably the next best thing, at least in terms of self hosted. Though doesn’t have the wide support of ntfy.
Sigh. Time to switch to gotify
been using EMQX plus an MQTT client on my phone for a few months now, I like it better than gotify since the app was chewing through my battery like a vampire.
it might be better now since my issues happened three-ish years ago.
This EMQX?
Seems it’s no longer FOSS?
I’ve been using Gotify for a few notifications from Home Assistant and it doesn’t appear to be eating my battery.
It’s a little more responsive than ntfy - sometimes ntfy doesn’t alert for ages after the trigger (could be phone power saving the wifi…), but then I also get realerts from yesterday… not had that with Gotify.
that’s the one.
FOSS or not, it still runs just fine on my infra. I prefer it over something like rabbitmq because it has a pretty slick admin webgui.
I’ll have to give gotify another try.
Lot of hate for a project maintained by a volunteer and offered for free here. Nobody forces this free stuff on you.
ts getting you pinned to 2.17 in the compose file 🥹🤞🥀
I’m so tired of that.
I’m using it for scripts notifications + unifiedpush. I don’t know where to start to find the fitting alternative.