What would it take to build AI ethically?

ladybugs@lemmy.world · edit-2 20 hours ago

What would it take to build AI ethically?

Hetare King@piefed.social · 10 hours ago

I consider these to be the main ethical issues with specifically LLMs and generative AI in general:

Using people’s work as training data without consent.
The high cost of training a model meaning that only a few entities in the world can actually do so and so, only few people get to decide what the knowledge base and “slant” of the model is. This is true even for open source models.
The high resource cost of using a model relative to the value of its output.
People with malicious intent being empowered by it far, far more than anyone else.
The model producing the response to the query directly instead of leading to the source, leaving both the source without any way to benefit and the user from having any context queues they can use to verify the reliability of the information.
Infinite and automated production of misinformation, libel and psychological manipulation.
Inducing psychosis in people.

Point 1 can be resolved by the people training AI just making different choices. Many won’t unless they’re forced to, but in principle they could.

Points 2 and 3 could hypothetically be resolved in the future with better technology.

The rest are basically inherent to the technology and you can at best try and mostly fail to reduce the risk. So as far I’m concerned, what it would take to build AI ethically is to train it for very specific purposes and have it be used as statistical models by people who know what they’re doing.

Though I do see some potential for ethical LLMs by using them to perform vector searches instead of generating text, basically turning them into smarter search engines.

SaneMartigan@lemmy.world · 9 hours ago

A socialist society built on equity for all. Automating jobs with AI to give people more free time would be great if it wasn’t making a rich minority richer.

valar@lemmy.ca · 18 hours ago

Pay for all the content you train the model on

one_old_coder@piefed.social · 17 hours ago

Or only use your own data, which would make it useless. I thought about RAGs but Wikipedia says:

These documents supplement information from the LLM’s pre-existing training data

It seems that RAGs use stolen data anyway.

And It still wouldn’t solve the issue that managers demand 10 times more work for free, it wouldn’t stop making workers crazy with the flood of reviews in programming, and execs wouldn’t stop dreaming of laying off everyone.

Yaky@slrpnk.net · 17 hours ago

Do you mean any AI, or text-generating LLMs?

I am fairly certain Cornell built BirdNET / Merlin Bird ID song identification using recordings in the public domain or with permissive licenses.

Same goes for iNaturalist and Seek using volunteer-submitted and identified photos.

So it’s possible to built domain-specific models with fewer ethical issues, but the push is for bullshit generators, unfortunately.

A Sharky Anthro@fedia.io · 13 hours ago

As long as moneyed interests are involved with the idea of trying to turn a profit off of LLMs (that they successfully conflated with AI), ethics will never be considered in any capacity. As it stands right now, a real AI is a pipe dream because techbros have shoot their shot way too early; instead of funding multidisciplinary efforts to understand consciousness, the brain, and ways to simulate real reasoning/thought. They’ve created a Frankenstein’s Monster with LLMs, that would never gain any form of intelligence as it cannot possibly replicate the complex consciousness and reasoning process that living things possess.

Realistically, I prefer humanity gives up on this at the moment because the technology to sustain data centers and cool them without severe environmental impact to OUR ONLY HOME PLANET is insufficient. Until we sort out our economic, social, political issues…An actual artificial intelligence should be a low as fuck priority.

Zerush@lemmy.ml · 7 hours ago

For example an AI search assistant not biased by big corps, it use webcontent in realtime, not an own knowledge base with stolen content, LLM only to interprete correct the user input for a semantic search, better if also use renevable energy.

cloudy1999@sh.itjust.works · edit-2 15 hours ago

I despise “AI” for quite a few reasons: It’s built on theft, it empowers the fascists and oligarchs, its masters seek to dis-empower or replace human workers and creatives, its name is a deception as well as its primary use case, etc. This community doesn’t need a rehash. I personally despise AI because I love the programming craft and I worry about a future where code is only generated, or worse: generated autonomously. Don’t get me started on “AI first” companies. Fuck that.

“AI” is an anti-human technology.

Now, separate “AI” and all its awfulness from LLM as an algorithm/data structure. Can LLMs be ethical? I honestly don’t know whether the good can be isolated from the bad. I started to brainstorm this out below, but the more I write, the less convinced I am that there’s a middle way. I’m afraid that much of the perceived benefit of LLMs is derived from the universal theft of training data.

Dear reader, please consider the following a brainstorm only from a non-expert Anti who’s trying in good faith to find a path.

–

Here are some possible ethical use cases:

Natural Language Interface - Like a Terminal Interface (TUI) or Graphical User Interface (GUI) or Command Line Interface (CLI), but instead discerns user intent from human language
Pattern Recognition - Some of LLMs’ legitimate accomplishments have been their ability to pore over decades of human work and detect patterns that otherwise would have been missed. Examples: Recent Erdős and Knuth news. LLMs are reasonable at code review and bug/security flaw detection
Summarization/Search - LLMs and their precursors have been rehashing summaries of well-tread topics in training data for years. Crafting summaries for human consumption seems a ‘ok’ use case, with the understanding than hallucinations are unavoidable. Examples: API documentation, code examples, encyclopedia-like snippets

IMO, an ethical LLM solution might have attributes like these. Disclaimer: I’m not an expert so some of this may be nonsense (“brainstorm”):

Public audit trail of training data
Author consent, voluntary or paid, for participation in training data
Harnesses should have a query-able manifest of valid operations. All user input should map to one of them
Harnesses should strictly require human acknowledgement before executing an operation, and especially when interacting with external systems
Human-first output - should encourage human learning and thought, not seek to replace it
Signed output - this one is tricky. I don’t know how to accomplish it. It would be great if LLM output could be signed in a way that excluded it from future training. The signature would also serve as notice to humans that the content is explicitly from an LLM. Web browsers could then have configurations to filter LLM content out so that users can consent to consume it. This solution may not be part of LLMs themselves
Limited topic/training data - imagine an LLM that’s only for recipes or only for a specific programming API or a specific new site. A smaller model should use fewer resources

I have high doubts that these qualities can be achieved due to complexity and cost. Such is the price of legitimacy.

–

OK, that’s all. I’m going back now to stewing in my disdain for “AI”.

minor tweaks

Jared White ✌️ [HWC]@humansare.social · 6 hours ago

I don’t agree summarization is an OK use case as a product. Maybe as a one off thing that a user explicitly requests of their own data within an office suite or whatever.

Meanwhile, I’ve been looking at the mincemeat “Google AI mode” makes of my essays, completely changing the meaning and giving people false conclusions which misrepresent my position. It’s shockingly awful.

cloudy1999@sh.itjust.works · 5 hours ago

Yes, they cannot reason at all, despite clever marketing names like ‘reasoning models’. A responsible operator must verify all output, something humans can’t collectively be trusted to do. Even when verification is performed, we must ask ourselves if ‘old-fashioned’ thinking wouldn’t have given a just as good or better result. IMO, it’s hard to find anything positive about this technology.

Something related I’ve been thinking about: they’re unable to produce truth or lies, only output.

Lucy :3@feddit.org · 13 hours ago

Existing renewable only dedicated power supply, your own hardware manufacturer, so you don’t disturb the rest of the market, a cooling solution that magically does not harm the environment and a DC somewhere it doesn’t disturb anything.

And of course an ethical dataset. So data that was explicitly provided for training.

Goldholz @lemmy.blahaj.zone · 13 hours ago

Okay hear me out. DC underground. The generated heat is used to heat houses!

Sanctus@anarchist.nexus · 19 hours ago

Getting a repository where you commit your own work under legal threat to be trained on. Then tear the data centers down and only allow them to run locally. But its kind of too late for all that. Pandora’s box has been opened.

CrocodilloBombardino@piefed.social · 19 hours ago

if it were built outside of a capitalist system and in a way that is in ecological balance with the planet

ratrace@lemmy.zip · 13 hours ago

Well first you plug in a toaster with a extention cord not plugged into a GFI outlet, next you fill up the tub and the last step is you get into that tub with said toaster as you submerge. Thats how you build ethical AI

JustTesting@lemmy.hogru.ch · 15 hours ago

There’s https://apertvs.ai/ probably as close as it gets. Government funded, made by universities. Afaik its datacenter is powered by hydro. But it is an academic project, so still uses common crawl and other publicly available datasets, which is considered ok practice in academia but still means consent is opt out, if something is publicly available. And of course no one uses this, because no marketing and it’s not as ‘good’ as models trained on stolen data.

Plus you could still argue that the energy and tax payer money could be better spent elsewhere.

ratrace@lemmy.zip · 13 hours ago

Universities are the tip of the baby killing military industrial complex. you people are all so silly. There is no such thing as ethical AI.

ratrace@lemmy.zip · 13 hours ago

deleted by creator

CombatWombat@feddit.online · edit-2 19 hours ago

Anil Dash claims he’s made an ethical AI: https://www.anildash.com/2026/04/28/one-good-ai-is-here/

I haven’t taken the time to verify his claims personally, it sounds like a reasonable attempt:

What’s good? Something that checks every box I can think of for our most immediately positive goals: it’s trained entirely with data that were consensually gathered; it’s completely open source and open weights, so anybody can examine it to know exactly how it works and what biases or flaws it might have; it’s designed to run on ordinary computers that normal people have access to — including those that can run entirely on renewable and responsible energy sources. And it is controlled by creators, not extractors, people who are inarguably on the side of artists and creatives and those who make art and culture in the world, designed to support and enable and empower their expression. No billionaires or guests of Epstein’s island were involved in the creation of this technology.

ruuster13@lemmy.zip · 19 hours ago

YOUR data was already stolen to build these models. You are already being actively exploited. This applies to everyone who has ever put something out onto the internet. The best way to usher in an era of ethical AI is to use the tools that exist now in whatever way necessary to ensure we beat the fascists worldwide and then restore sanity to the law.

ratrace@lemmy.zip · 13 hours ago

If I pray to jesus I can become a born again virgin.