Control the robots, incentivize the humans – Cointelegraph Magazine

Cointelegraph By Max Parasol Uncategorized May 2, 2023 | 0

Text generator ChatGPT is the fastest-growing consumer app ever, and it’s still growing rapidly.

But the dirty secret of AI is that humans are still needed to create, label and structure training data — and training data is very expensive. The dark side of this is that an exponential feedback loop is being created where AI is a surveillance technology. And so, managing the humans in the AI loop is crucial.

Some experts believe that when (potentially) robots take over the world, they’d better be controlled by decentralized networks. And humans must be incentivized to prepare the data sets. Blockchain and tokens can help… but can blockchain save humanity from AI?

ChatGPT is just regurgitated data

ChatGPT is a big deal according to famed AI researcher Ben Goertzel, given that “the ChatGPT thing caused the Google founders to show up at the office for the first time in years!” he laughs. Goertzel is the founder of blockchain-based AI marketplace SingularityNET and an outspoken proponent of artificial general intelligence (AGI) — computers thinking for themselves. That means he sees where ChatGPT falls short more clearly than most.

“What’s interesting about ChatGPT and other neuro models is that they achieve a certain amount of generality without having much ability to generalize. They achieve a general scope of ability relative to an individual human by having so much training data.”

Ben Goertzel and his robot Desdemona (How to prevent AI from ‘annihilating humanity’ using blockchain)

In other words, ChatGPT is really one function achieved by the brute force of having so much data. “This is not the way humans achieve breadth by iterative acts of creative generalization,” he says, adding, “It’s a hack; it’s a beautiful hack; it’s very cool. I think it is a big leap forward.”

He’s not discounting where that hack can take us either. “I won’t be shocked if GPT-7 can do 80% of human jobs,” he says. “That’s big but it doesn’t mean they can be human-level thinking machines. But they can do a majority of human-level jobs.”

Logic predicated on experience remains harder for AI than scraping the internet. Predicate logic means that humans know how to open bottle caps, for example, but AIs need trillions of data to learn that simple task. And good large language models (LLMs) can still turn language into presumptive logic, including paraconsistent logic, or self-contradictory logic, explains Goertzel.

“If you feed them the whole web, almost anything you ask them is covered somewhere on the web.”

Goertzel notes that means part of Magazine’s questioning is redundant.

“I’ve been asked the same questions about ChatGPT 10 times in the last three weeks, so we could’ve just asked ChatGPT what I think about ChatGPT. Neuromodels can generate everything I said in the last two months, I don’t even need to be saying it.”

ChatGPT-4 hasn’t been updated recently enough to tell us what Goertzel thinks in the past three weeks — *ChatGPT 4 hasn’t been updated recently enough to tell us what Goertzel thinks in the past three weeks. But if it had, it could. (GPT-4 via Forfront.ai)*

Goertzel is important in AI thinking because he specializes in AGI. He says that he and 90% of his AGI colleagues think LLMs like ChatGPT are partly a distraction from this goal. But he adds LLMs can also contribute to and accelerate the work on all kinds of innovation that could play a role in AGIs. For example, LLMs will expedite the advancement of coding. LLMs can even help ordinary people with no coding abilities to build a phone or web app. That means non-tech founders can use LLMs to build tech startups. “AI should democratize the creation of software technology and then a little bit down the road hardware technology.”

Goertzel founded SingularityNET as an attempt to use blockchain and open-source technology to distribute access to the tech that controls AGIs to everyone, rather than let it stay in the hands of monopolies. Goertzel notes that ChatGPT and other text apps deploy publicly viewable open-source algorithms. And so, the security infrastructure for their data sets and how users participate in this tech revolution is now at a crucial juncture.

For that matter, so is AI development more widely. In March, OpenAI co-founder Elon Musk and more than 1,000 other tech leaders called for a halt to the development of AI or rolling out systems more powerful than GPT-4. Their open letter warned of “profound risks to society and humanity.” The letter argued the pause would provide time to implement “shared safety protocols” for AI systems. “If such a pause cannot be enacted quickly, governments should step in and institute a moratorium,” they posited.

Goertzel is more of an optimist about the tech’s potential to improve our lives rather than destroy them, but he’s been working on this stuff since the 1970s.

I respect the concerns but am not gonna sign this. LLMs won’t become AGIs. They do pose societal risks, as do many things. They also have great potential for good. Social pressure for slowing R&D should be reserved for bioweapons and nukes etc. not complex cases like this.

— Ben Goertzel (@bengoertzel) March 29, 2023

Reputation systems needed

Humayun Sheikh was a founding investor in the famed AI research lab DeepMind where he supported commercialization for early-stage AI and deep neural network technology. Currently, he leads Fetch.ai as CEO and founder. It’s a startup developing an autonomous future with deep tech.

He argues that the intersection between blockchain and AI is economically driven, as the funding required to train AI models is prohibitively expensive except for very large organizations. “The entire premise behind crypto is the democratization of technology and access to finance. Rather than having one monopolized entity have the entire ownership of a major AI model, we envision the ownership to be divided among the people who contributed to its development.”

“One way we can absolutely encourage the people to stay in the loop is to involve them in the development of AI from the start, which is why we believe in decentralizing AI technology. Whether it’s people training AI from the start or having them test and validate AI systems, ensuring regular people can take ownership of the AI model is a strong way to keep humans in the loop. And we want to do this while keeping this democratization grounded in proper incentivization mechanisms.”

One approach to this is via emerging reputation systems and decentralized social networks. For example, SingularityNet spin-off Rejuve is tokenizing and crowdsourcing bio data submissions from individuals in the hope of using AI to analyze and cross-match this with animal and insect data in the hope of discovering which parts of the genome can make us live longer. It’s an AI-driven, Web3-based longevity economy. Open science should be paid is the thought and data depositors should be rewarded for their contributions.

Humayun Sheikh. — Humayun Sheikh says data marketplaces are a must.

“The development of AI is dependent on human training. Reputation systems can deliver quality assurance for the data, and decentralized social networks can ensure that a diverse slate of thoughts and views are included in the development process. Acceleration of AI adoption will bring forth the challenge of developing un-opinionated AI tech.”

Blockchain-based AI governance can also help, argues Sheikh, who says it ensures transparency and decentralized decision-making via an indisputable record of the data collected and decisions made that can be seen by everyone. But blockchain technology is only one piece of the puzzle. Rules and standards, as we see in DAOs, are always going to be needed for trustworthy governance,” he says.

Goertzel notes that “you can’t buy and sell someone else’s reputation,” and tokens have network effects. Blockchain-based reputation systems for AI can ensure consumers can tell the difference between AI fakes and real people but also ensure transparency so that AI model builders can be held accountable for their AI constructions. In this view there needs to be some standard for tokenized measurement of reputation adopted across the blockchain community and then the mainstream tech ecosystem.

And in turn, reputation systems can expedite AI innovations. “This is not the path to quick money but it is part of the path for blockchain to dominate the global economy. There’s a bit of a tragedy of the commons with blockchains in the reputation space. Everyone will benefit from a shared reputation system.”

Blockchains for data set management

Data combined with AI is good for many things — it can diagnose lung cancer — but governments around the world are very concerned with how to govern data.

The key issue is who owns the data sets. The distinctions between open and closed sources are blurred, and their interactions have become very subtle. AI algorithms are usually open-source, but the parameters of the data sets and the data sets themselves are usually proprietary and closed, including for ChatGPT.

The public doesn’t know what data was used to train ChatGPT-4, so even though the algorithms are public, the AI can’t be replicated. Various people have theorized it was trained using data sets including Google and Twitter — meanwhile, Google denied it trained its own AI called Bard with data and conversations with ChatGPT, further muddying the waters of who owns what and how.

Famed AI VC Kai-Fu Lee often says open-source AI is the greatest human collaboration in history, and AI research papers usually contain their data sets for reproducibility, or for others to copy. But despite Lee’s statements, data, when attached to academic research, is often mislabelled and hard to follow “in the most incomprehensible, difficult and annoying way,” says Goertzel. Even open data sets, such as for academic papers, can be unstructured, mislabelled, unhelpful and generally hard to replicate.

So, there is clearly a sweet spot in data pre-processing in AI meets blockchain. There’s an opportunity for crypto firms and DAOs to create the tools for the decentralized infrastructure for cleaning up training data sets. Open source code is one thing, but protection of the data is crucial.

“You need ways to access live AI models, but in the end, someone has to pay for the computer running the process,” notes Goertzel. This could mean making users pay for AI access via a subscription model, he says, but tokenomics are a natural fit. So, why not incentivize good data sets for further research? “Data analysis pipelines” for things like genomics data could be built by crypto firms. LLMs could do this stuff well already, but “most of these pre-processing steps could be done better by decentralized computers,” says Goertzel, “but it’s a lot of work to build it.”

Human-AI collaboration: Oceans of data needing responsible stewards

One practical way to think about AI-human collaboration then is the idea of “computer-aided design” (CAD), says Trent McConaghy, the Canadian founder of Ocean Protocol. Engineers have benefited from AI-powered CAD since the 1980s. “It’s an important framing: It’s humans working in the loop with computers to accomplish goals while leveraging the strengths of both,” he says.

McConaughy started working in AI in the 1990s for the Canadian government and spent 15 years building AI-powered CAD tools for circuit design. He wrote one of the very first serious articles about blockchains for AI in 2016.

CAD gives us a practical framing for AI-human collaboration. But these AI-powered CAD tools still need data.

Imagine trying to hand design a chip with 10 billion parts. Yet, people do it. How?

The answer is AI.

Engineers have had AI-powered computer-aided design (CAD) for chips, cars, etc for decades. With 10x+ productivity.

Now, *everyone else* gets AI-powered CAD. Expect 10xs.

— Trent McConaghy (@trentmc0) March 20, 2023

McConaghy founded Ocean Protocol in 2017 to address the issue. Ocean Protocol is a public utility network to securely share AI data while preserving privacy. “It’s an AI play using blockchain, and it’s about democratizing data for the planet.” Impressively, it’s the sixth-most active crypto project on GitHub.