[inference labs]
  • [inference labs]
  • Litepaper
  • Responsible & Ethical AI
  • zk-ML
    • Why zk-ML?
    • Benchmarks
  • Resources
    • Glossary
    • Privacy Policy
    • Terms of Service
Powered by GitBook
On this page
  • Reinforcement learning from human feedback (RLHF)
  • Constitutional AI
  • Ethical Frameworks
  • DevOps, MLOps, etc.
  • Emerging Trends
  • Conclusion

Responsible & Ethical AI

The current landscape of AI ethics and emerging trends.

PreviousLitepaperNextzk-ML

Last updated 1 year ago

AI ethics is an emerging field with multiple competing narratives about how to reliably build human values into AI. Some of the most current methods include Reinforcement Learning from Human Feedback (RLHF), Constitutional AI, and ethical frameworks.

Reinforcement learning from human feedback (RLHF)

into model behavior. The method was intentionally designed to mimic human judgment, while prioritizing specific ethical values, such as safety and harmlessness. RLHF works by having humans evaluate various scenarios, ranking them from 0-5 according to various ethical values. A reinforcement learning model is then trained on this data.

The method is also intended to capture the context that machines lack in ethical dilemmas. Despite the effectiveness of RLHF in changing model behavior, the technique still suffers from operational challenges, such as the lack of scalibility of human annotation and inconsistent labeling.

but is also used in many other companies, such as Hugging Face, , and , as well.

Constitutional AI

, attempting to encode specific ethical principles into the model itself, rather than rely on human annotators. Cleverly, the developers at Anthropic call Constitutional AI “Reinforcement Learning with AI Feedback”. Anthropic considers CAI to be a better version and replacement for RLHF, as relying on AI rather than human feedback solves scalability issues.

However, losing the human voice in AI development removes valuable information about context that only end users have. Also, CAI must pre-determines the principles built into AI, leaving end users unable to participate.

Ethical Frameworks

Ethical frameworks, such as the or , serve as a foundation for creating ethical AI systems, but they too are not without drawbacks. However, ethical frameworks face critique and .

DevOps, MLOps, etc.

There's also an increase in developer tools that aim to streamline the process of building ethical AI. However, this method also faces critique because it essentially outsources ethical decision-making to tool creators rather than the users or the developers themselves.

Emerging Trends

The future of AI ethics presents many opportunities. One path is personalized ethics, which relies on modularized AI dominating over monolithic models. Personalized ethics could allow for customization according to user's values and beliefs, ensuring that AI systems reflect a diversity of ethical perspectives.

Conclusion

AI ethics is a field in flux, grappling with present challenges while anticipating future ones. It's a space of ongoing experimentation that requires a continuous dialogue among stakeholders in the eternal pursuit just and good AI systems.

Another trend has been the need to . As it currently stands, end users have minimal input in improving model performance, verifying behavior, or disagreeing with problematic model output. Future methodologies may seek to improve the imbalance of power by giving more control to the users who interact with the AI systems daily.

As a blog by Anthropic states, "AI models will have value systems, whether intentional or unintentional." Complete neutrality of AI models is unachievable due to the nature of data always being created from the point of view of the subject. Instead, the focus will likely shift towards acknowledging these inherent biases and values, developing models that critically engage with them.

RLHF is a technique used in machine learning that focuses on integrating human feedback
RLHF was originally developed by research teams at OpenAI
Weights & Biases
Deep Mind
Constitutional AI (CAI) was created by Anthropic as an alternative to RLHF
Responsible AI toolkit by PwC
Explainable AI by IBM
because most frameworks cannot be implemented well in practice
lack empirical verification
place greater emphasis on the role of the end user
Also, the idea of creating an "unbiased" model is under scrutiny.