Ensuring Responsible AI: The Role of Supervised Fine-Tuning (SFT) in Upholding Integrity and Privacy Regulations

Authors

  • Tejaskumar Pujari Independent researcher
  • Anshul Goel Independent Researcher
  • Ashwin Sharma Independent Researcher

DOI:

https://doi.org/10.56127/ijst.v3i3.1968

Keywords:

Responsible AI, Supervised Fine-Tuning (SFT), Data Privacy, GDPR, AI Act, Integrity, Bias Mitigation, Differential Privacy, Federated Learning, AI Governance, Ethical AI, Model Transparency

Abstract

AI is increasingly used in high-stakes fields such as healthcare, finance, education, and public governance, requiring systems that uphold fairness, accountability, transparency, and privacy. This paper highlights the critical role of Supervised Fine-Tuning (SFT) in aligning large AI models with ethical principles and regulatory frameworks like the GDPR and EU AI Act.

The interdisciplinary approach combines regulatory analysis, technical research, and case studies. It proposes integrating privacy-preserving techniques—differential privacy, secure multiparty computation, and federated learning—with SFT during deployment.

The research also advocates incorporating Human-in-the-Loop (HITL) and Explainable AI (XAI) to ensure ongoing oversight and interpretability. SFT is positioned not only as a technical method but as a core enabler of responsible AI governance and public trust.

References

1. Abid, A., Farooqi, M., & Zou, J. (2021). Persistent anti-Muslim bias in large language models. Proceedings of the ACM Conference on Fairness, Accountability, and Transparency, 389–400.

2. Binns, R. (2018). Fairness in machine learning: Lessons from political philosophy. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 149–159.

3. Buolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. Proceedings of Machine Learning Research, 81, 1–15.

4. Dwork, C., & Roth, A. (2014). The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science, 9(3–4), 211–407.

5. European Commission. (2021). Proposal for a Regulation laying down harmonised rules on artificial intelligence (AI Act).

6. Floridi, L., & Cowls, J. (2019). A unified framework of five principles for AI in society. Harvard Data Science Review, 1(1), 1–15.

7. Ghallab, M. (2023). Ethical AI governance: From principles to practice. AI & Society, 38(4), 897–911.

8. Hovy, D., & Prabhumoye, S. (2021). Five sources of bias in natural language processing. Language and Linguistics Compass, 15(3), e12432.

9. IBM. (2022). AI Fairness 360 toolkit. Retrieved from https://aif360.mybluemix.net/

10. Jobin, A., Ienca, M., & Vayena, E. (2019). The global landscape of AI ethics guidelines. Nature Machine Intelligence, 1(9), 389–399.

11. Kairouz, P., McMahan, H. B., & Avent, B. (2021). Advances and open problems in federated learning. Foundations and Trends in Machine Learning, 14(1), 1–210.

12. Kroll, J. A., Huey, J., Barocas, S., Felten, E. W., Reidenberg, J. R., Robinson, D. G., & Yu, H. (2017). Accountable algorithms. University of Pennsylvania Law Review, 165(3), 633–705.

13. Leike, J., Krueger, D., & Everitt, T. (2023). Aligning language models with human values. Journal of Artificial Intelligence Research, 77, 1–32.

14. Mitchell, M., Wu, S., Zaldivar, A., Barnes, P., Vasserman, L., Hutchinson, B., & Gebru, T. (2019). Model cards for model reporting. Proceedings of the Conference on Fairness, Accountability, and Transparency, 220–229.

15. Nigeria Data Protection Commission (NDPC). (2023). Nigeria Data Protection Act.

16. Ouyang, L., Wu, J., Jiang, X., et al. (2022). Training language models to follow instructions with human feedback. arXiv preprint arXiv:2203.02155.

17. Pasquale, F. (2015). The black box society: The secret algorithms that control money and information. Harvard University Press.

18. Raji, I. D., & Buolamwini, J. (2019). Actionable auditing: Investigating the impact of publicly naming biased performance results. AAAI/ACM Conference on AI, Ethics, and Society, 429–435.

19. Raji, I. D., Smart, A., White, R. N., Mitchell, M., Gebru, T., Hutchinson, B., & Barnes, P. (2020). Closing the AI accountability gap. Proceedings of the 2020 ACM Conference on Fairness, Accountability, and Transparency, 33–44.

20. Rocher, L., Hendrickx, J. M., & de Montjoye, Y. A. (2019). Estimating the success of re-identifications in incomplete datasets using generative models. Nature Communications, 10(1), 3069.

21. Russell, S., Dewey, D., & Tegmark, M. (2015). Research priorities for robust and beneficial artificial intelligence. AI Magazine, 36(4), 105–114.

22. Shokri, R., & Shmatikov, V. (2015). Privacy-preserving deep learning. Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, 1310–1321.

23. Taddeo, M., & Floridi, L. (2018). How AI can be a force for good. Science, 361(6404), 751–752.

24. Tolan, S., Miron, M., Gomez, E., & Castillo, C. (2022). Measuring and mitigating unfairness in AI. Journal of Artificial Intelligence Research, 74, 1–43.

25. Tonekaboni, S., Joshi, S., McCradden, M. D., & Goldenberg, A. (2019). What clinicians want: Contextualizing explainable machine learning for clinical end use. Proceedings of the Machine Learning for Healthcare Conference, 359–380.

26. U.S. National Institute of Standards and Technology (NIST). (2023). AI Risk Management Framework.

27. Veale, M., & Binns, R. (2017). Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data. Big Data & Society, 4(2), 2053951717743530.

28. Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., & Bowman, S. R. (2018). GLUE: A multi-task benchmark and analysis platform for natural language understanding. Proceedings of the 2018 EMNLP Workshop, 353–355.

29. Zhao, J., Wang, T., Yatskar, M., Ordonez, V., & Chang, K. W. (2019). Gender bias in contextualized word embeddings. Proceedings of NAACL-HLT, 629–634.

30. Zhou, K., & Suresh, H. (2023). The ethics of adapting language models across domains. Ethics and Information Technology, 25(2), 141–158.

Published

2024-10-31

How to Cite

Tejaskumar Pujari, Anshul Goel, & Ashwin Sharma. (2024). Ensuring Responsible AI: The Role of Supervised Fine-Tuning (SFT) in Upholding Integrity and Privacy Regulations. International Journal Science and Technology, 3(3). https://doi.org/10.56127/ijst.v3i3.1968

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.