Balancing Innovation and Privacy: A Red Teaming Approach to Evaluating Phone-Based Large Language Models under AI Privacy Regulations

Authors

  • Mangesh Pujari Independent Researcher
  • Anil Kumar Pakina Independent Researcher
  • Anshul Goel Independent Researcher

DOI:

https://doi.org/10.56127/ijst.v2i3.1956

Keywords:

Large Language Models (LLMs), Red Teaming, AI Privacy, Mobile AI, Privacy Regulations, Adversarial Testing, Model Evaluation, Data Security, Innovation and Ethics, Privacy-Preserving AI

Abstract

The rapid deployment of large language models (LLMs) on mobile devices has introduced significant privacy concerns, particularly regarding data collection, user profiling, and compliance with evolving AI regulations such as the GDPR and the AI Act. While these on-device LLMs promise improved latency and user experience, their potential to inadvertently leak sensitive information remains understudied. This paper proposes a red teaming framework to systematically assess the privacy risks of phone-based LLMs, simulating adversarial attacks to identify vulnerabilities in model behavior, data storage, and inference processes.

We evaluate popular mobile LLMs under scenarios such as prompt injection, side-channel exploitation, and unintended memorization, measuring their compliance with strict privacy-by-design principles. Our findings reveal critical gaps in current safeguards, including susceptibility to context-aware deanonymization and insufficient data minimization. We further discuss regulatory implications, advocating for adaptive red teaming as a mandatory evaluation step in AI governance. By integrating adversarial testing into the development lifecycle, stakeholders can preemptively align phone-based AI systems with legal and ethical privacy standards while maintaining functional utility.

References

1. Neel, S., & Chang, P. (2023). Privacy issues in large language models: A survey. arXiv preprint arXiv:2312.06717.

2. Li, H., Chen, Y., Luo, J., Wang, J., Peng, H., Kang, Y., ... & Song, Y. (2023). Privacy in large language models: Attacks, defenses and future directions. arXiv preprint arXiv:2310.10383.

3. Laakso, A. (2023). Ethical challenges of large language models-a systematic literature review.

4. Winograd, A. (2022). Loose-lipped large language models spill your secrets: The privacy implications of large language models. Harv. JL & Tech., 36, 615.

5. Kucharavy, A., Schillaci, Z., Maréchal, L., Würsch, M., Dolamic, L., Sabonnadiere, R., ... & Lenders, V. (2023). Fundamentals of generative large language models and perspectives in cyber-defense. arXiv preprint arXiv:2303.12132.

6. He, J., Feng, W., Min, Y., Yi, J., Tang, K., Li, S., ... & Zheng, S. (2023). Control risk for potential misuse of artificial intelligence in science. arXiv preprint arXiv:2312.06632.

7. Bandi, A., Adapa, P. V. S. R., & Kuchi, Y. E. V. P. K. (2023). The power of generative ai: A review of requirements, models, input–output formats, evaluation metrics, and challenges. Future Internet, 15(8), 260.

8. Lee, G. G., Shi, L., Latif, E., Gao, Y., Bewersdorff, A., Nyaaba, M., ... & Zhai, X. (2023). Multimodality of ai for education: Towards artificial general intelligence. arXiv preprint arXiv:2312.06037.

9. Kassem, A. M. (2023). Mitigating approximate memorization in language models via dissimilarity learned policy. arXiv preprint arXiv:2305.01550.

10. Adomaitis, L., Grinbaum, A., & Lenzi, D. (2022). TechEthos D2. 2: Identification and specification of potential ethical issues and impacts and analysis of ethical issues of digital extended reality, neurotechnologies, and climate engineering (Doctoral dissertation, CEA Paris Saclay).

11. Behrens, H. W. (2023). On Counter-Adversarial Resilience in Permeable Networked Systems (Doctoral dissertation, Arizona State University).

12. Hakak, S., Khan, W. Z., Imran, M., Choo, K. K. R., & Shoaib, M. (2020). Have you been a victim of COVID-19-related cyber incidents? Survey, taxonomy, and mitigation strategies. Ieee Access, 8, 124134-124144.

13. Ding, Y., Sohn, J. H., Kawczynski, M. G., Trivedi, H., Harnish, R., Jenkins, N. W., ... & Franc, B. L. (2019). A deep learning model to predict a diagnosis of Alzheimer disease by using 18F-FDG PET of the brain. Radiology, 290(2), 456-464.

14. Culnan, M. J., & Bies, R. J. (2003). Consumer privacy: Balancing economic and justice considerations. Journal of social issues, 59(2), 323-342.

15. Dutton, W., Guerra, G. A., Zizzo, D. J., & Peltu, M. (2005). The cyber trust tension in E-government: Balancing identity, privacy, security. Information Polity, 10(1-2), 13-23.

16. Bannister, F. (2005). The panoptic state: Privacy, surveillance and the balance of risk. Information Polity, 10(1-2), 65-78.

17. Zhang, T., Zhu, T., Gao, K., Zhou, W., & Yu, P. S. (2021). Balancing learning model privacy, fairness, and accuracy with early stopping criteria. IEEE Transactions on Neural Networks and Learning Systems, 34(9), 5557-5569.

18. Tene, O., & Polonetsky, J. (2012). Big data for all: Privacy and user control in the age of analytics. Nw. J. Tech. & Intell. Prop., 11, 239.

19. Ohm, P. (2009). Broken promises of privacy: Responding to the surprising failure of anonymization. UCLA l. Rev., 57, 1701.

20. Acquisti, A., Taylor, C., & Wagman, L. (2016). The economics of privacy. Journal of economic Literature, 54(2), 442-492.

21. Maddireddy, B. R., & Maddireddy, B. R. (2023). Automating Malware Detection: A Study on the Efficacy of AI-Driven Solutions. Journal Environmental Sciences And Technology, 2(2), 111-124.

22. Zhang, J., & Tenney, D. (2023). The Evolution of Integrated Advance Persistent Threat and Its Defense Solutions: A Literature Review. Open Journal of Business and Management, 12(1), 293-338.

23. Dhinagaran, D. A., Martinengo, L., Ho, M. H. R., Joty, S., Kowatsch, T., Atun, R., & Tudor Car, L. (2022). Designing, developing, evaluating, and implementing a smartphone-delivered, rule-based conversational agent (DISCOVER): development of a conceptual framework. JMIR mHealth and uHealth, 10(10), e38740.

24. Boudreaux, B., DeNardo, M. A., Denton, S. W., Sanchez, R., Feistel, K., & Dayalani, H. (2020). Data privacy during pandemics: A scorecard approach for evaluating the privacy implications of COVID-19 mobile phone surveillance programs. Rand Corporation.

25. Msaouel, P., Oromendia, C., Siefker-Radtke, A. O., Tannir, N. M., Subudhi, S. K., Gao, J., ... & Logothetis, C. (2021). Evaluation of technology-enabled monitoring of patient-reported outcomes to detect and treat toxic effects linked to immune checkpoint inhibitors. JAMA network open, 4(8), e2122998-e2122998.

26. Shen, Y. T., Chen, L., Yue, W. W., & Xu, H. X. (2021). Digital technology-based telemedicine for the COVID-19 pandemic. Frontiers in medicine, 8, 646506.

27. Nguyen, P. H., Tran, L. M., Hoang, N. T., Trương, D. T. T., Tran, T. H. T., Huynh, P. N., ... & Gelli, A. (2022). Relative validity of a mobile AI-technology–assisted dietary assessment in adolescent females in Vietnam. The American Journal of Clinical Nutrition, 116(4), 992-1001.

28. Furlong, E., Darley, A., Fox, P., Buick, A., Kotronoulas, G., Miller, M., ... & Maguire, R. (2019). Adaptation and implementation of a mobile phone–based remote symptom monitoring system for people with cancer in Europe. JMIR cancer, 5(1), e10813.

29. Liu, M., Zhou, S., Jin, Q., Nishimura, S., & Ogihara, A. (2022). Effectiveness, policy, and user acceptance of COVID-19 contact-tracing apps in the post–COVID-19 pandemic era: experience and comparative study. JMIR public health and surveillance, 8(10), e40233.

30. Trivedi, M. H., Claassen, C. A., Grannemann, B. D., Kashner, T. M., Carmody, T. J., Daly, E., & Kern, J. K. (2007). Assessing physicians' use of treatment algorithms: Project IMPACTS study design and rationale. Contemporary clinical trials, 28(2), 192-212.

Downloads

Published

2023-10-30

How to Cite

Mangesh Pujari, Anil Kumar Pakina, & Anshul Goel. (2023). Balancing Innovation and Privacy: A Red Teaming Approach to Evaluating Phone-Based Large Language Models under AI Privacy Regulations. International Journal Science and Technology, 2(3), 117–127. https://doi.org/10.56127/ijst.v2i3.1956

Similar Articles

1 2 3 4 5 6 7 > >> 

You may also start an advanced similarity search for this article.

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.