FALCON: Fine-grained Activation Manipulation by Contrastive Orthogonal Unalignment for Large Language Model



Hu, Jinwei, Huang, Zhenglin, Yin, Xiangyu, Ruan, Wenjie, Cheng, Guangliang, Dong, Yi ORCID: 0000-0003-3047-7777 and Huang, Xiaowei
(2025) FALCON: Fine-grained Activation Manipulation by Contrastive Orthogonal Unalignment for Large Language Model .

[thumbnail of falcon.pdf] Text
falcon.pdf - Author Accepted Manuscript
Access to this file is restricted: awaiting official publication and publisher embargo.
After the embargo period this will be available under License Creative Commons Attribution.

Download (2MB)
Item Type: Conference Item (Unspecified)
Divisions: Faculty of Science & Engineering
Faculty of Science & Engineering > School of Computer Science & Informatics
Faculty of Science & Engineering > School of Computer Science & Informatics > Artificial Intelligence
Depositing User: Symplectic Admin
Date Deposited: 05 Dec 2025 08:44
Last Modified: 05 Dec 2025 08:47
URI: https://livrepository.liverpool.ac.uk/id/eprint/3195870
Disclaimer: The University of Liverpool is not responsible for content contained on other websites from links within repository metadata. Please contact us if you notice anything that appears incorrect or inappropriate.