
"Contextual integrity defines privacy as the appropriateness of information flows within specific social contexts, that is, disclosing only the information strictly necessary to carry through a given task, such as booking a medical appointment. According to Microsoft's researchers, today's LLMs lack this kind of contextual awareness and can potentially disclose sensitive information, thereby undermining user trust. The first approach focuses on inference-time checks, i.e., safeguards applied when a model generates its response."
"PrivacyChecker follows a relatively simple pipeline. First, it extracts information from the user's request; next, it classifies it according to a privacy judgement; and, optionally, it injects privacy guidelines into the prompt to ensure the model knows how to handle detected sensitive information. PrivacyChecker is model-agnostic and can be used with existing models without retraining. On the static PrivacyLens benchmark, PrivacyChecker was shown to reduce information leakage from 33.06% to 8.32% on GPT4o and from 36.08% to 7.30% on DeepSeekR1,"
Contextual integrity defines privacy as appropriateness of information flows in specific social contexts, requiring disclosure of only the information strictly necessary for a task. PrivacyChecker provides inference-time safeguards by extracting request information, classifying privacy risk, and optionally injecting privacy guidelines into prompts. PrivacyChecker integrates with system prompts and tool calls, acts as a gate for external tools, and works with existing models without retraining. PrivacyChecker reduced information leakage substantially on the PrivacyLens benchmark (GPT4o 33.06%→8.32%; DeepSeekR1 36.08%→7.30%) while maintaining task completion. CI-CoT and CI-RL aim to train models to reason about contextual privacy during generation.
Read at InfoQ
Unable to calculate read time
Collection
[
|
...
]