Mateusz Dziemian's picture

Mateusz Dziemian

mattmdjaga

·

AI & ML interests

Interested in AI safety.

Recent Activity

new activity 15 days ago

mattmdjaga/segformer_b2_clothes:test

new activity 15 days ago

mattmdjaga/segformer_b2_clothes:segformer_b2_clothes

new activity about 2 months ago

mattmdjaga/segformer_b2_clothes:segformer_b2_clothes

View all activity

Organizations

authored 2 papers 4 months ago

Security Challenges in AI Agent Deployment: Insights from a Large Scale Public Competition

Paper • 2507.20526 • Published Jul 28 • 1

Deceptive Automated Interpretability: Language Models Coordinating to Fool Oversight Systems

Paper • 2504.07831 • Published Apr 10

authored 2 papers about 1 year ago

AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents

Paper • 2410.09024 • Published Oct 11, 2024 • 1

Applying Refusal-Vector Ablation to Llama 3.1 70B Agents

Paper • 2410.10871 • Published Oct 8, 2024 • 1