AI safety tests found to rely on 'obvious' trigger words; with easy rephrasing, models labeled 'reasonably safe' suddenly fail, with attacks succeeding up to 98% of the time. New corporate research ...
Unwind by hunting for words on the day’s themed list — they’re hidden horizontally, vertically, diagonally and backward. Come back daily for a new theme or to browse the recent archive. Jump in to ...