CMAT: A Cross-Model Adversarial Texture for Scanned Document Privacy Protection

Xiaoyu Ye1, Jingjing Yu2,*, Jungang Li1, Yiwen Zhao1, Qiutong Liu2,
1Peking University, 2Capital University of Economics and Business
*Corresponding Author

Abstract

With the development of deep learning-based text detection and recognition, extracting text from smartphone-captured or scanned images has become effortless. However, this convenience gives rise to privacy concerns, especially for individuals and organizations unwilling to let their documents be read and analyzed by deep learning models.

We propose a novel solution for scanned document privacy protection. First, we propose a cross-model universal adversarial texture that mimics natural paper textures to safeguard scanned document images against multiple text detection models. Second, we curate a comprehensive dataset of scanned financial documents, including line-level and character-level text annotations for both English and Chinese documents, thereby providing diverse data for evaluating our protection method.

Experimental results indicate a significant reduction in H-mean scores of various text detection models, demonstrating the efficacy of our cross-model universal adversarial texture in preserving document privacy.

Method

Interpolate start reference image.

Our pipeline to craft the cross-model adversarial texture.

We use Toroidal Cropping(TC) in generating and optimizing arbitrary size adv texture and leverage the Tree-structured Parzen Estimator Approach(TPE) to search for optimal model weights. More details please refer to our paper.

Dataset

Interpolate start reference image.

Scanned page sample with annotations in our dataset.

Our dataset, AdvFinDocument, is collected from publicly available business annual reports, characterized by the form of financial sensitive documents.
It contains both English and Chinese scanned PDF pages, elaborately labeled with character-level and line-level text detection bounding box annotations, enabling it to be a benchmark for future work's efficiency testing.

Experiment

Interpolate start reference image.

Cross-model attack.

Results show a balanced and efficient attack to three text detection models on both the benchmark dataset FUNSD and our proposed AdvFinDocument.

Attack ε

We explore the effect of textures with different degrees of perturbation on PSENet textline object detection. Results show even the perturbation is very small, the proposed adversarial texture can still hide the majority of the text, revealing the effectiveness of the proposed method.

Interpolate start reference image.

Clean

Interpolate start reference image.

ε=0.05

Interpolate start reference image.

ε=0.06

Interpolate start reference image.

ε=0.07

Interpolate start reference image.

ε=0.08

Interpolation end reference image.

ε=0.09


Model Weight

With the inclusion of weights, the attack performance on all three models became more balanced, and the overall attack performance improves.

Related Links

These excellent works have inspired us.

T-SEA: Transfer-based Self-Ensemble Attack on Object Detection A framework provided to achieve universal (cross model&instance) patch-based adversarial attack.

Adversarial texture for fooling person detectors in the physical world Proposed Toroidal Cropping (TC) to generate continuous adversarial texture in arbitrary size.

BibTeX

@misc{Ye2024CMAT
  author={Xiaoyu Ye, Jingjing Yu, Jungang Li, Yiwen Zhao, Qiutong Liu},
  title={CMAT:A Cross-Model Adversarial Texture for Scanned Document Privacy Protection},
  year={2024},
  url={https://github.com/LJungang/CMAT}
}