Benchmarking Unsupervised Outlier Detection with Realistic Synthetic Data