In the age of graph data – such as social networks, business relationship graphs, or knowledge maps – sharing these datasets for research or application purposes is increasingly common. But what if the structure of a graph itself contains sensitive information? Even without revealing the node contents, simply disclosing the existence of edges can lead to privacy breaches.
Traditional approaches to Differential Privacy (DP) focus on protecting data during model training. In this paper, the authors go a step further — they aim to protect privacy at the moment of graph data publishing. They propose an elegant method based on Gaussian Differential Privacy (GDP) that enables learning the structure of a graph while maintaining strong privacy guarantees.
Problem and Assumptions
- We have real graph data $G$ which should not be shared in raw form.
- We want to generate a synthetic graph $\tilde{G}$ that:
- preserves statistical properties of $G$,
- enables model training as if on $G$,
- satisfies differential privacy with respect to $G$.
Mathematical Background
Differential Privacy
$$ \text{Mechanism } M \text{ satisfies } (\varepsilon, \delta)\text{-DP if for any } D, D’ \text{ differing by one element:} \ \Pr[M(D) \in S] \leq e^{\varepsilon} \Pr[M(D’) \in S] + \delta $$
Gaussian Differential Privacy (GDP)
$$ \mu = \text{privacy parameter}, \quad GDP \approx \mathcal{N}(\mu, 1) $$
Parameter Estimation
$$ \ell(\theta; G) = \sum_{(i,j)} y_{ij} \log \sigma(\theta_{ij}) + (1 - y_{ij}) \log(1 - \sigma(\theta_{ij})) $$
Algorithm (briefly)
- Input graph $G$
- Choose probabilistic model of graph
- Estimate $\theta$ with Gaussian noise
- Generate synthetic graph $\tilde{G} \sim P_\theta$
- Publish $\tilde{G}$
Experimental Results
On datasets like Cora, Citeseer:
- Statistical similarity preserved
- Trained models performed well
- Strong privacy even with low $\mu$
Conclusion
The GDP-based method allows:
- protecting node relationships,
- generating realistic synthetic graphs,
- training high-performing models.
An important step toward privacy-preserving graph data sharing.
📚 Link
👉 Based on the publication 📄 arXiv:2507.19116