Page 191 Cite

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.

×

References

Abadi, M., Chu, A., Goodfellow, I., McMahan, H. B., Mironov, I., Talwar, K., & Zhang, L. (2016, October). Deep learning with differential privacy. In Proceedings of the 2016 Association for Computing Machinery Special Interest Group on Security, Audit, and Control (ACM SIGSAC) Conference on Computer and Communications Security (pp. 308–318).

Abowd, J. M. (2021). Declaration of John Abowd, State of Alabama v. United States Department of Commerce, Case No. 3:21-CV-211-RAH-ECM-KCN.

Abowd, J., Ashmead, R., Cumings-Menon, R., Garfinkel, S., Heineck, M., Heiss, C., Johns, R., Kifer, D., Leclerc, P., Machanavajjhala, A., Moran, B., Sexton, W., Spence, M., & Zhuravlev, P. (2022). The 2020 Census Disclosure Avoidance System TopDown Algorithm. Harvard Data Science Review (Special Issue 2). https://doi.org/10.1162/99608f92.529e3cb9

Abowd, J. M., & Schmutte, I. M. (2019). An economic analysis of privacy protection and statistical accuracy as social choices. American Economic Review, 109(1), 171–202.

Abowd, J., Stinson, M., & Benedetto, G. (2006). Final report to the Social Security Administration on the SIPP/SSA/IRS public use file project. U.S. Census Bureau.

Abowd, J. M., & Vilhuber, L. (2008, September 24–26). How protective are synthetic data? In Proceedings of the Privacy in Statistical Databases: UNESCO Chair in Data Privacy International Conference (pp. 239–246). Springer.

An, D., & Little, R. J. (2007). Multiple imputation: An alternative to top coding for statistical disclosure control. Journal of the Royal Statistical Society: Series A (Statistics in Society), 170(4), 923–940.

Arnold, C., & Neunhoeffer, M. (2021). Really useful synthetic data: A framework to evaluate the quality of differentially private synthetic data [Conference session]. Workshop on Economics of Privacy and Data Labor, 37th International Conference on Machine Learning, Vienna, Austria, 2020. https://doi.org/10.48550/arXiv.2004.07740

Balle, B., Barthe, G., & Gaboardi, M. (2018). Privacy amplification by subsampling: Tight analyses via couplings and divergencies. Advances in Neural Information Processing Systems, 31. https://papers.nips.cc/paper_files/paper/2018/file/3b5020bb891119b9f5130f1fea9bd773-Paper.pdf

Page 192 Cite

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.

×

Balle, B., & Wang, Y. X. (2018). Improving the Gaussian mechanism for differential privacy: Analytical calibration and optimal denoising [Conference session]. 35th Conference of the International Conference on Machine Learning. https://doi.org/10.48550/arXiv.1805.06530

Banerjee, S., Carlin, B. P., & Gelfand, A. E. (2014). Hierarchical modelling and analysis for spatial data. CRC Press.

Barak, B., Chaudhuri, K., Dwork, C., Kale, S., McSherry, F., & Talwar, K. (2007). Privacy, accuracy, and consistency too: A holistic solution to contingency table release. In Proceedings of the 26th Association for Computing Machinery Symposium on Principles of Database Systems (pp. 273–282). https://dl.acm.org/doi/10.1145/1265530.1265569

Barrientos, A. F., Bolton, A., Balmat, T., Reiter, J. P., de Figueiredo, J. M., Machanavajjhala, A., & DeLong, M. (2018). Providing access to confidential research data through synthesis and verification: An application to data on employees of the U.S. federal government. The Annals of Applied Statistics, 12(2), 1124–1156. https://www.jstor.org/stable/26542565

Battese, G. E., Harter, R. M., & Fuller, W. A. (1988). An error-components model for prediction of county crop areas using survey and satellite data. Journal of the American Statistical Association, 83(401), 28–36.

Beimel, A., Brenner, H., Kasiviswanathan, S.P., & Nissim, K. (2014). Bounds on the sample complexity for private learning and private data release. Machine Learning, 94, 401–437. https://doi.org/10.1007/s10994-013-5404-1

Bell, W. R., Basel, W. W., & Maples, J. J. (2016). An overview of the U. S. Census Bureau’s Small Area Income and Poverty Estimates Program. In M. Pratesi (Ed.), Analysis of poverty data by small area estimation (pp. 349–378). Wiley.

Bell, W. R., Datta, G. S., & Ghosh, M. (2013). Benchmarking small area estimators. Biometrika, 100(1), 189–202.

Benavent, R., & Morales, D. (2016). Multivariate Fay–Herriot models for small area estimation. Computational Statistics & Data Analysis, 94, 372–390.

Benedetto, G., Linse, K., & Parker, E. (2022, June 10). Improving disclosure avoidance procedures for the Current Population Survey public use file [Conference session]. https://apps.bea.gov/fesac/meetings/2022-06-10/Paper-Current-Population-Survey-PUF-Disclosure-Avoidance-Proposal-to%20FESAC-061022.pdf

Benedetto, G., Stanley, J. C., & Totty, E. (2018). The creation and use of the SIPP Synthetic Beta v7.0 (Working Paper). U.S. Census Bureau. https://www.census.gov/library/working-papers/2018/adrm/SIPP-Synthetic-Beta.html

Benedetto, G., Stinson, M., & Abowd, J. M. (2013). The creation and use of the SIPP Synthetic Beta (Working Paper). U.S. Census Bureau. https://census.gov/content/dam/Census/programs-surveys/sipp/methodology/SSBdescribe_nontechnical.pdf

Bennett, N., & King, M. D. (2022, April 7–9). Student debt and its co-occurrence with other types of debt [Conference session]. Annual Meeting of the Population Association of America. Census Bureau. https://www.census.gov/content/dam/Census/library/working-papers/2022/demo/sehsd-wp2022-09.pdf

Binder, D. A. (1983). On the variances of asymptotically normal estimators from complex surveys. International Statistical Review, 51(3), 279–292.

Blum, A., Dwork, C., McSherry, F., & Nissim, K. (2005). Practical privacy: The SuLQ framework. In Proceedings of the 24th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS ’05) (pp. 128–138). Association for Computing Machinery. https://doi.org/10.1145/1065167.1065184

Blum, A., Ligett, K., & Roth, A. (2013). A learning theory approach to noninteractive database privacy. Journal of the ACM, 60(2), 1–25.

Boch, S. J., Taylor, D. M., Danielson, M. L., Chisolm, D. J., & Kelleher, K. J. (2020). Home is where the health is: Housing quality and adult health outcomes in the Survey of Income and Program Participation. Preventive Medicine, 132(105995). https://doi.org/10.1016/j.ypmed.2020.105990

Page 193 Cite

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.

×

Bonnéry, D., Feng, Y., Henneberger, A. K., Johnson, T. L., Lachowicz, M., Rose, B. A., Shaw, T., Stapleton, L. M., Wooley, M. E., & Zheng, Y. (2019). The promise and limitations of synthetic data as a strategy to expand access to state-level multi-agency longitudinal data. Journal of Research on Educational Effectiveness, 12(4), 616–647.

Borjas, G. J., & Hilton, L. (1996). Immigration and the welfare state: Immigrant participation in means-tested entitlement programs. The Quarterly Journal of Economics, 111(2), 575–604.

Bowen, C. M. (2022). Protecting your privacy in a data-driven world. Chapman and Hall/CRC.

Bowen, C. M., & Liu, F. (2020). Comparative study of differentially private synthesis methods. Statistical Science, 35(2), 280–307. https://doi.org/10.1214/19-STS742

Bowen, C. M., Liu, F., & Su, B. (2021). Differentially private data release via statistical election to partition sequentially. METRON, 79, 1–31.

Bowen, C. M., & Snoke, J. (2021). Comparative study of differentially private synthetic data algorithms from the NIST PSCR differential privacy synthetic data challenge. Journal of Privacy and Confidentiality, 11(1). http://dx.doi.org/10.29012/jpc.748

boyd, d., & Sarathy, J. (2022). Differential Perspectives: Epistemic Disconnects Surrounding the U.S. Census Bureau’s Use of Differential Privacy. Harvard Data Science Review, (Special Issue 2). https://doi.org/10.1162/99608f92.66882f0e

Bradley, J. R., Holan, S. H., & Wikle, C. K. (2015). Multivariate spatio-temporal models for high-dimensional areal data with application to Longitudinal Employer House-Hold Dynamics. Annals of Applied Statistics, 9, 1761–1791.

Bradley, J. R., Holan, S. H., & Wikle, C. K. (2016). Multivariate spatio-temporal survey fusion with application to the American Community Survey and Local Area Unemployment Statistics. Stat, 5(1), 224–233. https://onlinelibrary.wiley.com/doi/full/10.1002/sta4.120

Bradley, J. R., Holan, S. H., & Wikle, C. K. (2020). Bayesian hierarchical models with conjugate full-conditional distributions for dependent data from the natural exponential family. Journal of the American Statistical Association, 115, 2037–2052.

Bradley, J. R., Wikle, C. K., & Holan, S. H. (2016). Bayesian spatial change of support for count-valued survey data with application to the American Community Survey. Journal of the American Statistical Association, 111(514), 472–487.

Bruckmeier, K., Müller, G., & Riphahn, R. T. (2014). Who misreports welfare receipt in surveys? Applied Economics Letters, 21(12), 812–816.

Bu, Z., Dong, J., Long, Q., & Su, W. J. (2020). Deep learning with gaussian differential privacy. Harvard Data Science Review, 2(3). https://doi.org/10.1162/99608f92.cfc5dd25

Bun, M., Drechsler, J., Gaboardi, M., McMillan, A., & Sarathy, J. (2022). Controlling privacy loss in sampling schemes: An analysis of stratified and cluster sampling [Conference session]. 3rd Symposium on Foundations of Responsible Computing. https://doi.org/10.4230/LIPIcs.FORC.2022.1

Bun, M., Gaborardi, M., Neunhoeffer, M., & Zhang, W. (2023). Continual release of differentially private synthetic data (Working Paper). https://doi.org/10.48550/arXiv.2306.07884

Burgette, L. F., & Reiter, J. P. (2010). Multiple imputation for missing data via sequential regression trees. American Journal of Epidemiology, 172(9), 1070–1076.

Capps, R., Bachmeier, J. D., & Van Hook, J. (2018). Estimating the characteristics of unauthorized immigrants using U.S. census data: Combined sample multiple imputation. Annals of the American Academy of Political and Social Science, 677(1), 165–179. https://doi.org/10.1177/0002716218767383

Carless, W. (2020, March 11). Census launches online after last-minute software switch. The Center for Investigative Reporting. https://revealnews.org/article/census-launches-online-tomorrow-after-last-minute-software-switch/

Carlini, N., Chien, S., Nasr, M., Song, S., Terzis, A., & Tramèr, F. (2022). Membership inference attacks from first principles [Conference session]. 2022 IEEE Symposium on Security and Privacy, San Francisco. https://doi.org/10.1109/SP46214.2022.9833649

Page 194 Cite

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.

×

Carlini, N., Hayes, J., Nasr, M., Jagielski, M., Sehwag, V., Tramèr, F., Balle, B., Ippolito, D., & Wallace, E. (2023). Extracting training data from diffusion models. https://doi.org/10.48550/arXiv.2301.13188

Carlini, N., Tram, F., Wallace, E., Jagielski, M., Herbert-Voss, A., Lee, K., Roberts, A., Brown, T., Song, D., Erlingsson, U., Oprea, A., & Raffel, C. (2021). Extracting training data from large language models [Conference session]. 30th USENIX Security Symposium (USENIX Security 21). https://www.usenix.org/conference/usenixsecurity21/presentation/carlini-extracting

Carman, K. G., Liu, J., & White, C. (2020). Accounting for the burden and redistribution of health care costs: Who uses care and who pays for it. Health Services Research, 55(2), 224–231.

Carr, M. D., & Wiemers, E. E. (2020). The role of education in long-run earnings inequality and mobility. Russell Sage Foundation and the Washington Center for Equitable Growth.

Carr, M. D., Wiemers, E. A., & Moffit, R. A. (2023, May 4–5). Using synthetic data to estimate earnings dynamics: Evidence from the SIPP GSF and SIPP SSB [Conference session]. NBER Conference on Data Privacy Protection and Applied Research: Methods, Approaches, and Their Consequences, Cambridge, MA.

Census Bureau. (2021). 2020 Survey of Income and Program Participation users’ guide. Washington, DC. https://www2.census.gov/programs-surveys/sipp/tech-documentation/methodology/2020_SIPP_Users_Guide_OCT21.pdf

Cha, Y. (2010). Reinforcing separate spheres: The effect of spousal overwork on men’s and women’s employment in dual-earner households. American Sociological Review, 75(2), 303–329.

Chan, T. H. H., Shi, E., & Song, D. (2011, November). Private and continual release of statistics. ACM Transactions on Information and System Security, 14(3). https://doi.org/10.1145/2043621.2043626

Charest, A. S. (2011). How can we analyze differentially-private synthetic datasets? Journal of Privacy and Confidentiality, 2(2).

Chaudhuri, K., Monteleoni, C., & Sarwate, A. D. (2011). Differentially private empirical risk minimization. Journal of Machine Learning Research, 12, 1069–1109. https://jmlr.org/papers/volume12/chaudhuri11a/chaudhuri11a.pdf

Clemens, J., & Wither, M. (2019). The minimum wage and the Great Recession: Evidence of effects on the employment and income trajectories of low-skilled workers. Journal of Public Economics, 170, 53–67. https://www.sciencedirect.com/science/article/abs/pii/S0047272719300052

Cohen, A. (2022). Attacks on deidentification’s defenses [Conference session]. 31st USENIX Security Symposium. The Advanced Computing Systems Association. https://www.usenix.org/system/files/sec22-cohen.pdf

Cohen, A., & Nissim, K. (2020). Linear program reconstruction in practice. Journal of Privacy and Confidentiality, 10(1). https://doi.org/10.29012/jpc.711

Commission on Evidence-Based Policymaking. (2017). The promise of evidence-based policy-making. Commission on Evidence-Based Policymaking, Washington, DC. www2.census.gov/adrm/fesac/2017-12-15/Abraham-CEP-final-report.pdf

Couzin, J. (2008, September 5). Whole-genome data not anonymous, challenging assumptions.

Science, 321(5894), 1278. https://www.science.org/doi/full/10.1126/science.321.5894.1278 Creamer, J., Shrider, E., & Edwards, A. (2020). Estimated 17.8% of adults ages 25–34 lived in their parents’ household last year. U.S. Census Bureau. https://www.census.gov/library/stories/2020/09/more-young-adults-lived-with-their-parents-in-2019.html

Cressie, N., & Wikle, C. K. (2011). Statistics for spatio-temporal data. John Wiley & Sons.

Daily, D. (2022, December 14). Disclosure avoidance protections for the American Community Survey. U.S. Census Bureau. https://www.census.gov/newsroom/blogs/random-samplings/2022/12/disclosure-avoidance-protections-acs.html

Page 195 Cite

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.

×

Dalenius, T., & Reiss, S. P. (1982). Data-swapping: A technique for disclosure control. Journal of Statistical Planning and Inference, 6(1), 73–85. https://www.sciencedirect.com/science/article/pii/0378375882900581

de Valpine, P., Turek, D., Paciorek, C. J., Anderson-Bergman, C., Lang, D. T., & Bodik, R. (2017). Programming with models: Writing statistical algorithms for general model structures with NIMBLE. Journal of Computational and Graphical Statistics, 26(2), 403–413.

Desfontaines, D., & Pejó, B. (2020). SoK: Differential privacies. Proceedings on Privacy Enhancing Technologies, 2020(2), 288–313. https://doi.org/10.3929/ethz-b-000451916

Differential Privacy Team. (2017, December). Learning with privacy at scale. Machine Learning Research, Apple Computer. https://machinelearning.apple.com/research/learning-with-privacy-at-scale

Ding, B., Kulkalmi, J., & Yekhanin, S. (2017). Collecting telemetry data privately [Conference session]. Advances in Neural Information Processing Systems 30. https://papers.nips.cc/paper_files/paper/2017/hash/253614bbac999b38b5b60cae531c4969-Abstract.html

Dinur, I., & Nissim, K. (2003). Revealing information while preserving privacy [Conference session]. 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. https://doi.org/10.1145/773153.773173

Domingo-Ferrer, J., Mateo-Sanz, J., & Torra, V. (2001, June). Comparing SDC methods for microdata on the basis of information loss and disclosure risk [Conference session]. Proceedings of ETK-NTTS 2001. Office for Official Publications of the European Communities. https://www.researchgate.net/publication/229034399_Comparing_SDC_Methods_for_Microdata_on_the_Basis_of_Information_Loss_and_Disclosure#fullTextFileContent

Dondero, M., & Altman, C. E. (2020). Immigrant policies as health policies: State immigrant policy climates and health provider visits among U.S. immigrants. SSM—Population Health, 10, 100559. https://doi.org/10.1016/j.ssmph.2020.100559

Drechsler, J., & Reiter, J. P. (2011). An empirical evaluation of easily implemented, nonparametric methods for generating synthetic datasets. Computational Statistics and Data Analysis, 55(12), 3232–3243.

Duchi, J. C., Jordan, M. I., & Wainwright, M. J. (2018). Minimax optimal procedures for locally private estimation. Journal of the American Statistical Association, 113(521), 182–201.

Duncan, G. T., Elliot, M., & Salazar-González, J-J. (2011). Statistical confidentiality. Springer.

Duncan, G. T., & Lambert, D. (1986). Disclosure-limited data dissemination. Journal of the American Statistical Association, 81(393), 10–18.

———. (1989). The risk of disclosure for microdata. Journal of Business & Economic Statistics, 7(2), 207–217.

Dwork, C. (2006). Differential privacy. In M. Bugliesi, B. Preneel, V. Sassone, & I. Wegener (Eds.), Automata, languages and programming (pp. 1–12). Springer.

Dwork, C., McSherry, F., Nissim, K., & Smith, A. (2006). Calibrating noise to sensitivity in private data analysis. In S. Halevi & T. Rabin (Eds.), Theory of cryptography (pp. 265–284). Springer-Verlag. https://doi.org/10.1007/11681878_14

———. (2017). Calibrating noise to sensitivity in private data analysis. Journal of Privacy and Confidentiality, 7(3), 17–51.

Dwork, C., Naor, M., Pitassi, T., & Rothblum, G. N. (2010). Differential privacy under continual observation [Conference session]. 42nd ACM Symposium on Theory of Computing (STOC ‘10). Association for Computing Machinery. https://doi.org/10.1145/1806689.1806787

Dwork, C., Smith, A., Steinke, T., & Ullman, J. (2017). Exposed! A survey of attacks on private data. Annual Review of Statistics and its Applications. https://doi.org/10.1146/annurev-statistics-060116-054123

Eggleston, J., & Reeder, L. (2018). Does encouraging record use for financial assets improve data accuracy? Evidence from administrative data. Public Opinion Quarterly, 82(4), 686–706.

Page 196 Cite

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.

×

El Emam, K., Buckeridge, D., Tamblyn, R., Neisa, A., Jonker, E., & Verma, A. (2011). The re-identification risk of Canadians from longitudinal demographics. BMC Medical Informatics and Decision Making, 11(46).

El Emam, K., Mosquera, L., & Bass, J. (2020). Evaluating identity disclosure risk in fully synthetic health data: Model development and validation. Journal of Medical Internet Research, 22(11), e23139.

Elamir, E., & Skinner, C. J. (2006). Record-level measures of disclosure risk for survey micro-data. Journal of Official Statistics, 22, 525–539.

Elliot, M., Maning, A. M., Mayes, K., Gurd, J., & Bane, M. (2005, November 9–11). SUDA: A program for detecting special uniques [Conference session]. Joint United Nations Economic Commission for Europe (UNECE)/European Commission Statistical Office of the European Communities (EUROSTAT). https://unece.org/fileadmin/DAM/stats/documents/ece/ces/ge.46/2005/wp.44.e.pdf

Erlingsson, U., Pihur, V., & Korolova, A. (2014). RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response [Conference session]. 2014 ACM SIGSAC Conference on Computer and Communications Security (CCS ‘14). Association for Computing Machinery. https://doi.org/10.1145/2660267.2660348

Eugenio, E. C., & Liu, F. (2021, July). Construction of differentially private empirical distributions from a low-order marginals set through solving linear equations with I₂ regularization [Conference session]. The 2021 Computing Conference. Springer International Publishing.

Fay, R. E. III, & Herriot, R. A. (1979). Estimates of income for small places: An application of James-Stein procedures to census data. Journal of the American Statistical Association, 74(366a), 269–277.

Fellegi, I. P., & Sunter, A. B. (1969). A theory for record linkage. Journal of the American Statistical Association, 64(328), 1183–1210.

Fienberg, S. E., & McIntyre, J. (2005). Data swapping: Variations on a theme by Dalenius and Reiss. Journal of Official Statistics, 21(2), 309–323. https://www.scb.se/contentassets/ca21efb41fee47d293bbee5bf7be7fb3/data-swapping-variations-on-a-theme-by-dalenius-and-reiss.pdf

Fienberg, S. E., & Slavković, A. B. (2005). Preserving the confidentiality of categorical statistical data bases when releasing information for association rules. Data Mining and Knowledge Discovery, 11, 155–180. https://link.springer.com/article/10.1007/s10618-005-0010-x

Foote, A. D., Machanavajjhala, A., & McKinney, K. (2019). Releasing earnings distributions using differential privacy: Disclosure avoidance system for post-secondary employment outcomes (PSEO). Journal of Privacy and Confidentiality, 9(2). https://doi.org/10.29012/jpc.722

Foulds, J., Geumlek, J., Welling, M., & Chaudhuri, K. (2016). On the theory and practice of privacy-preserving Bayesian data analysis [Conference session]. 32nd Conference on Uncertainty in Artificial Intelligence, Association for Uncertainty in Artificial Intelligence. https://doi.org/10.48550/arXiv.1603.07294

Francis, P. (2022). A note on the misinterpretation of the US Census re-identification attack. In J. Domingo-Ferrer & M. Laurent (Eds.), Privacy in statistical databases (pp. 299–311). Springer, Cham. https://doi.org/10.1007/978-3-031-13945-1_21

Fry, R., Passel, J. S., & Cohn, D. (2020, September 4). A majority of young adults in the U.S. live with their parents for the first time since the Great Depression. Pew Research Center. https://www.pewresearch.org/short-reads/2020/09/04/a-majority-of-young-adults-in-the-u-s-live-with-their-parents-for-the-first-time-since-the-great-depression/

Garfinkel, S., Abowd, J., & Martindale, C. (2019). Understanding database reconstruction attacks on public data. Communications of the ACM, 62(3), 46–53. https://doi.org/10.1145/3287287

Gelman, A., & Little, T. C. (1997). Post-stratification into many categories using hierarchical logistic regression. Survey Methodology, 23, 127.

Page 197 Cite

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.

×

Gibson-Davis, C. (2011). Mothers but not wives: The increasing lag between nonmarital births and marriage. Journal of Marriage and Family, 73(1), 264–278.

Gittleman, M., Klee, M. A., & Kleiner, M. M. (2018). Analyzing the labor market outcomes of occupational licensing. Industrial Relations, 57(1), 57–100. https://onlinelibrary.wiley.com/doi/10.1111/irel.12200

Gong, R., & Meng, X-L. (2020). Congenial differential privacy under mandated disclosure [Conference session]. The 2020 ACM-IMS on Foundations of Data Science Conference.

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems, 27. https://proceedings.neurips.cc/paper_files/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf

Gumber, C., & Sullivan, B. (2022, July). Occupation, earnings, and job characteristics. Current Population Reports. U.S. Census Bureau. https://www.census.gov/content/dam/Census/library/publications/2022/demo/p70-178.pdf

Hall, M., & Greenman, E. (2015). The occupational cost of being illegal in the United States: Legal status, job hazards, and compensating differentials. International Migration Review, 49(2), 406–442.

Hall, R., Rinaldo, A., & Wasserman, L. (2012). Differential privacy for functions and functional data. Journal of Machine Learning Research, 14(1). https://dl.acm.org/doi/pdf/10.5555/2567709.2502603

Hardt, M., Ligett, K., & McSherry, F. (2012). A simple and practical algorithm for differentially private data release. In F. Pereira, C. J. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems, 25.

Hardt, M., & Rothblum, G. N. (2010). A multiplicative weights mechanism for privacy-preserving data analysis [Conference session]. 2010 IEEE 51st Annual Symposium on Foundations of Computer Science. https://doi.org/10.1109/FOCS.2010.85

Hawes, M. (2021, May 7). Understanding the 2020 Census Disclosure Avoidance System: Simulated reconstruction-abetted re-identification attack on the 2010 Census [Webinar]. U.S. Census Bureau. https://www.census.gov/data/academy/webinars/2021/disclosure-avoidance-series/simulated-reconstruction-abetted-re-identification-attack-on-the-2010-census.html

Hay, M., Rastogi, V., Miklau, G., & Suciu, D. (2009). Boosting the accuracy of differentially-private histograms through consistency. Proceedings of the VLDB Endowment, 3(1). https://doi.org/10.48550/arXiv.0904.0942

Heikkilä, M., Jälkö, J., Dikmen, O., & Honkela, A. (2019). Differentially private Markov chain Monte Carlo. Advances in Neural Information Processing Systems, 32, 4113–4123.

Hidiroglou, M. A., & You, Y. (2016). Comparison of unit level and area level small area estimators. Survey Methodology, 42(1), 41–61.

Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33, 6840–6851.

Homer, N., Szelinger, S., Redman, M., Duggan, D., Tembe, W., Muehling, J., Pearson, J. V., Stephan, D. A., Nelson, S. F., & Craig, D. W. (2008). Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoSGenet, 4(8), e1000167. https://doi.org/10.1371/journal.pgen.1000167

Hotz, V. J., Bollinger, C. R., Komarova, T., Manski, C. F., Moffitt, R. A., Nekipelov, D., Sojourner, D., & Spencer, B. D. (2022). Balancing data privacy and usability in the federal statistical system. Proceedings of the National Academy of Sciences, 119(31), e2104906119.

Hryshko, D., John, C., & McCue, K. (2017). Trends in earnings inequality and earnings instability among U.S. couples: How important is assortative matching? Labour Economics, 48, 168–182. https://doi.org/10.1016/j.labeco.2017.08.006

Page 198 Cite

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.

×

Hundepool, A., Domingo-Ferrer, J., Franconi, L., Giessing, S., Schulte Nordholt, E., Spicer, K., & de Wolf, P. D. (2012). Statistical disclosure control. Wiley.

Irvin, C., & Czajka, J. (2010). Simulation of Medicaid and SCHIP eligibility: Implications of findings from 10 states—Final report. Mathematica Policy Research, Inc. https://aspe.hhs.gov/reports/simulation-medicaid-schip-eligibility-implications-findings-10-states-final-report

Jain, P., Raskhodnikova, S., Sivakumar, S., & Smith, A. (2022). The price of differential privacy under continual observation. International Conference on Machine Learning. https://doi.org/10.48550/arXiv.2112.00828

Jälkö, J., Dikmen, O., & Honkela, A. (2016). Differentially private variational inference for non-conjugate models. https://doi.org/10.48550/arXiv.1610.08749

Jälkö, J., Prediger, L., Honkela, A., & Kaski, S. (2022). DVPIm: Differentially private variational inference improved. https://doi.org/10.48550/arXiv.2210.15961

Janicki, R., Holan, S. H., Irimata, K. M., Livsey, J., & Raim, A. (2023). Spatial change of support models for differentially private decennial census counts of persons by detailed race and ethnicity. Journal of Statistical Theory and Practice, 17(2), 31.

Janicki, R., Raim, A. M., Holan, S. H., & Maples, J. J. (2022). Bayesian nonparametric multivariate spatial mixture mixed effects models with application to American Community Survey special tabulations. The Annals of Applied Statistics, 16(1), 144–168.

Jarmin, R. S. (2021). Disclosure avoidance for the 2020 Census: An introduction. U.S. Census Bureau. https://www2.census.gov/library/publications/decennial/2020/2020-census-disclosure-avoidance-handbook.pdf

Jordon, J., Yoon, J., & Van Der Schaar, M. (2019, May 6–9). PATE-GAN: Generating synthetic data with differential privacy guarantees [Conference session]. International Conference on Learning Representations, New Orleans.

Kamath, G., & Ullman, J. (2020). A primer on private statistics. https://doi.org/10.48550/arXiv.2005.00010

Karr, A. F., Kohnen, C. N., Oganian, A., Reiter, J. P., & Sanil, A. P. (2006). A framework for evaluating the utility of data altered to protect confidentiality. The American Statistician, 60(3), 224–232.

Karwa, V., Krivitsky, P. N., & Slavković, A. B. (2017). Sharing social network data: Differentially private estimation of exponential family random-graph models. Journal of the Royal Statistical Society, Series C (Applied Statistics), 481–500.

Kasiviswanathan, S. K., Lee, H. K., Nissim, K., Raskhodnikova, S., & Smith, A. (2011). What can we learn privately? SIAM Journal on Computing, 40(3). https://doi.org/10.1137/090756090

Kifer, D., Abowd, J. M., Ashmead, R., Cumings-Menon, R., Leclerc, P., Machanavajjhala, A., Sexton, W., & Zhuravlev, P. (2022). Bayesian and frequentist semantics for common variations of differential privacy: Applications to the 2020 Census (Working Paper No. CED-WP-2022-004). U.S. Census Bureau.

Kim, C., Tamborini, C. R., & Sakamoto, A. (2018). The sources of life chances: Does education, class category, occupation, or short-term earnings predict 20-year long-term earnings? Sociological Science, 5, 206–233. https://sociologicalscience.com/articles-v5-9-206/

King, M., & Giefer, K. G. (2021, June 30). Most children receiving SNAP get at least one other social safety net benefit. U.S. Census Bureau. https://www.census.gov/library/stories/2021/06/most-children-receiving-snap-get-at-least-one-other-social-safety-net-benefit.html

Kingma, D. P., & Welling, M. (2013). Auto-encoding variational Bayes. https://doi.org/10.48550/arXiv.1312.6114

Kinney, S. K., Reiter, J. P., & Miranda, J. (2014). ‘SynLBD 2.0: Improving the Synthetic Longitudinal Business Database’. 1 Jan. 2014 : 129–135.

Page 199 Cite

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.

×

Klein, M., Moura, R., & Sinha, B. (2019). Multivariate normal inference based on singly imputed synthetic data under plug-in sampling (Statistics #2019-06). Research Report Series, U.S. Census Bureau. https://www.census.gov/content/dam/Census/library/working-papers/2019/adrm/RRS2019-06.pdf

Klein, M., & Sinha, B. (2015). Inference for singly imputed synthetic data based on posterior predictive sampling under multivariate normal and multiple linear regression models. Sankhya B, 77, 293–311. https://doi.org/10.1007/s13571-015-0100-8

Krenzke, T., Li, J., & Li, L. (2014). An evaluation of the impact of missing data on disclosure risk measures [Conference session]. Survey Research Methods Section of the American Statistical Association. American Statistical Association. http://www.asasrms.org/Proceedings/y2014/files/311082_86754.pdf

Lambert, D. (1993). Measures of disclosure risk and harm. Journal of Official Statistics, 9, 313–313.

Lee, J., Kim, M., Jeong, Y., & Ro, Y. (2022, June). Differentially private normalizing flows for synthetic tabular data generation. Proceedings of the AAAI Conference on Artificial Intelligence, 36(7), 7345–7353.

Leftin, J., Smith, J., Cnnyngham, K., & Trippe, C. (2014, February 28). Creation of the 2011 MATH SIPP+ Microsimulation Model and Database (Working Paper). https://www.mathematica.org/publications/creation-of-the-2011-math-sipp-microsimulation-model-and-database

Li, B., Chen, C., Liu, H., & Carin, L. (2019, April 16–18). On connecting stochastic gradient MCMC and differential privacy. Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 89, 557–566.

Li, J., Li, L., Krenzke, T., & Chang, W. Y. (2021, August 8). An approach to estimate the reidentification risk in longitudinal survey microdata [Virtual conference session]. Joint Statistical Meetings 2021. American Statistical Association.

Little, R. J. (1993). Statistical analysis of masked data. Journal of Official Statistics, 9(2), 407–426.

Little, R. J., Liu, F., & Raghunathan, T. E. (2004). Statistical disclosure techniques based on multiple imputation. In A. Gelman & X. Meng (Eds.), Applied Bayesian modelling and causal inference from incomplete-data perspectives (pp. 141–152). Wiley. http://dx.doi.org/10.1002/0470090456.ch13

Liu, F. (2022). Model-based differentially private data synthesis and statistical inference in multiply synthetic differentially private data. Transactions on Data Privacy, 15(3), 141–175.

Liu F., & Zhao, X. (2023). Disclosure risk from homogeneity attack in differentially privately sanitized frequency distribution. IEEE Transactions on Dependable and Secure Computing, 20(5), 3927–3939. https://doi.org/10.1109/TDSC.2022.3220592

Lohr, S. L., & Raghunathan, T. E. (2017). Combining survey data with other data sources. Statistical Science, 32(2), 293–312.

Machanavajjhala, A., Kifer, D., Abowd, J., Gehrke, J., & Vilhuber, L. (2008). Privacy: Theory meets practice on the map [Conference session]. 2008 IEEE 24th International Conference on Data Engineering. IEEE. https://doi.org/10.1109/ICDE.2008.4497436

Malec, D., Davis, W. W., & Cao, X. (1999). Model-based small area estimates of overweight prevalence using sample selection adjustment. Statistics in Medicine, 18(23), 3189–3200.

Maples, J. J. (2017). Improving small area estimates of disability: Combining the American Community Survey with the Survey of Income and Program Participation. Journal of the Royal Statistical Society (Series A: Statistics in Society), 180(4), 1211–1227.

McClure, D., & Reiter, J. P. (2012). Differential privacy and statistical disclosure risk measures: An investigation with binary synthetic data. Transactions on Data Privacy, 5(3), 535–552.

McKenna, L. (2019, April). A history of the Survey of Income and Program Participation and disclosure avoidance. U.S. Census Bureau. https://www2.census.gov/adrm/CED/Papers/CY19/2019-04-McKenna-sipp%20and%20da.pdf

Page 200 Cite

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.

×

McKenna, R., Miklau, G., & Sheldon, D. (2021). Winning the NIST contest: A scalable and general approach to differentially private data. Journal of Privacy and Confidentiality, 11(3). https://journalprivacyconfidentiality.org/index.php/jpc/article/view/778/727

McKernan, S.-M., Ratcliffe, C., & Braga, B. (2021). The effect of the U.S. safety net on material hardship over two decades. Journal of Public Economics, 197, 104403. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8153365/

Medalia, C., Meyer, B. D., O’Hara, A. B., & Wu, D. (2019). Linking survey and administrative data to measure income, inequality, and mobility. International Journal of Population Data Science, 4(1). https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8142965/

Meng, X. L. (1994). Multiple-imputation inferences with uncongenial sources of input. Statistical Science, 9(4), 538–558.

Messing, S., DeGregorio, C., Hillenbrand, B., King, G., Mahanti, S., Mukerjee, Z., Nayak, C., Persily, N., State, B., & Wilkins, A. (2020). Facebook privacy-protected full URLs data set. HarvardDataverse, Social Science One, V10. https://doi.org/10.7910/DVN/TDOAPG

Meyer, B. D., Wu, D., Mooers, V., & Medalia, C. (2021). The use and misuse of income data and the rarity of extreme poverty in the United States. Journal of Labor Economics, 39(S1). https://www.russellsage.org/sites/default/files/use%20and%20misuse%20of%20income%20data.pdf

Mohadjer, L., Rao, J. N. K., Liu, B., Krenzke, T., & Van De Kerckhove, W. (2012). Hierarchical Bayes small area estimates of adult literacy using unmatched sampling and linking models. Journal of the Indian Society of Agricultural Statistics, 66(1), 55–64.

Monte, L. M. (2017, March). Multiple partner fertility research brief. Current Population Reports. U.S. Census Bureau. https://www.census.gov/content/dam/census/library/publications/2017/demo/p70br-146.pdf

Nandy, S., Holan, S. H., & Schweinberger, M. (2023). A socio-demographic latent space approach to spatial data when geography is important but not all-important. https://doi.org/10.48550/arXiv.2304.03331

Narayanan, A., & Felten, F. W. (2014). No silver bullet: De-identification still doesn’t work. http://randomwalker.info/publications/no-silver-bullet-de-identification.pdf

Narayanan, A., & Shmatikov, V. (2008). Robust de-anonymization of large sparse datasets [Conference session]. 2008 IEEE Symposium on Security and Privacy. IEEE. https://doi.org/10.1109/SP.2008.33

Narayanan, A., & Shmatikov, V. (2010, June). Myths and fallacies of “Personally Identifiable Information.” Communications of the Association for Computing Machinery, 53(6), 24–26. https://doi.org/10.1145/1743546.1743558

National Academies of Sciences, Engineering, and Medicine. (2018). The 2014 redesign of the Survey of Income and Program Participation: An assessment. The National Academies Press. https://doi.org/10.17226/24864

National Research Council. (1989). The Survey of Income and Program Participation: An interim assessment. The National Academies Press.

———. (1993). The future of the Survey of Income and Program Participation. The National Academies Press. https://doi.org/10.17226/2072

———. (2009). Reengineering the Survey of Income and Program Participation. The National Academies Press. https://doi.org/10.17226/12715

Near, J. P., & He, X. (2021). Differential privacy for databases. Foundations and Trends in Databases, 11(2), 109–225. http://dx.doi.org/10.1561/1900000066

Nissenbaum, H. (2010). Privacy in context: Technology, policy, and the integrity of social life. Stanford University Press.

Nixon, M. P., Barrientos, A. F., Reiter, J. P., & Slavković, A. (2022). A latent class modeling approach for generating synthetic data and making posterior inferences from differentially private counts. Journal of Privacy and Confidentiality, 12(1). https://doi.org/10.29012/jpc.768

Page 201 Cite

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.

×

Nowok, B., Raab, G. M., & Dibben, C. (2016). Synthpop: Bespoke creation of synthetic data in R. Journal of Statistical Software, 74(11), 1–26.

Office of Management and Budget. (2013, May 9). Open data policy—Managing information as an asset [Memorandum No. M-13-13]. https://rosap.ntl.bts.gov/view/dot/34954

———. (2019a, April 24). Improving implementation of the Information Quality Act [Memorandum No. M-19-15]. https://www.whitehouse.gov/wp-content/uploads/2019/04/M-19-15.pdf

———. (2019b, July 10). Phase 1 implementation of the Foundations for Evidence-Based Policymaking Act of 2018: Learning agendas, personnel, and planning guidance [Memorandum No. M-19-23]. https://www.whitehouse.gov/wp-content/uploads/2019/07/m-19-23.pdf

Papamakarios, G., Nalisnick, E., Rezende, D. J., Mohamed, S., & Lakshminarayanan, B. (2021). Normalizing flows for probabilistic modelling and inference. The Journal of Machine Learning Research, 22(1), 2617–2680.

Park, D. K., Gelman, A., & Bafumi, J. (2006). State-level opinions from national surveys: Poststratification using multilevel logistic regression. In J. E. Cohen (Ed.), Public opinion in state politics (pp. 209–228). Stanford University Press.

Park, M., Foulds, J., Chaudhuri, K., & Welling, M. (2020). Variational Bayes in private settings (VIPS). Journal of Artificial Intelligence Research, 68, 109–157.

Parker, P. A., Holan, S. H., & Janicki, R. (2020). Conjugate Bayesian unit-level modeling of count data under informative sampling designs. Stat 9(1), e267. https://onlinelibrary.wiley.com/doi/abs/10.1002/sta4.267

Parker, P. A., Holan, S. H., & Janicki, R. (2022). Computationally efficient Bayesian unit-level models for non-Gaussian data under informative sampling with application to estimation of health insurance coverage. The Annals of Applied Statistics, 16(2), 887–904.

Parker, P. A., Holan, S. H., & Janicki, R. (2023a). Conjugate modeling approaches for small area estimation with heteroscedastic structure. Journal of Survey Statistics and Methodology, smad 00, 1–20. https://doi.org/10.1093/jssam/smad002

Parker, P. A., Janicki, R., & Holan, S. H. (2023b). A comprehensive overview of unit-level modeling of survey data for small area estimation under informative sampling. Journal of Survey Statistics and Methodology, 11(4), 829–857. https://doi.org/10.1093/jssam/smad020

Pilkauskas, N. V., & Cross, C. (2018). Beyond the nuclear family: Trends in children living in shared households. Demography, 55(6), 2283–2297. https://doi.org/10.1007/s13524-018-0719-y

Pistner, M., Slavković, A., & Vilhuber, L. (2018). Synthetic data via quantile regression for heavy-tailed and heteroskedastic data. In J. Domingo-Ferrer & F. Montes (Eds.), Privacy in statistical databases. Springer-Verlag. https://doi.org/10.1007/978-3-319-99771-17

Polson, N. G., Scott, J. G., & Windle, J. (2013). Bayesian inference for logistic models using Pólya–Gamma latent variables. Journal of the American Statistical Association, 108(504), 1339–1349.

Porter, A. T., Wikle, C. K., & Holan, S. H. (2015). Small area estimation via multivariate Fay–Herriot models with latent spatial dependence. Australian & New Zealand Journal of Statistics, 57(1), 15–29.

Potok, N., & Hart, N. (2022). A Blueprint for Implementing the National Secure Data Service: Initial Governance and Administrative Priorities for the National Science Foundation. Data Foundation.

Privacy Preserving Techniques Task Team. (2023). UN Handbook on Privacy-Preserving Computation Techniques. BigData UN Global Working Group. https://unstats.un.org/bigdata/task-teams/privacy/UN%20Handbook%20for%20Privacy-Preserving%20Techniques.pdf

Qardaji, W., Yang, W., & Li, N. (2014, June). PriView: Practical differentially private release of marginal contingency tables. Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (Association for Computing Machinery), 1435–1446. https://doi.org/10.1145/2588555.2588575

Page 202 Cite

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.

×

Quick, H., Holan, S. H., Wikle, C. K., & Reiter, J. P. (2015). Bayesian marked point process modelling for generating fully synthetic public use data with point-referenced geography. Spatial Statistics, 14, 439–451.

Quintana, D. S. (2020). A synthetic dataset primer for the biobehavioural sciences to promote reproducibility and hypothesis generation. eLife, 9, e53275.

Raab, G. M., Nowok, B., & Dibben, C. (2018). Practical data synthesis for large samples. Journal of Privacy and Confidentiality, 7(3), 67–97. https://doi.org/10.29012/jpc.v7i3.407

Raghunathan, T. E. (2016). Missing data analysis in practice. CRC Interdisciplinary Statistics Series. Chapman & Hall.

Raghunathan, T. E. (2021). Synthetic data. Annual Review of Statistics and Its Application, 8, 129–140.

Raghunathan, T. E., Lepkowski, J. M., Van Hoewyk, J., & Solenberger, P. (2001). A multivariate technique for multiply imputing missing values using a sequence of regression models. Survey Methodology, 27(1), 85–96.

Raghunathan, T. E., Reiter, J. P., & Rubin, D. B. (2003). Multiple imputation for statistical disclosure limitation. Journal of Official Statistics, 19(1), 1.

Raghunathan, T. E., Solenberger, P., Berglund, P., & Van Hoewyk, J. (2016). IVEware: Imputation and variance estimation software (V0.3). Survey Methodology Program, Survey Research Center, Institute for Social Research, University of Michigan. https://src.isr.umich.edu/wp-content/uploads/iveware_manual_revised.pdf

Rao, J. N., & Molina, I. (2015). Small area estimation. John Wiley & Sons.

Rein, D. B., Zhang, P., Wirth, K. E., Lee, P. P., Hoerger, T. J., McCall, N., Klein, R., Tielsch, J. M., Vijan, S., & Saadine, J. (2006). The economic burden of major adult visual disorders in the United States. Archives of Ophthalmology, 124(12), 1754–1760.

Reiter, J. P. (2003). Inference for partially synthetic, public use microdata sets. Survey Methodology 29, 181–188. http://www2.stat.duke.edu/~jerry/Papers/sm03.pdf

———. (2005a). Releasing multiply-imputed, synthetic public use microdata: An illustration and empirical study. Journal of the Royal Statistical Society, Series A, 168, 185–205.

———. (2005b). Using CART to generate partially synthetic public use microdata. Journal of Official Statistics, 21(3), 441.

———. (2012). Statistical approaches to protecting confidentiality for microdata and their effects on the quality of statistical inferences. Public Opinion Quarterly, 76(1), 163–181.

———. (2021). Assessing uncertainty when using linked administrative records. In A. Y. Chun, M. D. Larsen, G. Durrant, & J. P. Reiter (Eds.), Administrative records for survey methodology (pp. 139–153). Wiley. https://doi.org/10.1002/9781119272076.ch6

———. (2023). Synthetic data: A look back and a look forward. Transactions on Data Privacy, 16(1), 15–24.

Reiter, J. P., & Mitra, R. (2009). Estimating risks of identification disclosure in partially synthetic data. Journal of Privacy and Confidentiality, 1(1), 99–110.

Rezende, D., & Mohamed, S. (2015). Variational inference with normalizing flows [Conference session]. The 32nd International Conference on Machine Learning.

Ribar, D. C. (2005). Transitions from welfare and the employment prospects of low-skill workers. Southern Economic Journal, 71(3), 514–533. https://onlinelibrary.wiley.com/doi/abs/10.1002/j.2325-8012.2005.tb00655.x

Rinott, Y., O’Keefe, C., Shlomo, N., & Skinner, C. (2018). Confidentiality and differential privacy in the dissemination of frequency tables. Statistical Sciences, 33(3), 358–385.

Rocher, L., Hendrickx, J. M., & De Montjoye, Y. (2019). Estimating the success of re-identifications in incomplete datasets using generative models. Nature Communications, 10(1), 3069.

Rubin, D. B. (1993). Discussion: Statistical disclosure limitation. Journal of Official Statistics, 9(2), 461–468.

Page 203 Cite

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.

×

Rue, H., Martino, S., & Chopin, N. (2009). Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(2), 319–392.

Ruggles, S., Flood, S., Sobek, M., Brockman, D., Cooper, G., Richards, S., & Schouweiler, M. (2023). IPUMS USA: Version 13.0 [dataset]. IPUMS. https://doi.org/10.18128/D010.V13.0

Ruggles, S., & Magnuson, D. (2020, June). Census technology, politics, and institutional change, 1790-2020. Journal of American History, 107(1), 19–51. https://doi.org/10.1093/jahist/jaaa007

Sabia, J. J., & Nielsen, R. B. (2015). Minimum wages, poverty, and material hardship: New evidence from the SIPP. Review of Economics of the Household, 13(1), 95–134.

Sakshaug, Y. W., & Raghunathan, T. E. (2010). Synthetic data for small area estimation [Conference session]. International Conference on Privacy in Statistical Databases.

Savitsky, T. D., & Toth, D. (2016). Bayesian estimation under informative sampling. Electronic Journal of Statistics, 10(1), 1677–1708.

Scherer, Z. (2022). Leave usage following a first birth among men in the United States: Evidence from new nationally representative data (Working Paper No. 2033-05). U.S. Census Bureau. https://www.census.gov/content/dam/Census/library/working-papers/2022/demo/sehsd-wp2022-05.pdf

Scherer, Z., & Mayol-Garcia, Y. (2020, November). Transitions in parental presence among children: 2017. Current Population Reports. U.S. Census Bureau. https://www.census.gov/content/dam/Census/library/publications/2020/demo/p70-169.pdf

Scherpf, E., Newman, C., & Prell, M. (2015). Improving the assessment of SNAP targeting using administrative records (Economic Research Report No. 186). U.S. Department of Agriculture, Economic Research Service. https://www.ers.usda.gov/publications/pub-details/?pubid=45371

Scott, D. W. (2015). Multivariate density estimation: Theory, practice, and visualization. John Wiley & Sons.

Seeman, J., Slavković, A., & Reimherr, M. (2020). Private posterior inference consistent with public information: A case study in small area estimation from synthetic Census data. In J. Domingo-Ferrer & K. Muralidhar (Eds.), Privacy in statistical databases V12276. Springer. https://link.springer.com/chapter/10.1007/978-3-030-57521-2_23

Sei, Y., Okumura, H., Takenouchi, T., & Ohsuga, A. (2017). “Anonymization of sensitive quasi-identifiers for l-diversity and t-closeness. IEEE transactions on dependable and secure computing, 16(4), 580–593.

Shaefer, H. L., & Edin, K. (2013). Rising extreme poverty in the United States and the response of federal means-tested transfer programs. Social Service Review, 87(2), 250–268.

Sharma, M., Hutchinson, M., Swaroop, S., Honkela, A., & Turner, R. E. (2019). Differentially private federated variational inference [Conference session]. 33rd Conference in Neural Information and Processing Systems (NeurIPS). https://doi.org/10.48550/arXiv.1911.10563

Shlomo, N. (2010). Releasing microdata: Disclosure risk estimation, data masking and assessing utility. Journal of Privacy and Confidentiality, 2(1), 73–91.

Shlomo, N. (2020). Integrating differential privacy in the statistical disclosure control tool-kit for synthetic data production. In J. Domingo-Ferrer & K. Muralidhar (Eds.), Privacy in statistical databases V12276. Springer. https://link.springer.com/chapter/10.1007/978-3-030-57521-2_23

Shlomo, N., & De Waal, T. (2008). Protection of micro-data subject to edit constraints against statistical disclosure. Journal of Official Statistics, 24(2), 1–26.

Shlomo, N., Krenzke, T., & Li, J. (2019). Confidentiality protection approaches for survey weighted frequency tables. Transactions on Data Privacy, 12(3), 145–168.

Shlomo, N., & Skinner, C. (2022). Measuring risk of re-identification in microdata: State-of-the art and new directions paper. Journal of the Royal Statistical Society, Series A: Statistics in Society, 185(4), 1644–1662. http://dx.doi.org/10.1111/rssa.12902

Page 204 Cite

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.

×

Shlomo, N., & Young, C. (2008). Invariant post-tabular protection of census frequency counts. In J. Domingo-Ferrer & Y. Saygin (Eds.), Privacy in statistical databases (pp. 77–89). Springer.

Shokri, R., Stronati, M., Song, C., & Shmatikov, V. (2017). Membership inference attacks against machine learning models [Conference session]. 2017 IEEE Symposium on Security and Privacy, San Jose, CA. https://doi.org/1109/SP.2017.41

Si, Y., Pillai, N. S., & Gelman, A. (2015). Bayesian nonparametric weighted sampling inference. Bayesian Analysis, 10(3), 605–625. https://doi.org/10.1214/14-BA924

Skinner, C. J. (1989). Domain means, regression and multivariate analysis. In C. J. Skinner, D. Holt, & T. M. F. Smith (Eds.), Analysis of complex surveys (pp. 80–84). Wiley.

Skinner, C. J., & Elliot, M. J. (2002). A measure of disclosure risk for microdata. Journal of the Royal Statistical Society, Series B, 64, 855–867.

Skinner, C. J., & Holmes, D. (1998). Estimating the re-identification risk per record in microdata. Journal of Official Statistics, 14, 361–372.

Skinner, C. J., & Shlomo, N. (2008). Assessing identification risk in survey micro-data using log linear models. Journal of the American Statistical Association, 103(483), 989–1001.

Slavković, A., & Seeman, J. (2023). Statistical data privacy: A song of privacy and utility. Annual Reviews of Statistics and Its Application, 10, 189–218. https://www.annualreviews.org/doi/pdf/10.1146/annurev-statistics-033121-112921

Slavković, A. B., & Fienberg, S. E. (2004). Bounds for cell entries in two-way tables given conditional relative frequencies. In J. Domingo-Ferrer & V. Torra (Eds.), Privacy in statistical databases, V3050 (pp. 30–43). Springer.

Smith, K., Williams, A. R., & Mudrazija, S. (2021, October 21). Modeling Income in the Near Term (Urban Institute Research Report). Urban Institute. https://www.urban.org/sites/default/files/publication/104958/modeling-income-in-the-near-term.pdf

Snoke, J., Raab, G. M., Nowok, B., Dibben, C., & Slavković, A. (2018). General and specific utility measures for synthetic data. Journal of the Royal Statistical Society: Series A (Statistics in Society), 181(3), 663–688.

Snoke, J., & Slavković, A. (2018). pMSE Mechanism: Differentially private synthetic data with maximal distributional similarity. https://www.semanticscholar.org/reader/3433dd5f43ace6f53f96f6a76af901b56d9a5ce2

Song, Y., & Ermon, S. (2019). Generative modeling by estimating gradients of the data distribution. Advances in Neural Information Processing Systems, 32. https://proceedings.neurips.cc/paper_files/paper/2019/file/3001ef257407d5a371a96dcd947c7d93-Paper.pdf

Stan Development Team. (2023). RStan: the R interface to Stan. R package version 2.21.8. https://mc-stan.org/

Stanley, J., & Totty, E. (2021). A penny synthesized is a penny earned? An exploratory analysis of accuracy in the SIPP Synthetic Beta. U.S. Census Bureau. https://www2.census.gov/adrm/CED/Papers/CY21/2021-006-StanleyTotty.pdf

Stanley, J., & Totty, E. (2023, May 4–5). A penny synthesized is a penny earned? An exploratory analysis of accuracy in the SIPP Synthetic Beta [Conference session]. NBER Conference on Data Privacy Protection and Applied Research: Methods, Approaches, and Their Consequences, Cambridge, MA. https://conference.nber.org/conf_papers/f175215.pdf

Stykes, J. B., & Guzzo, K. B. (2019). Multiple-partner fertility: Variation across measurement approaches. In R. Schoen, (Ed.), Analytical family demography (pp. 215–239). Springer, Cham. https://doi.org/10.1007/978-3-319-93227-9_10

Su, B., Wang, Y., Schiavazzi, D. E., & Liu, F. (2023). Differentially private normalizing flows for density estimation, data synthesis, and variational inference with application to electronic health records. https://doi.org/10.48550/arXiv.2302.05787

Sun, A., Parker, P. A., & Holan, S. H. (2022). Analysis of household pulse survey public-use microdata via unit-level models for informative sampling. Stats, 5(1), 139–153.

Takahashi, T., Takagi, S., Ono, H., & Komatsu, T. (2020). Differentially private variational autoencoders with term-wise gradient aggregation. https://doi.org/10.48550/arXiv.2006.11204

Page 205 Cite

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.

×

Tamborini, C. R., & Villarreal, A. (2021). Immigrants’ employment stability over the Great Recession and its aftermath. Demography, 58(5), 1867–1895.

Tao, Y., McKenna, R., Hay, M., Machanavajjhala, A., & Miklau, G. (2022). Benchmarking differentially private synthetic data generation algorithms [Conference session]. 3rd AAAI Workshop on Privacy-Preserving Artificial Intelligence (PPAI-22), Association for the Advancement of Artificial Intelligence. https://doi.org/10.48550/arXiv.2112.09238

Taylor, S., MacDonald, G., Ueyama, K., & Bowen, C. M. (2021). A privacy-preserving validation server prototype [Technical paper]. Urban Institute.

Tersine, A. G. (2022, September 7). Source and accuracy statement for the Survey of Income and Program Participation calendar year 2021 data collection public use files (S&A-26) [Memorandum]. U.S. Census Bureau. https://www2.census.gov/programs-surveys/sipp/tech-documentation/source-accuracy-statements/2021/sipp-2021-SA-07-SEP22.pdf

Thompson, D., & King, M. D. (2022, February). Income sources of older households: 2017. Current Population Reports. U.S. Census Bureau. https://www.census.gov/content/dam/Census/library/publications/2022/demo/p70br-177.pdf

Thompson, G., Broadfoot, S., & Elazar, D. L. (2013, October 28–30). Methodology for the automatic confidentialisation of statistical outputs from remote servers at the Australian Bureau of Statistics [Conference session]. Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality, Ottawa, Canada. https://unece.org/fileadmin/DAM/stats/documents/ece/ces/ge.46/2013/Topic_1_ABS.pdf

Tournier, A. J., & de Montjoye, Y. (2022). Expanding the attack surface: Robust profiling attacks threaten the privacy of sparse behavioral data. Science Advances, 8(3), https://doi.org/10.1126/sciadv.abl6464

Tran, T., Reimherr, M., & Slavković, A. (2023). Differentially private synthetic heavy-tailed data. https://doi.org/10.48550/arXiv.2309.02416

Tudor, C., Cornish, G., & Spicer, K. (2014). Intruder testing on the 2011 UK Census: Providing practical evidence for disclosure protection. Journal of Privacy and Confidentiality, 5(2). https://doi.org/10.29012/jpc.v5i2.632

Ullman, J., & Vadhan, S. (2020). PCPs and the hardness of generating synthetic data. Journal of Cryptology, 33(4), 2078–2112.

Van den Burg, G. J. J., & Williams, C. K. I. (2021). On memorization in probabilistic deep generative models [Conference session]. 35th Conference on Neural Information Processing Systems (NeurIPS 2021). https://proceedings.neurips.cc/paper_files/paper/2021/file/eae15aabaa768ae4a5993a8a4f4fa6e4-Paper.pdf

Van Hook, J., & Glick, J. E. (2007). Immigration and living arrangements: Moving beyond economic need versus acculturation. Demography, 44(2), 225–249.

Vandendijck, Y., Faes, C., Kirby, R. S., Lawson, A., & Hens, N. (2016). Model-based inference for small area estimation with sampling weights. Spatial Statistics, 18, 455–473.

Vaughan, D. R., Haley, B., & Dajani, A. (2021). Ten years later: Self-sufficiency of welfare mothers before the Great Recession. Poverty & Public Policy, 13(3), 1–40. https://onlinelibrary.wiley.com/doi/abs/10.1002/pop4.308

Vietri, G., Archambeau, C., Aydore, S., Brown, W., Kearns, M., Roth, A., Siva, A., Tang, S., & Wu, S. Z. (2022). Private synthetic data for multitask learning and marginal queries. Proceedings, Advances in Neural Information Process Systems, 35. https://proceedings.neurips.cc//paper_files/paper/2022/hash/7428310c0f97f1c6bb2ef1be99c1ec2a-Abstract-Conference.html

Vilhuber, L. (2019, August 13–16). Utility of two synthetic data sets mediated through a validation server: Experience with the Cornell Synthetic Data Server [Conference session]. Conference on Current Trends in Survey Statistics 2019 at the Institute for Mathematical Sciences, National University of Singapore, Singapore. https://ecommons.cornell.edu/items/c86c85db-3496-4fde-aafb-40f1963e06d9

Vu, D., & Slavković, A. (2009). Differential privacy for clinical trial data: Preliminary evaluations [Conference session]. The 2009 IEEE International Conference on Data Mining Workshops, ICDMW ’09, Washington, DC. IEEE Computer Society.

Page 206 Cite

Suggested Citation:"References." National Academies of Sciences, Engineering, and Medicine. 2024. A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation. Washington, DC: The National Academies Press. doi: 10.17226/27169.

×

Waites, C., & Cummings, R. (2021, July). Differentially private normalizing flows for privacy-preserving density estimation. Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, 1000–1009. https://dl.acm.org/doi/10.1145/3461702.3462625

Wang, H., & Reiter, J. P. (2012). Multiple imputation for sharing precise geographies in public use data. The Annals of Applied Statistics, 6(1), 229.

Wang, J. C., Holan, S. H., Nandram, B., Barboza, W., Toto, C., & Anderson, E. (2012). A Bayesian approach to estimating agricultural yield based on multiple repeated surveys. Journal of Agricultural, Biological, and Environmental Statistics, 17, 84–106.

Wang, Y. X., Fienberg, S., & Smola, A. (2015, June). Privacy for free: Posterior sampling and stochastic gradient Monte Carlo. International Conference on Machine Learning, 2493–2502.

Warren, L., & Tettenhorst, A. (2022, September). Poverty dynamics: 2017–2019. Current Population Reports. U.S. Census Bureau. https://www.census.gov/content/dam/Census/library/publications/2022/demo/p70br-179.pdf

Wasserman, L., & Zhou, S. (2010). A statistical framework for differential privacy. Journal of the American Statistical Association, 105(489), 375–389.

Williams, A. R., Snoke, J., Bowen, C. M., & Barrientos, A. F. (2023, May 4–5). Disclosing economists’ privacy perspectives: A survey of American Economic Association members on differential privacy and data fitness for use standards [Conference session]. Data Privacy Protection and the Conduct of Applied Research: Methods, Approaches and Their Consequences, Cambridge, MA. National Bureau of Economic Research. https://conference.nber.org/conf_papers/f178417.pdf

Wilson, R. J., Zhang, C. Y., Lam, W., Desfontaines, D., Simmons-Marengo, D., & Gipson, B. (2020). Differentially private SQL with bounded user contribution. Proceedings on Privacy Enhancing Technology, (2), 230–250. https://doi.org/10.2478/popets-2020-0025

Winkler, W. E. (2006). Overview of record linkage and current research directions. Statistical Research Division Report. U.S. Census Bureau. http://www.census.gov/srd/papers/pdf/rrs2006-02.pdf

Wong, K. S., Tu, N. A., Bui, D. M., Ooi, S. Y., & Kim, M. H. (2019). “Privacy-preserving collaborative data anonymization with sensitive quasi-identifiers,” in 2019 12th CMI Conference on Cybersecurity and Privacy (CMI), Copenhagen, Denmark.

Woo, M. J., Reiter, J. P., Oganian, A., & Karr, A. F. (2009). Global measures of data utility for microdata masked for disclosure limitation. Journal of Privacy and Confidentiality, 1(1).

Xie, L., Lin, K., Wang, S., Wang, F., & Zhou, J. (2018). Differentially private generative adversarial network. https://doi.org/10.48550/arXiv.1802.06739

Xu, J., Ren, X., Lin, J., & Sun, X. (2018). DP-GAN: Diversity-promoting generative adversarial network for generating informative and diversified text [Conference session]. 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium. Association for Computational Linguistics. https://doi.org/10.48550/arXiv.1802.01345

Yancey, W. E., Winkler, W. E., & Creecy, R. H. (2002). Disclosure risk assessment in perturbative micro-data protection. In J. Domingo-Ferrer (Ed.), Inference control in statistical databases (pp. 135–151). Springer.

Yıldırım, S., & Ermis, B. (2019). Exact MCMC with differentially private moves. Statistics and Computing, 29(5), 947–963.

Yousefpour, A., Shilov, I., Sablayrolles, A., Testuggine, D., Prasad, K., Malek, M., Nguyen, J., Ghosh, S., Bharadwaj, A., Zhao, J., Cormode, G., & Mironov, I. (2021, December 6–14). Opacus: User-friendly differential privacy library in PyTorch [Conference session]. NeurIPS 2021. Neural Information Processing Systems. https://doi.org/10.48550/arXiv.2109.12298

Yu, M., He, Y., & Raghunathan, T. E. (2022). A semiparametric multiple imputation approach to fully synthetic data for complex surveys. Journal of Survey Statistics and Methodology, 10(3), 618–641. https://doi.org/10.1093/jssam/smac016

Zhang, J., Cormode, G., Procopiuc, C. M., Srivastava, D., & Xiao, X. (2017). Privbayes: Private data release via Bayesian networks. ACM Transactions on Database Systems (TODS), 42(4), 1–41.

Zhang, Z., Yan, C., & Malin, B. A. (2022). Membership inference attacks against synthetic health data. Journal of Biomedical Informatics, 125(103977).

A Roadmap for Disclosure Avoidance in the Survey of Income and Program Participation (2024)

Chapter: References

References

Welcome to OpenBook!

Get Email Updates