2020 Census: Citizenship, Science, Politics, and Privacy



well an awkward it's something I could phone be audio test microphone be podium mic tests and a lapel mic test lapel mic another podium da a test a good morning we're gonna go ahead and get started this is sound okay we're we're waiting for senator Peters but he's aw he's on his way so he'll he'll arrive during these first two comments but we want to get going go to busy schedule so thank you all for coming I'm David Lamoureux director of the Institute for Social Research my great pleasure to welcome you all to this conference on the 2020 census citizenship science politics and privacy welcome to those who were watching a live stream I want everybody to know that we are we are live-streaming this and recording it and could it possibly be published at a later date so be aware of that this conference is co-sponsored by the Institute for Social Research and the Gerald our Ford School of Public Policy we've got an excellent group of speakers this morning to talk about many aspects of the 2020 census senator Peters as I admission will be arriving shortly and I'll introduce him a bit later but to begin our program we're honored to have a keynote presentation from Al Fontenot that was the Associate Director for decennial census programs at the US Census Bureau which is a enormous undertaking at this stage of the census cycle he came to the Census Bureau after retiring from successful 40-year private-sector career culminating in almost 20 years as president and CEO of several manufacturing companies I was enjoying chatting with him about his about his history as sort of after having a whole career and sort of deciding to help with the 2000 census 2010 census just as a little side thing temporarily he has now done virtually everything you can do in the census and including serving as assistant regional director of the Los Angeles region regional director of the Chicago region chief of the field division an assistant director of both the field and decennial Directorate prior to his becoming the Associate Director for decennial census programs which is a enormous job so very impressive that he combines this and I think quite unusual that he combines this long history in the private sector as and then doing all of these different positions in the and the field work of both the American Community Survey and the and the decennial census so he's now we're looking forward to hearing his perspectives on how things are going for the 2020 census so let's welcome Alfano thank you very much good morning it's not everyone on this beautiful rainy morning here in Ann Arbor it's my pleasure to be here today to present an overview of the census I'm going to be highlighting the innovations we've implemented and the processes and programs that we have established to ensure that we conduct a complete and accurate 2020 census before we get started just a brief reminder you all know this but we remind everyone in every presentation that the purpose of the census is to conduct the enumeration of the population and housing and disseminate the results of the President of the States and the American people is mandated in article 1 section 2 of the US Constitution the primary purpose is the apportionment of representation among the states for the House of Representatives but there are many other important uses of census data including the drawing congressional state legislative district school districts voting precincts distributing over six hundred and seventy-five billion dollars annually to states and local communities informing decisions that governments and private sector business makes businesses make every day helping them make the decisions about their work as we think about the census it's a very complex situation at best and in this environment it's even more difficult with a series of challenges for the 2020 census including the fact that response rates for surveys and synthesis worldwide are declining people are more concerned about sharing information and trust of government is also declining households are more complex diverse and dynamic blended families may include two primary residences for the children as they have split families and blended families other households may include more than one family or multiple generations the United States continues to be a highly mobile nation as about 15% of the population moves in any given year more and more households speak languages other than English is the primary language spoken in that household the current low employment levels while they are good for the nation make it very challenging for us to hire 400,000 temporary census workers that we will need to complete our operation and in 2020 we'll be competing for attention with both the Olympics and the presidential election this slide provides a snapshot of how we conduct the census I like to provide this overview before getting into the innovations that we have developed and tested throughout this decade the first thing starting in the lower left corner there we go I think over there the lower left corner works as establishment where to count is how we start by building the address list what we refer to as a master address file this includes every address in the country and forms the basis for the census then we motivate people to respond this is one of the places where partners are very important to us people need to know how important their response is and that their census response is safe we're counting on our partners to help us get that message out self-response will be easier than ever this census we're going to be providing three modes and multiple languages in which people can respond to the census as in prior censuses we also make sure that we're counting and enumerated people who are living or staying in non-traditional housing in some cases we call that requires and will also provide an opportunity for people experiencing homelessness to be included in the census nevertheless some people will not respond so we follow up with them by knocking on their door an average of about six times if necessary some will be a lot more some will respond on the first not our partners are helpful in this phase as well because they encourage people to cooperate with our enumerators when they come to their residence and finally we tabulate and release the census data first to the President and then to the country now let's dive in a bit so I can tell you how we're doing things differently from previous censuses the Census Bureau is effectively moving the decennial census into the 21st century by leveraging existing technologies to automate multiple operations and increase efficiencies this means moving beyond the antiquated paper-and-pencil approach of 2010 to a modernized tech driven approach to achieve a complete and accurate census this modernization effort applies to multiple aspects of the decennial census in what we call our four key innovation areas first re-engineering address canvassing that's changing the way we build our address lists second optimizing self-response making it quick easy and safe for everyone to self-respond third utilizing administrative records and third-party data to reduce respondent burden that's using information that people have already given the government to reduce the need to go and ask them those same questions over and over again and finally re-engineering field operations that's using technology to be more efficient and everything we do to manage those 400,000 people and collect the data that we will be collecting through non-response follow-up operations and during the address canvassing operation when we say establishing where it account we mean identifying all the addresses where people do or could live will conduct a hundred percent review and update of the nation's address lists in preparation for the 2020 census we'll be building on the use of the handheld devices that we pioneered in 2010 in the last census and we'll use tablets and laptops to verify addresses in the field by using geospatial imagery to review the nation's addresses we'll be able to verify approximately 70% of the addresses in our data center instead of verifying all of them by having to walk every street in the nation in 2010 we walk 11.2 million census blocks to try to verify our address list we'll be able to save 70 percent of that activity through geospatial imagery will use multiple data sources to identify errors with address changes such as postal sequence delivery file the local government input that we get through Luco which is our local update of census addresses program it's it that way I just wanted to mention to you that that program so far is in its response phase we've already gone out solicited information we have received responses and updates from state and local governments representing ninety-five percent of the housing in the United States so governments take advantage of Luka to make sure that their address list is reflected in our master address file during this phase will also delineate the types of enumeration areas including those designated what we call Update League where our census delivers a complete questionnaire packet including an invitation to respond online telephone numbers to reach our online Center and a paper questionnaire to housing units in neighborhoods and communities that do not have City style addresses and also to errors by natural disasters for example we have designated the entire island of Puerto Rico as update Li and areas affected by fires hurricanes floods such as California for fires the Gulf Coast of Florida for hurricane impact those will also be handled an update leave basis because they do not have reliable reliable postal delivery and they do not have City style addresses still intact in their address in their street map so what we're going to do is in the numerator will go out and leave a packet and then using the electronic device get a map spot for where that location is and a longitude and latitude basis that will help us improve our address list of where people are and then we're going to leverage workload models and technology to efficiently manage and route on Brown staff assignments for infield address canvassing and for non-response follow-up when we say motivate people to respond we're talking about two primary areas first our nationwide communications of partnership campaign we are building on the success of using paid advertising in Prior censuses and we're going to focus our communications about the 2020 census using advanced modeling techniques to increase awareness and self-response in those errors that are most likely to be unaware of the benefits that the census data provides for their local community in addition to our traditional print and broadcast media advertising for the first time we'll be able to use focus segmentation to reach specific audience using digital media and digital advertising second but in many ways more important we motivate people to respond to the census by providing a safe and secure environment and systems into which they can submit their personal information and we let them know that their information that they give to the census is safe and confidential let me start with what's behind that statement when we say your data is safe with the Census Bureau we insured for spot to respondent data is safe through the Census Bureau's culture of data stewardship it's a comprehensive framework designed to protect information over the course of the lifecycle from collection unencrypted secure devices whether data are secured while at rest on the instrument secured during transmission and encrypted immediately during transmission where they're secured inside the census firewall and where we're also protecting that data during the entire stage of processing of the information and finally through the 2020 disclosure avoidance system we assure that all data provided on census forms remain confidential and all census publications our census design balances data security with user experience but make no mistake data security is and continues to be first we must contain cyber threats as soon as they are detected to protect responders data then we focus on user experience to sustain response service through any potential cyber issues Salt respondents may continue to respond the Census Bureau is working closely with experts from the federal cyber security community led by the Department of Homeland Security and top private industry partners to test review and validate our internal and external cyber security systems design and process communicating what we do to ensure respondents data are secure and confidential is a vital part of motivating response people need to know their data their responses are not just confidential but also secure this is why we say responding to the census is important easy and safe this is just one of the messages will be consistently disseminating through our integrated partnership and communications program let's look at our integrated partnership and communications program it consists of many elements as you can see from the slide it accomplishes both the national and Community Partnership Program excuse me the integrated communication contract and the enterprise communication work done by the census communication directory the program is designed to design produce implement and monitor an integrated program for the 2020 census our contract supports the 2020 mission and provides us with tools and resources to ensure a complete and accurate census in 2020 Young & Rubicam Wyatt are our our communications contractor this year and they bring extensive world-class marketing and communications expertise they bring team leadership strategy development dynamic creative development and execution operational systems and financial stewardship to the table making this contract a key component for our successful census the goal of the integrated partnership and communications program is to engage and motivate people to self-respond preferably via the internet and to raise awareness to encourage response throughout the entire 2020 census process including non-response follow up this operation communicates the importance of participating in the 2020 census by supporting field recruitment efforts with a to get us a diverse and qualified census workforce they will be working to engage and motivate people to self-respond preferably via the internet raising and keeping awareness high throughout the entire 2020 census and effectively supporting dissemination of census data to stakeholders and the public at the completion of the census its comprised of many components but the major components include website and mobile media relations paid advertising statistics and schools and social media the mission of the National Partnership Program is to establish relationships with organizations and business entities that have a national national reach and scope to get their engagement in leveraging trusted familiar voices to increase response to the census and to develop sustaining and transformational engagement thinking of creative ideas that we the Census Bureau may not have thought of but they in their corporate marketing and development program have thought of creative ways to reach people we're utilizing that to transform some of our communication the community partnership program is one that most of us are familiar with it focuses primarily on motivating diverse communities at the local level toward greater participation in the 2020 census mobilizing community leaders to be engaged with their constituents and with the Census Bureau for their consistent constituents to understand that that community center across the street those roads in front of the house the schools in the neighborhood are all supported because of the data that those constituents provide to the Census Bureau the census is about the people not about the government and we also ask our partners to reach out to populations with historically low response rates because we know those are more challenging to come so how's this done three ways we educate people about the 2020 census will encourage community partners to motivate people to respond and thirdly we engage grassroots organizations to reach out to their constituents who we know have often been more difficult to reach well be hiring 1,500 partnership specialists from local communities to directly work with local governments communities organizations church groups schools and libraries to develop an effective local movement customized with personalized messaging and encouraging self-response in the language that people are familiar with and in the context people are familiar with in their local communities we're leveraging trusted voices throughout all elements of the partnership program as you can see on the that side of the slide the right side of your right side your left side of the slide we have a number of units and organizations and groups that are key elements of partnership program from state complete comp Commission's through faith based organizations to higher education census on campus to the LGBTQ outreach we are looking to utilize these organizations as effectively as possible to communicate the census message the state complete count Commission is a new concept for 2020 these state complete count Commission's are established by either state legislation legislature action or by gubernatorial action and our central and critical component of the partnership program the higher level of commitment and resources provided by States enhance the efforts to conduct the census and get a complete calm just as a point of information the state of California has already committed over 100 million dollars for 2019 and 2020 to complete their state complete count Commission's efforts to push the census for their seats often in a state they realized that they could have seats at risk and substantial federal funding at risk California has made the calculation that this is well worth the investment for them at this time one of the advantages is the state officials do understand the benefits and impact of the 2020 census and they promote the critical importance of participation at all levels of the government within the state our original staff continued to identify officials that the county municipal and community levels to organize complete count committees there to increase awareness above the census and motivate respondents residents to respond complete call committees at the local level will energize local communities and provide more financial and manpower resources for us to implement an effective 2020 census in our initial stages we're at the early stages of the complete comp committee organization 38 states have already committed to doing state complete committees or commissions and over 500 local organizations have poverty established local complete count committees previous censuses ask the public to respond primarily by mail we mailed you a questionnaire you filled it out you mailed it back for the first time the Internet will be the primary response option making it easy for people to respond anytime and anywhere while we are encouraging people to respond online we will provide options for people to respond using telephone paper questionnaires in addition to an online option the use and adaptation of the Internet by all sectors of our society let us design an easy to use internet option that can be used on home computers at public libraries and in other public forums it's designed for all those who use mobile technology whether it's on their laptops their tablets or smartphones it enables them to respond wherever you are and wherever you want whenever you want with or without your census ID the maximum flexibility to respond online as we were developing this plan to use it already of technologies to modernize the census we asked ourselves what about people who are impacted by the computer literacy gap or on the other side of the digital divide we don't want to leave people out who don't have access to Wi-Fi or don't have electronic devices what about people are not comfortable with using technology or just don't want to put their personal information online what about those people's primary language it's not English as we went through this drill of questioning we developed a census questionnaire assistance centers these are call centers where people can call toll-free the number will be included in all of our mailings and in our literature and they can have one have a human answer the phone and provide them assistance on how to fill out their form how to fill out their online number two they can answer any other questions they may have about the census and number three they can actually take the interview right over the telephone so they can take their interview right there we'll also be sending a questionnaire in the first of our six mailings our invitation to respond to the census to areas that do not have strong that connectivity and that have households that data tell us are less likely to use the Internet we will be sending a questionnaire paper questionnaire to those on the initial mailing on the forth mailing which happens about three weeks later anyone who has not responded to the census will receive a patient paper questionnaire so people can respond however they choose to do so one of our goals in Connect conducting the 2020 census is to generate the largest possible self-response thus reducing the number of households requiring expensive follow-up well motivating everyone by encouraging self-response to a micro targeted advertising tailored contact strategies and multiple mailings as I mentioned previously what about people whose primary language is not English our language program this year is far stronger than it was 10 years ago we'll be providing an Internet response option in 12 languages in addition to English and providing video instructions and guides in 59 additional languages the agents and our call centers will be fluent and able to take responses in the 13 languages that we will have internet device as well as a TV telephone communication device for the net we have another type of enumeration which we call group quarters enumeration quarters are places where people live or stay in a group living arrangement these places are owned or managed by entity or organization providing housing and our services for the residents the services may include custodial or medical care as well as other types of assistance and residency is commonly restricted to those receiving those services this is not your typical household type living arrangement so people living in group quarters are usually not related to each other examples of group quarters include nursing facilities residential treatment centers college university or seminary student housing correctional facilities in fish inpatient hospice facilities Job Corps centers vocational trainings and facilities that have resident capabilities we also use a service based on raishin as part of this operation to count people who are experiencing homelessness by going to soup kitchens shelters outdoor locations and encampments to attempt to count those people in our census for the first time we're to be using smartphone technology to capture responses in the field we'll also take advantage a lot of mission to effectively and efficiently manage and route our on-the-ground field staff area census offices will rely on automation to collect manage payroll information all these things were done in paper in 2010 to more efficiently do case assignment done in paper in 2010 automatically determine the optimal travel routing in 2010 they figured it out on their own and we drove millions of miles more than perhaps was necessary we work with the united Parcel Service and their routing software to figure out how do we route people in the most efficient manner that allow us to reduce our number of physical offices and we are testing we've just finished our 2018 census tests which we ran in Providence Rhode Island it was very successful our systems it was a test to determine if our systems could work together and if people could work using those systems we found it to be extremely effective and one of the things that we show from the tests we had a 30% increase in productivity with these technological field enhancements a lot of that was achieved this is shorter routings someone who was achieved through the devices and other things that happen to help but we were able to achieve that and we're very excited about what that provides us I want to take just a moment to touch on the use of data that people have already provided the federal government what we call administrative records well we're using data we get from the US Postal Service the IRS the Social Security Administration and other agencies to remove vacant and non-existent addresses from the non-response follow-up workload and in some cases we will use records to enumerate households that do not respond to the census but a lot of stress and emphasize that we will only be using those in areas where a very high confidence in the data of a particular house all that that data is accurate and completely if it's not as is generally the case in virtually all of our traditionally agar counted populations we will always send someone to the door to conduct the interview using administrative records does enable us to identify millions of vacant and non-existent housing units and reduces the need and cost of multiple visits to knock on doors to identify and verify those units well verify those statuses by sending mailings and receiving multiple undeliverable address returns from the US Post Office coded vacant and then we'll check with having a one person go out and verify that those units are actually vacant or non-existent in 2010 vacant and non existent housing units accounted for fourteen point four million housing units of our non-response follow-up effort we will have significant savings with that this approach and finally consistent with prior censuses will deliver the portion that counts to the President and the redistricting counts to the States we will build upon efforts in recent decades to give the public greater access to the data we'll provide flexible tools allowing the public to view 2020 census data any way they want improvements will include visualizations is your search functionality and approved access to data tables and data sets I'd like to thank you for your time this concludes my presentation which I hope you follow informative feel free to contact us through any of the multiple ways listed here or I can personally be contacted at AE Fontenot at census gov I will make one statement I have not mentioned the citizenship question on advice of counsel since we are in litigation at the moment the only thing I am able to say on that is the Department of Commerce website has defined that the secretary acting under his authority and title 13 chose to add the question to the 20 these senses thank you all right thank you very much oh that was a excellent introduction to get our discussion going this morning I think it's always impressive to see just you know what it takes to to pull up the decennial census and lots of interesting things going on so this will be a ongoing topic for the morning discussion now it's my great pleasure to introduce u.s. senator Gary Peters representing the state of Michigan senator Peters in addition to the many other things that keep him busy he's been a great champion for science the last time I saw him was here on campus he was receiving the champion of science award from the science coalition which is a coalition of universities including the University of Michigan we had an event up on North Campus in engineering Dean Allen Gallimore and Wayne state vice president for community affairs Patrick Lindsay presented the award on behalf of the science coalition so just one of the many recognitions of Senator Peters contribution to science he's also been a great champion of the social science in particular and and and data which of course is very important and very relevant to our discussion this morning earlier this year he and some other members of the Senate introduced a bill called the 2020 census improving data and enhanced accuracy Act known as the 2020 census idea Act the Act would prohibit last-minute changes or additions to the census without proper research studying and testing and would ensure that information and questions that have not been submitted to Congress are not included in the census so it's my great pleasure to introduce senator Peters to talk more about efforts in Congress and please join me in thanking him for all the work he does for science research data and state of Michigan senator Peters well thank you David thank you for that introduction and thanks for the University of Michigan to kick off this very important symposium I'm so pleased that the Institute of for Social Research and the Ford School of Public Policy is convening this conversation without question it is absolutely critical that we have a comprehensive and an accurate census drawn up but you know despite the importance of the assent of the Census many folks don't really appreciate exactly how important this exercise is first deals a great deal with the form of government we have in representation as elected federal official I can tell you that's incredibly important when it comes to redistricting and to the establishment of our legislative branches as a member of the Senate it doesn't matter as much given the fact that I represent a state and not a district but I served in the House of Representatives for a number of years and I can tell you that it makes a incredible difference when it comes to that body you know our founders laid the groundwork for American democracy on the vision that the the US House would basically be the place where every one individual gets one vote we know that isn't necessarily the case in the Senate when you have small states that have two senators and one house number and places like California that have a whole lot of House members and still just two senators as well but James Madison described the House of Representatives as the legislative body with an immediate dependence on an intimate sympathy with the people that's the goal of that and it's important to have that districts of equal size and to make sure that we're counting every individual for that so previous to our founders putting the census in such an important role they have been used to assess taxes to confiscate property to draft citizens into the military so the idea of the census wasn't new to the founders but the concept of turning the census into a tool for democracy truly was revolutionary at the time and while the Constitution is explicit about the census role in democra see the surveys ramifications as we heard earlier a stretch well beyond representation in Congress information produced by the census is a lifeline for communities throughout our state here in Michigan and across the nation and impacts everybody's lives and very direct ways it determines what communities get in terms of federal funding for highway projects for police and fire department personnel Medicare and Medicaid for seniors children working families who rely on quality care and programs like Head Start that would provide education for our children but giving government programs as all of you know are not the only ones that rely on this census data inaccurate accounts will also impact the decisions in private industry that use the census test oh where to locate offices and where to recruit people and the list goes on in private industry and there are even less tangible but equally important implications for its sense of data for example since individuals of Middle East and North Africa and descent do not have a designated category in the census they are not eligible for protection under Section 203 of the Voting Rights Act which ensures among other things the availability of foreign language ballots researchers also use census data to monitor everything from health disparities among specific populations to employment discrimination this is an important research information for us to create a more equitable society the census is without question and I'm preaching the choir here I know it's a source of valuable information but I think today and what is the importance of this symposium today and what we're going to be hearing in the hours ahead is to recognize that the risk and the challenges the Census Bureau now faces less than two years away from 2020 are significant the security and integrity of the census collection tools and resulting data is paramount to a successful survey but since 2010 the risk and threats from our cyber infinite aura cyber infrastructure have grown exponentially across both the private sector and federal government and whether it's cyber criminals assessing troves of customer data from Target and Equifax or foreign governments like China stealing personal information of federal employees the threats are certainly very real and the risk are very great and while the Census Bureau is certainly making progress in this preparedness as we heard from the previous speaker security vulnerabilities have not been mitigated sufficiently in my mind and as a member of the Senate Homeland Security and government affairs committee and the Senate Commerce Committee both committees have jurisdiction over the census as well as jurisdiction over cyber security in the country and as a member of these committees I've joined my colleagues to exert some influence over the census planning and execution for the 2020 census I recently joined seven of my colleagues and requesting that the Government Accountability Office assess the IT security and readiness of census systems without question of reach of Americans personal information would be absolutely disastrous but it could also have a lasting impact on the integrity of the census for years to come by sowing distrust and discouraging response rates earlier this year the general GAO identified over 3,000 security vulnerabilities and unless the bureau can address them in a timely manner the reality is that this critical survey remains vulnerable to those who would do us harm as Gao continues to monitor the security challenges surrounding the census I'm going to be working with both Republicans and Democrats in Congress to help make sure that Americans personal information is indeed secure and maintaining the integrity of the census is paramount and we must resist and this is critical resist efforts to use it as a political tool like many people I'm very concerned by secretary Russel's abrupt decision and continued in since on adding an untested question about citizenship that will likely impact the accuracy and the cost of the survey not only will the Census Bureau employees have to travel to communities for in-person follow-up a significant investment of time and taxpayer money but adding the citizenship question is such a troubling way and I think serves to spread the mistrust that we need to try to mitigate as many of you know six lawsuits have been filed to prevent the citizenship question from being added to the 2020 census the largest of these lawsuits is New York versus the Commerce Department documents filed by the Commerce Department in this case provide compelling evidence that the stated reason for adding the citizenship question which was that the Justice Department needs the data to properly enforce the Voting Rights Act was merely a cover for what appears to be purely political motivations the fact that the White House is fighting to delay the trial knowing that initial testing and preparations for the census are well underway is unsettling to say the least and their continued attempts to permanently block lower court orders allowing secretary Ross to be questioned illustrates their fear that the truth may come out and they will be shown to have acted in bad faith when juxtaposed with the bureau's failure to include a Middle East North African or Mena designation despite years of intensive research and community engagement I'm disappointed that this inconsistent use of trusted survey methods even minor changes to the Senate census forms could have a significant impact on response rates undermining data accuracy and increasing the overall cost of it as well typically even small proposed changes to the census go through extensive research and testing consistent with statistical analysis provisions and principles in order to account for unintended consequences in March I joined my colleagues and introducing legislation that will protect the accuracy of the census and ensure any proposed changes to the current to the count are properly studied researched and tested pretty radical concept I understand among other provisions of the bill prevents the Secretary of Commerce from implementing major operational changes that have not been researched and tested for at least three years prior to these census with the census at such a critical juncture I have been very vocal in pushing the President to quickly nominate a new Census Bureau director it's extremely important that a nonpartisan entity like the Census Bureau have a nonpartisan experienced and scientifically qualified public servant at the helm after leaving the position vacant for a year President Trump nominated Dave and Dillingham to be the Census Bureau director I've had the opportunity to meet with mr. Dillingham in my office and a few a few weeks ago and discuss what I think should be priorities that he should pursue if confirmed and I called on him to take action to earn public trust in the census even in this current political environment that is let's say filled with discord and uncertainty I urged him to recognize which actions have been taken or what decisions have been made that may have contributed to a lack of trust in the census that means being honest about the potential impacts of the citizenship question in trying to educate the public to how they can separate fact from fiction very big topic in today's political environment everyone should feel confident that the information the census collects as part of this critical effort will not be misused mishandled or manipulated in any way would almost recognize our role in the education of the public to contribute to a successful survey central to the economic and social well-being of our country and that's why it is so important to what you are all discussing today I hope that these discussions are are informative and productive and I look forward to hearing what comes out of this very important symposium so thank you for what you're doing it is now my honor to welcome Jeff or enough who's the director of the populations Study Center who will moderate this morning's panel on citizenship and politics thank you everybody thank you so much senator Peters and happy Halloween everyone I think we given us a few reasons to be potentially scared about the future of the census security questions which we'll deal with on the second panel but also the issue of citizenship which we're going to do was on this panel so actually this time I'd like to have the panelists come up and have their seats and I'm going to give more lengthy introductions of each one by one as they as they approach the podium but this is a really great panel that we have for you this morning on citizenship and politics that consists of three people who I count among my closest friends in academia and one that I have not met yet they're my matches to me the first would be Barbara Anderson multiple introduced in the second followed by Jim house Angela Ocampo and then Kurt Metzger let me just say a few words about Barbara so Barbara is the Ronald a Friedman collegiate professor of sociology and population studies at the University of Michigan her research and teaching centers on the relationship between social change and demographic change as well as the area of technical demography and one of the reasons that Barbara is on this panel in addition to her lengthy career and expertise in this area is that she was recently the chair of the Census Bureau of the census bureau's scientific advisory council or panel and recently left that position I'm gonna leave it to her to explain the terms under which she left but you hint it has to do with the citizenship question so take it away Barbara thank you I'm sorry I don't see Anderson one okay he was underneath something oh there it is hi thank you so much as Jeff was saying I was a member of the sense of Scientific Advisory Committee which is a congressionally mandated committee which is whose purpose is to give the best advice possible to the director of the Census Bureau I was a member for seven years and I was chair for three years and after being led to believe I'd be reappointed I was not in my term finished August 15th on the one hand when I was a member I was officially a special federal government employee and was under some restrictions about what I could say in public and now I'm not on the committee but now I can say anything I want as L was in plot was saying if you're a regular Census Bureau employee at least you can't say anything about policy because it's just a non nonpartisan the secretary Ross announced the citizenship question on Tuesday March 27th and our spring meeting convened on Thursday March 29th after each meeting we submit an official lists of comments and recommendations to the Census Bureau director which we did and in March about half of our comments and questions related to the citizenship question and if we look at these these summarized concerns essentially all of which were talked about in our official comments it's and I think that really everybody on the committee would agree with what I'm saying here this was not time to within the committee it's likely to depress senseless response especially among immigrants and undocumented undocumented persons it's like you just discouraged cooperation with the census by potential community partner organizations these community partner organizations which L meant to talked about are very important for the 2000 census the ANS which now is replaced by homeland security suspended all raids for the months around the census to encourage response in 2010 the raids kept on but they also set up a program of cooperation with the community partners and for 2010 there are about a quarter of a million community partners that were enlisted and it was extremely successful from what we know from even from the federal Information Act obtained focus groups from the Providence Rhode Island test it is very likely that cooperation of community partners will be much more difficult in the 2020 census mainly or overwhelmingly because of the citizenship question these things will lead to a more expensive and lower quality census will be more expensive because it doesn't cost much if everybody self-response what's really expensive is sending people out to talk to people and you're going to have to do a lot more of this it'll be much less successful and it's going to be more expensive and it's going to be a lower quality census because you're almost certain to have a much higher under count a much lower response rate the 2010 effort was quite successful and in the community partners were great and it really worked survey research has shown that the Census Bureau has an excellent excellent public reputation which is not true of Census Bureau's in all countries but the citizenship question is likely to severely damage that rep you and increase distrust of the purposes of all US federal government data collection efforts and I don't know how long it's going to take to to heal that now to talk about a little bit of history this is not the first time there have been political concerns related to a census to go back in history a little bit and students I teach know I do this all the time between 1910 and 1920 the United States became much more urban there was a large migration from heavily rural states to heavily urban states so this would have in the normal course of affairs led to a reapportionment of seats in the House of Representatives with a substantial shift in those seats from the more rural areas to the newly 202 the more urban areas but members of Congress from rural states repeatedly blocked in the 1920s reapportionment efforts based on the 1920 census and this situation wasn't remedied until in 1929 a bill was passed which mandated reapportionment after the 1930 census and mandated automatic reapportionment after each successive census even though 1920 census was enormous so we've been to these political waters before if we look at World War one in World War two in World War one the census gave the fit that well provided the other parts of the government with names and addresses of draft age males on an individual basis and in World War two it got worse the United States government used 1940 individual census data to identify japanese-american households for internment this was legal at the time due to the War Powers Act of 1941 so what they did was not nice but it was legal then in 1978 a law was passed prohibiting sharing personal information with other government agencies for at least 78 years after data collection and this is related to when the census data become public for research in such also the German census and German government has had problems the 1983 German census was suspended during due to alarm over transfer of individual religion data to other German databases people were really agitated and the German Court ruled that this was illegal that it could not be done under German law without individual permission I in the next time in the 1987 attempt at a census there were extensive protests because even though there are these new legal assurances the there was substantial distrust among the German population about the privacy provisions in the German 1987 German census and over personal questions are on the census especially the request for the name of the employer the next census in Germany wasn't until 2011 so there was a big hiatus in that and it was based on population register data in combination not with a whole census but with a 10% population survey as many of you know through 2000 there was a long form in a short form for the US Census about one-sixth of households received the long form which had about 60 questions the American Community Survey which also had about 60 questions replace the long census form in 2010 the ACS surveys about 3 million households each year and these data collected throughout the year but the ACS by law cannot be used for congressional reapportionment for reallocating members of house representatives the ACS however is used for substantial allocation of federal government funds related to poverty status and a lot of other conditions the ACS does ask a question about citizenship and this has been almost totally non-controversial it's asked of this sample and it's fine and all that and that is what's used for work related to enforcement of the Voting Rights Act so one comes to a question I'm allowed to talk about policy in politics now so I will why had a citizenship question to the census Wilbur Ross stated that the reason was to improve enforcement of the Voting Rights Act which this administration has not showed an extremely large amount of concern today but you could make sure any question you ask on the whole census you're going to have better more detailed data but you could make almost as good an argument for almost all of the 60 questions on the ACS and the Census Bureau made an official reply to the Justice Department and to secretary Ross showing how ACS data if they really wanted to do a better job enforcing the Voting Rights Act that you could use the data from the ACS you could use estimation to make quite good estimates for everywhere else if that were exactly actually what they wanted to do the I thought about why in the world are they doing this I mean if they just wanted to depress response especially among underrepresented and minority populations there are a variety of easier less controversial ways they could have achieved that purpose so I scratch scratch scratch my head and maybe I was kind of slow but it was only a few weeks ago that I came to the conclusion which I am I 95% certain of that the purpose of adding the citizenship question is to lay and I've talked to other people who don't think I'm crazy was to lay the groundwork to change the whole basis for state legislative districts within States and for reapportionment of seats in the House of Representatives from being a basis of the total population to being a basis of the citizen voting age population now some people many people think this would be unconstitutional but there is a non-trivial although minority body of legal thought that says this would not be unconstitutional here we go to the Supreme Court and who knows what would happen there to give you a little more history in nineteen excuse me I should have said 2016 I had the wrong time wasn't nice to thought I never claimed me perfect in 2000 you know you read it a million times you don't find all your mistakes in 2016 two Texans sued the state of Texas because they wanted Texas legislative districts to be changed to being allocated based on the number of people eligible to vote rather than on the total population in Texas had done this the standard way based on the total population the Supreme Court unanimously decided that allocation by the state of Texas based on the total population was legal but they did not address the question of whether a law changing the allocation to the number of adult citizens would be legal or not and at the time justices Alito and Thomas stated that they did not think that allocation if there is such a law based on the eligible voter population for districts would be illegal in 2015 I got the century right there at least Lila Brasco argued on in 538 the districts based this idea has been knocking around for quite a while based on persons eligible to vote could not be drawn because we don't have the information well we didn't until maybe now on the citizen voting age population the planned citizenship question would provide that information for the entire population I also wanted to comment that what L said was totally well correct which is not a surprise he's a really good guy that Wilbur Ross clearly had the legal authority on his own to add the citizenship question so that he did not violate any law however no he or she can do that on their own however no earlier Secretary of Commerce had acted in that way now by law as I said before the ACS data cannot be used through your portion matters of seats in the House of Representatives so estimates of the number of citizens by for the whole population based on the ACS could not be used for this reallocation enterprise another reason that the ACS data could not be used is there have been various court cases about adjusting the Census count used what's causing what's called statistical estimation and estimating the undercount overall and putting people back in it seems to me that if the ACS were used to estimate the number of citizens by for the total population that would almost certainly get knocked down legally on the same kind of basis of why statistical estimation was not used for some last thoughts it's clear that the census has to count all people but there's a legal controversies I was mentioning about whether all people must be the basis for state legislative districts or for allocations of seats in the House of Representatives such a law for either within States or for members of Congress distribution among states and members of Congress would certainly go to the Supreme Court but especially with the current Supreme Court it's unclear what these how the Supreme Court would rule this kind of change would certainly would lead to less attention of needs of children and non-citizens it doesn't just restrict it to citizens but it doesn't count people who are minor children under age 18 it will decrease the influence of these that also would increase the influence of states with a relatively old population the according to an analysis by the demographer Andrew beverage in 2016 at that time and I don't think it's changed much in two years a voting-age citizens basis would shift about five congressional seats from Democratic to Republican which is at least something which you all might be interested in knowing well thank you so much thank you Barbara I should have mentioned at the outset that we're gonna take we're gonna take questions after all the speakers have finished but there will be time for questions so our next speaker is Jim House Jim is the Angus Campbell distinguished University professor emeritus of survey research public policy and sociology Jim has conducted extensive research in the areas of social psychology as well as the social and psychological determinants of health as well as political sociology and he's the author of the book beyond Obamacare life death and social policy as a member of the National Academy of Sciences Jim recently chaired a taskforce on the 2020 census and published a powerful report on the citizenship question he also happens to be my mentor so I'm very grateful to welcome Jim House to the podium thank you [Applause] well thank you Jeff and thank you all who are here and listening as Jeff indicated there'll be a lot of there's overlap as you can see as the speakers go by on some of the issues here I'm here from the perspective of a group of the and a committee of the National Academy of Sciences the Committee on National Statistics for which I served from 2012 to 2018 so like Barbara I'm doing this just right after having and this and the last thing I was involved in was the preparation of what is now called a letter report which I'll get to at the end of the presentation from the NA s on the question of should there be a citizenship question added to the census and what we examined that from the perspective of scientifically of what is the justification for this and what's the evidence for what kind of effects that it might have I'm gonna start a little broader than that based on what you've heard the Census and there's a larger system of what's called a Federal Statistical system in which the center of which the census is the largest unit that we generally take for granted most of the time we assume that's there that it does the kinds of things they indicate that the census does that other parts of it such as the Bureau of Labor Statistics tell us what's the situation is in the country with respect to the Consumer Price Index the unemployment rate and other parts of it do analysis in all kinds of areas and I'll get to mitad I'm in a minute to more information than you ever want to know or at least in a short period of time as to how this system is embedded in the federal government I think what I've come to appreciate in the six years I spent on the committee of not national statistics it's not how important this is as a resource for information certainly for science the whole areas in which Jeff indicated I've worked are fundamentally grounded in the data they're collected and disseminated through the census through the National Center for Health Statistics and other aspects of the Federal Statistical system and the I have gotten greater appreciation as Al's presentation suggested of the breadth of use that is and value that these data have for the functioning of our society for the public sector at all levels as has been indicated in terms of allocation and resources very much for the private sector as well from which al came from in which he is very sensitive to the fact that private organizations and use these data all the time in their work as to public organisations at all levels what I have become most aware of in some sense is how politically integrated that's better or and meshed the Federal Statistical system is in the broader nature of our government and the broader nature of the politics in our society and as Senator Peters presentation suggested it's a system that is increasingly challenged technologically around the issues that will be talked about more later in the morning around how do you manage collect data disseminate it and do so in a way that protects the privacy of individuals it's increasingly challenged economically in the way that almost all aspects of our governmental and public or public goods infrastructure in this country are challenged they have less money to try to do more things with it and as we're seeing in part around this particular citizenship question it's challenged politically these days as well it's not clear that substantial parts of our society really understand and appreciate the importance of these institutions to the functioning of a democratic society and if we lose that appreciation we are at risk and losing one of the key foundations of a well-functioning Democratic Society so let me just try to make a few comments here of a kind of background nature along these lines you know first is point that you've heard something about already the Census Bureau and the broader US Federal Statistical system or FSS as I've abbreviated it here derives from an intense is inherently enmeshed in our political system you've already heard from out that the census or at least the process of enumerated the population every ten years it is front and center in the Constitution of the United States it's in article one section two that there will be in enumeration and this has been carried out in various ways from 1790 through the present and originally by u.s. marshals then through decennial census offices were supervised through the Secretary of State and the Census Bureau as it exists today or has evolved today is a creature of the 20th and 21st century the whole series of other agencies have been established in the same way to deal with subsequent needs and problems that the society recognizes developed the economic and occupational and labor sector is a very large one that has been handled through the Bureau of Labor Statistics which was organized between 1884 1913 and continues to function today in very important ways and one of the last of these that's been formed is something called the Energy Information Administration which basically grew out of the energy crisis of the 1970s in the discovery that we needed to understand and know how much energy do we have where's it coming from how much is it costing us and so on and those are so we now have areas we have the similar areas in education agriculture etc the second point that I'd like to drive home is that each of these agencies that's a very decentralized system many other countries have a sort of central office of National Statistics Canada has something along this statistics Canada United Kingdom is not quite that way but they have a central concern for it there's an almost cabinet level position in the UK that's charged with protecting and promoting the ability of the Federal Statistical system in the UK to function and operate as an independent and nonpartisan and unbiased provider of information to the society in the u.s. this is complicated considerably by the way things are organized here you're never going to fully read everything in this table I'm just gonna try to see if I can I guess does the pointer show up guess it doesn't that way so off to do it this way for you to see the important element to recognize is that every one of the Statistical agencies and over here on this lip see on the right side are the various Statistical agencies they are embedded underneath various committees that provide Pro creations of oversight from the Congress and they are embedded in different administrative portions of the executive of the federal government the boxes in this figure such as here and here and so forth every box indicates a position that is a presidential appointment so it is is nominated by the goes through the federal confirmation hearing suspects one of the recent one first Justice Kavanagh is an example of that on a large scale but that happens all the time as you so I'll come to in a minute three of the directors of agencies are actually presidential appointees I didn't realize that until our Bob grows from here was nominated for the census and went through a full-scale nomination process so the outcome of this is that the administrators and the the rest of the most of the rest of the staff here are federal civil civil service employees and from my experience let me tell you these are incredibly capable and dedicated people and they work under increasingly difficult conditions to try to fulfill some of the things that I will indicate in a moment are the key function is that a statistical agency or the whole system are supposed to fulfill let me just to point out that there are three of these directors normally you'll see pretty soon the director gets back to and reports to somebody who is a presidential appointee so the political press system sits over this and that is Barbara and other people have indicated secretary Wilbur Ross as the Secretary of Commerce and operating through his under secretaries and so forth has the right to give instructions to the Bureau of the census as to how they are supposed to operate the other it's interesting the other agencies that are involved here the second one that has been there for a long time the Bureau of Labor Statistics and as I said that does the CPI the unemployment rate in a variety of other important things about the functioning of the economy and the labor system in the country and third one is the energy administration which is a relatively small operation but you may be able to infer what these three have in common these are all the organizations that produce politically important and sensitive information along with that information being important generally to the function of this society so these these physicians and these agencies are under a great deal of political challenge all the time in terms of how they do their work if you have time at some point you can look through this little table it's quite I found it quite interesting kept learning things as I looked at it let me make one other third point and this gets into where my particular relevance this morning comes from is that since 1972 the National Academy of Sciences has had a group called the Committee on National Statistics the function of which is to provide by unbiased scientific advice and counsel to all any and all of the Federal Statistical agencies largely at their request or at the request of Congress to consider aspects of how they function or in other ways and it's by virtue of that's being on this committee that I got involved in the particular issue of the citizenship question let me just before I get to that indicate a couple of things that that's important here is the website that I'm going to get to the that where you can go to find a volume that the Committee on National Statistics has produced now through six editions and chanced to use and provide at the beginning of each new administration to the political members of the administration and to the people in the statistical agencies regarding what a Federal Statistical agency is supposed to be doing for the government and for the society more broadly this volume is organized basically in terms of four basic principles for what an agency systems is supposed to do so the Census Bureau and any other is supposed to be providing information that is relevant relevant to policy issues and it's to be objective accurate timely and that can be used in the policy process secondly it's critical as al has indicated in his presentation that in doing that that the agency has credibility among its data users people who use the data need to know and understand and believe that that data is accurate in the age in which all information is being increasingly challenged as to whether it is quote real or valid or not the Federal Statistical agencies stand as one of the ultimate foundations for information that can be used with confidence and trust by jada users and they've been seen that way for years and thirdly as al indicated it's critical that Federal Statistical agencies be trusted by the people who provide them with their data whether those are the individuals providing data to the census or organizations that provide information for the economic censuses and so forth then and they need to know again the proper understand what the reasons for this are how the data are used and that their data is being kept confidential and that is the law as of this point in time and finally in the latest edition of this book there was a fourth principle added to emphasize the importance of the independence statistical agencies from political and other do undo external influences and I don't think I need to give in the context of the discussion to this point to indicate why was felt to be important that that's this time to emphasize that is the fourth thing so in thinking about what happens with the sense of the citizenship question or anything else that is done with the operation of a Federal Statistical agency like the census one needs to keep in mind what is it going to do to these things that is people have discussed already this has we're dealing with something that's highly policy relevant but has major issues in terms of potentially undermining the credibility of the data from the perspective of users providers and politicizing the entire process there are a whole series in this volume and we won't have time to go in them but again if you go to the website you can find them there a set of specific operational principles that affect these things that include things like confidentiality verifiability of data oversight by appropriate scientific and other checks to make sure that the agency is operating functioning senator Peters mentioned he requested the GAO to take a look at what's the census doing to protect confidentiality of data and that's a perfectly appropriate things to do so now let me just get to this the ongoing whether you want to call it concern controversy or opposition to the addition of the citizenship questions to the 2020 census it's worth knowing a little background I think and just thinking about this is not that citizenship has never been asked in the census it has been asked it was asked has been asked in varying ways over time there have been questions about birth order birth origins or say citizenship from 1820 and 1830 there's a hiatus around the Civil War and then it picked up again in 1870 and continued for 1819 to 1950 then it was dropped so it's now been almost seven decades it's since the assistantship was question was answered that would the question was answered at the request of and at the expense of the state of New York and the Commonwealth of Puerto Rico in 1964 health for them and understanding who was a citizen of what and where and as has been indicated but to help people recognize this the Census historically has been a relatively brief document that worked from the principle that the censuses main function was to quote enumerate the population over time other questions around things like citizenship around education around work that people did and so forth God added to the census and it got a little bit unwieldy in certain ways and in 1970 a decision was made to separate the census into what is called the brief short form census which every household in the country gets and answers and a long form of the census which was given to I think this is consistent barber said about a sixth of the households that has been in the range of 15 to 20 percent generally and on that there was a longer set of information about the kinds of things that I just mentioned the after 2000 the long form has been discontinued and replaced by something that Barbara mentioned that may not be clear to everybody called the American Community Survey which is an ongoing annual survey that over the course of a decade is intended to produce the same amount of same size and body of information that was produced from the long-form but to produce it in a way that keeps it constantly up to date with changes in the nature of population and as I barber indicated currently that's about three million households a year or being covered via the ACS and the ACS comes up as in ways that she's already mentioned that an important so the as you know there are several if the controversy here is ongoing in various places there is the congressional legislation that Senator Peters has been a leader on that is sitting it is not going to Peck go through the Congress as it is now constituted and obviously the election will determine whether any legislation of that type will ever move forward out of the Congress there are lawsuits ongoing and again as he mentioned the largest of those is in the state of New York it is due to come to trial in November and it is now being held up by another appeal again that Senator Peters mentioned by the Commerce Department to avoid secretary Ross having to testify or otherwise give depositions or statements as part of that that hearing but whenever that logjam is passed on many of you may know about it or even be involved in some of the entities and organizations that are doing that that undoubtedly I think wherever it goes initially is will probably go all the way up to the Supreme Court and we don't know what will happen out of other those two things in the meantime there is a normal process that goes on for changes of any kind that are made in the government in which there is public comment and input allowed and that's going on right now with respect to the census and that's the context in which the Committee on National Statistics did this letter report trying to indicate what should we what are the what do we think about the scientific justification need and potential impact of the citizenship question and that report is available also publicly on the National Academies website in that thing we come down based on consideration our committee which is independent of the one that Barbara was part of with their conclusion that it is not a good idea to add a citizenship question at this time and one has to recognize what is now being talked about is adding the citizenship question to the short form which has currently has 10 questions on it so this is this is this is an adding you know a kind of 10% question increment it's a very prominent one it's not like this is going to be hidden in the midst of a lot of other information it is very prominent it's very public and at this point anybody is going to know about it if it ends up being there the reason is we came down against was first was looking at the scientific is there a basis for this given any of the so far stated reasons and needs for having the citizenship question most of which have revolved around bility to better enforce the Voting Rights Act and again our conclusion as others is that the American Community Survey provides all the information that has been needed for a half-century through the long form and now through the ACS there's all the information that's been neatness needed there to fulfill the functions that are being asked for it secondly the report says that it clearly almost certainly will impair and damage aspects of the quality of the census unfortunately no one has the purrgil data at this point time to say definitively what that would be if we could say definitively that we know from prior information that putting a question like this on here depresses the response rate by 10% the issue would be very very different than it is right now we unfortunately don't have that information but all the indications are that it is likely to do that it's likely to there are problems in this this is treated as quote reinstating senator or secretary Ross's they were reinstating the citizenship question well that's not exactly the case because it hasn't really been asking exactly this context in this form before and as Senator Peters emphasized the normal process in the census is extensive testing over three to five years of any new content that goes in to the survey that is impossible given the date that the request was made and so this would be essentially being put in there with no prior information about its effect and it will undoubtedly are at least very likely drive up the costs at the same time that it's lowering the quality as Barbara attested to there is this prior incidents which is relevant here if it were the case that the citizenship question totally are in very major ways impaired the functioning and the quality of the 2020 census it is possible that that could be used as a justification for not using the 2020 census to Rhea portion representation in the House of Representatives and we would be back in a situation for under slightly different circumstances and reasons as existed in the 1920s and that's obviously as we know a very politically consequential thing it's not sure that that could happen and no one certainly the Census Bureau itself is doing everything it can under the circumstances to try to make sure they're really successful census and finally there was the conclusion that that to insert us a question like this where there has been from senator from secretary Ross and other statements that this information would be used to help to develop a National Register of citizenship and that is a use of census data that is a violation of the principles that I've just indicated and it's a violation of the current legal statutes on the way that census information can be used or transferred in other ways so on that basis our conclusion was this was bad that that comment period has just and it ended all of that information is being processed we don't know what will come out of that mess up now so in conclusion I would just assess to you that that you may not have thought so and I'm not sure I would have thought so eight or ten years ago that the Federal Statistical system is a big political issue but it really is in the same way that the President appoints people to the courts in the Supreme Court and that's a political issue the President appoints people who oversee the operation of the statistical system and the Congress in its own way advises councils and legislates in that process as well so when you're voting to the extent that you can think about what the implications of your vote is likely to be in terms of the operation over here not only the federal system but this goes down as well at the level of states which have probable kinds of things if you're interested in a bunch of these issues this is a thing that I came on across in doing this it's a law review article that provides a very nice review of the history of the use of the census and the principles that have been applied in apportioning representation and the House of Representatives and you'd be amazed at how many different ways and forms that has taken over the history of the country and any of those who are still possible given the fact that there is nothing mandated in the Constitution about the way except the apportionment across the states somehow appropriate to a population there is no other requirements on that nature of the districts the size of districts and so on so with with that I hope you will be able to think a little bit better about the issue as it continues to go forward from here and I look forward to the rest of the discussion as I have enjoyed the prior presentation so thank you very much thank you so much Jim that was a fantastic overview of the Federal Statistical data system and along with Barbara for kind of foreshadowing the possible political implications of these issues that we're talking about I hope that we can also make available the the various reports and articles that Jim had mentioned in his slides on the ice our website so we'll be sure to do that our next speaker is Angela Ocampo Angela's Ann Ellison a collegiate postdoctoral fellow in the political science department here at the University of Michigan her research examines the political incorporation of racial ethnic and religious minorities as both participants and as political leaders within American American institutions welcome Angela thank you [Applause] okay see some really cool graphs I want to show you also on this is gonna be also the Clippers not working for some reason oh just for the delay let's try a couple more things there we go so I'm really excited to be here and to present on okay then I'm particularly interested in which is the political participation and public opinions on racial and ethnic minorities so I'm going to be specifically talking about the potential impact of asking citizenship status for racial ethnic minorities and immigrant communities so by now we're all experts in sort of the history of the citizenship question and we've heard from our the two previous presentations Barbara Jim that the citizenship question has been asked and has been asked over time but it's it's it's been asked in different ways so in previous times it's been asked a total count of people who are not naturalized for a nurse and then sort of the question has taken different forms and it's been asked in not only the census but in more recent times is asked than the in the ACS the actual citizenship status question stopped being asked in the nineteen fifty and after the the 2000 the after 2000 the day of collection on citizenship has been via the American Community Survey so as it was previously mentioned by some of their speakers there is also some history and contentious history as to how this census and citizenship data has been used and has been shared with other other government officials so we know there's evidence that the census has cooperating with government officials and in sharing the data and sharing information of individuals who live in in large Japanese communities and in particular also micro-level data so so individual level data of individuals who were of Japanese ancestry who lived in this and these individuals were targeted which then you know was led to their internment in the in in drug World War two another really important sort of history of how the census data has been used which is it this is currently legal but it draws and it sort of draws a lot of concern and and and worries among individuals as to how the sharing of information between the Census and other government entities might influence what's you know my influence or my impact targeting of communities in different ways so in 2004 the census gave information to the Department of Homeland Security about neighborhoods with large numbers of Arab Americans and these were simple of zip code level breakdowns of Arab Americans organized by country again as I mentioned this is legal but there was serious concern as to how this was going to be used particularly given the fact that this was in sort of in the aftermath of the September 11 attacks and there's also there also been a lot of backlash against the Arab American community and so with sort of this history and then in the backdrop it's it's important to understand why these previous episodes are of concern to how racial and ethnic minorities might feel about answering the question on the census and what their what their fears and hesitancy to answer such question might be given given such history so um I want to point out that immigrant minority communities are living in fear and this is sort of the current current state of affairs their experience in daily discrimination threats of separation detention and deportation and not just threats right they're actually living through this in the current times and so they were even targeted and-and-and fear is is is a live right among these communities so there we've seen an increase in the arrests in Mauritian arrests since 2017 is an increase thirty percent we've seen that the travel ban of individuals coming from Muslim majority countries it's still it's still happening and we've also seen that immigrant communities have been targeted by the Department of Justice as it attempted to rescind the deferred for deferred action for childhood arrivals in 2017 which this program allowed undocumented youth to have a reprieve from deportation and so the fears are real because there's been actual targeting of these communities in the current administration and I I want to present these quotes because they really speak to the fear that I'm talking to you all about I can tell you though these communities are targeted but there's no other way to really understand what this fear is like among racial and ethnic and immigrant communities so here we have a quote from Carmen Guevara she's a 46 year old woman native of Guatemala and she's she's answering this question she's asked right and and and this is after it was announced that the citizenship question was gonna be add to the 2020 census so she was asked what what would you do are you going to answer and she said I would never answer because I don't have papers obviously I'm afraid and I have a son right so there's a serious concern and fear among immigrant communities that this is gonna have her percussions and there's gonna be a backlash and they're afraid I just for themselves but for their families another individual who was asked what he would do and whether or not he would answer the census right and so even though we're hearing about how the census is having sort of innovations and insert in in finding better ways to ensure that we have response rates right individuals are also going to be afraid of people coming at their door right so even if we try to try to address some of these low response rates via recontacting and via in-person knocking on their doors individuals are gonna be afraid so sad when asked about the 2020 census and the citizenship question he said the following I said I know that no pair in my neighborhood is gonna be opening the door for anyone doing a survey right so set aside speaks to this fear of government entities government officials and having contact and being fearful of anyone coming at your door because usually the people that are coming at the doors of many of these immigrant households is immigration and customs enforcement so I want to point out that that the fear is not only among one racial and ethnic group it's among many groups and it's because the sort of issue of immigration and and venal documented or being and mixed that is family stands across racial and ethnic groups so one of seven Asian Americans is undocumented and this is something that sometimes it's talked about we only think of the issue of immigration as Vina I let you know that you know issue but this is also it's something that is going to affect the asian-american population and it's going to affect so I want to be really clear about this it's gonna affect not only response rates of individuals who are undocumented themselves but individuals who live in what we call these mixed status households and these are households that include one on authorized adult and one u.s. born child so we see these are large population of individuals that are living in these mixed status households and you know the issue of immigration is something that and it's sort of this fear it's pretty its close and it's personal so we know that on average about sixty percent of Latinos report knowing someone who is undocumented either family or friend and one in three Latinos report knowing someone who has faced deportation of detention so these are not just abstract fears that are sort of floating around right these are concrete numbers that let us know that adding the citizenship question to the census can have real repercussions given the fear among racial communities I'm going to show you some some additional evidence of a fear among these racial ethnic groups and communities so there was some pre-testing done by the US Census and the National Advisory Committee on racial ethnic and other populations and and in this data has revealed some concerns about confidentiality that immigrant communities feel and this data was collected from February to September of 2017 and it up to until 2017 and it included focus groups and interviews in various languages and so here we have one respondent from the Arabic focus group that said you know they this person stated in light of the current political situation immigrants specifically Arabs and Mexicans would be so scared when they see a government interview other forceps again this gives us additional evidence that it is not only affecting one racial and ethnic group but more than one and that the fear of having contact with a government employee field staff from the US Census who's coming to inquire after you haven't really filled out the questionnaire online or on a tablet that people are fearful so another respondent from the Arabic focus groups argued that the immigrants are not going to trust the Census employees when they're continuously hearing contradicting messages from the media every threatening to the port immigrants right so the political climate is something that really affects the way that racial and ethnic minorities feel about their relationship to the political system and elected officials so when they're hearing contradictory messages messages in the media right some the things that we we have heard yesterday about the possibility of this executive order to end birthright citizenship that's something that is sending the message to racial and ethnic minorities that that the government is not is not trusting and that the administration does not have the best concerns in mind furthermore the field staff that were conducting these focus groups and the testing of these questions they they reported unusual so respondents walked out of the interview they were having within their own homes they were visibly nervous when asked about US immigration or citizenship and the respondents were worried about even given like they're legitimate names and so this is some of the the reports that were coming from the the census stuff that was conducting a lot of this pre testing ahead of the the 2020 census um I also I want to show you some really interesting and neat results that have just come my way via a tracking poll of Latino adults this election cycle and this data comes from Latino decisions which is a big polling firm that has been tracking the attitudes of Latinos as we are approaching the midterm election so this survey as Latinos various questions about the US Census the first one that I'm going to be showing you today it's about their responses to how they feel about the importance of implementing an accurate count of the entire Latino population right so we have very high levels of greement seventy-one percent of Latinos believe that it's very important for the US Census to implement a complete and accurate count right so it's you know adding up the very into some what we get close to 93 percent of respondents in this national survey believe that it's very important right this is something that they're also concerned about they're concerned about an accurate count when we asked Latinos about whether or not they trusted the Trump administration to keep confidential the personal information that they collected gluta citizenship and the status of immigrants and we asked them if they felt that the Trump administration was going to share this information with other federal agencies we have close to close to 70 percent of respondents didn't feel didn't trust the Trump administration didn't feel confident they felt that the Trump administration was going to shared this data with other federal agencies this is a high number moreover we asked individuals in this National Survey how concerned they were that their answers about people's citizenship could be shared with agencies such as Immigration and Customs Enforcement and we found that 54% of Latinos were very concerned that this was going to be shared with ice and 25% was somewhat concerned so accounting for about the very what and some what we see that these are very high numbers of Latinos who are extremely concerned that people's answers on the citizenship question on the 2020 census is going to be shared with ice right so now we have in addition to the quotes that I share with you we have national represented data that says that these concerns are real I want to also show some additional evidence that leads us to – to believe that there's gonna be low responses in households that have no that have non-citizens so there was recent research done by the center of economic studies and a group of scholars took advantage of the fact that in the 2010 census that in 2010 there were individuals who answered both the 2010 census and the 2010 ACS so if you if you remember the 2010 ACS does ask citizenship but the 2010 census does not the census doesn't ask citizenship so these researchers took advantage of the fact that these individuals answer both the 2010 ACS in the 2010 u.s. census the same housing units and they wanted to compare the response rates of these households to forecast the potential effect of added a citizenship question right so what they did is they they compared these response rates so they did a couple of different things so I'm gonna I try to work with them walk through them really quick so they calculated these response rates for the 2010 ACS and the 2010 census for two groups of households the first group of households worth households that they called non sensitive or less sensitive and those were the households that based on administrative records all of the individuals in those households were citizens so that's these households right here and then they calculate also these self-response rates for the households were where potentially sensitive households to the citizenship question and those where the households over here that based on image traded records they had at least one individual that was a non citizen so one of the first things that we can see from here is that there's already a high higher responses rates for the census right so we really can't take this as evidence that you know individuals are sort of are less inclined to answer or impacted by the citizenship question because on average both for the non sensitive and the sensitive households we have higher responses rates for the census and then we have lower responses rates for the ACS so this is probably due to the fact that as I mentioned before there's a lot of engagement with community-based organizations and partnerships and there's a there's a big media campaign to try to get individuals to answer the survey right so part of this is sort of greater propensity to answer the census comes from that right but what's sort of difficult to understand right is why we have sort of different response rates within the same type of survey either the ACS or the UC census for a given sand sensitivity group so we see here that households with at least one individual in the household that's not as cities in the 2010 ACS that did have the citizenship question the response rate is forty two point four percent that response rate is much higher in the 2010 ACS for respondents that live in households where everyone is a citizen right so we see we see a big difference here so to really understand and into account for the fact that people are ready much more likely to answer the 2010 census anyway they did a different different Alice's and they have a successive number of different analyses in the paper with a lot of robustness checks to to better understand these differences in the propensity in respondents in these sensitive households to answer at lower rates so we find here in just a simple different diff analysis that respondents in these sensitive households are eight point nine percentage points less likely to answer when they're asked the citizenship question some of these implications have already been mentioned by some of the previous presenters and I do want to underscore that if responses if response rates are low among racial and immigrant communities this is going to have an effect the representation in Congress and it's gonna have an effect particularly in the places where there are large immigrant populations so a place like a state like California could lose six congressional seats states like New York at Texas Illinois will also lose congressional seats and other states that have lower levels of immigrant populations will gain seeds Wyoming Utah so on and so forth census data is also used for determining federal funding that's as versus States for various programs and so if we have an inaccurate count a particular racial and ethnic minorities this is going to have a negative impact on on how this data is used for the allocation to really important federal programs federal funding included medicaid snap Medicare Part B among others and I also want to want to highlight that under counting I'm particularly racial and ethnic and immigrant communities it's it's gonna be really problematic and it's really gonna compromise research of these particular communities so it's estimated by the 2015 our population is gonna be of the entire u.s. majority minority right so that minorities are gonna be the largest who are going to account for the largest proportion of the u.s. population so having a low estimate and a low count of racial ethnic minorities in 2010 in one year can influence how we understand this population over many many years and for years to come thank you thank you Angela that was really interesting and puts a real human face on a lot of the issues that we've been talking about our next speaker I'm very glad to welcome this Kurt Metzger who is a demographer and along with Rand Farley who's also in the audience it's like one of our leading experts on population change in the Detroit metro area Kurt is he founded the Detroit area community information system which is now better known as data-driven Detroit and was recently the director of research at the United Way of southeast Michigan and Kurt is also served as a geographic specialist at the Census and is currently the mayor of the city of Pleasant Ridge which for those of you don't know is in Oakland County so your honor Kurt Metzger this is a tough gig six presenter and a warning of six I will try to make this as painless as possible no slides so the other thing I just wanted to say is I really do appreciate more today than ever that I'm retired that commute here this morning was just off with Soho but I'm glad to be here I what I'm trying to give you a little bit of presentation as Jeff gave you kind of my a little bit of background but just kind of walk you through some of my history to give you to bring us up to date you've heard a lot about the Census and the way 2020 is going to be conducted I came to the Census Bureau from Cincinnati back in 1975 to actually be what they call the geographic planning specialist when they moved Geographic planning out into the regions for the first time and we were running around updating maps going to planning agencies and at that time maps were actually done in Jeffersonville Indiana they were done on mylar people would rub those little block numbers on the maps and then they would photograph them and then they would distribute them in these big binders so I did that moved into the administrative operations for the 1980 census in Detroit running 37 district offices in terms of administrative activities in Ohio in Michigan and then after the census became the head of information services where I finally got to utilize the information that was being published by the census all that data that were being collected how are they being used at that time we were putting him in books and I would load up my my trunk of my car with all the census books and go out and do workshops how do you start to find data in these books some of you may remember those days takes me way back to good times but it was that that really gave me that love of what the census really means and how you can actually I remember talking to an N double ACP group in Columbus Ohio and having them just eyes light up and saying so this is where the data that national is giving us they're telling us what our allocation is and why and now we can actually fight for ourselves and it was that kind of empowerment that democratization giving well access to information that they could then use to challenge authority to do their own kinds of planning and so I was with the census up until 1990 and then started through the 1990 census and moved over to Wayne State University which was part of a state data center program the Census Bureau set up the state data center program throughout the United States as a way of kind of counteracting these large data companies like CACI and others that were created to run mainframe computer tapes and charge in order than amount of money and and really it was the private sector that had access to the data the people on the street didn't have that access and it was much more detailed than what you could get in the books and so the state data center program was set up so that you have universities and others distributing data throughout the state but also having access to these computer tapes and being able to give them at no and low cost to people and so adding that kind of information so we had something called mimic the Michigan metropolitan information center back there in the center for urban studies and I did that up until 2005 so I ran the 2000 census when I was in mimic and we did a lot of outreach and helped the Census Bureau conduct the 2000 census as much as possible 2005 I went to a United Way as research director and actually started to say okay this is again now applying the data how do we use the data how do we do it for our own internal purposes but also to help grantees and other partners here are the data how do we start to use it how can you use it to to write your proposals etc so it's always this kind of love of how do we use this information and how valuable it is and then in 2008 the Skillman foundation and Kresge Foundation things were a little bit in flux in Detroit obviously we were soon having three mayor's in three years we now are in recession the foreclosure crisis was really hitting and foundations we're going to start to invest in the city they needed to know where they should be investing and how are they going to evaluate those investments and so they came to me at United Way and said we've got 1.8 million dollars for a three your program could you start something and we created the what we call data-driven Detroit now which is still alive and well even though I've been gone for four years and it was really to give again information down to the lowest level of geography down to the block level and then aggregated with other kind of information from other government agencies from local local government etc etc trying to build all that kind of information to make it available to the public and also to to really to work with local governments to try to help in the 2010 census as I said we knew 2010 was going to be tough 2008 Kwame Kilpatrick was mayor when I when I got the gig with d3 as we call it he soon left and was replaced by Ken cockerel who was there for about a year and a half and then he was replaced by Dave Bing in 2010 so he had this you had the city in flux with the foreclosure crisis or probably other economic issues you had three different mayor's coming on we told Detroit this was going to be the toughest census that they'd had in years they had to get out they had to do outreach fortunately the Michigan nonprofit Association and others were doing a non-profit efforts to get out into the community the city said we don't have the resources we're not going to do it we can't do it so Detroit while you had the state doing there is efforts other communities doing efforts Detroit kind of begged off the result was the city of Detroit lost two hundred and thirty eight thousand people between 2000 and 2010 did they really leave lose 238 we'll never know it could be because of the undercount because they did a lousy job getting the word out you know the nonprofit community and others could only do so much but you didn't have the city really pushing it 25 percent of the population disappeared and we've been living with that population now dropping slowly we're about 672 according to the latest estimates and Mike Duggan has said population growth will be the one measure of his sess if we can turn around the population every year we think that it might be I'm still waiting for 2018 estimates I still think Detroit will do it but the I think the importance of outreach the importance of people getting out there and pushing the census is very critical and so I my main point is to to mention an effort now undergoing going on in the state that obviously we will have complete count committees coming at the community level we will have a complete count Commission I'm sure at the state level but right now we have something called and I want to get it right the census 2020 Michigan nonprofits count campaign and John Gustafson is here from the Michigan nonprofit Association you can ask her all the detailed questions but it is a program to mobilize nonprofits throughout the state there's a wonderful advisory committee of groups across the state a lot of the various minority and AD persons have colored the various groups across the state that are getting very activated we realized that the Census Bureau telling people to fill out the census and that it's private and it's wonderful doesn't cut it even city government telling you trust you trust us we this is very important we need it because funding because of political representation people kind of just go yeah that's great we know what we hope that voter turnout next Tuesday will show us something that we've never seen before but we have to wait but certainly census participation is not on everybody's radar and is not they're not very in you just heard a lot of people are afraid of the census and certainly more this time than before but they do trust their nonprofits they do trust the groups in the community they're their neighbors the groups that are providing programming for them that are giving the kinds of aid that they that they need they are trusted trusted participants in the community and people will listen to them so there's a big effort the Michigan nonprofit Association has combined with a number of foundations Kellogg starting it off over 20 foundations have now money into a campaign that's going to be up to four point seven million dollars the state's put in a half a million dollars toward that so if we're really looking forward to a tremendous outreach that will kind of overcome maybe some of the other issues that we have to face going forward so I just want to say that the campaign website is be counted mi 2020 dot-com that'll be available to you certainly in the notes and everything after the conference so I just wanted to be really quick but it's it's just the census is kind of why I got to be where I was just starting out leaving graduate school in the middle of a pH dissertation so work for the Census Bureau and have never looked back so I look forward to any kinds of questions I will stop right there Thank You Kurt thank you to everyone in the audience for sticking with us what we I know we've had a lot of speakers but I hope you agree that this was very informative and a really provocative session so now we're gonna take questions and as I understand it they're gonna be people traveling around with microphones for people who raise their hands because we need to get all this on tape or on the cloud so to speak thank you well thank you very much for your presentation our student Nicholas Jones super-intense the collection of racial and ethnic data for of the Census Bureau two years ago the population association meeting he presented an exceptionally lucid paper which reported that the Census Bureau's pre testing of the Mena question was very successful that is a question that would treat Middle Eastern North African as a racial category subsequently the decision was made not to ask that question and I'm wondering if you could tell us who made that decision and why in the second question which may be more appropriate for Al Fateh knowin that is what's the final deadline for a decision about questions on census 2020 I assume the Secretary of Commerce cannot decide in February of 2020 that he wants a question about the number of pets in the household or something added to the census what's the deadline for that um thank you I was hoping that you that someone would ask that on the first question you're writing about Nicholas Jones he and his racial/ethnic branch did a fabulous job and we looked the sense scientific advisory committee looked at what he did the other federally mandated Advisory Committee in the national advisory committee which especially looks at race and ethnicity looked at they tested things every way imaginable and as you said you're completely correct came out with fabulous results this was both on the meaner question but it also was related to asking a combined racial and ethnicity question because there was increasingly a problem that many Hispanics would check some other race in terms of racial category because the understanding of people was not the same as the understand and I think that the Census people thought and the impression I had was that this was going to go along swimmingly and it would be changed for 2020 however as I think you know the change in any racial or ethnic categories on official data collections like this has to be approved by the Office of Management in the budget and when I asked high up people and they said no OMB said no we're going to keep it the way it is and the way I understand it from people why tie up in the Census Bureau is that OMB gave no explanation whatsoever about why they said no and I think it was quite a shock to a lot of it was a shock to my committee and it was a shock to a lot of census people when they thought this was all perfectly clear and a done deal hi I'm going to respond to your question on the deadline we have indicated publicly that we needed final decision on questions by June of 2019 in order to get the printing done and order to kill other aspects done however at cost and increased risk we could push several months past that but every month we push it increases the risk that potentially damages the quality of the census and it increases the cost of getting it done the long pole in the tent right now is the paper part of it interestingly enough because we have complete electronic systems that ask it without the question and now with the question because we ran the 18 end-to-end test without the question because it was prior to that decision so we're concerned I will say we don't have a print contractor at the moment because our print contractor declared chapter 11 and we have put it out for bid we will be awarding a new contract in November this month we currently are in the final evaluation stages of the bid once we get a new contractor on board will give us an opportunity to sit with them and work through their production schedules and capabilities to refine that final deadline for changing capabilities barbar would know this how our institutional population is going to be counted that's been a notorious problem in Michigan because in the city of Ionia a very small town that has no reason for existence other than staff for the six correctional facilities most of those people come from Wayne jealousy or Saginaw counties and they are very disparate for example in Michigan we had 51 thousand people in the end of prisons in Michigan they were all disproportionately in the Western and the Upper Peninsula part of the state and and they were not counted where their homes were this reflected tremendously on the evaluation of a place like Wayne County with respect to people of color more than half of the population in prison are people of color I mean and and they're also disproportionately poverty so are those going to be how are those going to be counted may I say something in someone else and you can tell me what I said that was wrong No thank you rosemary there's those two aspects one is these are group quarter populations and there have been considerable problems with some of this the but also my understanding and l will correct me is that incarcerated populations in terms of the census allocation are I'm not defending it I'm just trying to explain it are counted at the place of the institution some states I believe Maryland changed their laws so that for allocations legislative districts within the state that they are attributed back to their place of residence before they were incarcerated but that is not for federal purposes and Elkin a better answer than I did the spaces there we made the decision in our residence criteria to continue the way we've always counted all people in the citizens and that's where they are living or staying on census day and where they have been living or staying on since the state prisoners on census day that are incarcerated or counting in the place of incarceration where they are however we provide we will provide an app to any state that wants to that allows them to take our data and re apportion the locations of their persons for state redistricting purposes and for state activities into 2010 Maryland used it and we were testing it up at that point in time we now will make it available to any state that wants to take that up and move people to their city from whence they were incarcerated for state in terms of state redistricting activities but from a federal standpoint since our primary purpose is really for state for federal information we will not be changing the way we count prisoners on a federal level well I'm reluctant to crack down but we said was almost totally true as I understand it for things like college students who don't live with their parents are counted at the place of their college however high school students who are at boarding school are attributed to the place of residence of their parents rather than the place of the boarding school you're correct Barbara that's so unexceptional but college students are counted at the campus they aren't prisoners are counted in the prison can I ask a follow-up question um and that is that I've recently had several discussions with people in which they they emphasize or they sort of suggested the possibility that while the federal census is what's used for reapportionment that states actually have a lot of discretion to use other sources of information including the ACS including as you suggest an app that would replete you know sort of put people in there in their home communities if they're in in prisons and and is this something that that we should be encouraging states to think about more in terms of redistricting if we think that they're going to be problems in the 2020 if people have thought I had never even had never occurred to me that we could use different sources of information to do these two things until the last couple weeks I was just wondering whether people had thought about this and had knew more about the the legal or statistical possibilities I'm James white or I'm with the Census Bureau as well I run the redistricting and voting rights data office so this particular topic goes straight to the heart of some of the stuff that we work through you are correct there are states do sometimes use their own data to augment the census data Maryland was a state actually reallocated prisoners back to their home of record and they actually did it for redistricting for congressional state legislative and the law actually says all local redistricting is supposed to use that new base data set that they've created but this isn't really something new the states of Kansas used to conduct its own census and use that data to redistricting they still do a reallocation of students and military prior to doing the redistricting Hawaii does the same thing they create a resident population base where they they remove some students and military that are considered non-resident before they do it and all of these different scenarios have been upheld by the courts including the use of the Maryland data you have four states now on deck to this prisoner reallocation you have New York State which did it last time in 2010 in Maryland New York only does it for state legislative they still use the Congressional counts for congressional for congressional California is doing it as well but only for state prisoners so you have all these different flavors around but you the four stages Delaware Delaware we have along the books to do it in 2010 and then they passed emergency legislation to delay until 2020 they didn't feel they were ready to undertake the operation but so you're you're very correct people do have the ability to make modifications to the Census Bureau as long as it's consistent and not arbitrary the way they're doing it within the state yet um one thing to me this up I don't think people appreciate how much the Census Bureau really works on their questions on you know the way the questionnaire looks at male and female are now left and right instead of on top of each other for so you won't have miss strokes and stuff doesn't matter as much on electronic but on paper it does but if you were gonna ask a citizenship question I'm surprised that people think that the complicated version this and the ACS is appropriate it could just be are you a citizen yes no I'm a person born abroad of US citizens you know what difference does that make you know it should just be a yes/no question but the Census Bureau didn't get the test that part of it so this is so obvious that it doesn't have the hands of the Census Bureau on it because that's not the way they do things you're right the question the citizen question planned for the 2020 census is directly from the ACS which is kind of a strange question because of the concerns of some territories so it's kind of unnecessarily complicated but when Wilbur Ross required this they just plunked on the ACS question without doing anything which is very unusual and pretty stupid okay Oh anyway okay everybody we are going to get started our next session continue our discussion of the plans for census 2020 that means you have to sit down now I'd like to welcome you all to the next part of this morning's program before I introduce the speakers who are going to talk about privacy issues related to the city's 2020 I would like to offer some i shoveled out and an enormous thanks to catherine allen west who's our director of communications here at is our catherine has done a fabulous job organizing this and and we really appreciate I am our next set of speakers are going to address issues around census 2020 privacy issues the census is taking some some new innovative steps to try to protect privacy to ensure to address the issues of respondent trust that we've talked about in the first session but they also raise I think it's actually some interesting that we've had those sort of respondent trusts and user what did we what was the National Academies trust and usability kind of and I think that those are actually the challenges that we're going to be addressing in this session today and I'm really looking forward to hearing the presentations from the speakers our first speaker is John Elton John is the assistant director for research and methodology at the Census Bureau he's a member of the Federal Committee on statistical methodology and the Committee on fellows at the American Statistical Association he's co-chair of the Oh a bunch of journals he has his PhD is from Statistics at the University of Iowa so he comes from the big ten we're always happy to bring people from DC back to the big ten I always I always joke that we have as you'll see we have I think of Michigan as part of the rotation and the professional development of the of the statistical system but I don't think Janice has actually been in Michigan but we're glad to hear you work for but John's gonna speak to us about about the 2020 plans for privacy protection and I will turn things over to them did thanks very much Maggie I should raise keeping with Maggie's last comment that I should mention I was raised about 30 miles east of here so perhaps at least indirectly I have a little bit of a connection with the University of Michigan also mentioned that we had a wonderful opening of a research data center for the Census Bureau on the Front Range happen to be in Colorado who's on the Front Range of the Rocky Mountains last autumn and in fact had a chance then to acknowledge the fact that two leading people the Census Bureau the previous century Morris Hanson and W Edwards Deming both in fact were raised from the Front Range and Deming got his bachelor's degree at University of Wyoming so we also have connections with lots of other institutions we're always delighted to have a chance to come out and have discussion a little bit of background on this particular presentation like to convey greetings from the main authors of this presentation today Simpson Garfinkel and John a Bob due to unavoidable constraints they are not able to participate in the event today so except for a few introductory comments little making the next couple of minutes I will be presenting material that they developed and that was presented originally in a census program management review a few weeks ago and so at the end we will have their contact information if you're interested in details we'll be covering only a very light amount of some very rich and very deep technical material today and so I'll try to answer a few of those questions but you may especially want to follow up with them if you're interested in a lot of details in addition I'll add thanks to Kath Nell and West as Maggie was mentioning a moment ago but also with thanks is a corresponding assignment Katherine has available some information regarding a Federal Register notice if you have not a chance to see that she has that electronically available and both David and I are going to be saying a few words about that Federal Register notice at the end that deadlines for responding to that is November 8th we hardly encourage everybody either here in the room or our listening online to participate and provide constructive responses we'll say more about constructive response at the end of this presentation but very much encourage you to follow up with that little bit of general background that in many ways is echoing a number of things we were hearing both now fonteneau's original presentation today and the first panel discussion is that excuse me any time you have any large-scale statistical program in-depth work with that requires an organization to carry out a complex balance multiple to mentions on that can generally be categories in at least three groups the first one is quality that includes notions of accuracy that we'll be emphasizing later today but also includes other dimensions that we were hearing considered earlier this morning for example relevance and timeliness second category is risk on that clues the type of disclosure risk that we're going to be talking about today a number of the dimensions of risk for example system performance and then third we always have to spend a lot of time thinking about cost that includes both a cash cost of carrying out a certain set of operations but also involves other scarce resources that we have for example available time line keeping with some of the comments that we are hearing in the previous question session about deadlines and other things like that we'll say more about that at the end in addition David I had a night chance to converse a little bit about our respective presentations previously and I think he's gonna be touching on a number of those issues as well I'm a second point that I'd like to emphasize is that like most large-scale methodological innovations that you see anywhere either in the statistical or the broader methodology world changes that we see in disclosure avoidance procedures require us to work at naturally fascinating intersection of three general areas one is what you might call general principle sites we'll see a little bit of that that we'll touch on very briefly today second dimension involves technological implementation it's great to have ideas and it's terribly important to have the scientific insights that are offered in these areas and then you go from that to saying and therefore here's how we are going to have a production system that meets those quality of crichton's criteria of quality risk and cost I referred to a moment ago in a scaled form and then the third dimension which is also terribly important is to have very careful attention to practical impact on that we end up having for any type of production system in this case to close close closure avoidance system it's reflected both in a combination of empirical results and also in user behavior some of that ties in with what Maggie was referring to a couple of minutes ago in terms of the quality of the work that we end up having and the resulting data that our users are able to use get will cover all of that at the end a little bit in my presentation but also I think David and some of our other speakers are also going to be following up on that as well in keeping with Alice comments at the start in his keynote address disclosure avoidance system is intended to ensure that the 2020 decennial data products meet legal requirements related to title 13 and that is the fundamental title under which we are authorized to collect the data but also have corresponding obligation to protect the privacy of those data so in particular this disclosure avoidance system is I'll be describing it very brief form of de-ice intended to prevent improper disclosures of data by the individuals or establishments in our 2020 products the longer version of this paper that was presented at the census program management review covered four main concepts purpose what why do we need a new disclosure avoidance system notions related to noise injection and differential privacy state of the project and some forward-looking statements we are going to keep that relatively short and particularly those last two elements today but you're welcome to look online all those materials are available online in a great deal of depth so instead we're going to focus on first of all what's the purpose of a disclosure avoidance system why do we care and the fundamental concept that we motivates our work with disclosure avoidance system in general and also the version of it that we have developed for 2020 is focused on a notion of core database reconstruction basic idea as displayed in this very simple graphic on the left hand side we have respondent data those are the data that al mentioned before that we are pledged to protect and then over on the right-hand side we have the published summary data that we're going to have and the concern that is expressed and summarized with the term core database reconstruction is to what extent in in what ways is it possible here's their risk that the published summary data will allow somebody not to make aggregate inferences that's what we want them to make from the published but instead to make statements about the underlying respondent data at a micro level that's the concern and so we can visualize that as saying that if we saw this published data could we quote reconstruct the original responses on the general notion of that goes beyond simply saying is there one single reconstruction but a crucial fact that would be developing the next few minutes is that in most cases and we'll put some footnotes on that at the moment in most cases we in fact have many different possible data based reconstructions unquote that could be essentially purported to be developed from a set of published summary data if we have a whole lot of those and in some ways we don't have too much in the way of information a lot of people distinguish among them you might say we have a very large haystack and we have one needle buried in the haystack and so our data are relatively safe unquote on the other hand if and we'll get to why we have to worry about the second case of the moment if in fact we have that needle in fact very prominent within that haystack it's not in some ways buried then we have to worry about that a lot that's a rough idea that we have behind database reconstruction we'll start with a very simple case this is intentionally oversimplified just to begin developing the idea suppose we have for the moment a publication based on some decennial census data that involve only two attributes the first one is age either an individual is under 18 or they're greater than equal to 18 other words voting age and we also have them classified as being in just one with three different race categories and suppose we have very simple publication that we have on the right perhaps for a certain block that involves ten persons living in that block and all we report on the margins simply saying age is less than eighteen four for individuals greater than equal to eighteen for six in the same way we have the distribution of the race classifications as we have indicated at the bottom of the bottom right hand side of the slide then in principle a possible reconstruction of the original for data again looking only at the two attributes would be what you have in the left hand side maybe we have for example for individuals who are in race one and the figures that we have there for relatively large and relatively small figures there are representing individuals respectively who are a greater than equal to 18 or less than 18 that's the notion of reconstruction now as a footnote you will also see discussion of quote re-identification unquote that would show up idea of Rhea denta fication as suppose you carry out that reconstruction and you say here is a certain household that much is reconstruction if you then say and that's the elting family that's really so that's the distinction you will see drawn in some of the literature between reconstruction the identification the worry is when we focus on reconstruction is in some ways that's an initial high-risk step toward re-identification we're going to spoke almost all attention today on reconstruction first possible reconstruction is what I displayed a moment ago and that we now have grayed out but that's not the only possible reconstruction we in fact have other many large number of additional reconstructions that could be carried out including the example that we have that we call r2 here in which we might have for example four individuals all under the age of 18 and race group one four individuals over the age of 18 and race group two four five two individuals over the age of eighteen and race group three it turns out that when you go through all of the results in combinatorics for that even in this relatively simple case it turns out you have over six hundred thousand possible reconstructions based on this very simple classification that you have here and consequently we end up saying if we lived in this nice low dimensional world and we could stay there then we really don't have too much to worry about in terms of reconstruction the problem is we live in a more complex world than that and in particular for the 2010 decennial so we're using that in many ways is to anchor the baseline for what we have here in the 2010 decennial in a certain sense what we did have 10 questions asked but for an individual we effectively have six different attributes attached to them things like age race is so unlike that when you go through the common of tourists for that word it turns out that you get a variant on what mathematical statisticians in the last 20-some years have been referring to as the curse of dimensionality unquote essentially the dimensionality means that you no longer can say I have a very large relatively large haystack in which I'm bearing a needle but in fact you have major problems attached to them here's the major problem based again on 2010 data we have for the purpose of discussion here three different files we're going to contemplate the first one is the one that's absolutely crucial in terms of redistricting it's referred to as the peel 94 – 171 took me a year at the Census Bureau to memorize that label bottom line there is we have over 2.7 billion that's billion with a B cells represented in that that's because of the very fine level information that we are obligated based on the based on the legal obligations that I was referring to before we also had two other files were published in 2010 balance of summary file one about 2.8 billion in a summary file – with just over 2 billion again huge numbers of cells attached to each of those and on the other hand if you say women where did that come from it came from a little over 300 million persons for whom we were collecting information 6 attributes as I was saying before we were referring to so you say women we have effectively the collected statistics in this sense about 1.8 billion numbers figures that we have but we have effectively in parallel with that something over 7 billion effectively equations go back to algebra and you say wait a minute I have essentially 1.8 billion unknowns I have seven point seven billion equations and pretty quickly you say I have no went over determined system I in fact it serious risk of in fact being able to reconstruct unquote that there's a whole lot of detail behind that but that's roughly the intuitive idea again the dimensionality is the crucial factor in that this has been well known for a number of decades and as a result of that over the course of time the Census Bureau's made major efforts to try to address this in 2010 two primary tools that were used were aggregation and swapping and for 2020 the focus is instead going to be on noise injection and a related set of tools that are referred to as differential privacy the basic idea behind nosey injection is that in effect you're going to take the information that you have that fine level of aggregation you're going to call it perturb it unquote in certain types of certain ways but not the same kind of structure that we saw with perturbation in 2010 differential privacy then is a body of tools I won't go into the details of it here the basic idea it's is that it's a way for us to effectively control the resulting trade-offs we have between the two crucial factors that Magda is referring to introductory remarks one is how well are we protecting privacy again dealing with the needle in the haystack idea that we referred to before effectively by injecting noise were no longer having people even sure what is the needle that they're fighting in the haystack if they find it and on the other hand the question of accuracy we have to have a high level of utility for certain purposes of the data we put out that's the whole reason we're doing it to begin with you will typically see in this literature a set of trade-offs that are characterized by the type of curve that I was displaying here and I won't go again into the details of a little highlight three main points first of all if you were to live in the upper right hand corner of this graph you were living on a point in that curve up near that upper like right hand corner that's essentially a place at which we were adding very little noise so you're back in your exercise I was describing a moment ago about we verbally not doing a very good job of hiding the needle in the haystack anymore so you have effectively a high level of privacy laws but on the other hand you are in fact providing data and relatively findable with a very high level of accuracy on the other hand if you live in the fore left-hand corner you effectively are in the opposite situation you've you've in fact provided a very high degree of protection a lot of noise in your data you provide a high level protection but on the other hand you have very poor quality and very low level accuracy for the information that you're providing differential privacy has sent tools that we have for helping us to understand those trade-offs obviously you'd like to live somewhere between those two extremes and there's a lot of further information that a lot of further information that we can consider in them I said a quick show of hands here how many of you in the audience are primarily serving methodologists okay see a few hands let me mention something in passing the assessment that we have about where we need to be living on this curve depends in a very fundamental way I'm trying to understand effectively the utility that is attached both to individuals individuals all 308 million or so of us as of 2010 then we attach to certain types of privacy protections and on the other hand the utility also that we attribute to a certain level of accuracy of the information that we're distributing on one of many areas of research that would be extremely valuable for us to have further insights from our colleagues and the academic community as well as the private sector and in the government is to understand more about ways in which we can elicit a clear understanding of utility in these very case specific cases it's effectively with the Federal Register notice that I referenced before is trying to get at in terms of service and use cases there's also some very interesting methodological issues related to this for example about a dozen years ago 20 or hagan and many co-authors co-editors had a really interesting book on elicitation of utility functions and priors how do we take those sorts of notions and also some related means immolated software has been developed by David Spiegel halter and others how do we take either those tools or related concepts and try to use those in a structured way to in fact take notional development of use cases and in fact get ourselves with much better insights about where we want to live that's just one of many areas of both methodological and also engineering insights that we haven't very much benefit from the resulting disclosure of voidance system once the decision is made about where we want to live on that Curt is summarized very briefly in this BA in this graph it's essentially a very time what I was displaying before but with a little more detail idea once again is in the red area that we have on the Left these are where we have confidential data inside the Census Bureau the original decennial response file as well as various levels of unedited then subsequent edited files on the right we have our released for information that we have again for example our pl 94 171 day up as well as the supplementary files 1 & 2 and prospectively special tabulations as well and again sitting in the middle we have disclosure avoidance system there's a fund mental tuning constants referred to as epsilon that's crucial to differential privacy calculations that making decisions on that that effectively ends up saying how do we tune that middle box that we have in our work now there are both advantages and disadvantages of differential privacy approaches relative to the swapping was used in 2010 for example privacy guarantees can be much more tuna boleyn provable there also in some sense future use they are not some sense assess relative to what type of external data are currently available in the outside environment that's a crucial factor if you go back through much of the disclosure literature over the 30-some years often much of that is essentially conditional upon what else is available already in the external environment on privacy guarantees can be explainable and and placed in the public domain and provides a reasonable degree of protection against database reconstruction but there are disadvantages and in particular the entire country effectively has to be processed wants to be the most efficient you possibly can be in there and also there's a set of calculations referred to as a privacy loss budget every time we have those additional information we're essentially having to charge to that and if we have a finite budget attached to that we have to be very careful about them going back to saying and get a little bit more about engagement with all of our colleagues in academia private sector and other government agencies the intention is to make the entire disclosure avoidance system place it in the public domain open-source data we very much eisah me open-source code we very much hope that our colleagues will in fact look at those and provide like we hope a whole lot of improvements and that we'll found form a basis for a great deal of enrichment of the disclosure avoidance literature in addition as we heard referenced before we do ultimately have data Census Bureau data it's going to seminal since the state of released into the public domain in particular 1940 data at present are now fully in the public domain as a result of that we very much hope people will be able to use either this disclosure avoidance system or anything else that they may wish to have use that apply it to the 1940 data we hope that provides a very rich testbed for a very energetic discussion of a whole lot of pluses and minuses and again how we can improve these data over time finally as I mentioned the start there's a Federal Register notice it as a deadline of comment for November 8 we heartily encourage everybody here to respond in a particular respondent concrete use cases to help us understand as much as we possibly can about where you view high priorities to be in terms of particular data products that are prospectively coming out of the 2020 census thank you I'm having so much fun tweeting about this I forgot I have to come up here and introduce our next speaker so our next speaker is David Johnson who many of you know he's a research professor here at ISR in the survey Research Center and he is director of the panel study of income dynamics prior to coming to Michigan David had a long history of service in the Federal Statistical system where he was chief economist at the Bureau of Economic Analysis and before that chief of the social economic and housing statistics division of the US Census Bureau and I believe he also had a stay at the Bureau of Labor Statistics so he is he has a broad vision of the Federal Statistical system he also hails from a Big Ten school with a PhD in economics from University of Minnesota David thanks man so I'm gonna try to not talk about big data or no privacy but I'm gonna serve in your honor of Halloween talk about the guy the scary guy with the mask which is represents disclosure avoidance and I'm hoping to convince you that it might not be as bad as we think I'm gonna take you through I was gonna say walk you through but I probably run and you've been given the time sort of take you why we need disclosure of what and so it's sort of building on what John said why I think the current methods that Census Bureau might be problematic and we might want to change them and why this new idea of noise infusion or noise injection might not be as scary as we as we think so why do we need well I think it all starts with title 13 so as as John said title 13 is the law that governs the privacy back in the 1950s it's been there it's also the law that prevents US Census Bureau from sharing the data with other agencies for any other reason but statistical purposes and the three big things that you look at like John mentioned section 9 the one we really focus on is make any publication whereby the data first by any individual under the soil can be identified so that's the key and the interpretation by census is that identification doesn't have to be identifying my family but basically reconstructing the data and because of that census can't then share the rules they use to adjust the data for disclosure because then that would be reconstruction re identification and there are penalties of of two thousand two hundred fifty thousand dollars most of us might be a special scoring status census employee that have to abide by these rules so this is also what protects the Census Bureau from sharing whatever if you have immigration data with other agencies to do anything as the first Clause says anything but for statistical purposes so that's the key guiding factor so it's all the interpretation of title 13 of what we mean by identifying somebody you and that's left to the Census Bureau to ensure that there's no identification the other thing is there are other things out there there are many reports that have been done that says we should really update our methods of privacy of a disclosure avoidance and privacy protection and the fact that Google uses it that's a big big draw so there's this one thing called the funnel I'm gonna skip that one so the the key is there are a couple of committees one is the Commission on evidence-based policymaking and they suggested that we that census and the Federal Statistical system more broadly have to adopt state of the art database cryptography privacy preserving and privacy enhancing technologies so that's the goal that's what's recommended by a bipartisan Commission there's also the National Academy of Sciences that comes out of the the Seon stat committee that that was mentioned before that Bob groats chaired the research with academia and Industry have to continue developing so we have to work with that they may develop these new techniques and that federal agencies should adopt these modern techniques so this is the impetus of doing this new method of disclosure now what is differential privacy well from from literature it's basically the promise that you will not be effective diversity or otherwise by allowing your data be used in any study or analysis no matter what other studies datasets or information sources are available so the idea that census have to protect your data right and take into account oh the other data that's out there that might be used to identify you so this is the rub right so it says this releases any publication any tabulation you can use other data to try to identify people and that census responsibilities they said they shared that so the other the next step is this epsilon differential privacy would suggest the probability of identification changes by only epsilon so to me it is a big deal Census Bureau usually was you can't do anything you have to mask every single identification so though the acknowledgement here is there's actually a probability of identification anyway but you can't the probability is never zero there's always something there and I think that's a big step I think for all of us in this new sense of disclosure so how do we do it currently or how the census do it currently no Robert I do it when I was there I think John mentioned it's an aggregation so we only release stuff at high levels so this is why when you get microdata you only get a puma identification of an area that's a public use microdata area that has to be over a hundred thousand people because the idea is if you could identify an area in under a hundred thousand you might be able to figure out who those people are you swap so this could be you swap people across the areas or you can actually swap characteristics – okay and then you top a bottom though we all know that you can't have income over 150 that's just top code all you see is a hundred and fifty then the next two are the new ones you're adding noise to responses or you're creating synthetic data and I'll mention those two in turn so John talked about this obviously aggregation and swapping and again swapping can be there across areas or across characteristics we can swap the age of people or change and you change something else and that can be problematic and I'll show how it can be problematic if you're swapping based on those ten characteristics and the decennial and then you match the decennial data to some other external data that has other characteristics so swapping swapping a couple people who might by race might have big impacts on what their income happened to be and I'll show how that happens well this is John mentioned this point about there's 25 estimates per person just imagine what you could – if you had a if you had a problem when you had that much data you could react afire construct anything you want and that's what I think that's Bureau shown you can reconstruct the data so here's the example of where disclosure has gone wrong so this is a paper done by Trent Alexander who's here at ISR and Betsey Stevenson is at Ford school along with McGavin where they looked at the data that was published the census poems file the 5% not only 5% linked it to the actual data in the RDC and found out that the the ratio of men and women between the public data and the internal data was goofy after age 65 and this was all due to disclosure techniques to to hide those people but it was the disclosure technique it was done in such a way that the cells were concerned about it Census Bureau were the five-year cells so it's 65 to 70 so you can see that some of this would average out but some of the doof goofy stuff occurs at these lower levels of age group so they found this in the cynicism as decennial census in 2000 census but they also found it in that the ACS and the CPS and you can see that for ages 62 to 64 they're all pretty close the ratios but at 65 and 66 they're all over the place and the CPS is probably the worst do the ACS is bad in in green but the CPS is also bad at the time I was running to see yes so because of this adjustment we had to re-release all the poverty estimates for people over 65 because it changed for men and women the poverty estimates were impacted and this was based on a current disclosure technique but I can't talk about and I can't tell you what it is and this is the rub and so this is the rub for census so it'd be great if census can say hey our previous method was so bad I'll show you my bad at this they can't do that because that risks the violation of title 13 of pre identification the example of success is a recent product done by Raj Chetty Nathanael hendren Jones and Herbert Hoover at Census Bureau where they took the census data and then linked it to the tax data that Raj does and they create this opportunity Atlas so this is basically here's when you were a kid and here's the income of our adult and they mapped it all across the country for every census track okay well the problem is when they did the linking they realized that they can't release the data using the internal data that they have to use the disclosure or avoidance data so if this would be the swap data or other things so if you can imagine some of these areas down in the South when you swap them based on age race family type you could be swapping people have completely different incomes and so they found when they did the swapping and they release the data the results were different from what they had with the internal data however when they do the differential privacy noise in fusion they could keep a lot of the same relationships so they could do it in such a way that you can keep a lot more of those correlations depending on how you build in the differential privacy this is so this is a success of how they did it now again we can't know how bad it was before and this is the problem so this sort of gets to what I think we should do as researchers so as John said here's the simple graph I just stole this from a SESAC presentation of trade-off between accuracy and privacy right so again right the more accuracy you get the less privacy you're gonna have and we want to find out what epsilon to choose what's that best privacy budget is privacy lost budget so if I'm an economist I have a linear map I do a trade-off I maximize it at that social optimal there's also a better fit so I can do that the problem is we have no clue as an estimated production technology looks like right it could look like this which would be much better for us in terms of accuracy and privacy I wouldn't have to trade off nearly as much accuracy as privacy so we don't know what the loss of accuracy does you know I'm certain that most of the researcher anybody who does actual regression research using only the 2010 decennial should get another job they should further it's only ten variables they're not gonna find much okay and I can't imagine the stuff you're gonna find is really gonna impact the results depending on how you knows infuse that but if you're gonna link it to other things there could be some other things going on I also can't imagine any of us come and this is what John was getting at any of us complaining that somebody could look at me look at the census and figure out though he's white he's the man he's 60 and he's married and his kids don't live there that's what you get in the decennial so again there's not much that privacy loss might not be that big for this Samuel so this is what we need people's help on is to estimate those two things there aren't your shoes with referential privacy as um Bob and and and Brian have looked at um there's no way to get around the fundamental law of information recovery which is basically the more and more data we have we're gonna be able to find you there's no way to get around that right even differential privacy is not going to get around that it's gonna set it's just going to set parameters on the estimates um there's really no difference between multiple releases synthetic data sets so we could release all these other data sets as we want but again eventually you might be able to figure out who these people are there is no way to see the raw data and this is what this is I think the biggest criticism really concerned about well I don't get to see that the real data well you never get to see what the real data are right as Frank showed that you just don't get to see it right in fact that I said you only got to see five percent sample it could be with the different projects you can read it you could actually get the entire 100 percent census data just perturbed a bit um they're also problem with outliers they may require a larger sample sizes again I think the sample size is a big deal here and it introduces statistical noise again a lot of us are not comfortable with a data set that has clinical annoys in it and it's not clear to me having been there and it seemed that we did to perturb some of the day that it's not just noise it's noise imposed by individual people interpretation of what we think would hide people whereas differential privacy you'll actually hear here it is here's the epsilon here's the shock of noise go at it right and you'll be able to adjust your standard error would love to do everything with the differential privacy on the big issue and I think it's hidden in this is this idea of a privacy budget so we talked about that Epsilon so this one turns out to be 2.5 and what that means but 2.5 means that's that's the total right so if I have 2.5 and I'm gonna release the National estimate a state estimate a county estimate attract estimate block estimate in the Micro data estimate with all these Pumas they have to add up all those Epsilon and all of those have to be less than 2.5 so this is the problem so if I have this and then I want to release a linked data set with the tax data and cut it to public use data that's gonna have to also be under this Epsilon so the key is and this is again what you can reply on the Federal Register notice who's gonna decide what is it a 10 2.5 86 I don't know who's gonna decide that and I think that's where we need our census needs in my opinion us as researchers to sort of say where are we among these spectrums and how will this affect other other different data sets so let me take a couple minutes to talk about synthetic data so synthetic did so there are two ways to do this you can shock the estimate you get I think that's what John was showing you shot the track so in that area you know the distribution of these characteristics or you can actually shock the internal microdata that used to create and this is census been doing this for a long time so these are for different areas where they do this on the map is the biggest one then the long commute that business database is one where they actually shock the microdata the safety is a small area estimate where actually is used to allocate title 1 funding so here's what are they're using a shocked model-based estimate based on the underlying data to actually allocate funds so it's if there's already it's already done and then my favorite is the set pedophile which is thus it linked to all your Social Security earnings records back to the 1950s we can find your entire life but we have to shop that to make it um able to be released and if you look at it you it's not that bad and I think that's is where I think the data starts going so on the SSB results it's basically this shock this this synthetic data where they looked at the distribution of share of household income earned by the wife there's a shark lift at half so when you hit half we're between husband and wife there's a big jump between the the percentage shares and if you look at the internal results you also get the job it's a bigger jump but what you could do is you could do your analysis on the synthetic data send the code this is what they exactly did to the Census Bureau they would run on the internal data and you could walk away and say hey I get the same results or they're very close or uncomfortable using the synthetic data so what can you do so first thing is you respond to this Federal Register notice and again this matters so I've been there a number of times I've had to answer these there was a question on the ACS that was actually retained and not changed because three hundred people organized by Steve Ruggles sent in three hundred letters there basically said the same thing so if you answer this and say hey you know we need these tabulations we need this data we want the raw data whatever you want to do it definitely matters because they have to read all these and respond to every single one the other thing is you can participate these advisory committees so you'd go to SESAC or you go to KO paths or app do so cope apps is the council of professional associations Federal Statistics app to Association for public data teams go to the meetings make your voices heard don't just sit and complain about my data right provide suggestions talk to census staff constantly and do it like you should vote early and often and then research this idea accuracy be very specific about the effects of of this privacy of these adjustments of what this gonna do to the accuracy or do we have no idea what this does but you could research this and figure this out and the final thing is and this is a big deal the problem with disclosure is people don't know what it is so you go talk to Census Bureau and I think trying to experience this you can't get a straight answer and you can't get answers from people because they don't know what the disclosure is so if you said I got this result but I can't tell you because I'm sworn in or title 13 so if you can train all the sense of staff and train all of us of how to use this new noise infusion in our new synthetic or whatever you want to call it I think that's the key and this has also came out of the report by Bob groves is that we have to train everybody on these new techniques thank you so thank you David I think we will what we will somehow I make sure that everybody has the link to the RFI so that you can submit I think to the Census Bureau where your your feedback on the kinds of information that the Census Bureau should be making available from census 2020 and other data products one thing I will say is that what I've heard over and over again is that it's really important for researchers and others to articulate the use cases that you have what is it that you want this tabulation this kind of data for and how much you know how disaggregated how much noise could you live with because how much of the privacy budget should get used up for this particular use case depends on what you need to have the data be useful for that purpose and if we just say we want more data we're less likely I think to get an effective response than if we say these are the use cases for which we need this kind of disaggregation or this little noise so I really urge people to to respond to that our last speaker is someone who's going to talk to us I think about another way of managing the privacy of privacy issues related to the Census Bureau and that is joelle Abramovich's I'm Joelle is an assistant research scientist here at the Survey Research Center she came to us also from the US Census Bureau in 2016 she is the co-director of the Michigan Federal Statistical research data center and she has a PhD in economics not from a Big Ten school from the University of Washington thanks Maggie thanks for having me today so as Maggie mentioned today I'll be talking about what I think is a critical component in thinking about this question about privacy and data so we know especially as researcher is that both data access and protecting privacy are important federal agencies and Minard administers censuses and surveys and collect information from administrative records these activities produce a wealth of information that is used in a myriad of ways from community planning to academic research safeguarding the information of individuals and firms providing these data as a priority party sees which I'll be talking about today facilitate access to restricted data while protecting privacy the balanced access to data and privacy some data are made available publicly while others are only made available on a restricted basis Federal Statistical research data centers which I'll call are DC's were established to provide access to the restricted data RTC's enabled qualified researchers with approved projects to access confidential unpublished data from the Federal Statistical system this quote from a former US Census Bureau director about Rd sees this research data center allows us to engage researchers outside of Washington and using this very important data while also protecting the public's right to privacy now I'll tell you a little bit about how our DC's protect privacy our DC's provide access to restricted census data as well as data from other Federal Statistical agencies each RDC is a secure facility all our DC research output goes through a rigorous disclosure avoidance review process to ensure that no confidential information is released each RDC is staffed by a Census Bureau employee and each RTC is part of a greater network of RTC's the network includes 29 already sees around the country already sees our joint projects of the US Census Bureau and their home institutions so we have one here at Michigan and the basement of this building and it is a joint project of the Census Bureau and the University of Michigan here's a map of all the RDCs around the country of the existing locations in blue and locations and development in red and by having this network across the country RTC's facilitate collaboration across location so that researchers in different places can be working on the same project and as researchers move across their careers that they can still be involved in this research and not be tied to one location working in an RTC provides access to data that are just not available elsewhere some data in particular are just not available publicly at all these include establishment level business data as well as linked household and firm data some data are available publicly but the restricted versions provide more detailed information these might include detailed geospatial variables there's also virtually no top or bottom bottom coding in the RTC and also in the RTC it's possible to link data to other non census data the RTC's provides Census Bureau data and increasingly data from other federal agencies so while they started with only census data with data like the decennial census and demographic surveys also data from economic censuses and surveys and linked business and household data they've now expanded their scope to include data from other agencies like the National Center for Health Statistics and the agency for Healthcare Research and quality which have included their Public Health Survey data and the RDCs as well as the Bureau of Labor Statistics which has included some of their labor survey data in the RDCs and it's an exciting time that more agencies are looking to join the RTC system in the future in addition to these different data sets our DC's also facilitate linkage of different data products providing unique research opportunities these include linked business and household data as well as links survey and administrative data and in addition to all of these survey products there are also a number of projects coming out of these sorts of linkages that are available in the RDC an exciting piece about the RDC is that they promote broader access to data than before our DC's are accessible by any academic the application process is transparent RDC administrators and agency analysts assists researchers in preparing their applications there's an increasing emphasis on timely review and we think about how do we protect privacy in the system there are a number of requirements for accessing the data so while they are available broadly there are requirements to ensure privacy one of these is that any researcher looking to use data and an RDC must submit a research proposal that would go to the federal agency that is responsible for those data in addition once that proposal was approved they also have to obtain special sworn status so access is restricted to Census Bureau employees or researchers who have special sworn status with the Census Bureau to do that the researcher must either be a US citizen or have lived in the US for three years and those obtaining special Soaring status must obtain security clearance and must sign and make a sworn statement about preserving the confidence to confidentiality of the data for life so it's not just during the time that they're working with the data but that even when they are no longer working in the RDC that they can't release any restricted information as a result of access to these non-public data this access has enabled innovative research published in leading academic journals so just some topics that are relevant to current news and policy conversations our researchers have assessed residential mobility and the geographic distribution of the healthy they're able to do that because they have access to more detailed geographic information in the restricted data our researchers have examined the relationship between environmental limit emissions and health outcomes they've quantified the impact of maternal access to the birth control pill on child poverty and they've investigated the effects of interactions with international markets on us perm firm performance using data on businesses that just wouldn't be available elsewhere these are just just a sample of a wide variety of research that has come out of our RTC so in summary the RTC network is a vital resource for providing data access while protecting privacy it reduces the amount and scope of publicly available data to protect privacy so we think about the private privacy budget that David and John we're talking about and this way there are fewer estimates being released but researchers can still work with data and a protected space permits access to restricted data in a secure and controlled manner to facilitate important research RTC's are increasingly necessary as additional differential privacy measures are implemented and understanding the role of RTC's can be valuable in helping the public have confidence in the confidentiality of their responses to federal census and surveys which is important as we think back to our earlier panel and having in helping people have confidence when they answer the census we are not surprisingly running over but we'll try and not take time for a few questions if people have questions of our speakers just linking Joelle's presentation to the others just just to clarify are there any implications of the of the differential privacy steps that are being taken and the RDCs or short answer is what Dave and I were both referring to briefly about you have a fixed privacy budget and then you have to allocate it anything in quotes that's released and that anything includes for example anything that comes out of an RDC is included with that one of the critical factors though is that often within an RDC you will have somebody carrying out in-depth analyses in a whole lot of different ways but for example diagnostic plots other things like that that they will then not be intending to release to the public except in a summary form for example you know for example we carried out a careful analysis of this and we do not find unusual outliers or something like that as opposed to having actually release the graphs themselves which could be doing outlier detection or something like that could involve a lot of quote leakage unquote of information and so in some ways we believe that the RT C's are an important component in some ways managing that total Epsilon because eventually if most of what you're releasing from there are highly aggregated results for example you know say coefficients are the model parameter estimates and some relatively modest information about certainly like standard errors or confidence or something like that that is some leakage but it's not nearly as much as if you're having to drop a lot of very large number of tables or something like that in the public so as I understand the raw reported data will be in the RT C's just an estimate you take out and I don't think the Census Bureau has figured out how to adjust regression coefficients a way to do the regression coefficients but yeah so I was gonna say I think I was just gonna add one of the thing which is that one thing which I have heard at least from John about is that right now we have in this there is there are disclosure protections on the data that are in the Census RDCs that there is some swapping and some and and things like that but and that once we have implemented differential privacy and figured out kind of how to add a little bit of noise to you know and how to you know your maximum likelihood estimates which don't use that much of the privacy budget probably don't need very much noise once we know how to do that then we will actually be able to use completely unprotected that is say unswept uncapped unprotected data in the RDCs which will allow the s will have we will solve the problem not by distorting the underlying data but by adding a little bit of noise to the output which i think is a better solution than messing with the underlying data when we don't know exactly what the implications of that will be for the subsequent analysis so I think right now for anybody who's working in the AR DC's knows that this is a challenge because there's a lot of uncertainty around disclosure from the RDCs but I think in the long run it actually makes the RT c–'s both more important in terms of in terms of helping to reduce the use of use of the privacy budget but also alep allows them to actually offer a higher quality access to data this is a little here it's very complicated area I say I've I've written done a bit of research on disclosure avoidance and developing the methods is much easier than figuring out the trade-offs and the risks but um I've always had a feeling that that noise injection is kind of a clunky way of doing it and swapping is also a kind of clunky way of doing it and sort of creating synthetic data with Witten by modeling essentially by modeling the data and then since it seems like a more promising Avenue to me for dealing with issues err I don't know if the panelists have any opinions about that but the way I did invent a method that I thought had a very good acronym was sir so getting good naturalness is also important so a method called synthetic multiple imputation of keys which which becomes Mike who's the character in Dickens who is stealing stuff so I always love the acronym I don't have an immediate overall response to that in any meaningful depth it's obviously an empirical question to a very large degree and that's an example where as I was referring to at the start you have a certain body of methodology and technology there's a certain level of development that's a snapshot of what we have now we hope in X years we have something much more refined and exploring a lot of the venues of the options that rod was referring to a moment ago and understanding a much more nuanced way where the benefits and risks are attached to each of those is going to me we believe one of the real benefits that we have from doing ongoing research and getting collaboration with everybody across sectors show my theorem that yields the similar results as the original dataset so the key is how many of these do we have to make and can they do it on the fly I also I mean III thought they was most dave was gonna say I said that there's really these are these are compliments not substitutes right so having synthetic data out there in the public that can be where you can reproduce the results or try to reproduce the results inside an RDC that then and then the results that come out of that are differentially private those are those are actually allow greater access while still maintaining scientific utility alright I will stop answering other questions believe in the first presentation there was reference that this needs to be done at the national level or it's best to be done that way but my recollection of the release of the pl 94 171 data is it trickled out several states at a time so how can you do that a broad more refined way of saying it is it should be doing an omnibus level for quote unquote of the releases you're intending to do that's it's somewhat distinct from saying if we are releasing these on a certain schedule we need to have a and they're the structure we have for privacy budgets it is going to be more efficient from a statistical and again risk privacy trade-off a point of view if we effectively allocate that budget ahead of time and say this much of the bunch goes to and then specify things David had a nice break out of that in terms of subsequent several different components that one could contemplate with that but you have to you have to have a clear idea of what each of those components are as opposed to somebody two years after the fact coming back in and saying oh we have the following pressing additional need if you have not accounted for that in your privacy budget priori for example we've got a natural disaster or something like that and we now need a separate breakout then that would have to be handled separately I had a question back to the pl 94 171 data and we loved that here so obviously the sent the citizenship question will the data produced from that will be part of that because that will be used for redistricting purposes I had a question for related to Jews quote to disclosure prevention for that so if it's released at the block level there are something like 6 million populated census blocks the median size in terms of population is 23 mean households 10 about 10% of census blocks populated census blocks only have one household getting back to the data sharing with other agencies I don't expect that the the underlying data will be shared but to what extent can you kind of ensure that some entrepreneurial data scientist at DHS won't take their their visa data for non citizen residents are living geolocate that and then connect it to the I don't have anything to add to that except Al's comment the in during his presentation about the Pratt the citizenship issue is currently subject to litigation and therefore were pretty much flowing sail on the alert James do you have anything more you want to add to that there we are doing having very narrow discussions about what the pl file will look like from 2020 but there's no decisions yet made so the only thing we have decided on is what we are doing for the prototype data file that's gonna be released in March which is coming off the 2018 test and obviously there was no citizenship question there but it's our assumption I think that this disclosure avoidance techniques that are being developed through differential privacy art are going to protect the microdata micro detail file and then any tabulations off of that will have that protection already applied to it all right if there are no other questions thank you guys all for coming we will put all the information slides videos everything on our website we're at is r dot umich.edu thank you get started for next session system once the decision is made about where we want to live on that curve is summarized for

Leave a Reply

Your email address will not be published. Required fields are marked *