Identity Finder and Project (SIR) FAQ

As an important component of Project SIR (Sensitive Information Remediation), an application called Identity Finder will assist UNC faculty and staff in scanning their computers and other online file space to identify sensitive information; and securely deleting, modifying, or migrating documents containing this information to a secure storage environment.  

Why are we doing this?

2011 Enterprise Risk Assessment Findings: Sensitive, University information “nearly ubiquitous”

  • Sensitive information is often stored on end user computers and central file stores.
  • Multiple copies of the same sensitive file are often found on a variety of systems.
  • Sensitive files tend to migrate with users when they are assigned a new computer or their roles transition inside the organization.
  • Older, sensitive information is seldom purged.

What are the project’s goals?

  • Seek and identify sensitive, University-owned information with Identity Finder.
      • ITS has licensed Identity Finder for all faculty, staff and appropriate students.
      • Phase I will scan for: Social Security Numbers, credit card numbers, and passport numbers. Later phases may include scanning for additional identifiers.
  • Remediate sensitive information.
      • Delete if unneeded (using Identity Finder).
      • Remove only sensitive fields (e.g., replace 123-45-6789 with xxx-xx-xxxx) (using Identity Finder).
      • If retention is required, store safely on professionally managed, central file storage.
  • Manage sensitive information appropriately going forward.
      • Appropriately classify.
      • Store safely on best-practice (for sensitive information) managed file storage.
      • Review on a schedule for deletion according to retention schedule approved by appropriate data steward.

Who manages what information? Who decides what data to shred or keep? What is a “Data Steward?”

“Data Stewards” are individuals with primary responsibility for a set of data, including any sensitive information within that data. The Data Steward for institutional administrative data is typically the lead administrative department with responsibility for that data, e.g., the Finance Office for institutional financial data, Human Resources for employment data, and the Registrar’s Office for student data (please see the Institutional Data Governance Policy for a full discussion of data stewardship). As noted in the 2/15/2012 report of the “Provost’s Task Force on the Stewardship of Digital Research Data,” the PI/researcher has primary responsibility for research data, though the institution shares responsibility. Individual members of the UNC-CH community have stewardship responsibilities for data on their individual machines, including individual use storage (such as file systems) and for any institutional data which may be held on personally owned devices.

Generally the Data Steward needs to make retention decisions about their data so the steward – typically the primary user of individual use workstations or file storage – is usually the only individual who can authoritatively review sensitive information to ensure its appropriate disposition. This is why Project SIR will involve almost all UNC-CH faculty and staff in searching for and appropriately addressing sensitive information on their machines and storage.

In many cases, while an individual may occasionally need access to sensitive information, that information is available from the primary Data Steward. Prior to modern regulations regarding sensitive information, individuals would sometimes keep sensitive information for possible future reference. UNC-CH has determined the risk of these “just in case” copies outweighs the benefit, so such sensitive information should be securely deleted from individual machines and file storage.

What is the scope of the project?

  • Re-mediate sensitive information on all faculty and staff computers.
      • Laptop, desktop and file encryption protected files on equipment that is stolen. When a computer is running, the files on even an encrypted computer are at risk of exposure from computer intrusions.
  • Re-mediate sensitive information on select student computers (i.e. student employees using personal machines, and those who likely have sensitive information on their computers due to the nature of their studies).
  • University-owned shared and individual storage
  • University-owned servers
  • For each computer, server or storage space in scope, there are two primary tasks:
      • Execute the scan.
      • Address the match list.
          • Dismiss false positives.
          • Re-mediate true positives.

Time spent scanning and re-mediating the information will vary based on the amount of data scanned and the amount of potentially sensitive information identified by the scan. A scan can take as little as 1 hour, or more than 8 hours. Remediation of the match list may take a few minutes or a few hours.

What is the proposed timeline for the activity?

ITS will complete scans/remediation of desktops/laptops, ITS’s space and AFS space by late Spring 2014.

The Information Security Office will begin working with Project SIR early adopters in March 2014.

Departments include: Finance and Administration, Human Resources, Office of Development, School of Social Work, College of Arts and Sciences (selected departments).
Additional campus units will begin scanning in Spring 2014. Units are asked to scan areas that are most likely to have sensitive information first. Information Security is able to assist in identifying potential high risk areas.

ITS will provide the following support:

  • Tools to scan and identify sensitive data
  • Documentation and Frequently Asked Questions (FAQs)
  • Project plans and lessons learned from ITS’s experience
  • Additional consultation and guidance as requested

Individual units will manage the actual scanning timeline and schedule.

A Data Steward is the individual responsible for the oversight of information in their area. Data Stewards will be asked to attest to remediation of data under their oversight.
The Project currently has a goal of remediating 90% of systems within scope before Spring 2015.

What is Identity Finder?

UNC-Chapel Hill is using Identity Finder to proactively locate and secure sensitive information on computers and servers, so that the information is not left vulnerable to potential unauthorized access. Identity Finder is an application that searches a computer for sensitive information such as social security, credit card, and passport numbers. It also provides a way to manage such information once it is identified.
Identify Finder is highly configurable. This FAQ is constructed specifically for UNC-CH’s configuration of Identity Finder as deployed in the first phase of the 2014 Sensitive Information Remediation Project. It is possible to manually reconfigure Identity Finder in ways that cause it to do more – or to do less – than the standard UNC-CH configuration. For this reason, the help documents focus on the behavior of Identity Finder as configured for UNC-CH. Please contact if you have a reason to consider reconfiguring Identity Finder for your particular scans.

Why is UNC-Chapel Hill using Identity Finder?

As part of the University’s information security strategy, Identity Finder is data identification software used to discover sensitive information stored on computers. UNC-Chapel Hill is using Identity Finder to proactively locate and secure sensitive information on University-owned computers and servers, such that the sensitive information is not as to not leave data vulnerable to potential unauthorized access.

Are there any recommendations for working with Identity Finder?

  • Review the help documentation
  • Initially, you should accept the search defaults (what type of data to scan and where to scan it).
  • For your first scan, choose a non-mission critical and unimportant system, such as a desktop. Familiarize yourself with Identity Finder by running searches on that system.
  • Practice moderation. Scan only as much as you can address in a reasonable amount of time.
  • Identify in advance where you will store sensitive data should you discover it.

What is a false positive?

A false positive in is a match in Identify Finder that may look like a SSN, Credit Card Number, or Passport number, but is actually just a series of numbers in a similar format as the pattern of the search criteria. The numbers could be the same length or start with the same set of numbers as, for example, credit cards.

Where does Identity Finder search on my computer?

By default, Identity Finder for Windows and Mac will search the following locations for sensitive information:

  • E-mails and Attachments
  • Files and Compressed Files

What information does Identity Finder locate?

In this first phase, the University is only scanning for three types of sensitive information:

  •  Social Security Numbers: Identity Finder searches for formatted SSNs (NNN-NN-NNNN) and unformatted SSNs (NNNNNNNNN).

However, to reduce false positives, the University has implemented several restrictions for matches on an unformatted SSN. The file must also contain the keyword “SSN” or “Social Security” somewhere in the file. Also, for most files, three matches are required for this type of number before Identity Finder will report a match. For a PDF, only one match is needed, as many PDF travel forms require an SSN. Because UNC-CHPIDs typically follow the pattern of an unformatted SSN, the false positives are reduced greatly by the need for the keywords “SSN” or “Social Security”. Formatted SSNs do not have the same restrictions.

  • Credit Card Numbers: Identity Finder searches for MasterCard, Visa, Discover, American Express, Diners Club.
  • Passport Numbers

Identity Finder is configured to not reset timestamps during search. Identity Finder is also capable of scanning for additional types of data. For information on expanding the scan to include additional types of information, please contact the Information Security Office at, or send a Remedy ticket to ITS-Security.

Does Identity Finder search .pst and .ost files?

Yes, Identity Finder searches both .pst and .ost files. Regardless of whether Outlook is open or not, if a user has mail folders attached to their Outlook Profile then they will be searched, and if a detached .pst file is found in the file, it will also be searched.

Does Identify Finder search image files, e.g., scanned documents?

Identity Finder can search FAX images, PDF images, TIFs, JPGs, and almost all other major image formats to accurately identify all sensitive information.
Optical Character Recognition (OCR) is used to search for text within images. The following file types are supported: bmp, dcx, gif, jbig2, jp2, jpeg, jpf, jpg, jpg2000, jpm, jpx, max, pcx, png, tfx, tif, tiff, xif, xiff, and xps. If the DPI of an image is less than 75 or greater than 2400, the recognition may fail and log an error.

Will a scan slow my computer?

The first Identity Finder scan may take some time, depending on the size of the disk and the power of the computer. We recommend starting the initial scan prior to leaving work for the day. Subsequent scans are generally fast and do not materially affect system performance.

How long does a scan take to complete?

The length of time to complete a scan depends on the amount of data being searched and your computer’s performance.

How often should I run a scan?

You can run Identify Finder at any point when you think you may have collected new sensitive information. The University is likely to establish a periodic requirement to scan based on the results discovered with the SIR project.

How much will Identity Finder cost?

Identity Finder is a centrally funded, University initiative and is available at no cost to end-users (faculty and staff) at the University.

How do I reset my Identity Finder profile password?

Identity Finder provides the ability to save settings, configuration information, and sensitive data across sessions through the use of a profile password. It is not possible to recover a lost password; however, it is possible to delete a profile and create a new one. When the profile password is created, that password is used to encrypt the profile. The profile password is not stored anywhere and therefore if it is lost or forgotten, all of the information in the profile will be lost.

The following data will be lost in Identity Finder when deleting a profile:

  • Custom Folders, Remote Computers and authentication credentials
  • Only Find Identities
  • Document Overview
  • Ignore list entries
  • Password Vault entries
  • Database connection information
  • Websites list


Why is my virus scanner creating alerts during Identity Finder searches?

During the course of an Identity Finder search, anti-virus applications may create an alert for files created in a subfolder of IDFTmpDir located in the user profile folder. This is not a problem with Identity Finder, but rather indicates that the user’s system already contains one or more infected files.

The files in IDFTmpDir are created during a search, specifically and most commonly when extracting files from archives (e.g., .zip files) or when detaching them from email messages. To search these files, Identity Finder places them in a temporary folder and then attempts to open them for read access. If the file has a virus, the act of extracting or detaching the file to the temporary folder and/or the attempt to read the file may trigger the anti-virus application (depending on its configuration). If Identity Finder is configured to log Locations Searched, you may be able to determine the specific archives or messages that contain the infected file(s); however, in these instances, it is recommended that you perform a full anti-virus scan of the user’s system ensuring a search within archive files and e-mail attachments.

For additional details on the location of the user profile folder for each operating system, please refer to the Windows or Macintosh configuration guide.

Records Retention Policies

Individuals with non-master copies of sensitive information should securely delete that information, recognizing a new copy can be obtained in the event it is needed, unless there is a strong need to retain for business purposes, in which case it needs to be secured.

  • Please consult the appropriate Data Steward to understand how long various records should be retained.
  • Data owners/stewards should dispose of any sensitive information that is no longer needed.

Can I scan AFS storage?

At this time it is not possible to scan AFS storage using Identity Finder.

How often do I need to rescan?

Regular scanning is a responsibility when working with sensitive information. The University will evaluate the risk remaining after the first pass of scanning and consider the effort involved in a subsequent scan to determine how often rescans should be done. This could vary by area based on risk.