As an important component of Project SIR (Sensitive Information Remediation), an application called Identity Finder will assist UNC faculty and staff in scanning their computers and other online file space to identify sensitive information; and securely delete, modify, or migrate documents containing this information to a secure storage environment.
Why are we doing this?
2011 Enterprise Risk Assessment Findings: Sensitive, University information “nearly ubiquitous”
- Sensitive information (SI) is often stored on end user computers and central shared storage
- Copies of the same sensitive file are often found on multiple systems
- Sensitive information tends to migrate with users when they are assigned a new computer or their roles change within the organization
- Older, sensitive information is seldom securely deleted
What are the project’s goals?
- Seek and identify sensitive, University-owned information using Identity Finder
- ITS has licensed a file scanning application, Identity Finder, for all faculty, staff and any students who may have SI
- Scan for: Social Security numbers, credit card numbers, and passport numbers
- Upon departmental leader request, the ITS technical team will help a department organize to scan for additional identifiers.
- Remediate sensitive information
- Delete the document containing sensitive information if it is not needed (using Identity Finder).
- If the document is needed, remove only sensitive fields (e.g., replace 123-45-6789 with xxx-xx-xxxx) (using Identity Finder)
- If retention of the sensitive information is required, store the SI safely on professionally managed, central file storage that meets the requirements of the System Administration Initiative (SAI). When essential for intensive local use, the SI may be stored on workstations or laptops that meet the required, enhanced security standards (see page 18 of the Information Security Policy) and the Sensitive Workstation Controls standards.
- Manage sensitive information into the future
- Appropriately classify information regarding whether it is sensitive
- Store safely on SAI-approved file server or on a laptop or desktop secured as described above
- Review regularly according to the retention schedule approved by the appropriate data steward
What is the scope of the project?
- Remediate sensitive information on all faculty and staff computers, even those that are encrypted, prioritizing those that are NOT encrypted for the first phase of the project
- Remediate sensitive information on select student computers (i.e. students who likely have sensitive, University-owned information on their computers due to the nature of their studies or employment)
- University-owned shared and individual storage running MS Windows or Mac OS X, or searchable from an Identity Finder client installed on those operating systems
- University-owned servers running Microsoft Windows prioritizing non-SAI servers for the first phase of the project
- For computers, servers or storage space within the scope of the project, there are two primary tasks:
- Perform the scan
- Review the resulting match list and resolve flagged entries
- Dismiss false positives
- Remediate true positives through file deletion, removing only the sensitive information from the file, or storing the file with SI on a SAI-approved file server
Time spent scanning and remediating the information will vary based on the amount of data scanned and the amount of potentially sensitive information identified by the scan. A scan can take as little as 1 hour, or more than 8 hours. Remediation of the match list may take a few minutes or a few hours.
What is the proposed timeline for the activity?
The ITS Information Security Office began working with Project SIR early adopters in March 2014.
ITS completed scans and remediation of desktops, laptops, ITS’s storage.unc.edu space and AFS space by June 30th, 2014.
Other campus units may begin scanning in July 28, 2014. Units are asked to scan high-risk areas first. The ITS Information Security Office can assist in identifying potential high risk areas.
ITS will provide the following support:
- Tools to scan and identify sensitive data
- Documentation and Frequently Asked Questions (FAQs)
- Project plans and lessons learned from ITS’s experience
- Additional consultation and guidance as requested
Individual units will manage their own scanning timeline and schedule.
- End user laptops and desktops that are not encrypted, should be scanned and remediated by July 15, 2015
- Windows servers, that are not in SAI, should be scanned and remediated by December 31, 2015
- Shared storage slices should be scanned and remediated by December 31, 2015
Who manages what information? Who decides what data to securely discard or keep? What is a “Data Steward?”
“Data Stewards” are individuals with primary responsibility for a set of data, including any sensitive information within that data. The Data Steward for institutional administrative data is typically the lead administrative department with responsibility for that data, e.g., the Finance Office for institutional financial data, Human Resources for employment data, and the Registrar’s Office for student data (please see the Institutional Data Governance Policy for a full discussion of data stewardship). As noted in the 2/15/2012 report of the “Provost’s Task Force on the Stewardship of Digital Research Data,” the PI/researcher has primary responsibility for research data, though the institution shares responsibility. Individual members of the UNC-CH community have stewardship responsibilities for data on their individual machines, including individual use storage (such as file systems) and for any institutional data which may be held on personally owned devices.
Generally the Data Steward needs to make retention decisions about their data so the steward – typically the primary user of individual use workstations or file storage – is usually the only individual who can authoritatively review sensitive information to ensure its appropriate disposition. This is why Project SIR will involve almost all UNC-CH faculty and staff in searching for and appropriately addressing sensitive information on their machines and storage.
In many cases, while an individual may occasionally need access to sensitive information, that information is available from the primary Data Steward. Prior to modern regulations regarding sensitive information, individuals would sometimes keep sensitive information for possible future reference. UNC-CH has determined the risk of these “just in case” copies outweighs the benefit, so such sensitive information should be securely deleted from individual machines and file storage.
What is Identity Finder?
Identity Finder is an application that searches a computer for sensitive information such as social security, credit card, and passport numbers. It also provides a way to remidiate such information once it is identified.
Identify Finder is highly configurable. This FAQ is constructed specifically for UNC-CH’s configuration of Identity Finder as deployed in the first phase of the 2014 Sensitive Information Remediation Project. It is possible to manually reconfigure Identity Finder in ways that cause it to do more – or to do less – than the standard UNC-CH configuration. For this reason, the help documents focus on the behavior of Identity Finder as configured for UNC-CH. Please contact firstname.lastname@example.org if you have a reason to consider reconfiguring Identity Finder for your particular scans.
Why is UNC-Chapel Hill using Identity Finder?
As part of the University’s information security strategy, Identity Finder is data identification software used to discover sensitive information stored on computers. UNC-Chapel Hill is using Identity Finder to proactively locate and secure sensitive information on University-owned computers and servers, such that the sensitive information is not as to not leave data vulnerable to potential unauthorized access.
How much will Identity Finder cost?
Identity Finder is a centrally funded, University initiative and is available at no cost to end-users (faculty and staff) at the University.
Are there any recommendations for working with Identity Finder?
- Review the help documentation
- Practice moderation. Scan only as much as you can address in a reasonable amount of time.
- Identify in advance where you will store sensitive data should you discover it. You can contact your unit’s IT support staff to arrange for secure storage.
What information does Identity Finder locate?
The University is scanning for three types of sensitive information:
- Social Security Numbers: Identity Finder searches for formatted SSNs (NNN-NN-NNNN) and unformatted SSNs (NNNNNNNNN).
- Credit Card Numbers: Identity Finder searches for MasterCard, Visa, Discover, American Express, Diners Club.
- Passport Numbers
Identity Finder is also capable of scanning for additional types of data. For information on expanding the scan to include additional types of information, please send a Remedy ticket to ITS-SIR.
What is a false positive?
A false positive in is a match in Identify Finder that may look like a SSN, credit card number, or passport number, but is actually just a series of numbers in a similar format as the pattern of the search criteria. The numbers could be the same length or start with the same set of numbers as, for example, credit cards.
To reduce false positives, the University has implemented several restrictions for matches on unformatted SSN. The file must also contain the keyword “SSN” or “Social Security” somewhere in the file. Also, for most files, three matches are required forthis type of number before Identity Finder will report a match. For a PDF, only one match is needed (many travel forms requaire an SSN). Becuase UNC-CH PIDS typically follow the pattern of an unformated SSN, the false positives are reduced greatly by the need for keywords “SSN or “Social Security”. Formatted SSNs do not have the same restrictions.
Where does Identity Finder search on my computer?
By default, Identity Finder for Windows and Mac will search the following locations for sensitive information:
- Files and Compressed Files
- E-mails and Attachments
- Removable Storage Devices
If there is a device attached to your computer, such as a USB driver or cell phone, Identity Finder will try to search that device. If you do not with to search the device, simply perform no action (do not click “OK” or “Cancel”) when the Identity Finder “Removable Storage Detected” prompt appears. (This is a known bug in the Identity Finder client.)
Does Identity Finder search .pst and .ost files?
Yes, Identity Finder searches both .pst and .ost files. Regardless of whether Outlook is open or not, if a user has mail folders attached to their Outlook Profile then they will be searched, and if a detached .pst file is found in the file, it will also be searched.
When performing and IDF scan, a window may appear regarding use of your Outlook mailbox.
You should select the option to “Use Old Data” to ensure that all data is scanned during your IDF search. Choosing the “Use Temporary Mailbox” option may not include all of your previous data in the scan.
Does Identify Finder search image files, e.g., scanned documents?
Identity Finder can search FAX images, PDF images, TIFs, JPGs, and almost all other major image formats to accurately identify all sensitive information.
Optical Character Recognition (OCR) is used to search for text within images. The following file types are supported: bmp, dcx, gif, jbig2, jp2, jpeg, jpf, jpg, jpg2000, jpm, jpx, max, pcx, png, tfx, tif, tiff, xif, xiff, and xps. If the DPI of an image is less than 75 or greater than 2400, the recognition may fail and log an error.
Something doesn’t seem right about the results my local scan is reporting. Who should I contact?
Contact your local IT support person. Your local IT support person will contact the SIR team to determine any troubleshooting steps that should be taken.
I store many of my files on our shared space. How will that be scanned?
Your unit IT support staff will work with the SIR team to determine a strategy for scanning your files that are stored on a shared file server.
Can I scan AFS storage?
At this time it is not possible to scan AFS storage using Identity Finder.
Will a scan slow my computer?
The first Identity Finder scan may take some time, depending on the size of the disk and the power of the computer. We recommend starting the initial scan prior to leaving work for the day. Subsequent scans are generally fast and do not materially affect system performance.
How long does a scan take to complete?
The length of time to complete a scan depends on the amount of data being searched and your computer’s performance.
How often should I run a scan?
You can run Identify Finder at any point when you think you may have collected new sensitive information. At this time, there is not a mandate to conduct a scan at regular intervals. However individuals who regularly work with sensitive information are encouraged to scan as often as necessary to ensure that there is no sensitive information being stored on their laptop or desktop.
I do University work on three computers, two owned by the University and one that is personally owned. Should I scan all three?
Yes, scan all three to ensure University-owned sensitive information is not stored on the devices. Work with your local IT support staff to ensure the appropriate Identity Finder software is installed on all computers on which you conduct University business.
Can I run Identity Finder if I am at my home or from other off-campus locations?
Yes, but you must be connected through the campus VPN.
How do I reset my Identity Finder profile password?
Identity Finder provides the ability to save settings, configuration information, and sensitive data across sessions through the use of a profile password. It is not possible to recover a lost password; however, it is possible to delete a profile and create a new one. When the profile password is created, that password is used to encrypt the profile. The profile password is not stored anywhere and therefore if it is lost or forgotten, all of the information in the profile will be lost.
The following data will be lost in Identity Finder when deleting a profile:
- Custom Folders, Remote Computers and authentication credentials
- Only Find Identities
- Document Overview
- Ignore list entries
- Password Vault entries
- Database connection information
- Websites list
Why is my virus scanner creating alerts during Identity Finder searches?
During the course of an Identity Finder search, anti-virus applications may create an alert for files created in a subfolder of IDFTmpDir located in the user profile folder. This is not a problem with Identity Finder, but rather indicates that the user’s system already contains one or more infected files.
The files in IDFTmpDir are created during a search, specifically and most commonly when extracting files from archives (e.g., .zip files) or when detaching them from email messages. To search these files, Identity Finder places them in a temporary folder and then attempts to open them for read access. If the file has a virus, the act of extracting or detaching the file to the temporary folder and/or the attempt to read the file may trigger the anti-virus application (depending on its configuration). If Identity Finder is configured to log Locations Searched, you may be able to determine the specific archives or messages that contain the infected file(s); however, in these instances, it is recommended that you perform a full anti-virus scan of the user’s system ensuring a search within archive files and e-mail attachments.
For additional details on the location of the user profile folder for each operating system, please refer to the Windows or Macintosh configuration guide.
How do I generate a “Gather Data” file?
A ‘Gather Data’ file may be requested by UNC-CH staff supporting Identity Finder to pass along to IDF for analysis.
The Gather Data functionality of the IDF client collects configuration information, logs from searches executed, and other important information to be compiled into a zip file that can be sent to IDF for troubleshooting a problem that you’re having with the IDF client. Instructions on how to generate a Gather Data file are available at help.unc.edu.
Are there any records retention policies that I can refer to?
Individuals with non-master copies of sensitive information should securely delete that information, recognizing a new copy can be obtained in the event it is needed, unless there is a strong need to retain for business purposes, in which case it needs to be secured.
- Please consult the appropriate Data Steward to understand how long various records should be retained.
- Data owners/stewards should dispose of any sensitive information that is no longer needed.
- The Identity Finder Project at UNC-Chapel Hill
- Instructions for Working with Identity Finder for Windows
- Instructions for Working with Identity Finder for Mac
- “Project SIR: Remediating Your Results” How-to Video
- Identity Finder Website