Using Powershell to report on files containing PII (Personally Identifiable Information)


26 August 2016:

This updated script also outputs the files found as a PS object that can be exported to CSV.

For example:

$PiiFiles = get-pii -FileType 'txt' 

Searches files with txt extension in the current folder and its subfolders, outputs the findings in HTML report and saves the file list in the $PiiFiles variable. That can be exported to CSV as in:

$PiiFiles | Export-Csv .\piilist1.csv -NoTypeInformation

Management of PII (Personally Identifiable Information) has always been a source of concern. PII includes information such as social security numbers and credit card numbers. Companies may have policies to regulate how they’re handled, and perhaps require encrypting the files where they’re stored. IT may be asked to audit or report on any files containing PII. This script does just that. The script is implemented as a function the SBTools module available on the Microsoft Script Center Repository.

Get-PII function uses EnahncedHTML2 functions by Don Jones who graciously agreed to have them included in SBTools module.

NIST provides examples of PII such as:

  • Name, such as full name, maiden name, mother‘s maiden name, or alias
  • Personal identification number, such as social security number (SSN), passport number, driver‘s license number, taxpayer identification number, patient identification number, and financial accountor credit card number
  • Address information, such as street address or email address
  • Asset information, such as Internet Protocol (IP) or Media Access Control (MAC) address or other host-specific persistent static identifier that consistently links to a particular person or small, well defined group of people
  • Telephone numbers, including mobile, business, and personal numbers
  • Personal characteristics, including photographic image (especially of face or other distinguishing characteristic), x-rays, fingerprints, or other biometric image or template data (e.g., retina scan, voice signature, facial geometry)
  • Information identifying personally owned property, such as vehicle registration number or title number and related information
  • Information about an individual that is linked or linkable to one of the above (e.g., date of birth, place of birth, race, religion, weight, activities, geographical indicators, employment information, medical information, education information, financial information).

This script searches and reports only on social security numbers and credit card numbers. It can be modified to detect additional PII patterns. Feel free to post a comment if you’d like to see more patterns added..

To search on a set of folders and report on files containing PII use a command like:

Get-PII “txt”,”csv”,”doc?” “D:\Sandbox”,”\\Server1\Install\Script?”

This searches the folder d:\sandbox and \\Server1\Install\script? for files with extensions txt, csv, and doc?, and compiles an HTML report of any files including PII.

The command output looks like:

PII1

The HTML report looks like:

PII2

The sample file looks like:

PII3

Advertisements

9 responses

  1. Kurt

    Thanks for this script. I note that it doesn’t search .docx files directly – I had to unzip a document caught by our firewall to verify that it contained data matching the patterns. It ended up being a false positive, but the script proved useful nonetheless.

    Kurt

    July 15, 2015 at 2:03 pm

  2. Pattern search tools like this script can only search files that are not compressed or encrypted. Common compression and encryption techniques would often interfere with the text pattern a script or tool is trying to detect.

    July 16, 2015 at 5:46 pm

  3. Mark Horvat

    Hi there,
    Great script!

    I have one issue though, while it produces a html report, how do I convert the html report to csv format ready for data mining?

    Thanks in advance!

    Mark

    August 10, 2016 at 6:41 pm

  4. hernan

    Hello, when i run the next command, it shows me nothing, could you help me? what am i doing wrong? thanks
    PS D:\Descargas Chrome> ./Get-PII2 “txt”,”csv”,”doc?” “C:\Users\myUser\Documents”

    Advertencia de seguridad
    Ejecute solo los scripts de confianza. Los scripts procedentes de Internet pueden ser útiles, pero este script podría dañar
    su equipo. Si confía en este script, use el cmdlet Unblock-File para permitir que se ejecute sin este mensaje de
    advertencia. ¿Desea ejecutar D:\Descargas Chrome\Get-PII2.ps1?
    [N] No ejecutar [Z] Ejecutar una vez [U] Suspender [?] Ayuda (el valor predeterminado es “N”): Z
    PS D:\Descargas Chrome>

    April 7, 2017 at 3:47 pm

  5. Import-Module .\Get-PII2.ps1
    Get-PII “txt”,”csv”,”doc?” “C:\Users\myUser\Documents”

    April 7, 2017 at 10:56 pm

  6. hernan

    Hello, the ps filename is Get-PII2. Do i have to change the name of the file? if i try Get-PII it doesnt recognize the command. Sorry, i dont know how to call the script.

    April 9, 2017 at 4:54 pm

    • Import-Module .\Get-PII2.ps1 # loads the script functions
      help Get-PII -ShowWindow # shows how to use the Get-PII function

      April 10, 2017 at 1:15 pm

  7. hernan

    SamB, thank you for ur answer. I could run the script successfully, but no results:

    command ran:
    Get-PII “txt”,”csv”,”doc?” “C:\Users\hfi\Documents\testingCreditCard”

    results:
    Searching for files with txt extension on folder C:\Users\hfi\Documents\testingCreditCard and its subfolders
    Searching for files with csv extension on folder C:\Users\hfi\Documents\testingCreditCard and its subfolders
    Searching for files with doc? extension on folder C:\Users\hfi\Documents\testingCreditCard and its subfolders
    No files found with PII in files with extension(s): ‘txt, csv, doc?’ in folder(s) ‘C:\Users\hfi\Documents\testingCreditCard’

    I have multiple files in \testingCreditCard:
    tarjeta1.txt: 4234-1234-1234-1234
    tarjeta2.txt: 4111-1111-1111-1111
    tarjeta.docx: 4556981799413951

    however, no patterns were found.

    Thank you very much

    April 10, 2017 at 3:19 pm

  8. hernan

    Hello Samb, could you read my question? Thank you in advance.

    April 12, 2017 at 9:29 am

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s