Using Powershell to report on files containing PII (Personally Identifiable Information)

26 August 2016:

This updated script also outputs the files found as a PS object that can be exported to CSV.

For example:

$PiiFiles = get-pii -FileType 'txt' 

Searches files with txt extension in the current folder and its subfolders, outputs the findings in HTML report and saves the file list in the $PiiFiles variable. That can be exported to CSV as in:

$PiiFiles | Export-Csv .\piilist1.csv -NoTypeInformation

Management of PII (Personally Identifiable Information) has always been a source of concern. PII includes information such as social security numbers and credit card numbers. Companies may have policies to regulate how they’re handled, and perhaps require encrypting the files where they’re stored. IT may be asked to audit or report on any files containing PII. This script does just that. The script is implemented as a function the SBTools module available on the Microsoft Script Center Repository.

Get-PII function uses EnahncedHTML2 functions by Don Jones who graciously agreed to have them included in SBTools module.

NIST provides examples of PII such as:

  • Name, such as full name, maiden name, mother‘s maiden name, or alias
  • Personal identification number, such as social security number (SSN), passport number, driver‘s license number, taxpayer identification number, patient identification number, and financial accountor credit card number
  • Address information, such as street address or email address
  • Asset information, such as Internet Protocol (IP) or Media Access Control (MAC) address or other host-specific persistent static identifier that consistently links to a particular person or small, well defined group of people
  • Telephone numbers, including mobile, business, and personal numbers
  • Personal characteristics, including photographic image (especially of face or other distinguishing characteristic), x-rays, fingerprints, or other biometric image or template data (e.g., retina scan, voice signature, facial geometry)
  • Information identifying personally owned property, such as vehicle registration number or title number and related information
  • Information about an individual that is linked or linkable to one of the above (e.g., date of birth, place of birth, race, religion, weight, activities, geographical indicators, employment information, medical information, education information, financial information).

This script searches and reports only on social security numbers and credit card numbers. It can be modified to detect additional PII patterns. Feel free to post a comment if you’d like to see more patterns added..

To search on a set of folders and report on files containing PII use a command like:

Get-PII “txt”,”csv”,”doc?” “D:\Sandbox”,”\\Server1\Install\Script?”

This searches the folder d:\sandbox and \\Server1\Install\script? for files with extensions txt, csv, and doc?, and compiles an HTML report of any files including PII.

The command output looks like:


The HTML report looks like:


The sample file looks like:



15 responses

  1. Kurt

    Thanks for this script. I note that it doesn’t search .docx files directly – I had to unzip a document caught by our firewall to verify that it contained data matching the patterns. It ended up being a false positive, but the script proved useful nonetheless.


    July 15, 2015 at 2:03 pm

  2. Pattern search tools like this script can only search files that are not compressed or encrypted. Common compression and encryption techniques would often interfere with the text pattern a script or tool is trying to detect.

    July 16, 2015 at 5:46 pm

  3. Mark Horvat

    Hi there,
    Great script!

    I have one issue though, while it produces a html report, how do I convert the html report to csv format ready for data mining?

    Thanks in advance!


    August 10, 2016 at 6:41 pm

  4. hernan

    Hello, when i run the next command, it shows me nothing, could you help me? what am i doing wrong? thanks
    PS D:\Descargas Chrome> ./Get-PII2 “txt”,”csv”,”doc?” “C:\Users\myUser\Documents”

    Advertencia de seguridad
    Ejecute solo los scripts de confianza. Los scripts procedentes de Internet pueden ser útiles, pero este script podría dañar
    su equipo. Si confía en este script, use el cmdlet Unblock-File para permitir que se ejecute sin este mensaje de
    advertencia. ¿Desea ejecutar D:\Descargas Chrome\Get-PII2.ps1?
    [N] No ejecutar [Z] Ejecutar una vez [U] Suspender [?] Ayuda (el valor predeterminado es “N”): Z
    PS D:\Descargas Chrome>

    April 7, 2017 at 3:47 pm

  5. Import-Module .\Get-PII2.ps1
    Get-PII “txt”,”csv”,”doc?” “C:\Users\myUser\Documents”

    April 7, 2017 at 10:56 pm

  6. hernan

    Hello, the ps filename is Get-PII2. Do i have to change the name of the file? if i try Get-PII it doesnt recognize the command. Sorry, i dont know how to call the script.

    April 9, 2017 at 4:54 pm

    • Import-Module .\Get-PII2.ps1 # loads the script functions
      help Get-PII -ShowWindow # shows how to use the Get-PII function

      April 10, 2017 at 1:15 pm

  7. hernan

    SamB, thank you for ur answer. I could run the script successfully, but no results:

    command ran:
    Get-PII “txt”,”csv”,”doc?” “C:\Users\hfi\Documents\testingCreditCard”

    Searching for files with txt extension on folder C:\Users\hfi\Documents\testingCreditCard and its subfolders
    Searching for files with csv extension on folder C:\Users\hfi\Documents\testingCreditCard and its subfolders
    Searching for files with doc? extension on folder C:\Users\hfi\Documents\testingCreditCard and its subfolders
    No files found with PII in files with extension(s): ‘txt, csv, doc?’ in folder(s) ‘C:\Users\hfi\Documents\testingCreditCard’

    I have multiple files in \testingCreditCard:
    tarjeta1.txt: 4234-1234-1234-1234
    tarjeta2.txt: 4111-1111-1111-1111
    tarjeta.docx: 4556981799413951

    however, no patterns were found.

    Thank you very much

    April 10, 2017 at 3:19 pm

  8. hernan

    Hello Samb, could you read my question? Thank you in advance.

    April 12, 2017 at 9:29 am

  9. Mike

    Looks like this script doesn’t work with docx or xlsx files

    June 7, 2017 at 12:06 pm

  10. DeeDub

    Worked for me. All I had to do is set $fileExtensions = “txt”,”csv”,”xls*”,”ppt*”,”doc*”,”pdf” and then set my location $SearchFolders = “\\sever\wherever”. Lastly, I ran get-PII $fileExtensions $SearchFolders.

    June 19, 2017 at 10:33 am

  11. Luke

    This script does not work for xls, ppt, or pdf documents. The get-content command only works for files in a text format such as TXT and CSV files. Trying running the command get-content against a docx file. You will find the output is compiled output and not readable text. It would be helpful, if there is was a command line tool that could read the contents of multiple file types. Since most people will not likely save this information in a TXT for CSV format, there is a strong need to have this script updated for other file formats.

    January 30, 2018 at 5:38 pm

  12. ExchangeRanger

    First off, this is a great script! Kudos to you! Secondly, as Luke stated, it seems it cannot search .docx and .xlsx files. Either because of the newer formats or some kind of permission issues. .txt and .pdf work in the same location. Could you possibly points us in the right direction? Thanks!

    June 13, 2018 at 5:20 pm

  13. ExchangeRanger

    I will note, .doc and .xls do work after saving as those file extensions from .xlsx/.docx.

    June 13, 2018 at 5:25 pm

  14. Peter Asp

    Can SSNs starting with 0000 be excluded? Our student IDs are 9 digits and start with 4 zeros. Thank you, Peter

    August 30, 2018 at 4:32 pm

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.