Get-Duplicates (or Unique items)
- Details
- Written by June Blender
- Last Updated: 14 April 2016
- Created: 23 October 2014
- Hits: 9969
I take a lot of free online coding classes, mainly from Coursera and Udacity, and I’ve picked up a lot of programming tricks in other languages that are easy translated to Windows PowerShell.
In a Java class on Udacity, I learned a cool way to find duplicates in any collection. It uses the fact that the keys in hash tables must be unique. The parser throw an “Item has already been added” error if you try to add a key that’s already in the hash table.
In this example, I try to add “Day” to a hash table that already has an “Day” key. The value is arbitrary.
$hash = @{ Day = "Wednesday"; Weather = "Sunny" } $hash.Add("Day", "Friday")
ERROR: Exception calling "Add" with "2" argument(s): "Item has already been added. Key in dictionary: 'Day' Key being added: 'Day'"Test.ps1 (15): ERROR: At Line: 15 char: 1 ERROR: + $hash.Add("Day", "Friday") ERROR: + ~~~~~~~~~~~~~~~~~~~~~~~~~~ ERROR: + CategoryInfo : NotSpecified: (:) [], MethodInvocationException ERROR: + FullyQualifiedErrorId : ArgumentException ERROR:
To detect a duplicate in a small collection, create a hash table and add the items to the hash table as keys.
$hash = @{ } "a", "b", "a", "c", "d" | ForEach { $hash.Add($_, 0) }
Use a Try block to add each item as a key with a value of 0. If a MethodInvocationException occurs in the Try block code, instead of erroring out and interrupting the script, it falls in the Catch block. I use the Catch block to save the duplicates in an array.
$Items = "a", "b", "a", "c", "d" $hash = @{ } $duplicates = @()
foreach ($item in $Items) { try { $hash.add($item, 0) } catch [System.Management.Automation.MethodInvocationException] { $duplicates += $item } }
You can return the duplicates that you saved and/or the unique items, which are the keys in the hash table.
$hash.keys c a d b
For the final version of my little script, I convert the hash table to an ordered dictionary, which preserves the order in which the keys were added. I also allow users to pipe the items to the script by adding the ValueFromPipeline parameter attribute and the Process block that supports it.
<# .SYNOPSIS Gets duplicates or unique values in a collection. .DESCRIPTION The Get-Duplicates.ps1 script takes a collection and returns the duplicates (by default) or unique members (use the Unique switch parameter). .PARAMETER Items Enter a collection of items. You can also pipe the items to Get-Duplicates.ps1. .PARAMETER Unique Returns unique items instead of duplicates. By default, Get-Duplicates.ps1 returns only duplicates. .EXAMPLE PS C:\> .\Get-Duplicates.ps1 -Items 1,2,3,2,4 2 .EXAMPLE PS C:\> 1,2,3,2,4 | .\Get-Duplicates.ps1 2 .EXAMPLE PS C:\> .\Get-Duplicates.ps1 -Items 1,2,3,2,4 -Unique 1 2 3 4 .INPUTS System.Object[] .OUTPUTS System.Object[] .NOTES =========================================================================== Created with: SAPIEN Technologies, Inc., PowerShell Studio 2014 v4.1.72 Created on: 10/15/2014 9:34 AM Created by: June Blender (juneb) #> param ( [Parameter(Mandatory = $true, ValueFromPipeline = $true)] [Object[]] $Items, [Parameter(Mandatory = $false)] [Switch] $Unique ) Begin { $hash = [ordered]@{ } $duplicates = @() } Process { foreach ($item in $Items) { try { $hash.add($item, 0) } catch [System.Management.Automation.MethodInvocationException] { $duplicates += $item } } } End { if ($unique) { return $hash.keys } elseif ($duplicates) { return $duplicates } }
Remember that qualification about small collections of items? This strategy is a little programming trick that is not optimized for large data sets. For those, stick with Microsoft.PowerShell.Utility\Get-Unique and other optimized methods.
June Blender is a technology evangelist at SAPIEN Technologies, Inc. You can reach her at This email address is being protected from spambots. You need JavaScript enabled to view it. or follow her on Twitter at @juneb_get_help.
For licensed customers, use the forum associated with your product in our Product Support Forums for Registered Customers.
For users of trial versions, please post in our Former and Future Customers - Questions forum.