User Rating: 4 / 5

Star ActiveStar ActiveStar ActiveStar ActiveStar Inactive
 

In Windows PowerShell, the command pipeline is magic that lets us string together simple, easily understood commands into a complex command with significant power. It makes generic commands, like Get-Item, Sort-Object, and Set-Content function like true reusable parts, so custom commands never need to create them.

To make a pipeline work, the output of one command must be the input to the next command. This article explains how to write parameters that take input from other commands in the pipeline. We'll go step-by-step and explain the details.

Step 1: Simple function

Let's start with a very simple function. This one takes a directory and sums the size of the files inside. It ignores subdirectories and it doesn't verify that the $Directory is a directory and not a file, but I want to keep it simple.

 

function Get-DirectoryFileSize
{
    param
    (
	[Parameter(Mandatory = $true)]
	[string]
	$Directory
    )
	
    if (Test-Path -Path $Directory)
    {
	Get-ChildItem -Path $Directory -File | ForEach-Object {$size += $_.Length}
	[PSCustomObject]@{'Directory' = $Directory; 'SizeInMB' = $size / 1MB}	
    }
    else
    {
	Write-Error "Cannot find directory: $Directory"
    }
}

It works just fine. Nothing fancy.

PS C:\> Get-DirectoryFileSize -Directory $pshome\en-US            
                                                                         
Directory                                                    SizeInMB        
---------                                                    --------        
C:\Windows\System32\WindowsPowerShell\v1.0\en-US     12.1714696884155

But, what if I want to pipe a directory path to Get-DirectoryFileSize? Currently, that won't work. The parameter binder doesn't bind the path value to the directory parameter. Let's fix that.

PS C:\> ."C:\ps-test\Get-DirFileSize.ps1"                          
PS C:\> Get-Item $pshome | Get-DirectoryFileSize                   
                                                                          
cmdlet Get-DirectoryFileSize at command pipeline position 2               
Supply values for the following parameters:                               
Directory: 

Step 2: Value From Pipeline

I want to enable the Directory parameter to take input from the pipeline. It's really very simple. Add a Parameter attribute with the ValueFromPipeline parameter. This attribute tells the PowerShell parameter binder to associate values that are piped to the function with this parameter.

[Parameter(ValueFromPipeline = $true)]

Here's our revised function with the added ValueFromPipeline parameter.

function Get-DirectoryFileSize
{
    param
    (
	[Parameter(Mandatory = $true, ValueFromPipeline = $true)]
	[string]	      
	$Directory
    )
	
    if (Test-Path -Path $Directory)
    {
	Get-ChildItem -Path $Directory -File | ForEach-Object {$size += $_.Length}
		
	[PSCustomObject]@{'Directory' = $Directory; 'SizeInMB' = $size / 1MB}
    }
    else
    {
	Write-Error "Cannot find directory: $Directory"
    }
}

Now, when you pipe a directory path to Get-DirectoryFileSize, it works.

PS C:\> Get-Item $pshome | Get-DirectoryFileSize          
                                                                 
Directory                                              SizeInMB      
---------                                              --------      
C:\Windows\System32\WindowsPowerShell\v1.0     1.73065853118896

"No way!," you might howl in protest. I took a class and read a book that said that you must add BEGIN, PROCESS, and END blocks when taking value from the pipeline.

Not true. BEGIN, PROCESS, and END blocks are required for ValueFromPipeline (and ValueFromPipelineByPropertyName) only when the parameter takes a collection of objects. When it takes one object, they're not required.

Let's see why that's true.

Step 3: Take a Collection

To enable the Directory parameter to take more than one path string, add an array symbol ( [ ] ) to the String value.

[string[]]

So, the parameter declaration now looks like this.

param
(
    [Parameter(Mandatory = $true)]
    [string[]]	      
    $Directory
)

But, when we call the function with two values for the Directory parameter, we get an odd result.

PS C:\> Get-DirectoryFileSize -Directory $pshome, $pshome\en-us                                               
                                                                                                                     
Directory   SizeInMB      
---------   --------      
{C:\Windows\System32\WindowsPowerShell\v1.0, C:\Windows\System32\WindowsPowerShell\v1.0\en-us} 13.9021282196045

The value of the Directory parameter was an array

@(C:\Windows\System32\WindowsPowerShell\v1.0, C:\Windows\System32\WindowsPowerShell\v1.0\en-us)

So instead of getting the size of the files in each directory, we got a sum of those files.

And, we were lucky! The array value works only because the Path parameters of Test-Path and Get-ChildItem take an array value. Otherwise, it would have generated an error.

To process each array value independently, wrap the script logic in a ForEach loop. Inside the loop, be sure to replace $Directory with the item value, in this case, $d. I also added a line of code to reset the value of $size to 0 between each item so we don't end up with a cumulative sum.

function Get-DirectoryFileSize
{
    param
    (
	[Parameter(Mandatory = $true, ValueFromPipeline = $true)]
	[string[]]
	$Directory
    )
	
    foreach ($d in $Directory)
    {
	$size = 0
        if (Test-Path -Path $Directory)
	{			
	    Get-ChildItem -Path $d -File | ForEach-Object {$size += $_.Length}
            [PSCustomObject]@{'Directory' = $d; 'SizeInMB' = $size / 1MB}
        }
	else
	{
	    Write-Error "Cannot find directory: $d"
	}
    }
}

And, here's the result:

PS C:\> Get-DirectoryFileSize -Directory $PSHome, $PSHome\en-US    
                                                                          
Directory                                                SizeInMB         
---------                                                --------         
C:\Windows\System32\WindowsPowerShell\v1.0       1.73065853118896         
C:\Windows\System32\WindowsPowerShell\v1.0\en-US 12.1714696884155

But, what if we pipe an array of paths to the function?

Step 4: Pipe a collection

If we pipe an array of paths to the function, we get an odd result. The function processes only the last value in the collection.

PS C:\> "$PSHome", "$PSHome\en-US" | Get-DirectoryFileSize           
                                                                            
Directory                                                    SizeInMB           
---------                                                    --------           
C:\Windows\System32\WindowsPowerShell\v1.0\en-US     12.1714696884155           
                                                                            
                                                                            
PS C:\> "$PSHome\en-US", "$PSHome" | Get-DirectoryFileSize           
                                                                            
Directory                                                    SizeInMB                 
---------                                                    --------                 
C:\Windows\System32\WindowsPowerShell\v1.0           1.73065853118896

When you set a breakpoint on the first line of code in the function and debug it, you can see that Directory has only one value, the last value in the collection.

image001

Here's where the BEGIN, PROCESS, and END blocks are required.

When a function has BEGIN, PROCESS, and END blocks:

  • The BEGIN block runs once, before the first item in the collection.
  • The END block also runs once, after every item in the collection has been processes.
  • The PROCESS block runs once for each item in the collection.

When a script doesn't have BEGIN, PROCESS, and END blocks, the entire function is considered to be an END block and it runs after the last item in the collection. That's why the value of the parameter is the last item in the collection.

So, to manage an array that's piped to the function, we need to add BEGIN, PROCESS, and END blocks.

function Get-DirectoryFileSize
{
    param
    (
	[Parameter(Mandatory = $true, ValueFromPipeline = $true)]
	[string[]]
	$Directory
    )
	
    BEGIN {}
    PROCESS {
	foreach ($d in $Directory)
	{
	    $size = 0
	    if (Test-Path -Path $Directory)
	    {
		Get-ChildItem -Path $d -File | ForEach-Object {$size += $_.Length}
                [PSCustomObject]@{'Directory' = $d; 'SizeInMB' = $size / 1MB}
	    }
	    else
	    {
		Write-Error "Cannot find directory: $d"
	    }
        }
    }
    END {}
}

 

Now, the function works correctly with piped values.

PS C:\> "$PSHome\en-US", "$PSHome" | Get-DirectoryFileSize 
                                                                  
Directory                                                SizeInMB 
---------                                                -------- 
C:\Windows\System32\WindowsPowerShell\v1.0\en-US 12.1714696884155 
C:\Windows\System32\WindowsPowerShell\v1.0       1.73065853118896

And, with values specified for the parameter.

PS C:\> Get-DirectoryFileSize -Directory "$PSHome\en-US", "$PSHome"    
                                                                              
Directory                                                SizeInMB             
---------                                                --------             
C:\Windows\System32\WindowsPowerShell\v1.0\en-US 12.1714696884155             
C:\Windows\System32\WindowsPowerShell\v1.0       1.73065853118896

By setting a breakpoint in the BEGIN block, you can also see that while BEGIN is processing, the value of the Directory parameter is $null. That's because it runs before any piped values are processed.

image003

 

And, while the END block is processing, the value of Directory is the last item in the collection.

image005

 

 

Epilogue: Do we still need ForEach?

We're essentially done with this function. It works just fine. But the curious among us (okay, me) wonder about this. If BEGIN, PROCESS, and END run once for each object in the pipeline, do we still need the ForEach loop inside the PROCESS block?

To test, delete or comment-out the ForEach loop. Remember to rename the $d variable back to $Directory.

function Get-DirectoryFileSize
{
    param
    (
	[Parameter(Mandatory = $true, ValueFromPipeline = $true)]
	[string[]]
	$Directory
    )
	
    BEGIN {}
    PROCESS {
	$size = 0
	if (Test-Path -Path $Directory)
	{
	    Get-ChildItem -Path $Directory -File | ForEach-Object {$size += $_.Length}
            [PSCustomObject]@{'Directory' = $Directory; 'SizeInMB' = $size / 1MB}
	}
	else
	{
	    Write-Error "Cannot find directory: $Directory"
	}
    }	
    END {}
}

The pipeline version of the command still works.

PS C:\> "$PSHome\en-US", "$PSHome" | Get-DirectoryFileSize

Directory                                                  SizeInMB
---------                                                  --------
{C:\Windows\System32\WindowsPowerShell\v1.0\en-US} 12.1714696884155
{C:\Windows\System32\WindowsPowerShell\v1.0}       1.73065853118896

But, the version with parameter values is getting the entire array as one object.

PS C:\> Get-DirectoryFileSize -Directory "$PSHome\en-US", "$PSHome"
Directory SizeInMB
--------- --------
{C:\Windows\System32\WindowsPowerShell\v1.0\en-US, C:\Windows\System32\WindowsPowerShell\v1.0} 13.9021282196045

So, you need both.

  • ForEach loop: To manage an array of parameters values
  • PROCESS block: To manage an array of piped values

And, here's the version of this function that I actually use.

function Get-DirectoryFileSize
{
	param
	(
		[Parameter(Mandatory, ValueFromPipeline)]
		[ValidateScript({ $_ | ForEach-Object {(Get-Item $_).PSIsContainer}})]
		[string[]]
		$Directory,
		
		[switch]
		$Recurse
	)
	
	BEGIN {}
	PROCESS
	{
		foreach ($folder in $Directory)
		{
			$size = 0
			if ($files = Get-ChildItem $folder -Recurse:$Recurse -File)
			{
				$files | ForEach-Object {
					$size += $_.Length
				}
				
				[PSCustomObject]@{
					'Directory' = $folder; 'SizeInMB' = $size / 1MB
				}
			}			
		}
	}
	END {}
}

 

June Blender is a technology evangelist at SAPIEN Technologies, Inc. and a Microsoft Cloud and Datacenter MVP. You can reach her at This email address is being protected from spambots. You need JavaScript enabled to view it. or follow her on Twitter at @juneb_get_help.

If you have questions about our products, please post in our support forum.
For licensed customers, use the forum associated with your product in our Product Support Forums for Registered Customers.
For users of trial versions, please post in our Former and Future Customers - Questions forum.
Copyright © 2024 SAPIEN Technologies, Inc.