FUNDAMENTALS A Complete Guide for Beginners
Bash split string refers to slicing a string into several parts based on a specific symbol, character, or substring as a delimiter. Here, a delimiter is a specific character or sequence of characters used to separate a string. The steps to split a string in Bash are:
- Initialize a string variable with any sentence or text. You can also read text from any file or user input.
- Choose a separator based on which you want to split the string.
- Read the string and Split it using the IFS variable, readarray command or tr command, etc.
- Store the split-ted words either into an array or into a variable.
- Finally, print split-ted words using loop , echo, or printf commands.
You can use the following 8 methods to split strings in Bash:
- Using IFS variable:
IFS='Delimiter' read -a <array_name> <<< <string_variable>
- Using “readarray” command:
readarray -d "delimiter" -t <array_name> <<< <string_variable>
- Using parameter expansion:
echo "${STRING_VARIABLE#* DELIMITER}" echo "${STRING_VARIABLE##* DELIMITER}" echo "${STRING_VARIABLE% DELIMITER*}" echo "${STRING_VARIABLE%% DELIMITER*}"
- Using positional parameters:
set -- $STRING_VARIABLE
- Using “tr” command:
array_name=($(echo $STRING_VARIABLE | tr "DELIMITER" "REPLACING_DELIMITER"))
- Using “awk” command:
echo "$STRING_VARIABLE" | awk -F ' DELIMITER' '{print $1,$2,$3,...field’s positional value}'
- Using “sed” command:
SPLIT_STRING=$(echo "$STRING_VARIABLE" | sed 's/DELIMITER/REPLACING_DELIMITER/g')
- Using “cut” command:
echo "$STRING_VARIABLE" | cut -d DELIMITER -f FIELD_NUMBER
This article will walk you through these 8 methods to split strings in Bash.
1. Using IFS Variable
IFS (Internal Field Separator) is a special shell variable used to split the string based on the assigned delimiter. Any character or value (\n
, -
,etc) can be the delimiter. IFS is assigned with whitespace characters by default. Nevertheless, it’s customizable, you can set a different set of characters as delimiters like space, tab, and new line.
Here’s an example:
#!/bin/bash
string="Learn-to-split-string-with-LinuxSipmply"
#Set dash as delimiter
IFS='-' read -a array <<< "$string"
#Print the split string using the loop
for word in "${array[@]}"; do
echo "$word"
done
-
as the delimiter. In the snippet IFS='-' read - a array <<< "$string"
, the read command reads the string variable and separates it into individual words using the dash as the delimiter, and then stores those words into the array. Here, the -a
option tells the “read” command to read the input line into the array. After splitting, the for loop iterates through each element of the array and prints them.Here, the script splits the string according to the IFS variable where the delimiter is a dash “-“.
Note: You can assign IFS with any particular character to split the string according to that character or symbol.
2. Using “readarrray” Command
The readarray command in Bash reads lines from standard user input (such as a string or a file) and stores the elements in an array. To split a string, use the readarray command along with the option -d
to specify the delimiter like the syntax readarray -d "delimiter" array_name <<< "$string_variable"
.
Check the following script to split the string using “readarray” command:
#!/bin/bash
string="Learn-to-split-string-with-LinuxSipmply"
#split the string with readarray command
readarray -d "-" -t array <<< "$string"
#print the array
for word in "${array[@]}";
do
printf "%s\n" "$word"
done
readarray
command reads the string into the array. The -d "-"
option specifies the dash -
as the delimiter, and -t
trims trailing newline characters. The printf "%s\n" ""$word"
prints each element of the array word in a newline.As you run the script, the resulting output displays the split-ted string into words based on the dash -
delimiter.
-d
option to split the string according to that character or symbol.3. Using Parameter Expansion
Parameter expansion is the manipulation tool in Bash that finds, replaces, or modifies the parameter values. In Bash, to split a string, use the parameter expansion expressions such as ${string#*delimiter}
and ${string##*delimiter}
to eliminate a prefix from a string. Additionally, ${string%delimiter*}
and ${string%%delimiter*}
are employed to remove a suffix.
Here’s an example:
#!/bin/bash
string='Bash@Split@String@Parameter@Expansion'
#Remove the shortest match of substring from the beginning
echo "Splits the string at the shortest match of *@ from starting"
echo "${string#*@}"
echo
#Remove the longest match of substring from the beginning
echo "Splits the string at the longest match of *@ from starting"
echo "${string##*@}"
echo
#Remove the shortest match of substring from the end
echo "Splits the string at the shortest match of @* from the end"
echo "${string%@*}"
echo
#Remove the longest match of substring from the end
echo "Splits the string at the longest match of @* from the end"
echo "${string%%@*}"
Here, ${string#*:}
removes the shortest match of *:
(anything followed by a colon) from the beginning of the string. It removes “Bash:” and gives 'Split:String:Parameter:Expansion'
.
${string##*:}
removes the longest match of *:
from the beginning of the string. It removes everything up to the last colon and gives “Expansion”.
${string%:*}
removes the shortest match of :*
from the end of the string. It removes “:Expansion” and gives “Bash:Split:String:Parameter”.
${string%%:*}
removes the longest match of *:
from the end of the string. It removes everything up to the first colon and gives “Bash”.
The script splits the string in four different ways according to the parameter expansion expression.
4. Using Positional Parameter
In Bash, positional parameters are special variables that automatically store the arguments passed to a script or function. These variables are sequential numbers (like 1,2 3..) and they can be accessed using the dollar sign symbol $
followed by the number. To set positional parameters, the “set” command is used.
To split a string using the positional parameter with the “set” command, you can follow the script below:
#!/bin/bash
string="Split string using positional parameter"
# Set positional parameters to the words in the string
set -- $string
# Iterate through positional parameters using a loop
for word in $@; do
echo "$word"
done
set -- $string
statement sets the positional parameters ($1, $2, $3, etc.) to each word of the string. The double dash --
is used to prevent ambiguity in case the $string begins with a dash -
, which might be misinterpreted as an option by the set command.Here, the string is split on each positional value.
5. Using “tr” Command
The tr command manipulates text data by translating or removing characters to perform transformation operations. To split a string using “tr”, use the below bash script:
#!/bin/bash
# Input string
string="Ubuntu, RHEL, Fedora, Kali, CentOS"
# Use tr command to replace the delimiter character with a newline
array=($(echo $string | tr "," "\n"))
# Print element of the array
for element in "${array[@]}"; do
echo "$element"
done
echo $string | tr "," "\n"
uses the echo command and pipes |
the string to the tr command. The tr command is used to replace every occurrence of the comma ,
with a newline character \n
. This effectively transforms the string into multiple lines.The script splits the string by replacing the comma (,
) of the string with the newline character \n
.
6. Using “awk” Command
The awk command is a text-processing tool that is suitable for pattern matching, extracting, and data manipulation from files or streams. To split a string, you can use the “awk” command with the -F
option which sets the file separator value. You can check the following example:
#!/bin/bash
string="ubuntu:fedora:rhel:centos"
echo "$string" | awk -F ':' '{print $1,$2,$3}'
echo string
has been piped to the awk command. The option -F
sets the colon :
as the field separator. Subsequently, the “awk” command splits the input string into fields based on the colon. {print $1,$2,$3}
refers to the script that will print the first three fields of the string separated by the colon. The script splits the string and displays output (ubuntu fedora rhel) as shown in the picture.
7. Using “sed” Command
The sed command is a powerful tool used for text stream processing. It performs operations on text like search, replace, insert, and delete operations. This command can be used to split strings with the -s
option.
To split a string using the “sed” command in Bash, you can use the following script:
#!/bin/bash
string="Bash,Loop,String,Split"
echo $string | sed 's/,/\
/g'
sed
command to substitute every comma ,
with a new line throughout the string, effectively splitting the string into separate lines.8. Using “cut” Command
The cut command extracts or cuts specific parts of input information. To split a string in Bash, use the cut command with the option -d
to specify the separator. check the following script to split a string using the “cut” command:
#!/bin/bash
string="Bash,Loop,String,Split"
echo "$string" | cut -d ',' -f1
echo "$string" | cut -d ',' -f2
echo "$string" | cut -d ',' -f3
echo "$string" | cut -d ',' -f4
-d ','
specifies the delimiter as a comma, and -f1
, -f2
, -f3
, -f4
indicates the fields (parts) to be extracted. The output will display each part on a separate line.The output shows the split-ted string using the cut command after running the script.
8 Bash String Split Examples
This section states 8 Bash split string examples based on various delimiters like space, colon, comma, substring, etc. You can excel in your skill in splitting strings in Bash by interacting with the examples stated below:
1. Split String by Space into Array
You can split a string into small segments and store them in an array using “readarray” command followed by the -d
option and an array. Check the below script to get an example of splitting a string into an array:
#!/bin/bash
# Input String
string="Split the string into Array"
#Split the string by space
readarray -d " " -t array <<< "$string"
#print the array
i=0
for element in "${array[@]}"; do
echo "Array[$i]: $element"
i=$((i+1))
done
read -a array <<< "$string"
. This command effectively reads a line from the standard input string, divides it into distinct fields based on space, and then assigns those fields to an array named “array”. At last, the array is printed using a for loop.2. Bash Split String into Variable
You can split a string and store the elements into variables in multiple ways. You can check the approaches from below:
2.1 Using IFS VAriable: To split a string and store the split-ted word in multiple variables, check the script:
#!/bin/bash
pkg="apt,yum,pacman"
IFS=’,’
read Var1 Var2 Var3 <<<$pkg
#Print the variable
echo 'Value of Var1:'$Var1
echo 'Value of Var2:'$Var2
echo 'Value of Var3:'$Var3
,
. Then, the read command takes the string pkg
and stores them into three separate variables var1, var2, and var3 after splitting.Here’s another example,
IFS=- read v1 v2 v3 v4 <<< Split-string-into-variables
echo "$v1 $v2 $v3 $v4"
Here, the dash -
separated string has been split into four separate variables (v1, v2, v3, v4) using the IFS variable and read command.
2.2 Using Cut Command: To split a string into variables, use the cut command as below:
string="Linux-Ubuntu-Red hat"
v1=$(echo $string | cut -f1 -d -)
v2=$(echo $string | cut -f2 -d -)
v3=$(echo $string | cut -f3 -d -)
echo $v1
echo $v2
echo $v3
The string has been cut into three variables (v1, v2, v3) based on the dash -
delimiter. The output shows the variables after splitting.
3. Bash Split String by “@”
To split a string by setting @
symbol in the IFS variable, check the following script:
#!/bin/bash
string="Split@the@string@using@symbol"
#Set space as delimiter
IFS='@'
read -ra splitstr <<< "$string"
#Print the split string using the loop
for word in "${splitstr[@]}";
do
echo "$word"
done
Upon running the script, you’ll get the split-ted string into words as shown in the image.
4. Bash Split String by Colon
To split a string using a colon, set the colon :
as a field separator within the IFS variable and use the “read” command with the option -ra
to split the string as before:
#!/bin/bash
string="Split:the:string:using:symbol"
#Set space as delimiter
IFS=':'
read -ra array <<< "$string"
To split the string by colon using the “readarray” command, use the following expression after declaring the string variable:
readarray -d : -t array <<< "$string"
5. Bash Split String by Comma
Set field separator as comma (‘,’) using either IFS variable or “readarray” command and split the string as below:
IFS=','
read -ra array <<< "$string"
- Alternatively, you can use the following expression:
# Split the string based on the delimiter, ','
readarray -d , -t array <<< "$string"
- If you want to split a string based on multiple separators or delimiters, set IFS to a custom character class containing multiple separators as below script:
#!/bin/bash
string="Split:This-String,With_Multiple;Separators"
# Set IFS to a custom character class containing multiple separators
IFS='[:,;_-]'
# Read the string into an array
read -ra array <<< "$string"
# Print each element of the array
for element in "${array[@]}"; do
echo "$element"
done
Here, you can see that, the script has split-ted the string based on the five delimiters (colon, comma, semicolon, underscore, dash ) specified within the IFS variable.
6. Bash Split Multiline String
Multiline strings can be split into substrings using various approaches such as using parameter expansion, readarray command, and IFS variable. Here’s how:
6.1. Using Parameter Expansion: You can use parameter expansion to split a multiline string as in the below script:
#!/bin/bash
string=$'Multiline string is split now.
Each line is stored as a substring.
Now print the substring.
Here’s it.'
substring=("${string//$'\n'/ }")
for word in "${substring[@]}"; do
echo "$word"
done
substring=("${string//$'\n'/ }")
utilizes parameter expansion to replace newline characters $'\n'
with spaces, effectively transforming the multiline string into a space-separated string. The resulting substrings are assigned to an array named substring.As you run the script, you’ll see the newline character from the multiline string has been replaced with space. That’s why the output shows the string as a paragraph.
6.2 Using “readarray” Command: To split the multiline string using “readarray” command, follow the script below:
#!/bin/bash
string=$'Multiline string is split now.
Each line is stored as a substring.
Now print the substring.
Here’s it.'
readarray -t substring <<<"$string"
#Print the substrings
i=1
for word in "${substring[@]}"; do
echo "Subtring $i: $word"
((i=i+1))
done
-t
option removes the trailing newline (‘\n’) characters.The script splits the multiline string into separate substrings considering the newline character.
6.3 Using IFS Variable: If you want to split the multiline string using the IFS variable, replace the readarray -t substring <<<"$string"
line of the above script with the following line:
IFS=$'\n' read -rd '' -a substring <<<"$string"
6.4 Using IFS Variable and Parameter Expansion: You can combine the IFS variable and parameter expansion to split a multiline string in Bash. To do that, use the following line instead of the previous one:
IFS=$'\n' substring=(${string//$'\n'/ })
7. Bash Split String by a Substring or Multi-character Delimiter
The multi-character delimiter is a sequence of characters that separates different segments of a string or text. To split a string using a multi-character delimiter, follow the script below:
#!/bin/bash
# Read the main string
string1="distroUbuntu distroRHEL distroKali distroFedora "
#Define the multi-character delimiter
delimiter="distro"
#concatenate the delimiter with the string
string=$string1$delimiter
#split the string based on the delimiter
array=()
while [[ $string ]]; do
array+=( "${string%%$delimiter*}" )
string=${string#*$delimiter}
done
#print the elements of the array
for val in "${array[@]}"
do
echo -n "$val"
done
string=$string1$delimiter
appends the delimiter with the string1. Then, to split the string iteratively, while loop is used. Here, array+=( "${string%%$delimiter*}" )
removes the longest suffix pattern matching $delimiter*
from the end of the string and appends each split element to the array. Whereas, string=${string#*$delimiter}
removes the shortest prefix pattern matching *$delimiter
from the beginning of the string. The loop continues until the string variable becomes empty. The loop continues until the string variable becomes empty, at which point the splitting is complete.
The output shows the split-ted string based on the substring “distro”.
8. Bash Split String by Newline Characters
You can use the IFS variable to split a string based on the newline (‘\n’) character as below:
x='some
thing'
y=${x%$'\n'*}
After running the script, you can see the string is split into the array.
Solved: Handling Empty Fields
A common problem while splitting a string into an array in Bash is dealing with empty fields. If your string has consecutive separators, Bash may consider them as empty fields. Let me illustrate this with an example:
#!/bin/bash
string='Bash::String::Splitting'
IFS=':' read -a array <<< "$string"
i=0
for element in "${array[@]}"; do
echo "array[$i]":"$element"
i=$((i+1))
done
#Output
#array[0]:Bash
#array[1]:
#array[2]:String
#array[3]:
#array[4]:Splitting
Here you can see from the output that the elements of ‘array[1]’ and ‘array[3]’ are empty fields. Bash interprets consecutive colons as an empty field here.
To solve this, tokenize the string by using a loop to iterate over the array elements and print them only if they are not empty. Follow the script below:
#!/bin/bash
string='Bash::Scripting'
IFS=':' read -a array <<< "$string"
# Print non-empty elements of the array
for element in "${array[@]}"; do
if [[ -n "$element" ]]; then
echo "$element"
fi
done
Here, the condition -n "$element"
within the if statement checks if the element is not an empty string. If an element is not empty, it gets printed.
What is the Difference Between Split and Tokenize?
While “splitting” generally means dividing a string into parts based on a specific criterion (like a delimiter), “tokenizing” has a broader meaning and often involves breaking a text into meaningful units (tokens) based on certain rules or patterns. For example:
#!/bin/bash
#bash split string example
sentence="Alice, Bob, and Charlie"
IFS=', ' read -ra split_result <<< "$sentence"
for part in "${split_result[@]}"; do
echo "$part"
done
#output
Alice
Bob
and Charlie
Let’s have an example of tokenizing below:
#!/bin/bash
#tokenizing example
sentence="Tokenization and splitting are almost same."
echo "$sentence" | awk 'BEGIN {RS="[^A-Za-z0-9]+"} {print $0}'
#output
Tokenization and splitting are almost the same
Tokenizing involves breaking a string into meaningful words or phrases, whereas splitting a string refers to breaking it based on a specified delimiter.
Best Practices to Split String
When you’re splitting strings in Bash, here are some important conventions to keep in mind:
- Quote Variables: Always quote your variable to prevent word splitting and pathname expansion. This ensures that the string is treated as a single entity, especially when it contains spaces or special characters.
- Save and Restore IFS: If you’re changing the way strings are split (using IFS), save and restore its original setting: This prevents unexpected issues in other parts of your script.
- Test with Different Inputs: Always test your scripts with different inputs to ensure they handle edge cases correctly.
Practice Tasks on Bash Split String
If you want to excel in your skill in Bash split string, you can try to solve the following problems using Bash split string concepts.
- Write a Bash script that takes a comma-separated list of items as an input string and prints each item on a new line.
- Split String by Delimiter and Get N-th Element in Bash Shell Script.
- Write a bash script that takes a URL as input and extracts the domain name.
- Suppose a given file contains IP addresses on each line. Extract those that belong to a specific subnet. For example, extract addresses within the 192.168.1.0/24.
- Create a Bash script that takes a sentence as input and reverses the order of words. The scripts should then print the modified sentence.
Conclusion
In conclusion, splitting a string in Bash is a valuable skill for Linux developers to extract data from log files and process user input, understanding various methods of string manipulation. Hope this article helped you to gain the expertise on Bash string splitting.
People Also Ask
Why it’s necessary to split a string?
Splitting a string is necessary for various purposes including programming and scripting. Here are some key reasons why it’s necessary:
- Data Extraction
- Configuration Parsing
- Text Processing and Filtering
- Path manipulation
- User Input handling
What happens if the delimiter is not found in the string?
If the delimiter is not found in the string, the splitting operation typically returns the entire string as a single element, as there is no delimiter to separate the string into distinct parts.
To handle leading and trailing whitespace during splitting in Bash, you can follow the script below:
#!/bin/bash
string=" John is 17 years old "
# Set IFS to include space
IFS=" "
# Tokenize using the read command
read -r -a tokens <<< "$string"
# Print each token
for token in "${tokens[@]}"; do
echo "$token"
done
What is the difference between ‘${variable%%pattern}’ and ‘${variable#pattern}’ in string splitting?
The syntax${variable%%pattern}
removes the longest match from the end, whereas the ${variable#pattern}
removes the shortest match from the beginning of the variable content based on the specified pattern.
Can I split a string into characters in Bash?
Yes, you can split a string into key-value pairs methods like parameter expansion or awk. Ensure that your string follows a consistent pattern. Here’s a simple example:
#!/bin/bash
string="Hello"
# Split the string into characters
for ((i=0; i<${#string}; i++)); do
echo "${string:$i:1}"
done
Here, ${#string} gives the length of the string and ${string:$i:1} extracts a substring of length 1, starting from position $i in the string.
How do I split a string in Bash?
To split a string in Bash, use the IFS variable, readarray command, or cut command. For example, to split a string on a delimiter dash -
, use the IFS variable to assign the delimiter, and then read the string into an array splitting it. In this case, the syntax will be: IFS='-' read array <<< ""$string
.
How do you split as a string in Bash without IFS?
To split a string in Bash without IFS, use the cut command, readarray command, awk command, etc. For example, to split a string on a delimiter comma ,
without using an IFS variable, use readarray command as the syntax: readarray -d "," -t array <<< "$string"
.
Related Articles
- Bash String Basics
- Bash String Operations
- String Manipulation in Bash
- How to Extract Bash Substring? [5 methods]
- Check String in Bash
- A Complete Guide to Bash Regex
- Bash Multiline String
<< Go Back to Bash String | Bash Scripting Tutorial
Not even the ChatGPT was capable of giving such great explanations.