Strings in Elixir are inserted between double quotes, and they are encoded in UTF-8. Unlike C and C++ where the default strings are ASCII encoded and only 256 different characters are possible, UTF-8 consists of 1,112,064 code points. This means that UTF-8 encoding consists of those many different possible characters. Since the strings use utf-8, we can also use symbols like: ö, ł, etc.
To create a string variable, simply assign a string to a variable −
str = "Hello world"
To print this to your console, simply call the IO.puts function and pass it the variable str −
str = str = "Hello world" IO.puts(str)
The above program generates the following result −
Hello World
You can create an empty string using the string literal, "". For example,
a = "" if String.length(a) === 0 do IO.puts("a is an empty string") end
The above program generates the following result.
a is an empty string
String interpolation is a way to construct a new String value from a mix of constants, variables, literals, and expressions by including their values inside a string literal. Elixir supports string interpolation, to use a variable in a string, when writing it, wrap it with curly braces and prepend the curly braces with a '#' sign.
For example,
x = "Apocalypse" y = "X-men #{x}" IO.puts(y)
This will take the value of x and substitute it in y. The above code will generate the following result −
X-men Apocalypse
We have already seen the use of String concatenation in previous chapters. The '<>' operator is used to concatenate strings in Elixir. To concatenate 2 strings,
x = "Dark" y = "Knight" z = x <> " " <> y IO.puts(z)
The above code generates the following result −
Dark Knight
To get the length of the string, we use the String.length function. Pass the string as a parameter and it will show you its size. For example,
IO.puts(String.length("Hello"))
When running above program, it produces following result −
5
To reverse a string, pass it to the String.reverse function. For example,
IO.puts(String.reverse("Elixir"))
The above program generates the following result −
rixilE
To compare 2 strings, we can use the == or the === operators. For example,
var_1 = "Hello world" var_2 = "Hello Elixir" if var_1 === var_2 do IO.puts("#{var_1} and #{var_2} are the same") else IO.puts("#{var_1} and #{var_2} are not the same") end
The above program generates the following result −
Hello world and Hello elixir are not the same.
We have already seen the use of the =~ string match operator. To check if a string matches a regex, we can also use the string match operator or the String.match? function. For example,
IO.puts(String.match?("foo", ~r/foo/)) IO.puts(String.match?("bar", ~r/foo/))
The above program generates the following result −
true false
This same can also be achieved by using the =~ operator. For example,
IO.puts("foo" =~ ~r/foo/)
The above program generates the following result −
true
Elixir supports a large number of functions related to strings, some of the most used are listed in the following table.
Sr.No. | Function and its Purpose |
---|---|
1 | at(string, position) Returns the grapheme at the position of the given utf8 string. If position is greater than string length, then it returns nil |
2 | capitalize(string) Converts the first character in the given string to uppercase and the remainder to lowercase |
3 | contains?(string, contents) Checks if string contains any of the given contents |
4 | downcase(string) Converts all characters in the given string to lowercase |
5 | ends_with?(string, suffixes) Returns true if string ends with any of the suffixes given |
6 | first(string) Returns the first grapheme from a utf8 string, nil if the string is empty |
7 |
last(string) Returns the last grapheme from a utf8 string, nil if the string is empty |
8 |
replace(subject, pattern, replacement, options \\ []) Returns a new string created by replacing occurrences of pattern in subject with replacement |
9 |
slice(string, start, len) Returns a substring starting at the offset start, and of length len |
10 |
split(string) Divides a string into substrings at each Unicode whitespace occurrence with leading and trailing whitespace ignored. Groups of whitespace are treated as a single occurrence. Divisions do not occur on non-breaking whitespace |
11 |
upcase(string) Converts all characters in the given string to uppercase |
A binary is just a sequence of bytes. Binaries are defined using << >>. For example:
<< 0, 1, 2, 3 >>
Of course, those bytes can be organized in any way, even in a sequence that does not make them a valid string. For example,
<< 239, 191, 191 >>
Strings are also binaries. And the string concatenation operator <> is actually a Binary concatenation operator:
IO.puts(<< 0, 1 >> <> << 2, 3 >>)
The above code generates the following result −
<< 0, 1, 2, 3 >>
Note the ł character. Since this is utf-8 encoded, this character representation takes up 2 bytes.
Since each number represented in a binary is meant to be a byte, when this value goes up from 255, it is truncated. To prevent this, we use size modifier to specify how many bits we want that number to take. For example −
IO.puts(<< 256 >>) # truncated, it'll print << 0 >> IO.puts(<< 256 :: size(16) >>) #Takes 16 bits/2 bytes, will print << 1, 0 >>
The above program will generate the following result −
<< 0 >> << 1, 0 >>
We can also use the utf8 modifier, if a character is code point then, it will be produced in the output; else the bytes −
IO.puts(<< 256 :: utf8 >>)
The above program generates the following result −
Ā
We also have a function called is_binary that checks if a given variable is a binary. Note that only variables which are stored as multiples of 8bits are binaries.
If we define a binary using the size modifier and pass it a value that is not a multiple of 8, we end up with a bitstring instead of a binary. For example,
bs = << 1 :: size(1) >> IO.puts(bs) IO.puts(is_binary(bs)) IO.puts(is_bitstring(bs))
The above program generates the following result −
<< 1::size(1) >> false true
This means that variable bs is not a binary but rather a bitstring. We can also say that a binary is a bitstring where the number of bits is divisible by 8. Pattern matching works on binaries as well as bitstrings in the same way.