Elixir - Strings


Advertisements

Strings in Elixir are inserted between double quotes, and they are encoded in UTF-8. Unlike C and C++ where the default strings are ASCII encoded and only 256 different characters are possible, UTF-8 consists of 1,112,064 code points. This means that UTF-8 encoding consists of those many different possible characters. Since the strings use utf-8, we can also use symbols like: ö, ł, etc.

Create a String

To create a string variable, simply assign a string to a variable −

str = "Hello world"

To print this to your console, simply call the IO.puts function and pass it the variable str −

str = str = "Hello world" 
IO.puts(str)

The above program generates the following result −

Hello World

Empty Strings

You can create an empty string using the string literal, "". For example,

a = ""
if String.length(a) === 0 do
   IO.puts("a is an empty string")
end

The above program generates the following result.

a is an empty string

String Interpolation

String interpolation is a way to construct a new String value from a mix of constants, variables, literals, and expressions by including their values inside a string literal. Elixir supports string interpolation, to use a variable in a string, when writing it, wrap it with curly braces and prepend the curly braces with a '#' sign.

For example,

x = "Apocalypse" 
y = "X-men #{x}"
IO.puts(y)

This will take the value of x and substitute it in y. The above code will generate the following result −

X-men Apocalypse

String Concatenation

We have already seen the use of String concatenation in previous chapters. The '<>' operator is used to concatenate strings in Elixir. To concatenate 2 strings,

x = "Dark"
y = "Knight"
z = x <> " " <> y
IO.puts(z)

The above code generates the following result −

Dark Knight

String Length

To get the length of the string, we use the String.length function. Pass the string as a parameter and it will show you its size. For example,

IO.puts(String.length("Hello"))

When running above program, it produces following result −

5

Reversing a String

To reverse a string, pass it to the String.reverse function. For example,

IO.puts(String.reverse("Elixir"))

The above program generates the following result −

rixilE

String Comparison

To compare 2 strings, we can use the == or the === operators. For example,

var_1 = "Hello world"
var_2 = "Hello Elixir"
if var_1 === var_2 do
   IO.puts("#{var_1} and #{var_2} are the same")
else
   IO.puts("#{var_1} and #{var_2} are not the same")
end

The above program generates the following result −

Hello world and Hello elixir are not the same.

String Matching

We have already seen the use of the =~ string match operator. To check if a string matches a regex, we can also use the string match operator or the String.match? function. For example,

IO.puts(String.match?("foo", ~r/foo/))
IO.puts(String.match?("bar", ~r/foo/))

The above program generates the following result −

true 
false

This same can also be achieved by using the =~ operator. For example,

IO.puts("foo" =~ ~r/foo/)

The above program generates the following result −

true

String Functions

Elixir supports a large number of functions related to strings, some of the most used are listed in the following table.

Sr.No. Function and its Purpose
1

at(string, position)

Returns the grapheme at the position of the given utf8 string. If position is greater than string length, then it returns nil

2

capitalize(string)

Converts the first character in the given string to uppercase and the remainder to lowercase

3

contains?(string, contents)

Checks if string contains any of the given contents

4

downcase(string)

Converts all characters in the given string to lowercase

5

ends_with?(string, suffixes)

Returns true if string ends with any of the suffixes given

6

first(string)

Returns the first grapheme from a utf8 string, nil if the string is empty

7

last(string)

Returns the last grapheme from a utf8 string, nil if the string is empty

8

replace(subject, pattern, replacement, options \\ [])

Returns a new string created by replacing occurrences of pattern in subject with replacement

9

slice(string, start, len)

Returns a substring starting at the offset start, and of length len

10

split(string)

Divides a string into substrings at each Unicode whitespace occurrence with leading and trailing whitespace ignored. Groups of whitespace are treated as a single occurrence. Divisions do not occur on non-breaking whitespace

11

upcase(string)

Converts all characters in the given string to uppercase

Binaries

A binary is just a sequence of bytes. Binaries are defined using << >>. For example:

<< 0, 1, 2, 3 >>

Of course, those bytes can be organized in any way, even in a sequence that does not make them a valid string. For example,

<< 239, 191, 191 >>

Strings are also binaries. And the string concatenation operator <> is actually a Binary concatenation operator:

IO.puts(<< 0, 1 >> <> << 2, 3 >>)

The above code generates the following result −

<< 0, 1, 2, 3 >>

Note the ł character. Since this is utf-8 encoded, this character representation takes up 2 bytes.

Since each number represented in a binary is meant to be a byte, when this value goes up from 255, it is truncated. To prevent this, we use size modifier to specify how many bits we want that number to take. For example −

IO.puts(<< 256 >>) # truncated, it'll print << 0 >>
IO.puts(<< 256 :: size(16) >>) #Takes 16 bits/2 bytes, will print << 1, 0 >>

The above program will generate the following result −

<< 0 >>
<< 1, 0 >>

We can also use the utf8 modifier, if a character is code point then, it will be produced in the output; else the bytes −

IO.puts(<< 256 :: utf8 >>)

The above program generates the following result −

Ā

We also have a function called is_binary that checks if a given variable is a binary. Note that only variables which are stored as multiples of 8bits are binaries.

Bitstrings

If we define a binary using the size modifier and pass it a value that is not a multiple of 8, we end up with a bitstring instead of a binary. For example,

bs = << 1 :: size(1) >>
IO.puts(bs)
IO.puts(is_binary(bs))
IO.puts(is_bitstring(bs))

The above program generates the following result −

<< 1::size(1) >>
false
true

This means that variable bs is not a binary but rather a bitstring. We can also say that a binary is a bitstring where the number of bits is divisible by 8. Pattern matching works on binaries as well as bitstrings in the same way.

Advertisements