Complex Numbers
Complex numbers demystified
Complex numbers are often introduced in an unnecessarily mystic and arcane way using the $\sqrt{1}=i$ property. And suddenly multiplication breaks and looks nothing like what we're used to.
This leads to many dismissing complex numbers thinking they are too, well, complex to understand. However, in reality, complex numbers are a beautiful and simple generalization of the real numbers which you're already familiar with, and they can be introduced without using square roots at all! In fact, $\sqrt{1}=i$ is just a neat consequence of complex numbers, and has nothing to do with the way they're defined.
But before we can introduce complex numbers, we have to understand another mathematical object: The field.
Fields
The classical definition of a field is a set^{[1]}
Formally, this means for all
 Associativity:
$a+(b+c)=(a+b)+c \land a\cdot (b\cdot c)=(a\cdot b) \cdot c$
 Commutativity:
$a+b=b+a \land a\cdot b = b \cdot a$
 Distributivity of multiplication over addition:
$a\cdot (b+c) = (a \cdot b) + (a\cdot c)$
 Additive and multiplicative identities $0,1$ both in
$F$ are defined:$a+0 =a \land a\cdot 1 =a$
 Additive and multiplicative inverses are defined
$\forall a \in F \space \exists a \in F : a + (a) = 0$
$\forall a \in F : a\neq 0 \space \exists a^{1} \in F : a \cdot a^{1} = 1$
Let's look at
Vector spaces
We can generalize the real numbers to a
Here, addition is defined as vector addition, and multiplication is defined as scalar multiplication.
This way, the real numbers become $\mathbb{R}^{1}$ which, per our definitions of vector addition and multiplication, collapses to our usual real number field.
The complex numbers would then just two dimensional tuples of numbers, $\mathbb{R}^{2}$, right?
Well, let's look at how many of the field properties complex numbers satisfy then:
Let $\vec{a} \in \mathbb{R}^2 = [a_1, a_2]$, $\vec{b} \in \mathbb{R}^2 = [b_1, b_2]$, and $\vec{c} \in \mathbb{R}^2 = [c_1, c_2]$:

Associativity:
For addition, we get:
$\vec{a}+(\vec{b}+\vec{c})=(\vec{a}+\vec{b})+\vec{c}$
By expanding this using vector addition, we see that:
$\begin{bmatrix} a_1 + (b_1 + c_1) \\ a_2 + (b_2 + c_2) \end{bmatrix} = \begin{bmatrix} (a_1 + b_1) + c_1 \\ (a_2 + b_2) + c_2 \end{bmatrix}$
Which gives us two equations:
$\begin{aligned} & a_1 + (b_1 + c_1) = (a_1 + b_1) + c_1 \\ & a_2 + (b_2 + c_2) = (a_2 + b_2) + c_2 \end{aligned}$
Which we know is true from the field properties of $\mathbb{R}$.
Likewise, for multiplication, we get:
$\vec{a} \cdot (\vec{b} \cdot \vec{c}) = ( \vec{a} \cdot \vec{b}) \cdot \vec{c}$
Again, by expanding this using vector multiplication, we get:
$\begin{bmatrix} a_1 \cdot (b_1 \cdot c_1) \\ a_2 \cdot (b_2 \cdot c_2) \end{bmatrix} = \begin{bmatrix} (a_1 \cdot b_1) \cdot c_1 \\ (a_2 \cdot b_2) \cdot c_2 \end{bmatrix}$
Which we also know to be true from the field properties of $\mathbb{R}$.
We can show the same thing for commutativity and distributivity of multiplication over addition using the exact same method.
Furthermore, we can find additive and multiplicative identities of $\mathbb{R}^2$; Namely, $[0,0]$ and $[1,1]$.
However, we cannot find any element satisfying the conditions of the multiplicative inverses, and since all the requirements must be met in order to have a field, we now know that $\mathbb{R}^2$ is not a field using our definitions for vector addition and multiplication.
Complex numbers are a solution to this problem.
Instead of using scalar multiplication as your multiplicative operator, we instead use the at first somewhat weird looking definition:
$\vec{a} \cdot \vec{b} = \begin{bmatrix} a_1 b_1  a_2 b_2 \\ a_1 b_2 + a_2 b_1 \end{bmatrix}$
The addition operator remains the same.
Using these definitions, we recover the field properties in $\mathbb{R}^2$. Hurray!
One interesting property is that when the complex part, that is $a_2$ in $\vec{a}$, is zero, the complex numbers behave just like the real numbers:
For multiplication that is:
$\vec{a} \cdot \vec{b} = \begin{bmatrix} a_1 b_1  0 0 \\ a_1 0 + 0 b_1 \end{bmatrix}= \begin{bmatrix} a_1 b_1 \\ 0 \end{bmatrix} = a_1 b_1$
And for addition:
$\vec{a} + \vec{b}= \begin{bmatrix} a_1 + b_1 \\ 0 + 0 \end{bmatrix}= a_1 + b_1$
However, if the real number part is zero multiplying two complex numbers can result in a real number. In other words, multiplying complex numbers results in a rotation of a two dimensional coordinate system.
For example:
$[0,1] \cdot [0,1] = [0\cdot 0  1\cdot 1, 0\cdot 1 + 1 \cdot 0] = [1,0]$
If we define $[0,1]=i$, we see that:
$i\cdot i = i^2 = 1$
Which means:
$\sqrt{1}=i$
Notice that this is a consequence of defining multiplication in a way that preserves the field properties in $\mathbb{R}^2$, and not something mystical crazy mathematicians have pulls out of thin air.
This property also gives us a neat way of writing complex numbers. Since $i = [0,1]$, we can write the $a_2$ part as a real number multiple of
$a + bi = [a,b]$
This representation also gives us a simple way of finding the multiplicative inverse:
$(a+bi)^{1} = \frac{1}{a+bi} = \frac{a}{a^2+b^2}  i\frac{b}{a^2+b^2}$
Complexvalued neural networks
Since the complex numbers are a field, we can use everything we know from linear algebra with complex numbers.
This means that we can create neural networks which work with complex numbers instead of the usual real numbers.
Recent theoretical research has shown that complex neural networks have can encode more rich representations, and are more robust against noise. However, the potential of complex networks remain largely untapped. ^{[2]} ^{[3]} ^{[4]} ^{[5]}
In an upcoming essay, we will discuss how to create complex valued neural networks.
Conclusion
We have seen how we can introduce complex numbers as a natural generalization of the real numbers, and how $\sqrt{1}=i$ is consequence which arises from defining a multiplication operation which preserves the field properties in $\mathbb{R}^2$.
You can think of a set analogous to a bag in programming; only it can be infinitely large. That is a set is just a collection of numbers. A set doesn't need to be ordered. ↩︎
Complex and Holographic Embeddings of Knowledge Graphs: A Comparison ↩︎
Orthogonality of decision boundaries in complexvalued neural networks. ↩︎