The objective is to accumulate enough knowledge about Erlang so that one can forget it and quickly get back to it when needed.
What makes Erlang unique?
- Erlang is unique because it models computation as a network or concurrent processes, mimicking things happening in parallel in the world.
What is a process?
A process is anything that for which these assertions are verified:
- it has addresses ;
- sending a message to any of its addresses causes it to receive this message ;
- it has a state ;
- given a message and its state, it can perform computations ;
- it can update its state ;
- it can send messages to addresses it knows about (and thus other processes) ;
- it can create other processes.
What is a Core?
- In a computer, a Core is a physical piece of hardware that can execute machine code that implements an ISA.
How does Erlang links processes and cores?
- Programming in Erlang involves expressing computations in terms of processes interlinked by message passing.
- Erlang provides a way to define these processes in its language, then maps the process instructions to machine code instructions, distributing them across the cores while maintaining the guarantees of the processes.
How to represent an actor computation?
- As a function
\text{f} : \text{Message} \times{} \text{State} \rightarrow{} \text{State} \times{} \text{Message} - let
f(\text{message},\text{state}) = (\text{state'}, \text{message'}) means that the process updates its state from\text{state} to\text{state'} and sends the message\text{message'} as a consequence of being in the state\text{state} when receiving the message\text{message} .
How to specify
- By writing a module i.e. a text file in the Erlang language.
- This file is compiled down to Erlang bytecode and then to machine code where the actual computations occur on the hardware.
What is the BEAM?
- The BEAM is a virtual machine that executes BEAM bytecode.
- A detailed description of the BEAM is given here.
- The bytcode is described here.
How to build a process?
- Write a module. For instance:
-module(person). -export([init/1]). init(Name) -> … . - the expression:
Joe = spawn(person,init,["Joe"]) builds a process which behaviour is specified by the moduleperson
and started using the person module public procedureinit
with the value"Joe"
- The expression:
Joe is the address — PID in Erlang terms — of the associated process such that:Joe ! {self(), "Hello!"} sends the message{self(), "Hello!"}
toJoe
whereself()
is the PID of the current process sending the message. - For Joe to reply, the
person
module should be completed with:-module(person). -export([init/1]). init(Name) -> receive {From, Content} -> … end. which allowsJoe
match the address sent withFrom
and the content withContent
.
What is process concurrency?
Given N
processes and M
cores, distributing the processes accross the cores in order the optimize cores utilization is called process concurrency. At a given time t
if processes p1
and p2
are executing in two different cores, then they execute in parallel at t
.
What are benefits of concurrency?
- Performance: if a program is implemented in terms of processes, then: it is easier to map it to many cores than a program written sequentially.
- Scalability: more cores and memory means that processes are mapped automatically to a greater amount of resources.
- Fault tolerance: if a process crash, it is hard to crash the whole network of processes since processes do not share memory.
- Clarity: since things happen in parallel in the real world, it is easier to map things to processes executing concurrently.
Is there a way to map a computation to a network of processes?
No. This process is inherently hard and cannot be automated.
Write a file server.
Specification:
- Let
s = self()
- A file server is a process
p = file_server:start(ADir)
. p ! {s, list_dir}
causess
to receive a list of files inADir
.-
p ! {s, {get_file, "a_file.txt"}}
causess
to receive the content of"a_file.txt"
inADir
, if any.
Implementation:
Write a file server client.
Specification:
- Given a file server
fs
, a client may usefs
sequentially. For instance:… FS = file_server:start(ADir). FileList = file_client:ls(FS) … without having to be aware of the communication protocol.
Implementation:
What does X = Y
means in Erlang?
- If
X
is un unbound variable, then:X = 123
means thatX
refers to the value123
after which it cannot be assigned an other value again. - More generally, it means evaluate
Y
and match it toX
.
How does Joe write programs?
When I’m writing a program, my approach is to “write a bit” and then “test a bit.” I start with a small module with a few functions, and then I compile it and test it with a few commands in the shell. Once I’m happy with it, I write a few more functions, compile them, test them, and so on.
Often I haven’t really decided what sort of data structures I’ll need in my program, and as I run small examples, I can see whether the data structures I have chosen are appropriate.
I tend to “grow” programs rather than think them out completely before writing them. This way I don’t tend to make large mistakes before I discover that things have gone wrong. Above all, it’s fun, I get immediate feedback, and I see whether my ideas work as soon as I have typed in the program.
Once I’ve figured out how to do something in the shell, I usually then go and write a makefile and some code that reproduces what I’ve learned in the shell.
Given an error, why its consequences are different depending on the sequential or concurrent nature of the program?
A concurrent program means many processes, maybe millions of them: an error localized to one process is not that important compared to an error in a sequential program where you have one process. In one case, an error will have a much harder time taking down the whole program than the other.
Given a function call, what happens next?
One of these three outcomes:
- The function never returns.
- The function returns.
- An exception is raised.
How to get rid of the first outcome?
The caller may set a timeout after which an exception is raised.
How to deal with exceptions?
The caller must do something in case an exception is raised.
How should functions be called, protecting from errors?
In a sequential program, it may look like so: for a client calling f
, it may
be conceptually equivalent to: y = call(f, x, 5, error_handler)
which means: call
.f
on x
. If it does not return under 5 seconds, raise a timeout exception. If an exception
err
is raised, call error_handler(err)
. Else: return the value and bind it to
y
In a concurrent program, we just let it crash.
What Let It Crash
means?
- Fail as soon as possible.
- Fail withe a meaningful message.
- The message should be written to a persistent error log.
- Only the programmer should see the error report, not the user.
- You should describe the behavior of functions only for valid input arguments;
- all other arguments will cause internal errors that are automatically detected;
- You should never return values when a function is called with invalid arguments;
- You should always raise an exception.
- Assume that the caller will fix it.
How to explicitly raise exceptions?
exit(Why)
throw(Why)
error(Why)
How to catch
an error?
catch all possible errors using this syntactic construct:
try…catch…after…end
How to avoid messages clogging the mailbox of a process?
The loop should have a clause that matches all messages and act accordingly, else: messages stay in the mailbox.
How to wait for a message given a timeout in milliseconds?
How to just wait for a given amount of time?
Write a function that flushes the mailbox.
Implement a priority receive.
Give an interpretation of:
- When we enter a receive statement, we start a timer (but only if an after section is present in the expression).
- Take the first message in the mailbox and try to match it against Pattern1, Pattern2, and so on. If the match succeeds, the message is removed from the mailbox, and the expressions following the pattern are evaluated.
- If none of the patterns in the receive statement matches the first message in the mailbox, then the first message is removed from the mailbox and put into a “save queue.” The second message in the mailbox is then tried. This procedure is repeated until a matching message is found or until all the messages in the mailbox have been examined.
- If none of the messages in the mailbox matches, then the process is suspended and will be rescheduled for execution the next time a new message is put in the mailbox. When a new message arrives, the messages in the save queue are not rematched; only the new message is matched.
- As soon as a message has been matched, then all messages that have been put into the save queue are reentered into the mailbox in the order in which they arrived at the process. If a timer was set, it is cleared.
- If the timer elapses when we are waiting for a message, then evaluate the expressions ExpressionsTimeout and put any saved messages back into the mailbox in the order in which they arrived at the process.
How to publish the PID of a process?
Use the register
API.
A concurrent program template.
What is the problem with sequential programs and errors?
If this process dies, we might be in deep trouble since no other process can help. For this reason, sequential languages have concentrated on the prevention of failure and an emphasis on defensive programming.
How is managing errors different when a large number of concurrent processes exist?
The failure of any individual process is not so important. We usually write only a small amount of defensive code and instead concentrate on writing corrective code. We take measures to detect the errors and then correct them after they have occurred.
What is the idea on which error handing in concurrent Erlang programs is based on?
- Remote detection and handling of errors
- Instead of handling an error in the process where the error occurs, we let the process die and correct the error in some other process.
- When we design a fault-tolerant system, we assume that errors will occur, that processes will crash, and that machines will fail.
- Our job is to detect the errors after they have occurred and correct them if possible.
- Let it crash
- Let some other process fix the error
- We cannot make fault-tolerant systems on one machine since the entire machine might crash, so we need at least two machines. One machine performs computations, and the other machines observe the first machine and take over if the first machine crashes.
What do we mean by error checking code
and error correcting code
?
- We build our applications in two parts: a part that solves the problem and a part that corrects errors if they have occurred.
- The part that solves the problem is written with as little defensive code as possible; we assume that all arguments to functions are correct and the programs will execute without errors.
- The part that corrects errors is often generic, so the same error-correcting code can be used for many different applications. For example, in database transactions if something goes wrong in the middle of a transaction, we simply abort the transaction and let the system restore the database to the state it was in before the error occurred. In an operating system, if a process crashes, we let the operating system close any open files or sockets and restore the system to a stable state.
- A,B : Processes
- A and B are linked
- What does it mean?
- It means that if one of the two dies, then: the other will receive its death message.
- P : Process
- LinkSet(P)
- What does it mean?
- LinkSet(P) : Set of processes linked to P
- A,B : Processe
- A monitor B
- What does it mean?
- If B terminates, then: A receives a down message from B.
- A : Process
- LS ≡ LinkSet(A)
- What kinds of messages are exchanged between A and LinkSet(A)?
- If any process in A U LinkSet(A) terminates or crash, then: error signals are exchanged, else: messages are exhcnaged.
What kinds of processes exist?
- Normal processes and system processes.
- system processes can trap exit signals but not normal processes.
process_flag(trap_exit, true)
What happens if a normal process receives an error signal?
If the reason is not normal, it will terminate.
How to kill a process that refuses to die?
Send him a kill signal: exit(Pid, kill)
Build a keep alive process.
How to build groups of processes that die together?
- Just link them.
- If one dies, then: it sends error signals are sent to its LinkSet.
- If they are not system processes, then: they die too.
How to prevent death propagation in a group of linked processes?
- If one process in the LinkedSet is a system process, then: it can trap exit signals.
- Since it traps exit signals, it does not die and does not propagate death.
How to watch a process PID
and react to its exit?
What is an Erlang node?
It is a self-contained Erlang system containing a complete virtual machine with its own address space and own set of processes.
On which kind of network Erlang nodes may run?
- Distributed Erlang applications run in a trusted environment
- Since any node can perform any operation on any other Erlang node, a high degree of trust is involved.
- Typically distributed Erlang applications will be run on clusters on the same LAN and behind a firewall, though they can run in an open network.
If security is at risk, on which kind of network can Erlang nodes be distributed?
- Socket-based distribution
- Using TCP/IP sockets, we can write distributed applications that can run in an untrusted environment.
How to transition from a concurrent program to a distributed program?
- By using two new operations:
- Start a new Erlang node.
- Perform a remote procedure call on a remote Erlang node.
What are systematic steps involved in developping a distributed application.
- Write and test a program in a regular nondistributed Erlang session
- Test the program on two different Erlang nodes running on the same computer.
- Test the program on two different Erlang nodes running on two physically separated computers either in the same local area network or anywhere on the Internet.
Why is interfacing with other programs necessary?
- We might use C for efficiency.
- we might want to integrate a library written in Java
How to interface foreign language programs to Erlang?
- By running the programs outside the Erlang virtual machine in an external operating system process.
- If the foreign language code is incorrect, it will not crash the Erlang system.
- Erlang controls the external process through a device called a port.
- Erlang communicates with the external process through a byte-oriented communication channel.
- Erlang is responsible for starting and stopping the external program.
- Erlang can monitor and restart it if it crashes.
- running an OS command from within Erlang and capturing the result.
- By running the foreign language code inside the Erlang virtual machine.
- This involves linking the code with the code for the Erlang virtual machine.
- This is the unsafe way of doing things. Errors in the foreign language code might crash the Erlang system.
- Although it is unsafe, it is useful since it is more efficient than using an external process.
- Linking code into the Erlang kernel can be used only for languages like C that produce native object code and can’t be used with languages like Java that have their own virtual machines.
Describe the behaviour of a port.
- The process that created the port is called the connected process.
- Erlang communicates with external programs through objects called ports.
- If we send a message to a port, the message will be sent to the external program connected to the port.
- Messages from the external program will appear as Erlang messages that come from the ports.
- If the external program crashes, then an exit signal will be sent to the connected process.
- If the connected process dies, then the external program will be killed.
- All messages to the port must be tagged with the PID of the connected process.
- All messages from the external program are sent to the connected processes.
What does OTP stands for?
Open Telecom Platform.
What is the central concept of OTP?
- OTP behavior.
What is an OTP behavior?
- It is an application framework that is parameterized by a callback module.
- properties such as fault tolerance, scalability, dynamic-code upgrade, and so on, can be provided by the behavior itself.
- In other words, the writer of the callback does not have to worry about things such as fault tolerance because this is provided by the behavior.
What are a few necessities in building an abstraction?
- An abstraction solves a problem.
- To build an abstraction, we need to separate functional and nonfunctional parts of the problem.
- Building an abstraction is a trial and error iterative process.
Write the most essential parts of a server.
Specification
- if:
init : State
handle : Message × State → Message × State
- then:
s ≡ Server(init,handle) : Server
- A server
s
is a process. - It has a state
state(s) : State
. - Given a message
m
with a contentcontent(m)
, it computeshandle(content, state) = (state', m'): Message × State
- If possible,
m'
is sent back to the client. - Its state is updated to
state'
.
- If possible,
- A server
Implementation
Use server1
to implement a name server.
Usage example:
What is the gen_server callback structure?
- Any ≡ Request
- Any ≡ Reply
- Any ≡ Message
- Any ≡ Reason
- init : Any → State
- handle_call : Request PID State → reply × Reply × State
- synchronous calls
- handle_cast : Message State → noreply × State
- asynchronous calls
- handle_info : Message State → noreply × State
- messages not sent with call or cast
- terminate : Reason State → ok
- code_change : ??? → ???
Assume that a company want to sell prime numbers and areas, what are the main processes involved?
- A process that produces prime numbers.
- A process that computes areas.
- A supervisor to recover production when it crashes.
- A logger that will persist error reports to fix production crashes.
- Alarms to detect, for instance, overheating CPUs.
How to deploy our software?
By packaging it into an OTP application.
What is an event? What to do with them?
- An event is a piece of data that is built when something notworthy happens.
- When an event happens, one may just tell the world about it:
EventHandler ! {event, E}
What is meant by very late binding
?
- Given an event
evt
, the system may change its associated behavior at runtime. - In most programming language, the code that will get executed in reaction to
evt
is statically or dynamically linked. - Everything else being equal, if the code executed is represented by:
code(t, handler(evt))
at a timet
, then:code(t, handler(evt)) = code(t+1, handler(evt))
- In Erlang, it may not be the case.
code(t+1, handler(evt))
may have beenhot loaded
concurrently at timet
.- In other words: the code associated with the processing of
evt
may be changed without stopping the system.
Why is it important to log errors?
- You want your system to work as expected.
- Things go wrong.
- You want to fix the minimal amount of code to make it right.
- Errors should give you enough insight into what went wrong to do exactly that.
- Think of it as a medical doctor that uses symptoms to posit a diagnosis and finally offer an efficient cure.
From how many point of view may the Erlang Logger be viewed?
- Programmer: to log an error, which procedure to call?
- Configuration: where and how the error logger stores data?
- Report: how to analyse the errors?
What changes between dev and production systems regarding error logging?
- If the system is started with
erl -boot start_clean
(i.e. dev mode), then: simple logging is done. - If the system is started with
erl -boot start_sasl
(i.e. prod mode), then: error logging is more involved..
How to configure the logger?
- Using a configuration file so that we do not need to remember configuration arguments when implementing the system.
Where are errors reported in SASL mode by default?
In the Erlang shell.
What kind of reports are produced?
- Supervisor report
- Progress report
- Crash report
What is the job of the supervisor?
- The supervisor is a process that monitors a tree of processes.
- It is the root of the tree.
- When a process fail, it may restart it, depending on some configuration.
What types of supervisor exist?
- one_for_one: If one process crashes, it is restarted.
- one_for_all: If one process crashes, all are terminated and then restarted.
What are appname.app
and appname_app.erl
files?
- It describes an application is such a way that
erl
can start and stop the application:$ erl -boot start_sasl 1> application:load(appname) 2> application:start(appname) 3> application:stop(appname) 4> application:unload(appname)
What is the Erlang view of the world?
Erlang view of the world is that everything is a process and that processes can interact only by exchanging messages
How to reduce complexity when interfacing with programs outside Erlang?
- By using
middle men
. - A Web server may receive
{get, Page}
and may reply with a String. - Exterior requests may take the form of HTTP for TCP connections.
- Translating from HTTP requests to Erlang terms must be done somehow.
- A middle man may take this job.
- For each protocol, a middle man may execute the translation.
- The server statys the same.
- In other words, the middle man transforms an
N × M
problem into anN + M
problem. It's like a world where everybody speaks English (or Mandarin)—it's much easier to communicate.
- Given
X = {Mod, p1, p2, …, pn}
- What does
X:Func(A1, A2, …, An)
means? - Give an example.
- It means
Mod:Func(A1, A2, …, An, X)
-
Given:
-module(counter). -export([bump/2, read/1]). bump(n, {counter,k}) -> {counter, n + k}. read({counter, n}) -> n. then:$ erl 1> c(counter) 2> x = {counter, 2} 3> x:read() 2
- At time
t
a key/value storekv_1
is chosen. - from
t
tot+99
, code accumulates aroundkv_1
usingAPI(kv_1)
- At time
t+100
a key/value storekv_2
replaceskv_1
. API(kv_1) ≠ API(kv_2)
- Consequence: everything breaks.
- How to avoid this pitfall?
- The
Adaptor Pattern
. - If
kv_adapter
is implemented using the adapter pattern, then: it may use different implementations to store the key/value pairs using the same interface.-module(kv_adapter_test). -export([test/0]). -import(kv_adapter, [new/1, store/2, lookup/1]). test() -> %% test the dict module M0 = new(dict), M1 = M0:store(key1, val1), M2 = M1:store(key2, val2), {ok, val1} = M2:lookup(key1), {ok, val2} = M2:lookup(key2), error = M2:lookup(nokey), %% test the lists module N0 = new(lists), N1 = N0:store(key1, val1), N2 = N1:store(key2, val2), {ok, val1} = N2:lookup(key1), {ok, val2} = N2:lookup(key2), error = N2:lookup(nokey), ok. - Implementation:
-module(adapter_db1). -export([new/1, store/3, lookup/2]). new(dict) -> {?MODULE, dict, dict:new()}; new(lists) -> {?MODULE, list, []}. store(Key, Val, {_, dict, D}) -> D1 = dict:store(Key, Val, D), {?MODULE, dict, D1}; store(Key, Val, {_, list, L}) -> L1 = lists:keystore(Key, 1, L, {Key,Val}), {?MODULE, list, L1}. lookup(Key, {_,dict,D}) -> dict:find(Key, D); lookup(Key, {_,list,L}) -> case lists:keysearch(Key, 1, L) of {value, {Key,Val}} -> {ok, Val}; false -> error end.
What is intentional programming
?
- If the programmer can guess what the code does just by looking at the code, then: intentional programming has been correctly implemented.
- For instance:
lookup(Key, Dict) -> {ok, Value} | not_found - may be used to fetch a value, search a value and test if a key exists. Instead, better provide this API:
dict:fetch(Key, Dict) = Val | EXIT dict:search(Key, Dict) = {found, Val} | not_found. dict:is_key(Key, Dict) = Boolean
How to use thrid-party programs?
Various solutions exist, one of them is rebar
.
Why is shared state concurrency
a problem?
- shared state concurrency implies
mutable state
. - mutable state means a region of memory that changes.
- if multiple processes share state, then: they must coordinate writes to the shared state.
- In particular, they cannot write to the same region of memory at the same time.
- In order to prevent this situation to happen,
mutexes
have been introduced. - Mutexes behave like locks on a region of memory.
- A process
p1
locks regionR
so that onlyp1
can write toR
. - When
p1
stops writting, it releasesR
so that others may write. - One problem is that, when
p1
locksR
then crashes, then:R
is locked and all other processes are prevented to write: computation stops. - An other problem occurs if
p1
corrupts the memory: all other processes are sent astray, then do not know what to do.
How is Erlang solving
shared state concurrency
?- By not having state: there are no mutable data structures so no locks.
- ⇒ easy to parallelize computations
Given the Erlang programming language, how to parallelize execution?
- your program must have lots of processes
- avoid side effects
- avoid sequential bottlenecks
- write small messages, big computations code
Why having lots of processes favor parallel execution?
- All the CPUs should be busy all the time. The easiest way to achieve this is to have lots of processes.
- Preferably the processes should do similar amounts of work. It’s a bad idea to write programs where one process does a lot of work and the others do very little.
Why avoiding side effects favors parallel execution?
- Side effects prevent concurrency.
- It may lead to concurrent writes to a single memory region leading to the use of mutexes.
How is a Distributed Ticket Booking System
linked to distributed hash tables
?
- If one agency sells tickets from a set of tickets, then: the agency is a sequential bottleneck.
- If two agencies sells tickets, one with odd numbers and the other with even numbers, then: the agencies are guarenteed not to sell twice the same ticket.
- Replacing the 2 agencies by n(t) agencies that can crash while maintaining the guarentee is an active area of research which goes by the name of
distributed hash table.
A list of experiments demonstrating various Elixir concepts in practice.
Whatever the source: books, documentation, blog post, …, reference whatever is useful.