淺談 Synchronization

擷取自網路

About synchronization

在分布式系統中,這是一個很大的問題。多台機器之間的資料同步,確保使用者不會讀取到錯誤的資料。我會假設一個情境,下面的解說,將會根據情境講解。希望這樣大家比較好理解。

In distributed systems, synchronization is a difficult problem. How to synchronize between machines and make sure users get the correct data. I will assume a story and explain synchronization according to the story. Hope you can understand better in this way.

The story …

現在,我們有一間銀行。在 A 跟 B 地方各有一台機器。一位客戶原本有 500 元在帳戶中。他到 A 地方去存 10 元,但這時銀行在 B 地方給了他 1% 的利息。所以現在
A 地方 : $510
B 地方 : $505

Now, assume there is a bank. There is a machine in A city and B city each. A customer has $500 in his account. He goes to deposit $10, in the same time bank pays him 1% interests. So now,
A city: $510
B city: $505

If synchronize now …

A 地方: $510 * 1% = $515.1 (給 1% 的利息)
B 地方: $505 + $10 = $515 (存 10 元)

A city: $510 * 1% = $515.1 (pay 1% interests)
B city: $505 + $10 = $515 (deposit $10)

Concept …

同步的核心概念:在所有機器中,所有的更新順序是一致的。

Core concept of synchronization is that apply updates in the same order at all machines.

Replicated State Machine (RSM) …

  • 每一台機器都是一個狀態機器 Every replica models as state machine
    • 每一台機器有各自的狀態,按照一樣的順序執行更新,確保最後的狀態一致。
    • Given identical states, applying updates in the same order in same final state.
  • Over time
    • 如果有一個機器壞掉了,重啟新的一台。
    • Starts a new replica if one goes down.
  • Over space
    • 可同時執行好幾台機器
    • Can run multiple replicas in the same time.

Challenge of RSM

更新順序是按照接收時間,但是每台接收的時間可能不一樣。加上可能會有時間錯誤,或者網路延遲。

Order updates based on the receipts time, but clocks not in synchronous among replicas. In addition, there might be clock drift and time screw, or network latency.

擷取自網路

Logical clocks …

1.
捨棄時間,以事件排序。
Disregard precise time, only orders events.

2.
每個事件都有自己的"邏輯時間"。
Every event has its own “logical time".

3.
以邏輯時間,排序所有事件。
Order all events based on “logical time".

Rules of Logical clocks

1.
每個序列初初始時間為 0
The initial logical clock is 0 in each procedure.

2.
每條序列中,若發生事件,則時間加一
In every procedure, logical time add 1 if event happens.

3.
接收到別條序列的訊息,則選擇最大時間 +1
Received message from other procedures, then choose the maximum clock and add 1.

4.
若 C(a) < C(b) ,則代表 a 發生的時間早於 b
If C(a) < C(b) which means a event happened before b event.

擷取自網路

Above figure …

以 P0 為例
1. 發生事件 a, 邏輯時間加一,所以為 (a,1)。
2. 接收到 P2 的訊息,P2訊息時間為1,跟P0 時間一樣,最大值加一(1+1),所以為 (b,2)
3. 接收到 P1 的訊息,P1 訊息時間為1,P0 時間為2,最大值加一(2+1),所以為(c,3)
4.C(a) < C(b),a 事件發生在 b 事件之前

In term of P0
1. Event a happened, logical clock added 1, so it’s (a,1).
2.Received message from P2, the logical clock of P2 is 1 which is the same as P1, so the maximum clock added 1(1+1), it’s (b,2).
3.Received message from P1, and the logical clock of P1 is 1. P0 is 2, so the maximum clock added 1(2+1), it’s (c,3).
4. C(a) < C(b) which means a event happened before b event.

擷取自網路

Back to the example …

現在假設
A 地方: 存錢的邏輯時間為 (a,2)
B 地方: 給利息時間為 (b,3)
則 C(a) < C(b),所以兩台機器應先執行 A 事件,在執行 B 事件。

Now assume,
A city: the logical clock of depositing $100 is (a,2)
B city: the logical clock of paying interests is (b,3)
so C(a)<C(b) which means it should update A event before B event.

Waiting time …

如果每次更新前,都要確認事件順序的話,會花很多等待時間。
If it needs to check the order before every update, it would take long time.

There are two ways to solve this problem

1.
定期更新所有的節點,不用等到有更新時,才同步。
Periodically synchronize all nodes.

2.
允許暫時的不一致,但這需要記住狀態,才可以回滾。
Allow temporary inconsistency. It needs to log in order to rollback.

-MsHe

淺談 Master-Worker 架構

Master-Worker Architecture …

分布式系統中常見的架構。一個 Master 可以跟多個 Worker 溝通,Worker 之間也可以互相溝通。
It is a common architecture in distributed systems. One master can communicate with several workers and workers can communicate with each other.

How Master-Worker works …

In term of Master

1.
可以作為接口,接收外面的請求。
Master can be the entry and receive the requests.

2.
指定任務給 worker。
Assign tasks to workers.

3.
定期發訊息給 worker ,確保 worker 運作正常。
Periodically sends message to worker to ensure it is functional.

In term of Worker

1.
向 Master 註冊自己的訊息。
Register itself to master.

2.
接收來自 Master 的任務,並回傳結果。
Received tasks from master and reply to master.

擷取自網路

Advantages …

1.
Master 可以協調 Worker 的工作量。
Master can coordinate workload of each worker.

2.
Master 定期訪問 Worker,若 Worker 失敗,可以重新分配任務。
By periodically pinging workers, master can monitor workers and reassign tasks if workers failed.

Disadvantages …

1.
需要大量的溝通。
There are amount of communication.

2.
確保一致姓。
Make sure the consistency.

3.
需要錯誤容忍 (如何處理 Master 跟 Worker 的錯誤)。
Needs Fault tolerance. (How to handle if master or worker failed.)

-MsHe

淺談 RPC

擷取自網路

What is RPC …

遠端程序呼叫 (Remote Procedure Call),顧名思義,就是可以呼叫遠端的機器,執行某些運算。這個方式大量地使用在分布式系統中,增加系統的運算能力。

Remote procedure call literally means that it allows programmers to call procedures that are located and executed on remote machines. RPC is widely used in distributed systems, in order to improve the capacity of calculation of systems.

Advantages …

  1. RPC 可以基於多種協定。 RPC is based on several protocols, like HTTP, TCP.
  2. RPC 自帶負載平衡策略。 RPC has its own load balance strategy.
  3. RPC 的傳輸效率好,因為請求的體積小。 Since RPC has small content of request, it has high efficiency of transmission.
擷取自網路

Challenges …

1.
需要一定的傳輸時間。
Since it needs to pass parameters and results, it takes certain time.

2.
需要有錯誤處理跟確保資料正確性.
It needs fault tolerance and ensures correctness of data.

擷取自網路

Process …

1.
客戶端將參數打包成訊息,並將訊息傳給服務端。
Client machine packs the parameters into a message and sends message to server machine.

2.
客戶端會等待(暫停) 直到收到服務端的回復。
Client machine will wait( block itself) until it receives the reply from server machine.

3.
服務端收到請求後,OS 會將訊息傳給 Server Stub。
After receiving the request, server machine’s OS passes message to Server Stub.

4.
Server Stub 打開訊息,取得參數,呼叫執行程序。
Server Stub unpacks message, gets the parameters, and calls procedure.

5.
得到結果之後,打包結果,傳給客戶端。
After getting result, packs result into message and send back to client machine.

6.
客戶端收到結果後,Client Stub 解除等待,打開訊息,並且傳給呼叫者。
Client machine received message, then it unblocks Client Stub. Client Stub unpacks message and passes it to caller.

擷取自網路

Some questions …

1.How are pointers or references passed?
直接複製整個資料結構。
Simply copy entire data structure.

2. Why does it pack parameters?
將參數轉換成特定的格式,適合在機器跟網路之間傳輸(語言獨立)。
In order to transform the data to specific format which is suitable for transmission between machines and network(language independence).

3.Why do parameters need to transform to specific format?
因為客戶端的程式語言跟服務端的程式語言可能不同。所以轉換成跟語言無關的特定格式,這樣無論什麼程式語言,都可以理解訊息內容。
Client and server might use different programming languages. By transforming to specific format (language independent), the message can be read no matter programming language.

4.Does client take long time on waiting reply from server?
這取決於網路跟計算速度。有時候很可能等待太久而超時。異步 RPC 是一種解決方式,但我不會在這裡過多討論。
It depends on the network and calculation of server. It might take too long and timeout. Asynchronous RPC is one of solutions, but I won’t talk about it too much.

5.What is asynchronous RPC?
客戶端不需要等待服務端的回應,只要知道服務端已經收到訊息後,客戶端就會解除等待。服務端執行完會,會主動呼叫客戶端(callback)。
Client doesn’t wait for reply from server. Once the client knows that server accepts the request, it unblocks itself. Server will inform client after it gets the result(Callback).

-MsHe

淺談 MapReduce

Vector concept confusion in the head

Why we need MapReduce …

越來越的資料被記錄,單個機器已經無法處理和儲存這麼大量的資訊。沒辦法在單一個機器上完成,需要分散式的儲存跟處理。而 MapReduce 是一種分散快速處理資料的方法。

More and more data has been recorded, it’s impossible to store and process data on single machine. Since one machine can not handle such amount of data, the data needs distributed storage and processing. MapReduce is a way to distributed and fast processing large data.


Desired properties of MapReduce …

1.可以擴展 Scalable
在多台機器上面運算,並且可以新增或減少機器。
It can be calculated on several machines and it can simply adding or removing machines.

2.錯誤容忍 Fault-tolerance
若有機器故障,不會影響其他機器處理資料。
If one machine goes down, it won’t affect other machines.

3.可以廣泛使用 Wide applicable
夠簡單,程式人員可以定義或者客製化。
It is simple enough so that programmer can define or customize.


擷取自網路

How MapReduce works …

1.Map Operation
根據輸入的關鍵字,計算出相對應的值,產生中介 <Key,Value> 的資料
以上面的圖來看,想要計算出每個字出現的次數。所以 Map operation 根據內容,計算出相對應的值,產生 <A,1>,<B,1>,<R,1>… 等中介資料。
Map operation will calculate value according to key and generate intermediate <key,value> pairs.
In above figure, it want to count every character. Map operation first calculate value of each character and generate <A,1>,<B,1>,<R,1> … intermediate pairs.

2.Reduce Operation
合併相關的中介資料,並且產出最終結果。
上圖而言,Reduce operation 合併相關的中介資料,產生出最後每個字的對應數值。
Merge together the intermediate <key,value> pairs and generate the result.
In above figure, Reduce operation merge the relative intermediate pairs and generate the final result of each character.

Process …

1.
原始的輸入資料被隨機分散成好幾份,分別到不同的機器進行運算。
The raw input data is divided into several pieces and send to different machine to process.

2.
執行 Map Operation 產生出中介值。
Executes Map operation and each machine generates a set of intermediate key value pairs.

3.
將資料依據 Key 值重組排序。
According to key, data will be shuffled and sorted.

4.
執行 Reduce Operation 產生出最終結果。
Executes Reduce operation and generates the results.

-MsHe

淺談分布式系統

擷取自網路

Definition of Distributed System …

多臺計算機,互相連結、溝通,提供特定的服務。

Multiple computers that are interconnected and cooperate to provide some services.


Why we need distributed system …

1.克服地區限制 Conquer geographic limitation
如果客戶端跟主機端距離太遠,則需要傳輸的時間過長,可能會產生問題;或者可能連線有問題。
If the distance between clients and servers, it might need too long to transit messages and cause some problems. Or it might have disconnection.

2.在不穩定的元件上,建構一個穩定的系統 Build reliable systems with unstable components
每個元件(例如:電腦、傳輸線、網路 …) 都沒辦法保證不會發生問題。而我們必須在這種不穩定的元素中,建造出穩定的系統,以供客戶使用。
Every component, for example: computers, cables, internet, can not be guarantee to be worked. We must build a stable system with these unreliable components.

3.聚集許多元件,提供更好的效能 Aggregate several components for higher performance
利用多個機器,可以分散計算,以提供更好的效能及容量。
Using several components, we can separate calculation for higher performance and capacity.


擷取自網路

Challenges …

1.部分錯誤 Partial failures
我們目前無法準確的探測出到底是哪部分出錯了。是網路問題,還是機器壞掉了,還是只是執行慢所以逾時了。錯誤探測變成分布式的一種挑戰。
We currently can not detect which part of system failed. It is network’s problem, servers going down, or just low performance and causing timeout. Failure detection become one of challenges in distributed systems.

2.並行 Concurrency
若多個使用者同時執行,該怎麼同時運行。這又衍生另一個常見的問題,該怎麼同步伺服器之間的資料。
If there are several users at the same time, how to ensure the operations. And there is another problem coming, how to make sure consistency of distributed system?

3.測試 Testing
分布式系統可能遇到的情況太多,我們沒有辦法逐一測試。
It is impossible to test every possible scenario since there are too many situations of distributed system.

-MsHe

一湯一菜的生活美學

關於飲食 …

現在的我們,常常追求食物的創新跟精緻。往往餐廳的食物雖然美味,但卻加入很多調味料。其實飲食跟生活型態一樣,越簡單,越能享受其中樂趣。

About diet …

Nowadays, we usually prefer to have creative and delicate food. Although food is delicious, it sometimes over seasoning. In fact, diet is like lifestyle, more simple it is, more fun it is.

擷取自網路

米 …

亞洲最常見的主食之一,米。最簡單的食物,我們煮飯的時候,幾乎不會對米進行任何調味,但它卻是我們每天都會吃的食物。儘管他什麼都沒有調味,我們卻很少厭倦它。這很像生活,有時簡單到索然無味。

Rice …

The most common food in Asia is rice. Rice is the simplest food. We even not seasoning rice when we cook. Although it is simple, rice is what we eat everyday. We rarely dislike it even though we eat everyday and it is bland. It just like life is simple and insipid.

簡單 …

在這本書中,強調簡單的飲食。像書中所說的:「越單純的事物越美好,越值得推崇」,簡單的食物,加上一點點調味,就可以讓你吃的很安心。把時間花在品嘗上面,更能發現食物的味道,而且更健康。

Simple …

In this book, author emphasizes that eat simply. Like the author says: ‘more simple the thing is, more beautiful it is, and more worthy it is.’ Simple food and little seasoning can make you eat safety. Taking your time on tasting the food, you will find the true value of food and eat more healthier.

煮飯 …

如果有空,就自己煮飯吧。簡單的煮飯,你會漸漸體會樂趣。而你會感覺,簡單其實是一種主張,一種思想,更是一種生活型態。

Cook …

If you have time, try to cook by yourself. Cooking simply, you will have some fun little by little. And you will find out that simpleness is a notion, a thinking, and a kind of lifestyle.

Making the simple complicated is commonplace; making the complicated simple, awesomely simple, that’s creative.

– Charles mingus

-MsHe

Tensorflow (3)

擷取自網路

設定好環境後,開始學習一些基本語法。寫程式的人應該對 Hello World 很熟悉吧。

After setting up configuration, we start to learn some basic syntax. Every programmer should be familiar with ‘Hello world’, right?

Print

import tensorflow as tf

# set up the constant
hello = tf.constant('hello world')

# start session
sess = tf.Session()

# run the operation
print(sess.run(hello))
擷取自網路

Basic operation

There are two ways to do basic operations.

  1. Using constant
import tensorflow as tf

# Basic operations
# set up constant

a = tf.constant(2)
b = tf.constant(3)

# print the output 
with tf.Session() as sess:
    print("a=2, b=3")
    print("Addition with constants: %i" % sess.run(a+b))
    print("Multiplication with constants: %i" % sess.run(a*b))
    print("Subtraction with constants: %i" % sess.run(a-b))
    print("Division with constants: %i" % sess.run(a/b))

2. Using variables

import tensorflow as tf

#Define a and b
a = tf.placeholder(tf.int16)
b = tf.placeholder(tf.int16)

# Define operations
add = tf.add(a,b)
mul = tf.multiply(a,b)
sub = tf.subtract(a,b)
div = tf.divide(a,b)


#Print the output
with tf.Session() as sess:
    # Run every operation with variable input
    print("Addition with variables: %i" % sess.run(add, feed_dict={a: 2, b: 5}))
    print("Multiplication with variables: %i" % sess.run(mul, feed_dict={a: 2, b: 5}))
    print("Subtraction with variables: %i" % sess.run(sub, feed_dict={a: 2, b: 5}))
    print("Division with variables: %i" % sess.run(div, feed_dict={a: 2, b: 5}))

這篇文章,我們介紹了基本的操作。

In this article, we introduce some basic operations.

-MsHe

自律及自由

We must all suffer from one of two pains: the pain of discipline or the pain of regret. The difference is discipline weighs ounces while regret weighs tons.

– Jim Rohn, Author

關於自由 …

自由,是每個人都嚮往的事情。但真正的自由是什麼?答案每個人都不一樣,但我想不是毫無顧忌地只做自己喜歡的事情。不然就不會有很多人因為退休感到焦慮。我認為自由更傾向於知道自己的時間該怎麼過。對於沒有一定時間表的人,這是個需要思考的問題。

About freedom …

Freedom is everyone desired. What is the freedom? Everyone has different answer about this question. But I think doing things you like without any considerations is not an option of the answers. If it was, then there will not be so many people feel anxious about retirement. I prefer that freedom is knowing what you should do in every moment. For some people who do not have their own schedule, this is a difficult question that needs to think.

I have never known a really successful man who deep in his heart did not understand the grind, the discipline it takes to win.

– Vince Lombardi, American Football Coach

思想上的自律 …

1. 堅持到底
永遠都有事情等著你,不要鬆懈。

2. 生活實踐
安排自己的行程表,知道什麼時候該做什麼事情。

3. 專注
記住你的目標,才不會迷失方向。

4. 害怕失敗
害怕失敗,所以你才會努力不懈。

5. 開懷大笑
撐不下去的時候,笑一下。不要忘記享受痛苦。

Discipline of thought …

1. Persistence
There are always something waiting for you. Don’t be slack.

2. Practice
Plan your schedule. Know what you need to do in every second.

3. Concentration
Remember your goal, so you won’t be lost your direction.

4. Be afraid of failing
Be afraid of failing, so you will do hard.

5. Laugh
Laugh when you feel you can’t hold on anymore. Don’t forget to enjoy your pain.

擷取自網路

行動上的自律

1. 早起
早點起床,讓自己有更充裕的時間做事情。

2. 健身
身體的狀況,會影響做事的效率。保持健康。

3. 安排
每日該做什麼,自己要清楚。

4. 做
做了安排,剩下的就是動手。如果什麼都不做,安排一點用都沒有。

5. 睡覺
注重自己的睡眠。是保持健康跟效率的好習慣。

Discipline of action

1. Get up early
Get up early, so you have more time to do things.

2. Workout
Health will affect your efficiency. Stay health.

3. Plan
You should know clearly what you should do everyday.

4. Do
You need to actually do after planning. The plan will be useless if you don’t stick in the plan.

5. Sleep
Attention to sleep. Good quality of sleep is the way to keep health and efficiency.

擷取自網路

我很相信,自律即自由。我也努力再往這條路上走。堅持很難,但我更害怕後悔。努力學習,不要後悔。我常常這樣激勵自己。希望你是自由的。

I deeply believe that discipline equals freedom. I am still working. Sometimes, it’s painful on persistence, but I am more afraid of being regretful. I always tell myself that Work hard, learn hard, don’t be regretful. Hope you are freedom.

– MsHe

Tensorflow(2)

擷取自網路

接下來 …

Tensorflow(1) 中,我們已經可以下指令執行 Python 了。接著,為了讓寫程式更簡單一點,我打算搭配 VS CODE,讓我的寫程式的效率更高。VS CODE 有很好的介面跟外掛,而且可以直接在 VS CODE 上面執行程式碼。所以今天打算把 VS CODE 加上 Tensorflow 的環境建立好。

Next …

In Tensorflow(1), we already executed Python by command line. Next, I am going to install VS CODE in order to code more efficiently. VS CODE supports good interface and plug-in. You can run code directly in the VS CODE. So today, I am going to build VS CODE and set up the python environment.

Steps

1.Install Python

2.Set up your environment parameter

輸入自己的環境參數
python.pythonPath: 安裝 python 的資料夾
python.autoComplete.extraPaths: 昨天創建的新環境資料夾

Customize your own parameters in the red box.
python.pythonPath: the folder that python is installed.
python.autoComplete.extraPaths: the folder that you created yesterday.

3.Test

3.1 創建工作區 Create Work Space

3.2 Debug 的地方就會出現你的設定 Your configuration will show in the Dubug

3.3 執行昨天的程式碼 Execute your code

Encounter Problem

在一開始的時候,我並沒有創建我的工作區。我只是新開一個 Python 的檔案,結果我失敗了。我找不到我剛剛的設定,直到我發現必須有工作區,才可以選擇設定。

In the beginning, I didn’t create my own work space. I just created a .py file, and it’s failed. I can’t find my configuration until I found out it needed to created the work space before selected your configuration.

Wise men learn by other men’s mistakes; fools by their own.

-MsHe

Tensorflow (1)

擷取自網路

Tensorflow 介紹 …

Tensorflow 是
1. 機器學習的開發平台
2. 可使用 Python 或者 C++
3. 有兩種版本 CPU 或者 GPU
4. 是 GOOGLE 開發的

Introduce Tensorflow …

Tensorflow is
1. Platform of machine learning
2. Support languages are Python and C++
3. Across a variety of platforms. (CPU or GPU)
4. Is google system

擷取自網路

安裝環境 Environment …

  1. WIN 10
  2. CPU Tensorflow
  3. Python 3.7

步驟 Steps …

1.Install Anaconda
https://www.anaconda.com/distribution/#windows

2.Open Anaconda Prompt

3. input
pip install tensorflow

4.input
conda create -n tensorflow python=3.5
上網查說,python 3.5 的版本目前是運行的最順利的,所以我先試試
After searching, a lot of developers said that the most stable environment is running in python 3.5 is . So I started with python 3.5

5.input
activate tensorflow
你剛剛創建的 tensorflow
You just created tensorflow

6.test
import tensorflow as tf
hello = tf.constant('Hello world')
sess = tf.Session()
print(sess.run(hello))

The output

7. input
deactivate

8. 遇到問題 Encounter Problem …

當我第一次嘗試 import tensorflow,上面顯示 ‘ImportError: No module named ‘tensorflow’ 。所以我查了一下,發現很多人遇到相同的問題。結果需要另一個指令 conda install -c https://conda.anaconda.org/jjhelmus tensorflow。接著一切正常。

When I first tried to import tensorflow, it showed that ‘ImportError: No module named ‘tensorflow". So I searched, realized that a lot of users encounter the same problem. You need to input another command. conda install -c https://conda.anaconda.org/jjhelmus tensorflow. Then everything works fine for me.

參考:
https://stackoverflow.com/questions/40472144/importerror-no-module-named-tensorflow-cant-install-tensorflow
https://en.wikipedia.org/wiki/TensorFlow
https://www.itread01.com/content/1546953667.html

Live well, love lots, and laugh often.

-MsHe